BIND 10 #363: "H" and "I" are probably harmful for PyArg_ParseTuple()
BIND 10 Development
do-not-reply at isc.org
Wed Mar 16 06:12:00 UTC 2011
#363: "H" and "I" are probably harmful for PyArg_ParseTuple()
-------------------------------------+-------------------------------------
Reporter: jinmei | Owner: jinmei
Type: defect | Status: reviewing
Priority: minor | Milestone: A-Team-
Component: | Sprint-20110316
DNSPacket API | Resolution:
Keywords: | Sensitive: 0
Estimated Number of Hours: 0.0 | Add Hours to Ticket: 0
Billable?: 1 | Total Hours: 0
Internal?: 0 |
-------------------------------------+-------------------------------------
Comment (by jinmei):
Replying to [comment:9 zzchen_pku]:
> > It didn't pass a test for me:
> For my 32bit OS, the range of interger is (-0x80000000, 0x7fffffff),so
the test will throw TypeError for 0x80000000. Since the unittest is
platform dependent, I don't think it make sense to keep it.
You mean "since the unittest shouldn't be platform dependent"? Anyway
I see the point. We could still test the overflow cases selectively
(with a helper routine to identifying the size of int, long, etc), but
that's probably overkilling. So I'm okay with removing it.
> > Also, it was not clear to me why you chose long (instead of int) in
> > some cases while choosing int for others. Please clarify the
> > rationale.
> I find this in python2 doc:
> {{{
> python2 supports numeric types: plain integers, long integers
> Plain integers (also just called integers) are implemented using long in
C, which gives
> them at least 32 bits of precision,Long integers have unlimited
precision.
> }}}
> and also
> {{{
> In Python 3, there is only one integer type, called int, which mostly
behaves like the
> long type in Python 2.
> }}}
> PyArg_ParseTuple() is responsible for converting a Python integer to a
plain C numeric type, as described above, I think long precision is more
suitable to prevent overflow.
On thinking about it further, I now agree long is better than int, but
not for that reason. The above citation is about internal
representation of integers in the python language, and regardless of
the internal representation PyArg_ParseTuple() can convert them to
various C integer types, so the question of int vs long (vs short,
etc) still holds.
In our usage many integers are unsigned 16-bit ones. To convert them
safely (without worrying about overflow), we need at least 32-bit
signed integers. That's why I agree long (which is ensured to be at
least 32-bit) is better than int (which can be e.g. 16-bit). For this
rationale, in some cases we really don't need long, and in some other
cases even long is not safe enough. See specific cases below.
Now, specific comments about the code:
- this may not specific to this patch, but some calls to PyErr_Clear()
seem to be redundant. example:
{{{
if (messageflag < 0 || messageflag > 0xffff) {
PyErr_Clear();
PyErr_SetString(PyExc_OverflowError, "Message header flag out of
range");
}}}
- As far as I can see, some of the long int variables don't have to be
so (they can be a plain int): Message_init, Message_clear,
MessageRenderer_setCompressMode. (although being long is not
necessarily incorrect; it just consumes more (stack) space
unnecessarily). Actually, in these cases I think it make more sense
to do this check within the corresponding C++ class (Message,
MessageRenderer, etc) and delegate the check to it.
- on the other hand, in this case the integer has to be long:
- Message_setQid() (because size of int may be 16 bits)
- Message_addRRset: I'd avoid using hardcoded range here. Actually,
in this case I think it make more sense to delegate the check to the
C++ code.
- Name_init: I wouldn't bother to check '> 0xffff'. In fact, the
buffer could store longer data than 64KB. Invalid (positive)
positions will be rejected in setPosition() anyway. (with removing
that check) I'd also consider this case a "type error" rather than
overflow, that is, a negative integer type is passed where an
unsigned is expected. Same comment applies to other similar cases.
- Name_at, Name_split: likewise. And, in this case it doesn't even have
to be
long (int should be sufficient).
- Name_split: is this case tested? (catching it seems to be correct and
good)
{{{
+ PyErr_Clear();
+ PyErr_SetString(PyExc_TypeError,
+ "No valid type in split argument");
}}}
- Rcode_init: ext_code doesn't have to be long.
- RRTTL_init: this doesn't seem to be correct:
{{{
- } else if (PyArg_ParseTuple(args, "I", &i)) {
+ } else if (PyArg_ParseTuple(args, "k", &i)) {
}}}
"k" omits overflow check, so it essentially has the same problem as
"I". One solution is to use "L" with long long (assuming our target
system has it; this is probably acceptable. We already rely on
uint64_t etc, anyway). Another solution is to ignore overflow in
this specific case.
- message_python_test.py: the change makes the following comment
incorrect:
{{{
# this would cause overflow and result in a "valid" flag
}}}
- tests in general. I've not closely checked the test coverage, but
from a quick look there seems to be some cases that were not tested
(e.g. for RRTTL). Please make sure all new code is covered by tests
(note also that the coverage problem is unlikely to happen if it was
developed test driven); or, if I miss something and the code was
fully covered already, please say so.
--
Ticket URL: <http://bind10.isc.org/ticket/363#comment:10>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list