BIND 10 #404: serialize RDATA for in memory data source
BIND 10 Development
do-not-reply at isc.org
Tue Apr 12 00:39:10 UTC 2011
#404: serialize RDATA for in memory data source
-------------------------------------+-------------------------------------
Reporter: jinmei | Owner: jinmei
Type: task | Status: accepted
Priority: major | Milestone:
Component: data | Sprint-20110419
source | Resolution:
Keywords: | Sensitive: 0
Estimated Number of Hours: 0.0 | Add Hours to Ticket: 0
Billable?: 1 | Total Hours: 0
Internal?: 0 |
-------------------------------------+-------------------------------------
Comment (by jinmei):
(I'll first provide some responses to the original review)
Replying to [comment:8 vorner]:
> In the benchmark, using vector<uint8> as a buffer and using casted
pointer to &buffer[0] seems kludge. First of, the fact that it should work
is not so apparent from the vector interface, one needs to know the
guarantee that the elements must be stored in continuous array of memory.
I admit this syntax may look tricky. But at the same time I thought this
was a quite well-known and commoly used idiom for decent level of C++/STL
programmers, and I would say we should allow this usage in our code when
need conversion between the STL world and lower-level pure C interfaces.
I noticed you made explict comments about this point in your revised
branch, to which I wouldn't necessarily make an objection.
> Second, sizeof returns the size in chars and the standard guarantees
aliasing compatibility with unsigned char pointers only, not with uint8 *.
So, the compiler is free to deduce these two pointers can never point to
the same memory and the cast breaks that. So, if you want to use vector
and not bare pointer to array, could you please use unsigned chars?
I'm okay with changing uint8 to char (noticing you didn't change that
part). Frankly, though, if we worry about possible incompatibility
between uint8 and unsigned char, I'm afraid we'll find much more serious
cases in our main code (rather than in a benchmark tool).
> In the AbstractMessageRenderer, all the writeUint8 and alike methods are
really small inside and I can't really imagine any way they would do
anything else than just write the data into some buffer (not counting
writeName). The fact you made them virtual, I guess you made the call to
them much more costly than the code inside them and made them impossible
to inline. Isn't there a way to have only one implementation for them? The
dividing into fields could be done in other functions, like writeName.
Personally I'd think it's premature to worry about the overhead of the
virtual functions at this moment (I wouldn't deny the possibility of
finding it to be a critical performance bottleneck when we reach the
stage of optimization). But I also see most of the writeXXX() methods
would be a duplicate of the base class version, so, as a conclusion
I'm okay with defining them non virtual at least until we see the need
for making them virtual. I have some other comments regarding this
point for the actual patch, which I'll explain separately.
> Is std::vector really not portable? We use it in headers all the time.
Is the hiding in pimpl really worth the prize of the overhead and more
complicated implementation?
It's quite likely to be non portable (or non compatible) at the binary
level when we change the STL implementation (even the debug/normal
modes of STLPort are incompatible; for example, we'd have to build and
use gtest with the same mode of STLPort). Basically, I want to
minimize the risk of breaking the compatibility unless there's a
strong need to expose it. I admit we are already breaking this level
of compatibility in various places of our code, but if we use the fact
as an excuse to break it further, it will be more and more difficult
to restore the compatibility when if and we want to do that.
Performance reasons can be a "strong need" in future, so, once we are
in the optimization phase and find the overhead of indirection is a
critical bottleneck, I'm okay with exposing the vector definition.
Until then, I'd rather hide it.
> RdataFieldComposer has many small methods. Is it worth writing them into
the class header and then putting them outside? Isn't it nice to put their
bodies inside the class and hint the compiler these methods are worth
inlining (though the compiler does mostly what it wants today anyway).
I suspect this comment doesn't apply to your own patch, which
eliminated many of the RdataFieldComposer::writeXXX methods, so I'll
skip this point.
> The RdataFields::toWire has a weard statement: condition ? true : false.
Wouldn't condition (without the ?) be enough?
Ah, this seems to me a common bad practice of mine. I'm not sure why
I tend to make this type of expression unnecessarily complicated, but,
you're right; we don't (shouldn't) need ?:.
> Looking at the FieldSpec ‒ is it probable we use all 16 bits of the
length field? Surely not for a name and if it is possible to have such
long binary fields (I doubt there are any in the wild), we could split it
into more consecutive fields. If we spared 2 bits from the length, we
could put the type of field to the same 2-byte word (either by manually
masking & shifting) or by field width specification (but I fear it's
feature of C99). Since now it takes 3 bytes and another byte will be
padding, because the uint16 can't live on odd address, this would make the
size half.
I agree it makes sense to consider compressing the fields further, but
that would probably be a next step, probably after measuring the
actual memory footprint with real world example (big) zones.
--
Ticket URL: <http://bind10.isc.org/ticket/404#comment:17>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list