BIND 10 #404: serialize RDATA for in memory data source

Wed Apr 13 08:44:29 UTC 2011

#404: serialize RDATA for in memory data source
-------------------------------------+-------------------------------------
                 Reporter:  jinmei   |                Owner:  jinmei
                     Type:  task     |               Status:  reviewing
                 Priority:  major    |            Milestone:
                Component:  data     |  Sprint-20110419
  source                             |           Resolution:
                 Keywords:           |            Sensitive:  0
Estimated Number of Hours:  0.0      |  Add Hours to Ticket:  0
                Billable?:  1        |          Total Hours:  0
                Internal?:  0        |
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Replying to [comment:20 vorner]:

 > > It's quite likely to be non portable (or non compatible) at the binary
 > > level when we change the STL implementation (even the debug/normal
 > > modes of STLPort are incompatible; for example, we'd have to build and
 > > use gtest with the same mode of STLPort).  Basically, I want to
 > > minimize the risk of breaking the compatibility unless there's a
 > > strong need to expose it.  I admit we are already breaking this level
 > > of compatibility in various places of our code, but if we use the fact
 > > as an excuse to break it further, it will be more and more difficult
 > > to restore the compatibility when if and we want to do that.
 >
 > I'd really be worried to go that way, it really complicates matters a
 lot ‒ we wouldn't be able to put anything except basic types into the
 headers, not even a shared pointer. This kind of thing is usually solved
 by distro maintainers anyway. So, maybe it's a thing we should bring up to
 some call? Or, was it already discussed sometime?

 I don't think the incompatibility like the one between debug/normal
 modes of STLPort will be taken care of by the distro.  As for the
 call/previous discussion, we've discussed things related to this topic
 time to time, but as far as I know we don't really have a clear
 consensus about how much we'd allow non basic types to be in public
 interfaces.

 I'm happy to raise this project wise, but I'm a bit pessimistic we can
 have a consensus.  For this specific case, as you seem to want to
 avoid hiding buffer quite strongly, and as I have to admit it wouldn't
 be very convincing to insist on this case while leaving many others,
 I'm okay with leaving the decision on this point to you.

 > Replying to [comment:18 jinmei]:
 > > Replying to [comment:15 vorner]:
 > > - Also as I said, I'd still hide the buffer using pimpl at the moment.
 >
 > Well, there are actually two issues on that. [...]
 >
 > So I'm little bit reluctant to do that. Is it about the binary
 compatibility as well?

 Yes, that's the same discussion.  I simply repeated it as a summarized
 specific point to the branch code itself.  See above.

 > > - In general I don't like to have a non private member variable unless
 it's absolutely necessary (just as many C++ books suggest), [...]
 >
 > Again, what good does it bring?  Because anybody is able to fully
 overwrite the buffer with the write* methods, extend it, shrink it, etc.
 So it doesn't bring any protection of integrity. It only complicates
 matters for whoever is writing the writeName() function. The rule in
 whatever book is rule of thumb, that you don't want just anybody being
 able to dig in your guts. This doesn't apply here, since everybody is
 already able to do so (and, by definition of what the class does, must
 be). Or do I miss some benefit here?

 My main concern in this context is to ensure the (larger) flexibility
 of changing the internal representation before many other clients,
 hopefully third party ones, relies on the public interfaces.  Once
 someone starts relying on the fact that the backend of MessageRenderer
 is OutputBuffer, we won't be able to change that without breaking
 compatibility of that deployment.  It's true that it's not a problem
 today, and, if we are lucky (or unlucky that no one else is interested
 in using our APIs) this may never be an issue in future.  But exactly
 because it's not an immediate threat we tend to underestimate the
 risk, and will only regret the decision when the risk is realized and
 it's too late.

 I understand it's a tradeoff between that kind of robustness and
 complexity due to more indirections.  I also understand when there are
 many public methods that access the internal data it can effectively
 mean exposing the data itself.  But, in this case, taking these into
 account I still prefer publishing the internal data representation.

 Like I said in the above paragraph, I understand it's a controversial
 point and different people may have different opinion.  If you still
 don't like my suggestion, I propose bringing this to the dev list and
 resolve the conflict consensus-basis.

 > > - [comment] This is an off topic for this ticket, but I've been
 > thinking it makes more sense to hide the output buffer inside the
 > (Abstract)MessageRenderer [...]

 > It does make sense in the terms of clean design. On the other side, it
 might make one optimisation a little bit harder ‒ currently, it is
 possible to reuse the same buffer to render many messages. Maybe it would
 be possible to reuse the whole renderer, though.

 MessageRender should also be able to be reused.  In fact that was my
 original intent.

 > Anyway, could I propose that we take the lazy approach, make a
 > ticket or comment at the interface and leave it as it is until the
 > change brings any advantage?

 That's fine.  In general I agree we should limit the scope of
 individual tickets.

 > > '''rdatafields.h'''
 > > - I don't understand the point of changing the return type of
 getFieldSpecData().  Which specific alias problem does the change try to
 solve?
 >
 > Thinking about it once again, I was wrong. The problem would be if
 anybody tried to access the data by a pointer to something different than
 `char`, `FieldSpec` or `Type/uint16_t` (depending on the part of data
 accessed). So, in that sense, `void *` is incompatible with the data,
 meaning that you can't access the `void` pointed to because of aliasing
 rules. However, no one can really access `void`, so it's not an issue.

 Okay.

 > So I might return it back to `void *`. However, thinking about it, why
 we returned  `void *` in the first place, when the docs say it can be
 type-casted to `FieldSpec`? In that sense, the API seems inconsistent,
 since `void *` usually describe „typeless unknown data“. Wouldn't
 returning a `FieldSpec *` make more sense?

 Hmm, good point.  I don't remember what I exactly thought about the
 interface consistency when I first wrote the original code, but my
 general intent is to treat the exposed data as opaque as possible.
 So, if it doesn't break other part, it will probably make more sense
 to change the second constructor of RdataFields so that it takes void*
 instead of FieldSpec* (and internally convert it to FieldSpec* using
 cast).  And, in that case, it may also make sense to use the actual
 data length rather than the number of spec's.

 > Sorry for disagreeing so much, but I can't help myself feeling it
 differently.

 No worries.  I understand many of the above points are controversial
 where people's opinions may vary.

-- 
Ticket URL: <http://bind10.isc.org/ticket/404#comment:21>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development