BIND 10 #2850: Define and implement ZoneTableSegmentMapped

BIND 10 Development do-not-reply at isc.org
Fri May 3 11:47:21 UTC 2013


#2850: Define and implement ZoneTableSegmentMapped
-------------------------------------+-------------------------------------
            Reporter:  jinmei        |                        Owner:
                Type:  task          |  jinmei
            Priority:  medium        |                       Status:
           Component:  data source   |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-20130514
         Sub-Project:  DNS           |                   Resolution:
Estimated Difficulty:  5             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |  shared memory data source
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------
Changes (by muks):

 * owner:  muks => jinmei


Comment:

 Hi Jinmei

 Firstly, this is an incomplete set of changes. I have implemented most
 points from your review comments, but before spending more time on this
 ticket, I want to be sure that the design looks reasonable to you.  I
 have to implement more detailed tests, but I want to hold off on it
 until the interfaces and their documentation look fine to you.

 The review comments that have not been addressed have a reply of FIXME
 below.

 Replying to [comment:10 jinmei]:
 > '''general'''
 >
 > - On a closer look, we really don't even need `getZoneWriter()` method
 >   at all, as `ZoneWriter` is now a concrete class and can be
 >   instantiated directly.  That will make the user side simpler.  I
 >   suggest:
 >   - removing this method
 >   - checking the "writable" condition in the `ZoneWriter` constructor
 >   - adjusting the user side accordingly (which should be trivial).  I
 >     committed this change for client_list.cc (85873bf)

 The method has been removed, and code everywhere has been adjusted
 for it. We check the writable condition in the `ZoneWriter`
 constructor now.

 > '''Makefile.am (multiple)'''
 >
 > - We need to make sure "mapped" related sources are compiled only if
 >   "USE_SHARED_MEMORY" is set.  I also suggest building and testing the
 >   whole tree `--without-shared-memory`.

 This is implemented.

 > '''zone_table_segment.cc'''
 >
 > - Like Makefile.am, we need to make sure we won't include or
 >   instantiate mapped-related things when unavailable.

 This is implemented.

 > - getZoneWriter: I suggest using `isc::InvalidOperation` instead of
 >   `Unexpected`.  I'd use the latter only when it's really
 >   "unexpected", such as a very rare system error the case where it's
 >   most likely an internal bug but we cannot completely eliminate the
 >   possibility of a bad user input.  (but per the above general
 >   suggestion, this suggestion would go to the `ZoneWriter` constructor).

 This is now changed (in the `ZoneWriter` constructor).

 > '''zone_table_segment.h'''
 >
 > - getZoneWriter: it needs to describe possible exception.
 >   (but per the above general suggestion, this suggestion would go to
 >   the `ZoneWriter` constructor).

 Done.

 > - isWritable doc: in general, I'd like to keep the base class
 >   documentation agnostic about derived class details.  So I suggest
 >   changing it to:
 > {{{#!cpp
 >     /// The user of the zone table segment will load or update zones
 >     /// into the segment only for writable ones.  The precise definition
 >     /// of "writability" differs in different derived classes (see
 derived
 >     /// class documentation).  In general, however,  the user should
 only rely
 >     /// on this interface rather than assume a specific definition for a
 >     /// specific type of segment.
 > }}}
 >   And update derived classes documentation about their definition of
 >   writability.

 Done.

 > - isWritable doc: it's probably safe to say an implementation of this
 >   method must be exception free (simply because it should be able to
 >   be so).

 Done.

 > - reset() must be defined as virtual (consider, e.g., how we implement
 >   #2852.  no one other than the base `ZoneTableSegment` class (and
 >   memmgr, though still indirectly in that case) has to be aware of the
 >   (non)existence of specific derived classes).

 This is now updated now. See the end of this ticket comment for further
 discussion.

 > '''zone_table_segment_mapped.h'''
 >
 > - As noted above, `reset` must be defined at the base class, and so be
 >   `MemorySegmentOpenMode`.  The documentation should be abstracted
 >   (CREATE, READ_WRITE, READ_ONLY should be explained at some
 >   conceptual level, not directly referring to a "file", etc).

 Done.

 > '''zone_table_segment_mapped.cc'''
 >
 > - you don't have to define the destructor (in this case)

 Removed.

 > - I'd avoid hardcoding magic keywords like "zone_table_checksum".

 Done.

 > - isWritable(): I'd simply return false if `mem_sgmt_` is NULL.
 >   In fact, this is completely valid because it's indeed "non writable"
 >   without the underlying memory segment.  If we throw in that case,
 >   the caller (possibly) needs to do behave differently depending on
 >   whether it results in an exception, which is convenient.  And, I
 >   actually expect we'll encounter this situation in client_list.

 Did you mean to say "inconvenient" instead of "convenient" above? I have
 made `isWritable()` not throw anymore (in API specification too), and it
 returns false for uninitialized situation.

 > - getHeader, getMemorySegment: in my sense I wouldn't use `Unexpected`
 >   in these cases (see above).  Also, if we allow getHeader to throw,
 >   this should be documented in the base class.

 Done.

 > '''ZoneTableSegmentMapped::reset'''
 > - I'd perform validation checks at the beginning of the method.

 I don't know exactly which validation checks are meant here, but if
 these are the config validation, it is done early now.

 > - this method is too big to understand.  in particular, the
 >   huge switch block severely damages the readability (for example in
 >   my screen I cannot even be sure how many cases it has "at a
 >   glance").  I suggest delegating each case to helper private methods:
 > {{{#!cpp
 > void
 > ZoneTableSegmentMapped::reset(MemorySegmentOpenMode mode,
 >                               isc::data::ConstElementPtr params)
 > {
 >     // validation
 >
 >     switch (mode) {
 >     case CREATE:
 >        resetCreate();
 >        break;
 >     case READ_WRITE:
 >        resetReadWrite();
 >        break;
 >     case READ_ONLY:
 >        resetReadOnly();
 >        break;
 >     default:
 >        // hmm, what should we do? (see below)
 >     }
 > }
 > }}}
 >   If the method is concise like this, it's easier to notice that there
 >   could be a missing case as commented above (and see the next point).

 You are absolutely right. I did not spend time in rearranging the code,
 being distracted by getting the various cases in the implementation
 correctly handled. The code indeed was too big there.

 It has been split into individual methods now. :)

 > - we need to handle the "default" case above.  enums cannot completely
 >   eliminate this case.

 I think of this everytime when writing `switch`, but I believe the
 current approach is superior. This is because modern compilers warn if
 you miss handling an enum value in a switch statement, which will result
 in a compile error in our project.

 Consider you have enum code written with a `default` case in `switch`
 statements in the codebase. After some time, you add another value to
 the enum. The tree compiles without any errors, and you have a situation
 where you have to find every occurance of affected switch statements that
 you need to update, but it's possible to miss some when checking
 manually. You notice they are not handled only when the execution path
 hits a particular switch statement and executes its `default` case.

 In our switch, every enum value is handled, and if something isn't, the
 compiler will error at compile time.

 As it is an error when an `int` is passed for an `enum`, we are not
 going to have values out of enum range in this use of `enum`.

 > - this method repeats same/similar patterns for handling checksum and
 >   table header.  I'd consider unify them, especially because some
 >   cases would be tricky (see below).

 These have been unified now.

 > - I'd like to clarify the level of exception guarantee as much as
 >   possible.  The current implementation only provides weak guarantee
 >   (strictly speaking it doesn't even provide the weak level because
 >   some allocated resource could remain in the mapped file).  I noticed
 >   it's probably impossible to provide the strong guarantee (either
 >   success or fall back to original state) for all cases due to the
 >   restriction of mapped memory segment, but I can still see it
 >   possible in many other cases (and these are actually our intended
 >   use cases).  For example, if the files are different, we can delay
 >   releasing the original segment (we need to remember the original
 >   file name).  So, I would:
 >   - clarify in which case we can provide the strong guarantee
 >
 >   - if it's not too complicated, provide the strong guarantee when
 >     it's possible
 >   - also if possible, make it possible for the caller to tell which
 >     level of exception guarantee is provided in case of exception
 >     (e.g., by using different types of exception).
 >   - document these, including what the caller should do in each case.
 >     note also that this method must be virtual, and the documentation
 >     should be generic (not relying on the mapped version specifics).

 These are done. Please review if the exception guarantees are
 reasonable.

 > - maybe we should also make it possible to clear (just close any
 >   existing segment) the segment?

 A `clear()` method has been added to `ZoneTableSegment` now.

 > - in general, we need to cover the case where allocate() results in
 >   `MemorySegmentGrown` exception. same for `ZoneTable::create()`, and
 >   setNamedAddress (if it returns true).  for that matter, I noticed
 >   `ZoneTable::create()` itself would need some tricker consideration
 >   regarding this point.  One solution would be to handle these things
 >   correctly, of course.  But, considering it should be quite unlikely
 >   at the initialization time (with a sufficiently large size of
 >   initial segment size), maybe some optimistic simplification is
 >   acceptable:
 > {{{#!cpp
 >     // allocate the space for the named address first
 >     segment->setNamedAddress("zone_table_header", NULL);
 >     try {
 >         void* ptr = segment->allocate(sizeof(ZoneTableHeader));
 >         ZoneTableHeader* new_header = new(ptr)
 >             ZoneTableHeader(ZoneTable::create(*segment, rrclass_));
 >         const bool grown =
 >             segment->setNamedAddress("zone_table_header", new_header);
 >         assert(!grown); // note that we've already allocated the space
 above.
 >         header_ = new_header;
 >     } catch (const MemorySegmentGrown&) {
 >         // Unlikely case.  Don't bother to try to recover.
 >         isc_throw(MaybeSomeCustomException);
 >     }
 > }}}

 I have adopted a variant of it that is a little more graceful if
 `allocate()` returns NULL. Please review this code carefully.

 > - I'd assert checksum is not NULL here:
 > {{{#!cpp
 >             uint32_t* checksum = static_cast<uint32_t*>
 >                 (segment->getNamedAddress("zone_table_checksum"));
 > }}}

 Such cases are taken care of now, in the code using `std::pair` returned
 by `getNamedAddress()`.

 > - not directly related, but I just realized the return value of
 >   `getNamedAddress()` is ambiguous: it cannot distinguish the case
 >   where the given name doesn't exist from the case where the named
 >   address is NULL.  If not too heavy, I'd also revise the interface
 >   so it returns `std::pair<bool, void*>`, whose first member indicates
 >   whether the name exists.

 This has been implemented.

 > - In my sense I wouldn't use `Unexpected` in this case (see above)
 > {{{#!cpp
 >             if (saved_checksum != new_checksum) {
 >                  isc_throw(isc::Unexpected,
 >                            "Saved checksum doesn't match mapped segment
 data");
 >             }
 > }}}
 >   The most likely cause of this would be that the segment data is
 >   broken, which is completely possible.  I'd probably use a dedicated
 >   exception type for cases of broken segments in general; the caller
 >   may want to handle that case specifically (e.g., remove it and
 >   create a new one from the scratch).  I'd also include the file name
 >   in the what() message (maybe also for some other cases too).

 I decided not to use a dedicated exception type for it, as it would
 result in two more exception types for whether the segment is usable or
 not after that (depending on whether it is the same segment being
 opened: see how the exception guarantee was implemented). The code
 throws one of `ResetFailed` and `ResetFailedAndSegmentCleared` now.

 The filename is included now in the exception message.

 > - this breaks strong exception guarantee:
 > {{{#!cpp
 >         header_ = static_cast<ZoneTableHeader*>
 >             (segment->getNamedAddress("zone_table_header"));
 >         if (!header_) {
 >             isc_throw(isc::Unexpected,
 >                       "There is no previously saved ZoneTableHeader in a
 "
 >                       "mapped segment opened in read-only mode.");
 >         }
 > }}}

 I couldn't tell how it was an issue in the earlier state of code, but
 can you check the current state of code?

 > - this comment should be able to be removed:
 > {{{#!cpp
 >         // The segment was already shrunk when it was last closed. Check
 >         // that its checksum is consistent.
 >         // FIXME: We can't really do this as we can't set the checksum
 >         // to 0 for checksum calculation in a read-only segment.
 > }}}

 I have revised the comment, but see discussion at the end of this ticket
 comment.

 Replying to [comment:12 jinmei]:
 > '''zone_writer.h/cc'''
 >
 > - maybe a matter of taste, but we might pass a reference of
 >   `ZoneTableSegment` to the `ZoneWriter` constructor if it cannot be
 >   NULL.

 This has been implemented.

 > '''zone_writer.h'''
 >
 > - name is not described, install_action is not part of the parameters:
 > {{{#!cpp
 >     /// \brief Constructor
 >     ///
 >     /// \param segment The zone table segment to store the zone into.
 >     /// \param load_action The callback used to load data.
 >     /// \param install_action The callback used to install the loaded
 zone.
 >     /// \param rrclass The class of the zone.
 >     ZoneWriter(ZoneTableSegment* segment,
 >                const LoadAction& load_action, const dns::Name& name,
 >                const dns::RRClass& rrclass);
 > }}}

 Fixed.

 > - cleanup: I'm not sure about the intent of "generally" in the
 >   original base class version, but if that was simply because derived
 >   classes could behave differently, I guess we can simply remove this
 >   word:
 > {{{#!cpp
 >     /// Generally, this should never throw.
 >     void cleanup();
 > }}}
 >   In fact, I don't understand how it could throw now.

 The API doc has been updated using a simpler `\throw` directive.

 > '''zone_writer.cc'''
 >
 > - I'd include zone_table_segment.h explicitly as long as the .cc file
 >   directly refers to the definition.

 The `ZoneWriter` class interface in `zone_writer.h` requires
 `ZoneTableSegment` to be known, and `zone_writer.cc` implements that
 interface, so it ought to be known by then to the `.cc` code too. (This
 is why the code compiles currently.)

 There is the small possibility that `ZoneTableSegment` as known to the
 class interface in `zone_writer.h` is an opaque type. But compile will
 fail anyway if this is the case.

 I don't have a strong opinion as it doesn't cause problems. It will just
 be redundant. If you want this anyway, I'll add it. :)

 > '''zone_table_segment_test.h'''
 >
 > - We shouldn't need getZoneWriter() method at all.

 Removed.

 > '''zone_table_segment_mapped_unittest.cc'''
 >
 > - I guess some comments on the main code would require more tests
 >   (checking exception guarantee, invalid mode, etc)

 FIXME

 > - a minor point, but why you define both the destructor and
 >   `TearDown`?  Especially when you have the former, there's basically
 >   no point to have `TearDown` unless, e.g., you want to check
 >   something with `ASSERT_xx`.

 `TearDown()` has been removed now.

 > - it's generally better to avoid initializing an object of a non
 >   trivial type in a namespace level:
 > {{{#!cpp
 > const std::string mapped_file = TEST_DATA_BUILDDIR "/test.mapped";
 > }}}
 >   to avoid fiasco.

 This has been changed to a `const char*`.

 > - resetBadConfig: missing case: params is NULL.

 A test has been added for it.

 > - reset test: it generally seems to lack (some) necessary checks for
 >   each mode:
 >   - For CREATE: we need to confirm it really creates a new file or at
 >     least removes any existing content, and the resulting segment is
 >     writable.
 >   - For READ_WRITE: in case the segment already exists we need to
 >     confirm it doesn't change existing content (except for the
 >     checksum), and the resulting segment is writable.
 >   - For READ_ONLY: we need to confirm it doesn't change any existing
 >     content, and the resulting segment is not writable.

 FIXME

 > - I believe the previous cases can be tested by inspecting/tweaking the
 >   underlying memory segment.  Also, I'd separate these three cases.
 >   Each case is complicated enough, so mixing them would tend to cause
 >   missing cases, and even if it's comprehensive it's not clear for the
 >   reader of the test code (= me).

 FIXME

 > - As for corrupted data, I think it's better to test it at some
 >   realistic level.  Data corruption will certainly happen some time,
 >   while quite rare, so it's better we can be sure it works as we
 >   expect when it finally happens.

 FIXME

 > This completes my review.  I made and pushed relatively small and
 > minor changes (basically trivial cleanups and editorial/style matters)
 > directly.

 These look fine to me.

 I am still not entirely satisfied with the branch. Some points:

 * I think the checksum should be stored separately from the segment, and
   there should be no possibility that the memory area for the checksum
   could itself influence its computation. The current code of patching 0
   during checksum computation is hackish and opens the possibility of
   broken code. Why do this when we can cleanly separate the checksum
   storage?

   Another option is to remove checksums altogether. It only checks one
   byte of each page for corruption, and I wonder what value there is to
   do this, esp. now that we have code to mutually exclude writer and
   reader instances. If it is to check for memory corruption, we don't
   checksum many other things that may get corrupted in the auth server.

 * It seems that most of the functionality of
   `ZoneTableSegmentMapped::reset()` can be abstracted out to the base
   class so it works for all implementations of
   `ZoneTableSegment`. Methods like `isWritable()`, `getHeader()`,
   etc. also need not be implemented outside the base `ZoneTableSegment`
   then. Without this, a caller would have to know about specialization
   details, such as not to call `reset()` on a
   `ZoneTableSegmentLocal`, etc.

 * A question about the design: why support `reset()`?  Is it for
   performance that we re-use a `ZoneTableSegment` instead of
   `reset()`ing during construction and simply creating another object in
   case the underlying storage changes?

-- 
Ticket URL: <http://bind10.isc.org/ticket/2850#comment:13>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list