BIND 10 #2831: define and implement MemorySegmentMapped

Tue Apr 9 09:09:34 UTC 2013

#2831: define and implement MemorySegmentMapped
-------------------------------------+-------------------------------------
            Reporter:  jinmei        |                        Owner:
                Type:  task          |  jinmei
            Priority:  medium        |                       Status:
           Component:  data source   |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-20130423
         Sub-Project:  DNS           |                   Resolution:
Estimated Difficulty:  5             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |  shared memory data source
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------
Changes (by muks):

 * owner:  muks => jinmei

Comment:

 Hi Jinmei

 Apologies for the delayed review. Many factors took away time.

 Here are my review comments. I kept asking why something is designed the
 way it is, so some commentary on design decisions in the API doc would
 be helpful.

 * Let's look at the following API doc comment of `setNamedAddress()`:

 {{{
     /// \c addr must be 0 (NULL) or an address that belongs to this
 segment.
     /// The latter case means it must be the return value of a previous
 call
     /// to \c allocate().  The actual implementation is encouraged to
 detect
     /// violation of this restriction and signal it with an exception, but
     /// it's not an API requirement.  It's generally the caller's
     /// responsibility to meet the restriction.
     ///
 }}}

 It seems like this restriction can be enforced in the API itself to
 prevent programming mistakes, by having a single static method that
 allocates a memory buffer with an associated name. Deallocating can also
 be done similarly by passing the name string without requiring the
 (perhaps relocated) addresses.

 * The API doc comment of `allocate()`:
 {{{
     /// Depending on the implementation details, it may have to grow the
     /// internal memory segment (again, in an implementation dependent
 way)
     /// to allocate the required size of memory.  In that case the
     /// implementation must grow the internal segment sufficiently so the
     /// next call to allocate() for the same size will succeed, and throw
     /// a \c MemorySegmentGrown exception (not really allocating the
 memory
     /// yet).
     ///
     /// An application that uses this memory segment abstraction to
 allocate
     /// memory should expect this exception, and should normally catch it
     /// at an appropriate layer (which may be immediately after a call to
     /// \c allocate() or a bit higher layer).  It should interpret the
     /// exception as any raw address that belongs to the segment may have
     /// been remapped and must be re-fetched via an already established
     /// named address using the \c getNamedAddress() method.
     ///
 }}}

 Because `MemorySegmentGrown` is thrown and no memory is allocated
 according to the API doc, should the caller call `allocate()` again to
 allocate the memory?

 I am trying to follow the design behind why `MemorySegmentGrown` is an
 exception, and not a value returned as the method's return value, or in
 a method argument. It is not an exceptional situation that should be
 allowed to unwind the stack all the way upwards, even if by mistake.
 `allocate()` can return `NULL` and a `bool` in an argument this case,
 which the caller can use and pass on even to its caller.

 It also seems inconsistent that `setNamedAddress()` doesn't throw if it
 allocates memory, which may also cause addresses previously returned by
 `allocate()` to be relocated (similar to when `allocate()` throws
 `MemorySegmentGrown`).

 * The `setNamedAddress()` API doc does not describe what happens if
   `NULL` is passed in the `addr` argument. If `NULL` will clear the
   association, then `clearNamedAddress()` would be redundant (we should
   not have to support two different ways of clearing the
   association). If indeed `NULL` can be associated with a name, it
   should be documented (with what a sample use-case may be for storing
   `NULL`).

   **Update:** I see now that there is a test for the `NULL` case. We
   should document the behavior anyway for the question raised above.

 * In the API doc of `MemorySegmentMapped()`, is it assumed that if
   `create` argument is true, then any existing file will be overwritten
   (truncated)?  If this is not the case, can the `create` argument be
   removed, such that this read+write version of `MemorySegmentMapped()`
   auto-creates the file if it doesn't exist, or opens it if it does
   exist?

 * Is there a reason why you use 0 instead of `NULL` in a lot of places,
   including unittests? (They are equivalent, but `NULL` is a gentle
   reminder that it's a pointer).
 {{{
 void*
 MemorySegmentLocal::getNamedAddress(const char* name) {
     std::map<std::string, void*>::iterator found =
 named_addrs_.find(name);
     if (found != named_addrs_.end()) {
         return (found->second);
     }
     return (0);
 }
 }}}

 * I think this public variable's definition should go into a `.cc` file:
 {{{
     /// \brief The default value of the mapped file size when newly
 created.
     ///
     /// Its value, 32KB, is an arbitrary choice, but considered to be
     /// sufficiently but not too large.
     static const size_t INITIAL_SIZE = 32768;
 }}}

 * Can the following comment be removed from
 `MemorySegmentMapped::setNamedAddress()`? It seems weird reading it by
 itself (it is as if by default the base `MemorySegment` interface
 doesn't expect `true` to be returned):
 {{{
     /// This version can actually return true.
     ///
 }}}

 * `getNamedAddress()` has different behavior in the different
 implementations with `name=NULL`.  If we don't want to support
 `name=NULL` , it should be documented in the
 `MemorySegment::getNamedAddress()` base class interface. Or we should
 add:

 {{{
 diff --git a/src/lib/util/memory_segment_local.cc
 b/src/lib/util/memory_segment_local.cc
 index 52de66b..e230f18 100644
 --- a/src/lib/util/memory_segment_local.cc
 +++ b/src/lib/util/memory_segment_local.cc
 @@ -53,6 +53,9 @@ MemorySegmentLocal::allMemoryDeallocated() const {

  void*
  MemorySegmentLocal::getNamedAddress(const char* name) {
 +    if (name == NULL) {
 +        return (NULL);
 +    }
      std::map<std::string, void*>::iterator found =
 named_addrs_.find(name);
      if (found != named_addrs_.end()) {
          return (found->second);
 diff --git a/src/lib/util/memory_segment_mapped.cc
 b/src/lib/util/memory_segment_mapped.cc
 index 3a90ccd..2264385 100644
 --- a/src/lib/util/memory_segment_mapped.cc
 +++ b/src/lib/util/memory_segment_mapped.cc
 @@ -172,6 +172,9 @@ MemorySegmentMapped::allMemoryDeallocated() const {

  void*
  MemorySegmentMapped::getNamedAddress(const char* name) {
 +    if (name == NULL) {
 +        return (NULL);
 +    }
      offset_ptr<void>* storage =
          impl_->base_sgmt_->find<offset_ptr<void> >(name).first;
      if (storage) {
 diff --git a/src/lib/util/tests/memory_segment_common_unittest.cc
 b/src/lib/util/tests/memory_segment_common_unittest.cc
 index 2ffdb7c..f721564 100644
 --- a/src/lib/util/tests/memory_segment_common_unittest.cc
 +++ b/src/lib/util/tests/memory_segment_common_unittest.cc
 @@ -25,6 +25,9 @@ namespace test {

  void
  checkSegmentNamedAddress(MemorySegment& segment, bool out_of_segment_ok)
 {
 +    // getNamedAddress(NULL) should not crash.
 +    EXPECT_EQ(static_cast<void*>(NULL), segment.getNamedAddress(NULL));
 +
      // If not exist, null pointer will be returned.
      EXPECT_EQ(static_cast<void*>(0), segment.getNamedAddress("test
 address"));

 }}}

 * Same in `clearNamedAddress()` too.

 * The exception message is incorrect:
 {{{
 void
 MemorySegmentMapped::deallocate(void* ptr, size_t) {
     if (impl_->read_only_) {
         isc_throw(InvalidOperation, "allocate attempt on read-only
 segment");
     }

 }}}

 * What are the conditions under which this assertion is triggered? i.e.,
   how did you arrive at this `INITIAL_SIZE` in the condition
   expression?
 {{{
     // It appears an assertion failure is triggered within Boost if the
 size
     // is too small.  To work this around we'll make it no-op if the size
 is
     // already reasonably small.
     if (getSize() < INITIAL_SIZE) {
         return;
     }
 }}}

 * Can something like `MappedFile` be used as the type name instead of
   `BaseSegment`? The method calls on `base_sgmt_` seem to match a file
   naming.

 * A general observation about API doc comments, which applies to the
   rest of BIND 10 API doc too: differently written comments seem to be
   used to say the same thing about exceptions:

 {{{
     /// This version never throws.
 }}}

 and

 {{{
     /// \throw None
 }}}

 Documentation about exceptions that can be thrown:

 {{{
     ...

     /// In such a case it throws \c MemorySegmentError.  If it's thrown
 the
     /// segment is not usable anymore.
     ///
     /// This method cannot be called if the segment object is created in
 the
     /// read-only mode; in that case InvalidOperation will be thrown.
 }}}

 Shall we use `\throw` in all cases to be consistent? It would be easier
 when we get around to publishing such API documentation for the BIND 10
 libraries.

 * This is not very important, but I wanted to suggest using XOR instead
   of addition in the checksum computation. I checked about this to make
   sure before suggesting it, and it seems that for 2-bit errors,
   addition has a slightly better probability of generating a different
   checksum. So addition is good in this case.

 * A test should be added to verify that previous name/address
   associations continue to exist and are still usable after
   `shrinkToFit()` is called, and previously allocated data exists and is
   the same. For these names and data, I suggest using random sizes and
   random data, a list of which you can save in `vector`s for matching
   after the shrinking is performed.

 * For the above case, we should also test the case where the mapping
   changes (somehow force it to happen). In this case, we test that
   `getNamedAddress()` returns the updated address and the data at that
   address is consistent with what we expect.

 * Is it not possible to perform `shrink_to_fit()` when the `BaseSegment`
   object is still alive (and the mapped file is open)? The documentation
   doesn't seem to say anything for or against it. Same question applies
   to `grow()` used in `growSegment()` too.

 * Is it necessary to call `flush()` in `growSegment()`? If so, isn't it
   necessary to do this in `shrinkToFit()` also?

 * I don't see any fork tests for interprocess testing. Please add these,
   as all of `MemorySegmentMapped`'s methods and the allocated data ought
   to be tested to function properly from independent process address
   spaces with a shared map file. This is our use-case.

 * We should also have tests for write behavior.. i.e., are writes to
   memory reflected in the maps from other reader processes before the
   writer destroys its segment? Are they when the writer destroys its
   segment?

 * In `checkSegmentNamedAddress()`, why the `memcpy()` instead of just
   `static_cast`ing it to an `uint32_t*` and assigning it? Similarly in
   `MemorySegmentMappedTest.namedAddress` too.
 {{{
 diff --git a/src/lib/util/tests/memory_segment_common_unittest.cc
 b/src/lib/util/tests/memory_segment_common_unittest.cc
 index 45bef7b..612d38e 100644
 --- a/src/lib/util/tests/memory_segment_common_unittest.cc
 +++ b/src/lib/util/tests/memory_segment_common_unittest.cc
 @@ -30,21 +30,21 @@ checkSegmentNamedAddress(MemorySegment& segment, bool
 out_of_segment_ok) {

      // Now set it
      void* ptr32 = segment.allocate(sizeof(uint32_t));
 -    const uint32_t test_val = 42;
 -    std::memcpy(ptr32, &test_val, sizeof(test_val));
 +    uint32_t* test_val = static_cast<uint32_t*>(ptr32);
 +    *test_val = 42;
      EXPECT_FALSE(segment.setNamedAddress("test address", ptr32));

      // we can now get it; the stored value should be intact.
      EXPECT_EQ(ptr32, segment.getNamedAddress("test address"));
 -    EXPECT_EQ(test_val, *static_cast<const uint32_t*>(ptr32));
 +    EXPECT_EQ(42, *static_cast<const uint32_t*>(ptr32));

      // Override it.
      void* ptr16 = segment.allocate(sizeof(uint16_t));
 -    const uint16_t test_val16 = 4200;
 -    std::memcpy(ptr16, &test_val16, sizeof(test_val16));
 +    uint16_t* test_val16 = static_cast<uint16_t*>(ptr16);
 +    *test_val16 = 4200;
      EXPECT_FALSE(segment.setNamedAddress("test address", ptr16));
      EXPECT_EQ(ptr16, segment.getNamedAddress("test address"));
 -    EXPECT_EQ(test_val16, *static_cast<const uint16_t*>(ptr16));
 +    EXPECT_EQ(4200, *static_cast<const uint16_t*>(ptr16));

      // Clear it.  Then we won't be able to find it any more.
      EXPECT_TRUE(segment.clearNamedAddress("test address"));
 }}}

 * I don't understand this assertion in `growSegment()`:
 {{{
         const size_t new_size = prev_size * 2;
         assert(new_size != 0); // assume grow fails before size overflow
 }}}

 * Is this guaranteed by Boost?
 {{{
     // Another shrink shouldn't cause disruption, and the size shouldn't
 change
     segment_->shrinkToFit();
     EXPECT_EQ(shrinked_size, segment_->getSize());
 }}}

 * If `initial_size` is passed as 0 to the constructor, Boost (or
   something else under it) throws a library error. Also 0 passed to
   `allocate()` returns a non-NULL pointer. Not sure if we should bother
   to add lines to the API doc about these extreme cases, whether they
   are allowed or not, or assertions.

 * Though it is an advisory lock, it may make sense to additionally make
   use of `fcntl()` read and write locks to check for mutual exclusion in
   `MemorySegmentMapped`. This will make visible any programming mistakes
   quickly.

 * I have pushed a bunch of commits to the branch. Please also check
   them.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2831#comment:13>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development