BIND 10 #2831: define and implement MemorySegmentMapped

Wed Apr 10 21:41:38 UTC 2013

#2831: define and implement MemorySegmentMapped
-------------------------------------+-------------------------------------
            Reporter:  jinmei        |                        Owner:
                Type:  task          |  jinmei
            Priority:  medium        |                       Status:
           Component:  data source   |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-20130423
         Sub-Project:  DNS           |                   Resolution:
Estimated Difficulty:  5             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |  shared memory data source
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Replying to [comment:13 muks]:

 > Apologies for the delayed review. Many factors took away time.

 No problem (this hasn't been a critically blocking task), thanks for
 the careful read and detailed review.

 Before answering the comments: I realized Boost managed_mapped_file
 could cause various build time issues: depending on the combination of
 Boost version, type of compiler (and maybe its version), and the
 system, it may be `-Werror` unfriendly or in the worst case (Solaris +
 !SunStudio, as usual) doesn't seem to compile at all.  So far, the
 only critical failure seemed to be the case for sunstudio, but I had
 to give up making it compile after trying several things.  Since the
 shared memory support is a kind of optional feature for limited large
 scale environments, my suggest at this moment is to introduce a
 ./configure knob so we can disable building it completely.

 I updated confiugre.ac and our m4 macro for boost to make it possible,
 with other portability tricks such as auto-detecting and suppressing
 `-Werror` regression.

 These changes are independent from other changes of the branch, so if
 it's more convenient for you, you can review it separately.  The
 changes are currently committed at the end of the branch, from e01ee17
 to ba0d613 (inclusive).

 > * Let's look at the following API doc comment of `setNamedAddress()`:
 [...]
 > It seems like this restriction can be enforced in the API itself to
 > prevent programming mistakes, by having a single static method that
 > allocates a memory buffer with an associated name. Deallocating can also
 > be done similarly by passing the name string without requiring the
 > (perhaps relocated) addresses.

 Unfortunately, this wouldn't always be the case in one of intended use
 cases (for our scheduled shmem data source work - see #2836).  I've
 added a note on this in the API doc.

 > * The API doc comment of `allocate()`:
 > {{{
 >     /// Depending on the implementation details, it may have to grow the
 >     /// internal memory segment (again, in an implementation dependent
 way)
 >     /// to allocate the required size of memory.  In that case the
 >     /// implementation must grow the internal segment sufficiently so
 the
 >     /// next call to allocate() for the same size will succeed, and
 throw
 >     /// a \c MemorySegmentGrown exception (not really allocating the
 memory
 >     /// yet).
 >     ///
 >     /// An application that uses this memory segment abstraction to
 allocate
 >     /// memory should expect this exception, and should normally catch
 it
 >     /// at an appropriate layer (which may be immediately after a call
 to
 >     /// \c allocate() or a bit higher layer).  It should interpret the
 >     /// exception as any raw address that belongs to the segment may
 have
 >     /// been remapped and must be re-fetched via an already established
 >     /// named address using the \c getNamedAddress() method.
 >     ///
 > }}}
 >
 > Because `MemorySegmentGrown` is thrown and no memory is allocated
 > according to the API doc, should the caller call `allocate()` again to
 > allocate the memory?

 Yes.

 > I am trying to follow the design behind why `MemorySegmentGrown` is an
 > exception, and not a value returned as the method's return value, or in
 > a method argument. It is not an exceptional situation that should be
 > allowed to unwind the stack all the way upwards, even if by mistake.
 > `allocate()` can return `NULL` and a `bool` in an argument this case,
 > which the caller can use and pass on even to its caller.

 Fair question.  I tried to clarify that in doc.  The specific
 intended usage is, again, #2836.  I admit the interface and usage
 is quite tricky, though.  If there's a better way to handle
 intermediate segment extension in the context of #2836, I'm happy to
 adopt it instead of the currently implemented approach.

 > It also seems inconsistent that `setNamedAddress()` doesn't throw if it
 > allocates memory, which may also cause addresses previously returned by
 > `allocate()` to be relocated (similar to when `allocate()` throws
 > `MemorySegmentGrown`).

 Ah, good catch.  I was actually not aware of the inconsistency, but
 the difference was intentional: `setNamedAddress()` is generally
 expected to be independently, not as part of a set of allocations.
 I clarified it in the doc (10d88f0).

 > * The `setNamedAddress()` API doc does not describe what happens if
 >   `NULL` is passed in the `addr` argument. If `NULL` will clear the
 >   association, then `clearNamedAddress()` would be redundant (we should
 >   not have to support two different ways of clearing the
 >   association). If indeed `NULL` can be associated with a name, it
 >   should be documented (with what a sample use-case may be for storing
 >   `NULL`).

 Hmm, I thought it was at least clear NULL is allowed, but I tried to
 explain more details about what it would mean.

 > * In the API doc of `MemorySegmentMapped()`, is it assumed that if
 >   `create` argument is true, then any existing file will be overwritten
 >   (truncated)?  If this is not the case, can the `create` argument be
 >   removed, such that this read+write version of `MemorySegmentMapped()`
 >   auto-creates the file if it doesn't exist, or opens it if it does
 >   exist?

 Not a direct answer to these questions, but I realized we'd also need
 a "create only" mode for our later work, so I revised the interface
 quite substantially.  Please check the new `OpenMode` enum, revised
 constructor, and their documentation.

 > * Is there a reason why you use 0 instead of `NULL` in a lot of places,
 >   including unittests? (They are equivalent, but `NULL` is a gentle
 >   reminder that it's a pointer).
 > {{{
 > void*
 > MemorySegmentLocal::getNamedAddress(const char* name) {
 >     std::map<std::string, void*>::iterator found =
 named_addrs_.find(name);
 >     if (found != named_addrs_.end()) {
 >         return (found->second);
 >     }
 >     return (0);
 > }
 > }}}

 Please see http://bind10.isc.org/ticket/2833#comment:16 (search for
 "NULL") and the followup discussion.  I've reverted to NULL for this
 branch, too.

 > * I think this public variable's definition should go into a `.cc` file:
 > {{{
 >     /// \brief The default value of the mapped file size when newly
 created.
 >     ///
 >     /// Its value, 32KB, is an arbitrary choice, but considered to be
 >     /// sufficiently but not too large.
 >     static const size_t INITIAL_SIZE = 32768;
 > }}}

 Good catch, done.

 > * Can the following comment be removed from
 > `MemorySegmentMapped::setNamedAddress()`? It seems weird reading it by
 > itself (it is as if by default the base `MemorySegment` interface
 > doesn't expect `true` to be returned):
 > {{{
 >     /// This version can actually return true.
 >     ///
 > }}}

 I can do it, but I thought it's useful to add note specific
 differences of specific implementations (of course the caller
 shouldn't assume the derived-class specific behavior when it uses the
 abstract interface).  I updated the documentation in that sense
 at b8d905c.  But if you think it's still rather confusing, I'm okay
 with removing it.

 > * `getNamedAddress()` has different behavior in the different
 > implementations with `name=NULL`.  If we don't want to support
 > `name=NULL` , it should be documented in the
 > `MemorySegment::getNamedAddress()` base class interface.
 [...]
 > * Same in `clearNamedAddress()` too.

 Okay.  In general, I'd prefer explicitly rejecting invalid (or
 unexpected) parameters, so I revised the implementation and
 documentation that way.

 And also for `setNamedAddress`.

 > * The exception message is incorrect:
 > {{{
 > void
 > MemorySegmentMapped::deallocate(void* ptr, size_t) {
 >     if (impl_->read_only_) {
 >         isc_throw(InvalidOperation, "allocate attempt on read-only
 segment");
 >     }
 >
 > }}}

 Good catch, corrected.

 > * What are the conditions under which this assertion is triggered? i.e.,
 >   how did you arrive at this `INITIAL_SIZE` in the condition
 >   expression?
 > {{{
 >     // It appears an assertion failure is triggered within Boost if the
 size
 >     // is too small.  To work this around we'll make it no-op if the
 size is
 >     // already reasonably small.
 >     if (getSize() < INITIAL_SIZE) {
 >         return;
 >     }
 > }}}

 I found it through our unit tests (if I disabled the above 'if', one
 of the unit tests triggered the assertion failure).  I tried to
 clarify the issue in more detail as a code comment.

 > * Can something like `MappedFile` be used as the type name instead of
 >   `BaseSegment`? The method calls on `base_sgmt_` seem to match a file
 >   naming.

 I don't understand the second sentence...anyway, I'm okay with
 renaming it, but boost::basic_managed_mapped_file actually comes from
 the "Managed Memory Segments" abstraction
 (see
 http://www.boost.org/doc/libs/1_53_0/doc/html/interprocess/managed_memory_segments.html),
 rather than file abstraction, so I prefer using the term "memory"
 and/or "segment".

 If `BaseSegment` sounds confusing to you, I propose, e.g.,
 `BoostSegment` and `boost_sgmt_`.  At the moment I didn't change the
 code.

 > * A general observation about API doc comments, which applies to the
 >   rest of BIND 10 API doc too: differently written comments seem to be
 >   used to say the same thing about exceptions:
 >
 > {{{
 >     /// This version never throws.
 > }}}
 >
 > and
 >
 > {{{
 >     /// \throw None
 > }}}
 >
 > Documentation about exceptions that can be thrown:
 [...]
 > Shall we use `\throw` in all cases to be consistent? It would be easier
 > when we get around to publishing such API documentation for the BIND 10
 > libraries.

 As a general matter I tend to agree (at least it's better to be
 consistent, and use of `\throw` would be the best choice), but
 regarding this one:
 {{{
     /// This version never throws.
 }}}
 it's rather implementation note than part of API spec (Basically, as
 API the base class description should be referenced).

 For this branch, I've added some more `\throw`s where they seem
 appropriate.  I also changed some of `InvalidOperation` cases to
 `MemorySegmentError` as it should be basically generic enough at the
 base class level.

 > * This is not very important, but I wanted to suggest using XOR instead
 >   of addition in the checksum computation. I checked about this to make
 >   sure before suggesting it, and it seems that for 2-bit errors,
 >   addition has a slightly better probability of generating a different
 >   checksum. So addition is good in this case.

 This checksum is basically quite poor anyway in terms of integrity
 check, so the difference in that sense is subtle anyway.  Nevertheless
 I'm okay to switch to any other algorithm as long as it's lightweight,
 but as it seems there's no reason to change it for now, I've kept it.

 > * A test should be added to verify that previous name/address
 >   associations continue to exist and are still usable after
 >   `shrinkToFit()` is called, and previously allocated data exists and is
 >   the same. For these names and data, I suggest using random sizes and
 >   random data, a list of which you can save in `vector`s for matching
 >   after the shrinking is performed.

 Okay, added these cases to the `namedAddress` test.

 > * For the above case, we should also test the case where the mapping
 >   changes (somehow force it to happen). In this case, we test that
 >   `getNamedAddress()` returns the updated address and the data at that
 >   address is consistent with what we expect.

 Also added it to the `namedAddress` test.

 > * Is it not possible to perform `shrink_to_fit()` when the `BaseSegment`
 >   object is still alive (and the mapped file is open)? The documentation
 >   doesn't seem to say anything for or against it. Same question applies
 >   to `grow()` used in `growSegment()` too.

 Do you mean whether the underlying Boost library allows that (and
 whether we really need reset `BaseSegment` before grow/shrink)?  If
 so, I thought Boost doesn't allow it, but, hmm, according to
 http://www.boost.org/doc/libs/1_53_0/doc/html/boost/interprocess/basic_managed_mapped_file.html
 it may not so be clear.  It could depend on what reading/writing
 means.  But from an additional quick experiment, if we keep
 `BaseSegment` and perform grow(), a subsequently call to allocate()
 caused crash, so I guess we cannot really assume it works that way.
 I suspect that's the same for shrink.

 ...but are we on the same page, or are you asking something different?

 > * Is it necessary to call `flush()` in `growSegment()`? If so, isn't it
 >   necessary to do this in `shrinkToFit()` also?

 Ah, no, it shouldn't be necessary any more because the destructor
 calls flush().  I guess grow had it before I extended the destructor.
 Now removed.

 > * I don't see any fork tests for interprocess testing. Please add these,
 >   as all of `MemorySegmentMapped`'s methods and the allocated data ought
 >   to be tested to function properly from independent process address
 >   spaces with a shared map file. This is our use-case.

 Okay, added.

 > * We should also have tests for write behavior.. i.e., are writes to
 >   memory reflected in the maps from other reader processes before the
 >   writer destroys its segment? Are they when the writer destroys its
 >   segment?

 These don't work, or more accurately, we cannot guarantee that.  Our
 readers open the mapped memory in read-only mode, which internally
 means they use it as a private memory space.  In this case it's not
 even guaranteed that modifications to the original file is ever
 reflected to reader's memory (different OSes showed different
 behaviors in my previous experiments).  We basically don't allow
 mixture of writer and readers (and that's the responsibility of the
 implementor and operator).

 > * In `checkSegmentNamedAddress()`, why the `memcpy()` instead of just
 >   `static_cast`ing it to an `uint32_t*` and assigning it? Similarly in
 >   `MemorySegmentMappedTest.namedAddress` too.

 hmm, I don't remember why I used memcpy, but at least I guess I wanted
 to avoid the patch you showed in that it repeats the hardcoded
 constants.  Still, however, it doesn't mean it should memcpy.  I
 revised the code using static_cast.

 > * I don't understand this assertion in `growSegment()`:
 > {{{
 >         const size_t new_size = prev_size * 2;
 >       assert(new_size != 0); // assume grow fails before size overflow
 > }}}

 I've revised a comment to clarify the intent at commit da33bc4.  Does
 that answer your question?

 > * Is this guaranteed by Boost?
 > {{{
 >     // Another shrink shouldn't cause disruption, and the size shouldn't
 change
 >     segment_->shrinkToFit();
 >     EXPECT_EQ(shrinked_size, segment_->getSize());
 > }}}

 Boost doesn't guarantee it, but our wrapper (basically) does.  See the
 discussion of "perform `shrink_to_fit()` with `BaseSegment` still
 alive" above.

 > * If `initial_size` is passed as 0 to the constructor, Boost (or
 >   something else under it) throws a library error. Also 0 passed to
 >   `allocate()` returns a non-NULL pointer. Not sure if we should bother
 >   to add lines to the API doc about these extreme cases, whether they
 >   are allowed or not, or assertions.

 As for the initial_size, Boost rejects it if it's too small because it
 internally needs to allocate some management data structures.  But its
 size is not public information (or even the existence of such
 restriction is not publicly described), so we cannot predict the
 exact threshold.  it should be safe to assume 0 will be always
 rejected, so I added a test case for it and added notes about it in
 the documentation.

 Regarding allocate() and deallocate(), I think it's okay to say
 nothing.  After all, the user shouldn't use the return value anyway,
 and as long as deallocate() works for any pointer value returned by
 allocate(), it's not really special.

 > * Though it is an advisory lock, it may make sense to additionally make
 >   use of `fcntl()` read and write locks to check for mutual exclusion in
 >   `MemorySegmentMapped`. This will make visible any programming mistakes
 >   quickly.

 Do you mean acquiring a lock when opening a segment in read-write
 mode (so there should be no more than one writer at the same time)?

 > * I have pushed a bunch of commits to the branch. Please also check
 >   them.

 These generally look good, thanks.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2831#comment:14>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development