BIND 10 #2850: Define and implement ZoneTableSegmentMapped
BIND 10 Development
do-not-reply at isc.org
Mon May 13 14:05:36 UTC 2013
#2850: Define and implement ZoneTableSegmentMapped
-------------------------------------+-------------------------------------
Reporter: jinmei | Owner:
Type: task | jinmei
Priority: medium | Status:
Component: data source | reviewing
Keywords: | Milestone:
Sensitive: 0 | Sprint-20130514
Sub-Project: DNS | Resolution:
Estimated Difficulty: 5 | CVSS Scoring:
Total Hours: 0 | Defect Severity: N/A
| Feature Depending on Ticket:
| shared memory data source
| Add Hours to Ticket: 0
| Internal?: 0
-------------------------------------+-------------------------------------
Changes (by muks):
* owner: muks => jinmei
Comment:
Hi Jinmei
Replying to [comment:17 jinmei]:
> Replying to [comment:13 muks]:
>
> > > - we need to handle the "default" case above. enums cannot
completely
> > > eliminate this case.
> >
> > I think of this everytime when writing `switch`, but I believe the
> > current approach is superior. This is because modern compilers warn if
> > you miss handling an enum value in a switch statement, which will
result
> > in a compile error in our project.
>
> That's not my concern. My concern is we need to be robust against
> buggy code like this:
> {{{#!cpp
>
ztseg_->reset(static_cast<ZoneTableSegment::MemorySegmentOpenMode>(100),
> param);
> }}}
>
> Not all "modern" compilers can detect the logical violation of the
> enum range (in fact, no compilers that I know of complain about it).
I don't understand why we would want to support something like this. A
lot of code in the tree will break when forcibly made like this.
In the interest of moving this bug forward, I'm just adding the
`default` clause.
> > > '''zone_writer.cc'''
> > >
> > > - I'd include zone_table_segment.h explicitly as long as the .cc
file
> > > directly refers to the definition.
> >
> > The `ZoneWriter` class interface in `zone_writer.h` requires
> > `ZoneTableSegment` to be known, and `zone_writer.cc` implements that
> > interface, so it ought to be known by then to the `.cc` code too.
(This
> > is why the code compiles currently.)
> >
> > There is the small possibility that `ZoneTableSegment` as known to the
> > class interface in `zone_writer.h` is an opaque type. But compile will
> > fail anyway if this is the case.
> >
> > I don't have a strong opinion as it doesn't cause problems. It will
just
> > be redundant. If you want this anyway, I'll add it. :)
>
> In general, I personally prefer making .cc not rely on details of
> indirect inclusion of header files so the .cc will be robust against
> future changes to header files. But I don't insist it strongly
> either.
I've added the include.
> Replying to [comment:16 muks]:
>
> > Replying to [comment:14 jinmei]:
>
> > > For this purpose, however, it doesn't have to be a checksum, so one
> > > other option is to just do a simple scan of the pages
> > > (add a new `MemorySegmentMapped::scan()` method instead of
> > > getCheckSum() and use it in `ZoneTableSegmentMapped::reset()`, maybe
> > > if specified so in params).
> >
> > In this case, are you thinking of removing the checksum functionality
> > when adding the `scan()` method?
>
> It would be independent from whether/what to do for checksum.
Separately, are you considering removing the checksum functionality?
> > If we sequential scan the mapped file in `scan()`, it may be useful to
> > use `posix_fadvise()` on the fd (if we have access to the fd).
>
> I'm not sure how effective it is in our usage, but in any case we
> don't have direct access to the FD; it's hidden inside the boost
> implementation.
On my Linux workstation, it doesn't seem to make much of a
difference. Maybe the default is sequential. But the POSIX function
exists for a reason. If we know we want to page in a large file into
core sequentially, we can give the kernel hints about it.
Anyway, if we don't have access to the descriptor, then there's nothing
to do. :)
> > > Like by pulling up the main part of reset() to the base class,
> > > delegating `openReadWrite()`, etc, to derived classes as a protected
> > > method? Hmm, possibly. I'm not necessarily opposed to that idea,
and
> > > I see the possibility that it will help future extensions, such as
> > > supporting other types of shared memory.
> >
> > Shall I implement this, or create another ticket for it? I think
another
> > ticket would be better for it as `ZoneTableSegmentLocal` tests can
also
> > reuse common tests, and this would involve a bit of refactoring.
>
> It's probably better to defer it to a separate ticket. It won't
> affect users of the zone table segment class, and, at least for now it
> doesn't provide real benefit much. But I'd add some comment about
> the future possibility somewhere in the base class documentation.
Nod. Such a comment has been added.
> > > > * A question about the design: why support `reset()`? Is it for
> > > > performance that we re-use a `ZoneTableSegment` instead of
> > > > `reset()`ing during construction and simply creating another
object in
> > > > case the underlying storage changes?
> > >
> > > I don't understand what you mean here. At least it has nothing to
do
> > > with "performance" in my mind. If you mean we don't need the
> > > `reset()` API, specifically what would you be suggesting?
> >
> > I was thinking that we could simply create a new `ZoneTableSegment`
with
> > `create()` taking a `ConstElementPtr` that would contain all the
details
> > (type of backend: local/mapped, the filename, etc.). If we need to
> > reset, we destroy the old one and create a new `ZoneTableSegment` with
> > `create()`.
>
> There can be several pros and cons for both approaches, but there's at
> least one thing that reset() could do better: provide the strong
> exception guarantee depending on the current state and given param of
> the segment.
>
> I'm not necessarily defending the approach with reset(), and if you
> have specific reasons for taking other approaches, I'm open to
> discussions. But if the difference is minor I'd move forward with the
> current design and implementation. Not many other classes would
> depend on these details, so if we find clearer reason for changing it,
> we should be able to do it without affecting others much.
Nod. At this point, even I just want to go ahead with what's in the
branch as it's gone way beyond its estimate. So we can we can consider
it if necessary later.
> > I missed talking about two more issues:
> >
> > * The `setNamedAddress()` interface could be problematic in practice
as
> > it may relocate addresses in the segment. When we `allocate()`, the
> > addresses are anonymous with no associated name and when we pass it
to
> > `setNamedAddress()` there is a possibility the memory could be
"lost"
> > with no pointer to it if the address is relocated. To workaround
this
> > issue, we reserve space first in `ZoneTableSegmentMapped` with
`NULL`,
> > but such code is error-prone and may not be so straightforward to do
> > in other use-cases. In general, a better `setNamedAddress()`
interface
> > will guarantee that the address being passed will not be lost. Such
> > things may seem trivial at this layer, but could cause problems when
> > we mis-use these interfaces in higher layers, and in this case, hard
> > to debug too. It's better to have a robust interface here.
> >
> > However, I don't have any good ideas on how to fix it. :)
>
> I agree it's error prone. One possibility of improving it would be to
> have setNamedAddress() return the latest real address (that may be
> re-addressed after setting the name):
>
> {{{#!cpp
> // the second element of the pair is the real address for the name.
> // normally it's identical to addr, but could be different if
> // setting the name requires the segment to grow.
> pair<bool, void*> setNamedAddress(const char* name, void* addr);
> }}}
>
> or, we might even pass `void**`:
>
> {{{#!cpp
> // associate *addrp and name. *addrp will be reset to the real
address of
> // the name, taking into account the possible readdressing.
> bool setNamedAddress(const char* name, void** addrp);
> }}}
>
> The latter is safer, while a bit inconvenient when we want to pass
> NULL.
I have implemented your latest suggestion for this, which is to reserve
an `offset_ptr` during `MemorySegmentMapped` construction (please see the
previous comment).
I have also removed some code that used to reserve space by passing a
`NULL` address.
> > * If there is a missing checksum or header (by corruption or
programming
> > mistake), `READ_WRITE` mode will not warn (it will think the file
was
> > freshly created). Is there a reason why we'd want to add `CREATE`
> > functionality to this mode too?
>
> Yes. This way the memmgr can always specify READ_WRITE, whether the
> file already exists or not (the latter is the case of the very first
> time). CREATE is normally unnecessary; it would be used only when
> READ_WRITE fails due to possible data corruption (memmgr could remove
> the file by itself and create it with READ_WRITE, though).
Hmm. Shall I check in `READ_WRITE` mode if the file exists before
proceeding to create a fresh checksum, header, etc.?
> Specific comments on the revised branch follow:
>
> '''zone_table_segment.h'''
>
> - general: there seems to be some confusion/inconsistency about
> terminology: what's "backing memory store"? What's the relationship
> between the "store" and "memory segments"? What does open in "open
> a zone table segment" mean? The confusion is understandable in some
> sense as we are completing this class gradually, and we originally
> didn't know how exactly the mapped version would look like. But now
> that we have both local and mapped, I think we should describe
> general overview of this abstraction, define and clarify these
> terminology, and explain any non trivial technical details.
These have been defined more clearly now.
> - `ResetFailedAndSegmentCleared`: this use of strong guarantee is
> awkward:
> {{{#!cpp
> /// existing backing memory store. When this exception is thrown, there
> /// is a strong guarantee that the previously existing backing memory
> /// store was cleared.
> }}}
> "strong guarantee" is a technical term, which usually means if the
> intended operation fails it (effectively) rolls back to the original
> state. See, e.g., http://en.wikipedia.org/wiki/Exception_safety
> So, "strong guarantee" and "previously existing store was cleared"
> cannot coexist.
I didn't know about these definitions of exception safety. The
documentation has been amended accordingly.
> - `ResetFailed`: I guess you meant "cleared" or something, instead of
> "unloaded":
> {{{#!cpp
> /// is still a strong guarantee that the previously existing backing
> /// memory store was not unloaded.
> }}}
This comment has also been updated.
> - `getHeader`: with this description it sounds like the caller needs
> to check if it throws. actually, we have to be able to ensure the
> condition this should succeed. And it reminded me of one missing
> task, which I just created: #2943. With that extension in mind,
> I'd revise the description as follows:
> {{{#!cpp
> /// \brief Return the ZoneTableHeader for the zone table segment.
> ///
> /// As long as isUsable() returns true, this method must always
succeed
> /// without throwing an exception. If isUsable() is \c false, a
derived
> /// class implementation can throw InvalidOperation depending on
> /// its implementation details. Applications are generally expected
to
> /// call this method only when isUsable() is true (either by making
sure
> /// explicitly or by some other indirect means).
> ///
> /// \todo Implement isUsable(): See Trac #2943
> virtual ZoneTableHeader& getHeader() = 0;
> }}}
> Same for the const version and getMemorySegment(), but use DRY, and
> let them refer to unified text.
Done.
> - `reset`: based on my impression that this doc doesn't use the word
> "strong guarantee" as a technical term, I'd clarify it using this
> term, i.e., we provide the strong guarantee for the first two
> failure cases, but not for the third (we might say we still provide
> basic guarantee, but it may not be worth mentioning).
These have been changed appropriately I think. Please check them.
> - `reset`: I'd explain the level of exception safety (guarantee) if
> the segment has never been in use and reset() fails. In this case
> I guess we can say it provides the strong guarantee.
This is also done.
> - `clear`: I don't understand this description:
> {{{#!cpp
> /// Implementations of this method should close any current memory
> /// store and reset the `ZoneTableSegment` to a freshly constructed
> /// state.
> }}}
> If the application needs to reset() it, it should basically simply
> call reset() because it internally clear() any existing segment.
> `clear` would be only useful when the application just wants to
> clear it.
Fixed.
> '''zone_table_segment_mapped.cc'''
>
> - processHeader: this could (though unlikely) result in
`MemorySegmentGrown`:
> {{{#!cpp
> ZoneTableHeader* new_header = new(ptr)
> ZoneTableHeader(ZoneTable::create(segment, rrclass_));
> }}}
> And, as noted previously, `ZoneTable::create` itself doesn't seem to
> be safe for this issue. So, while not entirely clean, I'm inclined
> to not try to recover from these cases and just throw a fatal
> exception such as `std::bad_alloc`.
Done.
> - I believe we can assert() `grew` is false here:
> {{{#!cpp
> const bool grew =
segment.setNamedAddress(ZONE_TABLE_HEADER_NAME,
> new_header);
> if (grew) {
> }}}
Changed accordingly.
> - reset: technically, there's a slight chance of failure (throw) here:
> {{{#!cpp
> current_filename_ = filename;
> }}}
> Mostly just being paranoia, but I'd use swap to be 100% safe:
> {{{#!cpp
> current_filename_.swap(filename);
> }}}
> (Unfortunately we need to give up making `filename` const though)
This was fixed differently so that the passed filename can still be
`const`.
> '''zone_table_segment_test.h'''
>
> - `ZoneTableSegmentTest`: the class name sounds confusing to me
> because it sounds like a test fixture class (in fact, there's a
> fixture class of the same name).
This (and also `MemorySegmentTest`) were changed to use `Mock` in the
name.
> '''tests/memory/zone_loader_util.cc'''
>
> It doesn't have to use scoped_ptr for `ZoneWriter` anymore.
Updated.
> '''zone_finder_context_unittest.cc'''
>
> - it doesn't have to use scoped_ptr for `ZoneWriter`.
Updated.
> '''zone_table_segment_mapped_unittest.cc'''
>
> - The constructor is not fully exception safe. We need, for example,
> to delay creating() the segment until the body of the constructor
> with help of something similar to `SegmentObjectHolder`.
I've changed it to use a `auto_ptr`.
Ideally, this should use something like a SegmentObjectHolder, but a
`SegmentObjectHolder` expects different arguments. In this limited
usecase, `ZoneTableSegment::destroy()` just calls its destructor, and
there are other tests like `ZoneWriterTest.*` which also used it with a
`scoped_ptr` (`ZoneWriterTest` constructor has been changed now to
directly construct a mock object).
> - fileExists: shouldn't we need to include `<cerrno>` for `errno`?
Included.
> - createData: this causes undefined behavior:
> {{{#!cpp
> for (int i = 0; i < 256; ++i) {
> string name("name");
> name += i;
> }}}
> at line 3 an integer is implicitly converted to char, and it's
> undefined for i > 127 if char is signed. besides, when i = 0
> it effectively does nothing (if we use it via c_str()), which I
> guess is not what you actually intended. while these names wouldn't
> have to be human readable, I guess it's better to do, e.g.:
> {{{#!cpp
> name += lexical_cast<string>(i);
> }}}
> Same for verifyData.
Eek! This has been fixed, but using `boost::format` instead of
`boost::lexical_cast`.
> - createData and verifyData: they rely on the fact that
> `UniformRandomIntegerGenerator` produces the same sequence of random
> numbers from the same seed. I don't see much advantage for the
> reliance while it has one clear disadvantage that the reader would
> have to wonder and check the behavior of the generator class. I'd
> rather suggest pre-creating the sequence of data and passing them to
> these functions in the form of a vector or something. Then it's
> super clear that these functions refer to identical data set without
> requiring any extra research or pondering. (and this would make the
> additional changes made to `UniformRandomIntegerGenerator` in this
> branch unnecessary)
This has been updated so that the data for the test is created in one
place. I have not removed my modifications to
`UniformRandomIntegerGenerator` as it may be useful to have a seed
argument in the RNG (and there were also some cleanups made).
> - corruptChecksum: what's the rationale of the magic number?
> {{{#!cpp
> size_t checksum = *static_cast<size_t*>(result.second);
> checksum ^= 0x55555555;
> *static_cast<size_t*>(result.second) = checksum;
> }}}
> If the purpose of this function is to just change the value of the
> original checksum, why not doing something much simpler, e.g, just
> by `+= 1`? In this case, the use of a non trivial magic number and
> the less common operator (xor) just seem to unnecessarily require
> more brain cycles to understand it.
I'm sure you figured it too, that 0x5 is 0b0101. It was just to have a
good operand to the xor operator. I initially started with `+= 1`, but
decided on corrupting it to a radically different value from the
expected value so that any printed test output isn't just off by 1.
Anyway, it now increments the checksum by 1 to corrupt it.
> - reset: this comment seems obsolete:
> {{{#!cpp
> // This must not throw now.
> EXPECT_TRUE(ztable_segment_->isWritable());
> }}}
Nod. Fixed.
> - clear: we need a test what if clear() is called before doing any
> reset() (btw what should happen?)
`clear()` before `reset()` is effectively a nop, as clearing returns it
to the freshly constructed state. A testcase has been added.
> - corruption tests don't seem to be comprehensive
> - in particular, it doesn't seem to confirm exception guarantee
> conditions (whether the original data are still valid in case of
> strong guarantee; whether the segment becomes unusable otherwise)
Tests for this have been added.
> - We might also check CREATE should succeed even if there is
> corrupted original segment.
Added.
> - what about processHeader() failure in case of READ_WRITE?
In case of `READ_WRITE`, if the header is missing, there's no way in the
current implementation for us to know if it's because the file was
created freshly, or because there was corruption.
Shall I add code to check if the mapped file exists in `reset()` before
a `MemorySegmentMapped` is constructed? This way we can know for sure if
a header must exist beforehand.
> - there may be others; so I also suggest you to run lcov/gcov to
> confirm all cases are covered in tests.
All cases are not covered, because it's not possible to cover many of
them (such as `ResetFailedAndSegmentCleared` cases where we cannot
hotwire the currently open mapped file).
Replying to [comment:19 jinmei]:
> While reviewing #2836 I noticed one related thing: memory remapping
> can happen in `ZoneWriter::load()` and `ZoneWriter::install()`,
> which could invalidate the zone table header address (return value
> of getHeader()).
>
> So, we'll need something like this:
>
> - add another virtual method to `ZoneTableSegment`:
> {{{#!cpp
> // update header_ by getting the address of ZONE_TABLE_HEADER_NAME
again.
> // (local version can be nop.)
> virtual void resetHeader() = 0;
> }}}
> - and update load() and install() as follows:
> {{{#!cpp
> void
> ZoneWriter::load() {
> if (state_ != ZW_UNUSED) {
> isc_throw(isc::InvalidOperation, "Trying to load twice");
> }
>
> zone_data_ = load_action_(segment_.getMemorySegment());
>
> if (!zone_data_) {
> // Bug inside load_action_.
> isc_throw(isc::InvalidOperation, "No data returned from load
action");
> }
>
> segment_.resetHeader();
>
> state_ = ZW_LOADED;
> }
>
> void
> ZoneWriter::install() {
> if (state_ != ZW_LOADED) {
> isc_throw(isc::InvalidOperation, "No data to install");
> }
>
> ZoneTable* table(segment_.getHeader().getTable());
> if (!table) {
> isc_throw(isc::Unexpected, "No zone table present");
> }
> const ZoneTable::AddResult result(table->addZone(
> segment_.getMemorySegment(),
> rrclass_, origin_,
zone_data_));
>
> state_ = ZW_INSTALLED;
> zone_data_ = result.zone_data;
> segment_.resetHeader();
> }
> }}}
>
> If you think these are too much for this task, I'm okay with deferring
> it to a separate ticket.
This has been implemented now. Initially I thought it was not possible
to test this as `ZoneTableSegment::create()` was used to create the zone
table segment, but it was possible to test it using a mock object.
Replying to [ticket:2943 jinmei]:
> [...] we need two small extensions to the `ZoneTableSegment`
> class:
>
> {{{#!cpp
> class ZoneTableSegment {
> public:
> /// ZoneTableSegmentLocal returns "local"; ZoneTableSegmentMapped
returns
> /// "mapped"
> virtual std::string getStringType() const = 0;
>
> /// ZoneTableSegmentLocal always returns true;
ZoneTableSegmentMapped
> /// returns true iff the memory segment has been set by reset().
> virtual bool isUsable() const = 0;
> };
> }}}
These have also been implemented in this ticket, as at least one of them
needed to be referred in documentation. `getStringType()` has been
renamed to `getImplType()`.
Also:
Can you revert `3c98ade56f07c44740f1bde2e37f8fdc9842bd18` and check why
`base_sgmt_` is empty in `~Impl()`? I don't understand possibly how
`base_sgmt_->flush()` can empty its enclosing smart pointer.. maybe it's
something else causing it.
--
Ticket URL: <http://bind10.isc.org/ticket/2850#comment:22>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list