BIND 10 #2850: Define and implement ZoneTableSegmentMapped
BIND 10 Development
do-not-reply at isc.org
Wed May 8 00:16:27 UTC 2013
#2850: Define and implement ZoneTableSegmentMapped
-------------------------------------+-------------------------------------
Reporter: jinmei | Owner:
Type: task | jinmei
Priority: medium | Status:
Component: data source | reviewing
Keywords: | Milestone:
Sensitive: 0 | Sprint-20130514
Sub-Project: DNS | Resolution:
Estimated Difficulty: 5 | CVSS Scoring:
Total Hours: 0 | Defect Severity: N/A
| Feature Depending on Ticket:
| shared memory data source
| Add Hours to Ticket: 0
| Internal?: 0
-------------------------------------+-------------------------------------
Comment (by jinmei):
Replying to [comment:13 muks]:
> > - we need to handle the "default" case above. enums cannot completely
> > eliminate this case.
>
> I think of this everytime when writing `switch`, but I believe the
> current approach is superior. This is because modern compilers warn if
> you miss handling an enum value in a switch statement, which will result
> in a compile error in our project.
That's not my concern. My concern is we need to be robust against
buggy code like this:
{{{#!cpp
ztseg_->reset(static_cast<ZoneTableSegment::MemorySegmentOpenMode>(100),
param);
}}}
Not all "modern" compilers can detect the logical violation of the
enum range (in fact, no compilers that I know of complain about it).
> > '''zone_writer.cc'''
> >
> > - I'd include zone_table_segment.h explicitly as long as the .cc file
> > directly refers to the definition.
>
> The `ZoneWriter` class interface in `zone_writer.h` requires
> `ZoneTableSegment` to be known, and `zone_writer.cc` implements that
> interface, so it ought to be known by then to the `.cc` code too. (This
> is why the code compiles currently.)
>
> There is the small possibility that `ZoneTableSegment` as known to the
> class interface in `zone_writer.h` is an opaque type. But compile will
> fail anyway if this is the case.
>
> I don't have a strong opinion as it doesn't cause problems. It will just
> be redundant. If you want this anyway, I'll add it. :)
In general, I personally prefer making .cc not rely on details of
indirect inclusion of header files so the .cc will be robust against
future changes to header files. But I don't insist it strongly
either.
Replying to [comment:16 muks]:
> Replying to [comment:14 jinmei]:
> > For this purpose, however, it doesn't have to be a checksum, so one
> > other option is to just do a simple scan of the pages
> > (add a new `MemorySegmentMapped::scan()` method instead of
> > getCheckSum() and use it in `ZoneTableSegmentMapped::reset()`, maybe
> > if specified so in params).
>
> In this case, are you thinking of removing the checksum functionality
> when adding the `scan()` method?
It would be independent from whether/what to do for checksum.
> If we sequential scan the mapped file in `scan()`, it may be useful to
> use `posix_fadvise()` on the fd (if we have access to the fd).
I'm not sure how effective it is in our usage, but in any case we
don't have direct access to the FD; it's hidden inside the boost
implementation.
> > Like by pulling up the main part of reset() to the base class,
> > delegating `openReadWrite()`, etc, to derived classes as a protected
> > method? Hmm, possibly. I'm not necessarily opposed to that idea, and
> > I see the possibility that it will help future extensions, such as
> > supporting other types of shared memory.
>
> Shall I implement this, or create another ticket for it? I think another
> ticket would be better for it as `ZoneTableSegmentLocal` tests can also
> reuse common tests, and this would involve a bit of refactoring.
It's probably better to defer it to a separate ticket. It won't
affect users of the zone table segment class, and, at least for now it
doesn't provide real benefit much. But I'd add some comment about
the future possibility somewhere in the base class documentation.
> > > * A question about the design: why support `reset()`? Is it for
> > > performance that we re-use a `ZoneTableSegment` instead of
> > > `reset()`ing during construction and simply creating another
object in
> > > case the underlying storage changes?
> >
> > I don't understand what you mean here. At least it has nothing to do
> > with "performance" in my mind. If you mean we don't need the
> > `reset()` API, specifically what would you be suggesting?
>
> I was thinking that we could simply create a new `ZoneTableSegment` with
> `create()` taking a `ConstElementPtr` that would contain all the details
> (type of backend: local/mapped, the filename, etc.). If we need to
> reset, we destroy the old one and create a new `ZoneTableSegment` with
> `create()`.
There can be several pros and cons for both approaches, but there's at
least one thing that reset() could do better: provide the strong
exception guarantee depending on the current state and given param of
the segment.
I'm not necessarily defending the approach with reset(), and if you
have specific reasons for taking other approaches, I'm open to
discussions. But if the difference is minor I'd move forward with the
current design and implementation. Not many other classes would
depend on these details, so if we find clearer reason for changing it,
we should be able to do it without affecting others much.
> I missed talking about two more issues:
>
> * The `setNamedAddress()` interface could be problematic in practice as
> it may relocate addresses in the segment. When we `allocate()`, the
> addresses are anonymous with no associated name and when we pass it to
> `setNamedAddress()` there is a possibility the memory could be "lost"
> with no pointer to it if the address is relocated. To workaround this
> issue, we reserve space first in `ZoneTableSegmentMapped` with `NULL`,
> but such code is error-prone and may not be so straightforward to do
> in other use-cases. In general, a better `setNamedAddress()` interface
> will guarantee that the address being passed will not be lost. Such
> things may seem trivial at this layer, but could cause problems when
> we mis-use these interfaces in higher layers, and in this case, hard
> to debug too. It's better to have a robust interface here.
>
> However, I don't have any good ideas on how to fix it. :)
I agree it's error prone. One possibility of improving it would be to
have setNamedAddress() return the latest real address (that may be
re-addressed after setting the name):
{{{#!cpp
// the second element of the pair is the real address for the name.
// normally it's identical to addr, but could be different if
// setting the name requires the segment to grow.
pair<bool, void*> setNamedAddress(const char* name, void* addr);
}}}
or, we might even pass `void**`:
{{{#!cpp
// associate *addrp and name. *addrp will be reset to the real
address of
// the name, taking into account the possible readdressing.
bool setNamedAddress(const char* name, void** addrp);
}}}
The latter is safer, while a bit inconvenient when we want to pass
NULL.
> * If there is a missing checksum or header (by corruption or programming
> mistake), `READ_WRITE` mode will not warn (it will think the file was
> freshly created). Is there a reason why we'd want to add `CREATE`
> functionality to this mode too?
Yes. This way the memmgr can always specify READ_WRITE, whether the
file already exists or not (the latter is the case of the very first
time). CREATE is normally unnecessary; it would be used only when
READ_WRITE fails due to possible data corruption (memmgr could remove
the file by itself and create it with READ_WRITE, though).
Specific comments on the revised branch follow:
'''zone_table_segment.h'''
- general: there seems to be some confusion/inconsistency about
terminology: what's "backing memory store"? What's the relationship
between the "store" and "memory segments"? What does open in "open
a zone table segment" mean? The confusion is understandable in some
sense as we are completing this class gradually, and we originally
didn't know how exactly the mapped version would look like. But now
that we have both local and mapped, I think we should describe
general overview of this abstraction, define and clarify these
terminology, and explain any non trivial technical details.
- `ResetFailedAndSegmentCleared`: this use of strong guarantee is
awkward:
{{{#!cpp
/// existing backing memory store. When this exception is thrown, there
/// is a strong guarantee that the previously existing backing memory
/// store was cleared.
}}}
"strong guarantee" is a technical term, which usually means if the
intended operation fails it (effectively) rolls back to the original
state. See, e.g., http://en.wikipedia.org/wiki/Exception_safety
So, "strong guarantee" and "previously existing store was cleared"
cannot coexist.
- `ResetFailed`: I guess you meant "cleared" or something, instead of
"unloaded":
{{{#!cpp
/// is still a strong guarantee that the previously existing backing
/// memory store was not unloaded.
}}}
- `getHeader`: with this description it sounds like the caller needs
to check if it throws. actually, we have to be able to ensure the
condition this should succeed. And it reminded me of one missing
task, which I just created: #2943. With that extension in mind,
I'd revise the description as follows:
{{{#!cpp
/// \brief Return the ZoneTableHeader for the zone table segment.
///
/// As long as isUsable() returns true, this method must always
succeed
/// without throwing an exception. If isUsable() is \c false, a
derived
/// class implementation can throw InvalidOperation depending on
/// its implementation details. Applications are generally expected
to
/// call this method only when isUsable() is true (either by making
sure
/// explicitly or by some other indirect means).
///
/// \todo Implement isUsable(): See Trac #2943
virtual ZoneTableHeader& getHeader() = 0;
}}}
Same for the const version and getMemorySegment(), but use DRY, and
let them refer to unified text.
- `reset`: based on my impression that this doc doesn't use the word
"strong guarantee" as a technical term, I'd clarify it using this
term, i.e., we provide the strong guarantee for the first two
failure cases, but not for the third (we might say we still provide
basic guarantee, but it may not be worth mentioning).
- `reset`: I'd explain the level of exception safety (guarantee) if
the segment has never been in use and reset() fails. In this case
I guess we can say it provides the strong guarantee.
- `clear`: I don't understand this description:
{{{#!cpp
/// Implementations of this method should close any current memory
/// store and reset the `ZoneTableSegment` to a freshly constructed
/// state.
}}}
If the application needs to reset() it, it should basically simply
call reset() because it internally clear() any existing segment.
`clear` would be only useful when the application just wants to
clear it.
'''zone_table_segment_mapped.cc'''
- processHeader: this could (though unlikely) result in
`MemorySegmentGrown`:
{{{#!cpp
ZoneTableHeader* new_header = new(ptr)
ZoneTableHeader(ZoneTable::create(segment, rrclass_));
}}}
And, as noted previously, `ZoneTable::create` itself doesn't seem to
be safe for this issue. So, while not entirely clean, I'm inclined
to not try to recover from these cases and just throw a fatal
exception such as `std::bad_alloc`.
- I believe we can assert() `grew` is false here:
{{{#!cpp
const bool grew = segment.setNamedAddress(ZONE_TABLE_HEADER_NAME,
new_header);
if (grew) {
}}}
- reset: technically, there's a slight chance of failure (throw) here:
{{{#!cpp
current_filename_ = filename;
}}}
Mostly just being paranoia, but I'd use swap to be 100% safe:
{{{#!cpp
current_filename_.swap(filename);
}}}
(Unfortunately we need to give up making `filename` const though)
'''zone_table_segment_test.h'''
- `ZoneTableSegmentTest`: the class name sounds confusing to me
because it sounds like a test fixture class (in fact, there's a
fixture class of the same name).
'''tests/memory/zone_loader_util.cc'''
It doesn't have to use scoped_ptr for `ZoneWriter` anymore.
'''zone_finder_context_unittest.cc'''
- it doesn't have to use scoped_ptr for `ZoneWriter`.
'''zone_table_segment_mapped_unittest.cc'''
- The constructor is not fully exception safe. We need, for example,
to delay creating() the segment until the body of the constructor
with help of something similar to `SegmentObjectHolder`.
- fileExists: shouldn't we need to include `<cerrno>` for `errno`?
- createData: this causes undefined behavior:
{{{#!cpp
for (int i = 0; i < 256; ++i) {
string name("name");
name += i;
}}}
at line 3 an integer is implicitly converted to char, and it's
undefined for i > 127 if char is signed. besides, when i = 0
it effectively does nothing (if we use it via c_str()), which I
guess is not what you actually intended. while these names wouldn't
have to be human readable, I guess it's better to do, e.g.:
{{{#!cpp
name += lexical_cast<string>(i);
}}}
Same for verifyData.
- createData and verifyData: they rely on the fact that
`UniformRandomIntegerGenerator` produces the same sequence of random
numbers from the same seed. I don't see much advantage for the
reliance while it has one clear disadvantage that the reader would
have to wonder and check the behavior of the generator class. I'd
rather suggest pre-creating the sequence of data and passing them to
these functions in the form of a vector or something. Then it's
super clear that these functions refer to identical data set without
requiring any extra research or pondering. (and this would make the
additional changes made to `UniformRandomIntegerGenerator` in this
branch unnecessary)
- corruptChecksum: what's the rationale of the magic number?
{{{#!cpp
size_t checksum = *static_cast<size_t*>(result.second);
checksum ^= 0x55555555;
*static_cast<size_t*>(result.second) = checksum;
}}}
If the purpose of this function is to just change the value of the
original checksum, why not doing something much simpler, e.g, just
by `+= 1`? In this case, the use of a non trivial magic number and
the less common operator (xor) just seem to unnecessarily require
more brain cycles to understand it.
- reset: this comment seems obsolete:
{{{#!cpp
// This must not throw now.
EXPECT_TRUE(ztable_segment_->isWritable());
}}}
- clear: we need a test what if clear() is called before doing any
reset() (btw what should happen?)
- corruption tests don't seem to be comprehensive
- in particular, it doesn't seem to confirm exception guarantee
conditions (whether the original data are still valid in case of
strong guarantee; whether the segment becomes unusable otherwise)
- We might also check CREATE should succeed even if there is
corrupted original segment.
- what about processHeader() failure in case of READ_WRITE?
- there may be others; so I also suggest you to run lcov/gcov to
confirm all cases are covered in tests.
--
Ticket URL: <http://bind10.isc.org/ticket/2850#comment:17>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list