BIND 10 #408: General control logic of NSASthe (was: General control logic of NSAS)
BIND 10 Development
do-not-reply at isc.org
Mon Dec 13 12:25:36 UTC 2010
#408: General control logic of NSASthe
---------------------------+------------------------------------------------
Reporter: vorner | Owner: vorner
Type: task | Status: reviewing
Priority: major | Milestone:
Component: recurser | Resolution:
Keywords: | Sensitive: 0
Estimatedhours: 0.0 | Hours: 0
Billable: 1 | Totalhours: 0
Internal: 0 |
---------------------------+------------------------------------------------
Changes (by stephen):
* owner: stephen => vorner
Comment:
>> Comment: the suggestion in the "to do" comment about doing part of the
lookup using a shareable lock
>> seems a good idea. But I would use an upgradeable lock, allowing the
conversion between a read lock
>> to an exclusive lock without releasing it. That would make the code
simpler: acquire read lock and
>> look up entry, if not found upgrade to exclusive lock and create entry.
> Reading documentation, it seems it would not work. It suggests only one
can have upgradable lock at the time, which is the same as exclusive in
our case.
Ah yes, you are right. Although there seem to be three states:
* shared ownership - allow other threads to shared ownership or upgrtade
ownership
* upgrade ownership - allow other threads to have shared ownsership
* exclusive ownership - do not allow any other form of ownership
So we could acquire shared ownership then, if we want to modify the table,
upgrade to exclusive ownership in two steps.
>> ''newNS()''
>> As this function is purely local to this module, suggest that it is
declared "static"
> It is in an anonymous namespace, which has equivalent effect to being
static. It is unreachable from other
> modules and compiler knows it is not called from outside, allowing it to
optimise more.
Yes, you are right, ignore my comment.
>> '''ZoneEntry::dispatchFailures()'''
>> Can't the arguments be passed by reference to save a copy operation?
> With family, there's no gain (it's an enum, which might be even smaller
than the reference/pointer). There
> might be with the shared pointer, but the function is private, so the
compiler can (and as it is small,
> probably will) inline it and eliminate all the stuff there. So I think
this does not really matter at all.
Probably not. But if in the future the code is extended and moved to a
.cc file, it will matter. The suggestion is just a bit of future-
proofing.
>> ''ZoneEntry::process()''
>> The header says that this function "processes the nameservers". What
processing is done?
> I hope this explanation is better.
It is - thanks.
>> Could we have a bit more documentation on the locks. Under what
circumstances should the
>> caller pass a lock and under what circumstances should the zone entry's
own lock be used?
>> Also, the header says that if an exception is generated the lock is not
released; what if
>> the lock is the internal one?
> The working of locks was probably too complicated. I found out that the
problem/complications
> may be solved by using a recursive lock, so I changed that one and
removed the parameter.
> So we will have less documentation on the locks O:-).
OK
>> Is there really a need for the process guard? If there are multiple
calls on the stack...
>>:
>> There is. It waits only for the nameservers already in cache, not for
the ones from external fetches. I
>> changed the comment explaining why it is done.
OK
>> Comment on removal of LRU from nameserver entries...
>> :
> Actually, I do not want a LRU list with nameservers at all. I want only
the hash table with weak pointers.
> The thing keeping them alive would be the !ZoneEntry holding the
nameserver.
>
> The with weak pointers there would delete them safely (shared_ptr
promisses that). The gotcha there is
> the weak pointer is left inside the hash table and throws an exception
upon access. So we would need to
> catch that exception when we access that hash slot and delete the dead
weak pointer (which points nowhere
> at that time).
Will an access always throw an exception? The pointer points to a block
of memory which could easily be allocated to another nameserver object.
> And the LRU list approach probably does not work, because an active zone
entry never touches the
> hash table again if the nameserver list stays the same. It keeps the
original pointers, avoiding lookups.
> Therefore we will drop a nameserver of the oldest zone entry, but that
one might still be active.
Hmm... on thinking about it, you are right. Even though a zone entry is
deleted, the associated nameserver entries only get deleted when they
reach the end of the LRU list. So if we have one zone active for a long
time and flood the table with short-lived zones (with different
nameservers), even though the short-lived zones might be explicitly
deleted, the presence of their nameservers would force those associated
with the long-lived zone out of the table.
>> '''src/lib/nsas/nameserver_entry.h'''
>> '''src/lib/nsas/nameserver_entry.cc'''
>> Is there any reason why the header commentsfor askIP() are delimited by
/*...*/? All other comments in
>> the .h file use the Doxygen "///" form.
> Well, yes, there is.
>
> For one, it is /** ... */ ...
The point was not what style of comments are used, it's just that there is
a mixture of styles in the file which looks messy. In general, I think
changes to a file should be in the same style as the existing code.
>> Is the five minutes mentioned in this method a reference to caching
server failures? If so, the five minutes
>> is defined in RFC 2308 (section 7).
> I didn't find it there. I think this is a slightly different problem ‒
the server itself might not be dead.
> I'm just unable to obtain its address (maybe because the nameserver that
knows it is down, or because no
> nameserver I know address of knows them or because my network cable is
suddenly eaten by a mouse...).
Is that really different? You can't reach the server, so as far as you
are concerned, it is down.
>> ''NameserverEntry::getAddresses()''
>> :
> I'm not sure I really understand this comment. What is the problem with
expiring while there's a fetch in
> progress?
Pointers might become invalid because something referred to by the fecth
expires (and is deleted) before the fetch returns.
>> (Code) In the switch (getState()) check, under what circumstances would
the state be IN_PROGRESS but we do
>> :
> I added a comment into the code. In short, yes, the entry might already
received (or failed to) the IP
> address of this family while still waiting for the other one. In that
case, we just provide them.
OK
>> ''Test ChangedNS''
>> The header comment suggests that this does three tests. Perhaps it
should be split?
> They test how the entry reacts to series of events, so, no, they need to
go one after another (and there's a
> part that needs to happen before this test, so it is tested together
with it as well).
OK
>> '''Documentation'''
>> It would be worth re-visiting the NSAS Design and check that the
description of the data structures and
>> processing match what is now in place.
> Yes, I want to do it some time around merging. But IMO keeping
documentation on wiki seems bad,
> because while the code has multiple versions (with possibly multiple
different ways how it does
> something), the wiki shows only the latest one.
Something for the F2F? Perhaps we should keep this overview documentation
with the code, perhaps in the form of an !OpenOffice and/or PDF file (to
allow diagrams to be more easily included)?
However, all points have been addressed, please go ahead and merge with
#356.
--
Ticket URL: <https://bind10.isc.org/ticket/408#comment:10>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list