BIND 10 #408: General control logic of NSASthe

Mon Dec 13 16:04:14 UTC 2010

#408: General control logic of NSASthe
---------------------------+------------------------------------------------
      Reporter:  vorner    |        Owner:  vorner   
          Type:  task      |       Status:  reviewing
      Priority:  major     |    Milestone:           
     Component:  recurser  |   Resolution:           
      Keywords:            |    Sensitive:  0        
Estimatedhours:  0.0       |        Hours:  0        
      Billable:  1         |   Totalhours:  0        
      Internal:  0         |  
---------------------------+------------------------------------------------

Comment(by vorner):

 Replying to [comment:10 stephen]:
 >  * shared ownership - allow other threads to shared ownership or
 upgrtade ownership
 >  * upgrade ownership - allow other threads to have shared ownsership
 >  * exclusive ownership - do not allow any other form of ownership
 >
 > So we could acquire shared ownership then, if we want to modify the
 table, upgrade to exclusive ownership in two steps.

 It makes no sense to provide any way for shared lock to upgrade to
 exclusive (either directly or trough other state). It would lead to
 unpreventable deadlocks.

 If there are two threads holding shared lock and both decide they want
 exclusive, neither of them can get it, because the other one has shared.
 But none of them is willing to give up its shared one until it reaches
 exclusive and finishes the work.

 This is the reason for existence of upgrade ownership. It is like shared
 one, but only one has a guarantee it can decide to want exclusive lock.

 Anyway, I believe it is not a problem now, since we have much more hash
 cells than threads and the chance of two colliding is small. And if the
 thread only looks up the entry (does not create one), it holds the lock
 really short time. I think we can leave this be unless we measure it is a
 bottleneck.

 > > The with weak pointers there would delete them safely (shared_ptr
 promisses that). The gotcha there is
 > > the weak pointer is left inside the hash table and throws an exception
 upon access. So we would need to
 > > catch that exception when we access that hash slot and delete the dead
 weak pointer (which points nowhere
 > > at that time).
 >
 > Will an access always throw an exception?  The pointer points to a block
 of memory which could easily be allocated to another nameserver object.

 According to documentation, it does.

 What I partly guess from the documentation and how the shared_ptr looks in
 debugger, it holds two pointers. One for the real data and one for a small
 piece of memory where it holds the reference counts. So it can hold two
 counts ‒ one of the shared_ptrs there and one of total
 weak_ptrs+shared_ptrs. If shared goes to zero, the real object is
 destroyed, but the small piece of memory not. So the weak_ptr looks there
 and finds that the shared_ptr count is zero, therefore it was deleted and
 throws. The piece of memory is destroyed when the last weak_ptr stops
 pointing to it.

 It is probably little bit simplified, but it shows it can be done easily.
 The fancy thing there will be doing it thread-safe and fast, but for
 modelling of the behaviour this is probably enough.

 > >> Is the five minutes mentioned in this method a reference to caching
 server failures? If so, the five minutes
 > >> is defined in RFC 2308 (section 7).
 > > I didn't find it there. I think this is a slightly different problem ‒
 the server itself might not be dead.
 > > I'm just unable to obtain its address (maybe because the nameserver
 that knows it is down, or because no
 > > nameserver I know address of knows them or because my network cable is
 suddenly eaten by a mouse...).
 > Is that really different?  You can't reach the server, so as far as you
 are concerned, it is down.

 Well, it is. A dead server is one I know where it exists and it doesn't
 answer. This is a case when I do not know who to ask so there's nobody I
 could call dead. The difference is also whose fault it is. I think I
 should leave the FIXME there.

 > >> ''NameserverEntry::getAddresses()''
 > >> :
 > > I'm not sure I really understand this comment. What is the problem
 with expiring while there's a fetch in
 > > progress?
 > Pointers might become invalid because something referred to by the fecth
 expires (and is deleted) before the fetch returns.

 Ah, right. Unless I have a bug in the code somewhere, this can't happen.
 The only pointers that exist there are to NameserverEntry and ZoneEntry.
 And the callbacks (from NameserverEntry or resolver) contain shared
 pointers, so they survive until the answer is provided and the callback
 destroyed.

 > >> '''Documentation'''
 > >> It would be worth re-visiting the NSAS Design and check that the
 description of the data structures and
 > >> processing match what is now in place.
 > > Yes, I want to do it some time around merging. But IMO keeping
 documentation on wiki seems bad,
 > > because while the code has multiple versions (with possibly multiple
 different ways how it does
 > > something), the wiki shows only the latest one.
 > Something for the F2F?  Perhaps we should keep this overview
 documentation with the code, perhaps in the form of an !OpenOffice and/or
 PDF file (to allow diagrams to be more easily included)?

 I was thinking about doxygen. We already have API documentation there and
 it allows creating separate pages, including images, etc. So having a docs
 subdirectory as well as tests one might make sense, we would have
 everything under one system, we could see diffs (it is hard with a PDF)
 and we could generate the documentation for the whole thing.

 But you are right it might make sense to bring this up on F2F. I'm going
 to write a mail about it.

 > However, all points have been addressed, please go ahead and merge with
 #356.

 I have merged it. I'll close the ticket when I update the wiki
 documentation and it is merged to trunk.

-- 
Ticket URL: <http://bind10.isc.org/ticket/408#comment:11>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development