BIND 10 #2609: SERVFAIL on all queries while b10-loadzone is running

BIND 10 Development do-not-reply at isc.org
Mon Jan 7 19:34:52 UTC 2013


#2609: SERVFAIL on all queries while b10-loadzone is running
-------------------------------------+-------------------------------------
            Reporter:  vorner        |                        Owner:
                Type:  defect        |                       Status:  new
            Priority:  medium        |                    Milestone:  New
           Component:  Unclassified  |  Tasks
            Keywords:                |                   Resolution:
           Sensitive:  0             |                 CVSS Scoring:
         Sub-Project:  DNS           |              Defect Severity:  N/A
Estimated Difficulty:  0             |  Feature Depending on Ticket:
         Total Hours:  0             |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Replying to [comment:1 jinmei]:
 > Does that happen with the old b10-loadzone, too?
 >
 > Not looked into the implementation, but I thought this scenario
 > shouldn't cause such disruption because it's not a write-write
 > conflict.

 Hmm, actually, SQLite3 doesn't even allow read if there's a
 transaction in which some write operations have been performed:
 http://www.sqlite.org/faq.html#q5

 To solve this situation, I guess we need some tricky operation within
 sqlite_accessor:

 - if it's for replacing the entire zone, don't start a transaction,
   but assign a new zone ID in the "zones" table (preventing it from
   matching actual queries accidentally).
 - build the new zone using the new zone ID (without a transaction)
 - on completion, start a transaction and swap the old and new zone
   IDs; also update the zone ID in the diff table to the new one;
   commit the transaction.
 - remove records for the old version of the zone (if it's reasonably
   fast, in a transaction; otherwise do it without making a
   transaction)

 This is tricky in various points: we now need to do rollback operation
 ourselves.  updating the diff table may also be tricky.  if loadzone
 and xfrin happen at the same time (though quite unlikely in practice)
 that would result in a mess.  etc...

 So, a higher level alternative is to declare that the SQLite3 data
 source shouldn't be used for a huge zone.  In that case, we should add
 support for another data source (using a database that has more
 fine-grained locking) very soon, though.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2609#comment:2>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list