BIND 10 #851: b10-auth (+ hotspot cache) can crash in handling query when DB is busy

BIND 10 Development do-not-reply at isc.org
Thu Apr 21 09:49:46 UTC 2011


#851: b10-auth (+ hotspot cache) can crash in handling query when DB is busy
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  jinmei                             |                Status:  reviewing
                       Type:         |             Milestone:
  defect                             |  Sprint-20110503
                   Priority:         |            Resolution:
  critical                           |             Sensitive:  0
                  Component:  data   |           Sub-Project:  DNS
  source                             |  Estimated Difficulty:  0.0
                   Keywords:         |           Total Hours:  0
            Defect Severity:  N/A    |
Feature Depending on Ticket:         |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by vorner):

 * owner:  vorner => jinmei


Comment:

 Good morning

 Well, I didn't really read the code to depth. But as I understand it, it
 worked like this before the change:
  * Someone asked a query, the answer got cached.
  * Some time passes.
  * The DB becomes busy (by xfrin maybe).
  * Someone asks the same query.
  * The cache says it's valid, so we continue processing.
  * We get part of the answer from cache.
  * We ask the DB something more (eg. the `zoneinfo.getEnclosingZone()` on
 line 947), it fails because of being busy, and returns NULL. We don't
 expect NULL here, so we crash.

 What I think can happen now is this:
  * Someone asked a query, the answer got cached.
  * Some time passes.
  * Someone asks the same query.
  * We first check the DB (in `doQueryTask`) if it is valid. The DB
 confirms.
  * We get part of the answer from cache.
  * The DB becomes busy right now (by xfrin). It's a tiny time window, but
 it can happen, we didn't lock the DB in transaction.
  * We ask the DB something more (again, the `zoneinfo.getEnclosingZone()`
 on line 947), it fails, crash.

 Does it make some sense?

 Or, is that kept somewhere between the calls? Should I read the code more
 thoroughfully?

 Anyway, asking the DB several times without having a transaction is IMO
 wrong, because we can get inconsistent information (someone can modify the
 DB between our queries to it).

 With regards

-- 
Ticket URL: <http://bind10.isc.org/ticket/851#comment:9>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list