Bind code question

Kevin Darcy kcd at daimlerchrysler.com
Fri Feb 16 00:29:49 UTC 2001


The only thing I can think of, given your description, is that the TTL on the
glue records is 0. This is the only other situation (besides of course no glue
being cached from the response) where nslookup() would call sysquery() to get the
glue.

Are you *absolutely* sure that the responses you're getting via "nslookup" are
*exactly* the same as what named itself would be getting? You could probably
verify this by cranking up your named debug level to 10 or more and configuring
the "packets" logging category to capture to a file (the "packets" category only
seems to generate useful information at high debug levels). That should show you
exactly what named is receiving in the responses. My guess is that your named is
in fact seeing glue-less responses.

If for some reason -- like missing glue from a response -- named *does* need to
call sysquery() to get those address records, then all bets are off. BIND 8
doesn't have "query restart" (see the infamous "too hard" comment on line 1427 of
ns_resp.c). It just fires off the sysquery()'s and drops the original query,
hoping to be able to complete resolution when the client retries.


- Kevin


Joe Hooper wrote:

> I'm having difficulty understanding a couple of things in the BIND
> implementation (I'm working with 8.2.3).
>
> When a query with the response bit set comes into the function ns_resp(),
> I can follow it up until it calls nslookup() to grab the address for each
> NS in nsp (list of databuf structs). This occurs when the response doesn't
> contain the answer and the iterative process is about to start.
>
> Adding some additional debugging output, I've noticed that the return value
> for the nslookup() call in ns_resp() is 0 quite often, indicating no addresses
> could be found for any of the nameservers. This shuttles us to the timeout
> label and pulls the query off the allocated queries queue.
>
> Nslookup() is calling sysquery(), which I understand if no glue records were
> returned with the response. As the comments in the code mention, a zero return
> value can indicate a sysquery() timeout or simply a response w/ no glue.
> However, in nearly all my test cases doing a dump of the actual records
> returned by nslookup (the userland command) showed that the servers were
> indeed returning the necessary glue A records, however, this timeout condition
> still appears during normal named operation (I'm flushing the cache, etc,
> between all this).
>
> What's going on here? If the glue is being returned properly, is there some
> reason I'm overlooking why the code would still need to do a sysquery()? Is it
> indicative of some different error altogether? After a timeout and the query
> gets popped off, does Bind just hope the response eventually comes back, gets
> cached, and will be available when the user program requests the same data
> again?
>
> Many thanks in advance for any help....
>
> --
> ********************************
> Joe Hooper | hooper at isolated.org
> ********************************





More information about the bind-users mailing list