Strange problem with a query deleting a record...

Gordon A. Lang glang at goalex.com
Sat Aug 24 15:48:09 UTC 2013


Making some assumptions about where your dig queries are being sent, I would 
say it looks like the Squid is simply failing its DNS lookup (for whatever 
reason), then the Squid system is retaining a 5 minute negative cache.  If 
this is true, then the question would become why does the Squid system fail 
on that one lookups but (presumably) succeeds on others?

--
Gordon A. Lang

--------------------------------------------------
From: "John E.P. Hynes" <johnh at primebuchholz.com>
Sent: Saturday, August 24, 2013 8:55 AM
To: "Barry Margolin" <barmar at alum.mit.edu>
Cc: <comp-protocols-dns-bind at isc.org>; <bind-users at isc.org>
Subject: Re: Strange problem with a query deleting a record...

> On 08/24/2013 12:46 AM, Barry Margolin wrote:
>> In article <mailman.1159.1377301811.20661.bind-users at lists.isc.org>,
>>   Mark Andrews <marka at isc.org> wrote:
>>
>>> In message <52177D81.8020206 at chrysler.com>, Kevin Darcy writes:
>>>> On 8/22/2013 12:55 PM, johnh at primebuchholz.com wrote:
>>>>> Greetings All,
>>>>>
>>>>> First of all, I apologize if this is out of place - I'm having a very
>>>>> strange issue that is either a problem with bind itself, or at least,
>>>>> affecting it.  Summary:
>>>>>
>>>>> For only ONE address, whenever I attempt to access it through my squid
>>>>> proxy, the record disappears from DNS, and the retry time changes too.
>>>>> Essentially, accessing www.thisdomain.com works, but a link to a 
>>>>> portal
>>>>> on
>>>>> that page to the subdomain login.thisdomain.com causes the problem. 
>>>>> I'm
>>>>> willing to bet the problem lies with squid, but as to how it could
>>>>> possibly change a record in bind... Well, I'm stumped.  If you don't 
>>>>> go
>>>>> through squid, everything works.  All other requests to bind for the
>>>>> address of the host in question work fine. Here's a the output of dig
>>>>> from
>>>>> before accessing the page through squid:
>>>>>
>>>>> ; <<>> DiG 9.4.1-P1 <<>> login.thisdomain.com
>>>>> ;; global options:  printcmd
>>>>> ;; Got answer:
>>>>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45037
>>>>> ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 0
>>>>>
>>>>> ;; QUESTION SECTION:
>>>>> ;login.thisdomain.com.            IN      A
>>>>>
>>>>> ;; ANSWER SECTION:
>>>>> login.thisdomain.com.     17      IN      A       111.222.333.123
>>>>>
>>>>> ;; AUTHORITY SECTION:
>>>>> thisdomain.com.         168319  IN      NS      ns1.thisdomain.com.
>>>>> thisdomain.com.         168319  IN      NS      ns2.thisdomain.com.
>>>>>
>>>>> ;; Query time: 0 msec
>>>>> ;; SERVER: 127.0.0.1#53(127.0.0.1)
>>>>> ;; WHEN: Thu Aug 22 12:29:57 2013
>>>>> ;; MSG SIZE  rcvd: 88
>>>>>
>>>>> You can do anything to request the address from bind and it works,
>>>>> *except* try to access it through squid.  Bypassing squid and going
>>>>> directly through the firewall works fine.
>>>>>
>>>>> Now, immediately after you try to access it through squid:
>>>>>
>>>>> ; <<>> DiG 9.4.1-P1 <<>> login.thisdomain.com
>>>>> ;; global options:  printcmd
>>>>> ;; Got answer:
>>>>> ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 43943
>>>>> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
>>>>>
>>>>> ;; QUESTION SECTION:
>>>>> ;login.thisdomain.com.            IN      A
>>>>>
>>>>> ;; AUTHORITY SECTION:
>>>>> thisdomain.com.         298     IN      SOA     ns1.thisdomain.com.
>>>>> serv.anotherdomain.com. 2006062510 3600 3600 2592000 300
>>>>>
>>>>> ;; Query time: 0 msec
>>>>> ;; SERVER: 127.0.0.1#53(127.0.0.1)
>>>>> ;; WHEN: Thu Aug 22 12:30:06 2013
>>>>> ;; MSG SIZE  rcvd: 95
>>>>>
>>>>> After the 5-minute retry shown above expires, the original record
>>>>> reappears.
>>>>>
>>>>> Ideas?  I'm stumped.  It seems like squid is somehow able to corrupt
>>>>> bind's info, but I can't imagine how.
>>>> I have a theory. If this is a name that's hosted on a stupid
>>>> load-balancer, and that load-balancer doesn't understand non-A-record
>>>> query types, then if Squid is sending a non-A query type (e.g. SRV,
>>>> possibly even AAAA, if it's *really* stupid), then the load-balancer 
>>>> may
>>>> be erroneously "poisoning" your cache with an NXDOMAIN response.
>>>>
>>>> We ran into this many years ago with Cisco GSSes (Global Site 
>>>> Selectors)
>>>> and work around it by having a "shadow" version of the zone, which the
>>>> GSSes proxy to for QTYPEs they don't handle. That "shadow" version of
>>>> the zone has a wildcard entry in it which forces responses to be NODATA
>>>> instead of NXDOMAIN, and this prevents the cache poisoning.
>>>>
>>>>                                                               - Kevin
>>> The load balancer should be able to correct for such misconfigurations
>>> by changing the rcode of the response from NXDOMAIN to NOERROR.  It
>>> knows what names is is answering for so it can know that the NXDOMAIN
>>> is a erroneous response.
>> If I understand what Kevin was saying, the load balancer IS the DNS
>> server. If you ask it for the A record it's responsible for, it sends a
>> reasonable reply. If you ask it for some other record type for that
>> name, it sends NXDOMAIN instead of NOERROR.
>>
>> It's a design flaw in these load balancers.
>>
>
> Thanks everyone who's been helping with this.
>
> In order to investigate this further, I did a tcpdump of both a "working" 
> conversation of a browser requesting the site, not going through the squid 
> proxy, and another of the "broken" conversation through the proxy.
>
> Result:  There is an NXDOMAIN response to a request for an AAAA record 
> that the proxy makes that is causing this.  The browser never asks for 
> anything but an A record, which succeeds.
>
> I've contacted the site in question with this info, so hopefully it'll get 
> resolved.  I'll keep the list posted on any results or info for posterity.
>
> -John
>
>
> --
> Please consider the environment before printing this e-mail.
>
> This e-mail is intended only for the named person or entity to which it
> is addressed and contains valuable business information that is
> privileged, confidential and/or otherwise protected from disclosure.
> Dissemination, distribution or copying of this e-mail or the information
> herein by anyone other than the intended recipient, or an employee, or
> agent responsible for delivering the message to the intended recipient,
> is strictly prohibited.  All contents are the copyright property of the
> sender.  If you are not the intended recipient, you are nevertheless
> bound to respect the sender's worldwide legal rights.  We require that
> unintended recipients delete the e-mail and destroy all electronic
> copies in their system, retaining no copies in any media.  If you have
> received this e-mail in error, please immediately notify us by calling
> our Help Desk at (603) 433-1143, or e-mail to it at primebuchholz.com.
> We appreciate your cooperation.
>
> _______________________________________________
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to 
> unsubscribe from this list
>
> bind-users mailing list
> bind-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
> 


More information about the bind-users mailing list