SERVFAIL for some domains on some servers

Cihan Subasi (Garanti Teknoloji) CihanS at garanti.com.tr
Mon Mar 1 09:47:46 UTC 2010


We have similar problem with a master with 9.5.1.p3 and 4 slaves with 9.5.2 p1...We are getting SERVFAIL on slaves occasionally, and we resolve the issue with restarting the named. How did you resolve this issue? Or have you resolved it? Thank you



-----Original Message-----
From: bind-users-bounces+cihans=garanti.com.tr at lists.isc.org [mailto:bind-users-bounces+cihans=garanti.com.tr at lists.isc.org] On Behalf Of Oliver Henriot
Sent: Saturday, February 27, 2010 7:52 PM
To: bind-users at lists.isc.org
Subject: SERVFAIL for some domains on some servers

Dear list users,

Maybe you can help me out here. Please bear with me if I'm stating the obvious, but my computing skills are scarce and I still have a lot to learn.

I have a series of name servers, some of which fail to resolve hosts in other domains whereas others don't have any problem.

My setup is as follows :
- server "1" : master for my domain, recursion disabled for all except localhost. Setup is BIND 9.5.1-P2 on SunOS 5.9.
- servers "2", "3" and "4" : slaves for my domain, recusrion allowed for all, official resolvers for my clients, same configuration on all 3. 
Setup is DiG 9.3.6-P1 on CentOS 5.4.

Servers "2" and "4" fail to resolve domain x.fr whereas "1" and "3" have no problem (if interrogated locally for "1" of course). The error I get is :


dig -t A @"2" www.x.fr

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2 <<>> -t A @"2" www.x.fr
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 37397
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.x.fr.                IN      A

;; Query time: 4622 msec
;; SERVER: "2"#53("2")
;; WHEN: Sat Feb 27 18:20:07 2010
;; MSG SIZE  rcvd: 40


The behavior is the same for "4" and for any host in domain x.fr (and 
the domain itself).

It's not a network problem, I can telnet on port 53 of the name servers 
for domain x.fr from "2" (obviously using the ip address as the name 
can't be resolved by the server).

Also, reverse queries for hosts in domain x.fr from "2" do not fail.

Finally, even more strange, if I use dig's +trace option servers "2" and 
"4" do not fail any more and can resolve www.x.fr (although the query 
lags quite a bit when doing the last bit of resolving, from x.fr to 
www.x.fr).

Here's the output :

dig www.x.fr @"2" +trace

; <<>> DiG 9.5.1-P3 <<>> www.x.fr @"2" +trace
;; global options:  printcmd
.                       518400  IN      NS      F.ROOT-SERVERS.NET.
.                       518400  IN      NS      G.ROOT-SERVERS.NET.
.                       518400  IN      NS      H.ROOT-SERVERS.NET.
.                       518400  IN      NS      I.ROOT-SERVERS.NET.
.                       518400  IN      NS      J.ROOT-SERVERS.NET.
.                       518400  IN      NS      K.ROOT-SERVERS.NET.
.                       518400  IN      NS      L.ROOT-SERVERS.NET.
.                       518400  IN      NS      M.ROOT-SERVERS.NET.
.                       518400  IN      NS      A.ROOT-SERVERS.NET.
.                       518400  IN      NS      B.ROOT-SERVERS.NET.
.                       518400  IN      NS      C.ROOT-SERVERS.NET.
.                       518400  IN      NS      D.ROOT-SERVERS.NET.
.                       518400  IN      NS      E.ROOT-SERVERS.NET.
;; Received 500 bytes from "2"#53("2") in 2 ms

fr.                     172800  IN      NS      E.EXT.NIC.fr.
fr.                     172800  IN      NS      B.EXT.NIC.fr.
fr.                     172800  IN      NS      F.EXT.NIC.fr.
fr.                     172800  IN      NS      A.NIC.fr.
fr.                     172800  IN      NS      C.NIC.fr.
fr.                     172800  IN      NS      G.EXT.NIC.fr.
fr.                     172800  IN      NS      D.NIC.fr.
fr.                     172800  IN      NS      D.EXT.NIC.fr.
;; Received 444 bytes from 192.58.128.30#53(J.ROOT-SERVERS.NET) in 44 ms

x.fr.     172800  IN      NS      ns1.x.fr.
x.fr.     172800  IN      NS      ns2.x.fr.
;; Received 108 bytes from 193.176.144.6#53(E.EXT.NIC.fr) in 33 ms

www.x.fr. 300     IN      A       xxx.xxx.xxx.xxx
x.fr.     300     IN      NS      ns2.x.fr.
x.fr.     300     IN      NS      ns1.x.fr.
;; Received 124 bytes from xxx.xxx.xxx.xxx#53(ns1.x.fr) in 0 ms


I'm at a loss as to what's going on (or wrong) here and what I can to do 
to solve the problem. Any help would be greatly appreciated.

Thanks in advance.

Oliver


_______________________________________________
bind-users mailing list
bind-users at lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


This message and attachments are confidential and intended solely for the individual(s) stated in this message. If you received this message although you are not the addressee, you are responsible to keep the message confidential. The sender has no responsibility for the accuracy or correctness of the information in the message and its attachments. Our company shall have no liability for any changes or late receiving, loss of integrity and confidentiality, viruses and any damages caused in anyway to your computer system.  

Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan, butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu tutulamaz.



More information about the bind-users mailing list