named daemon hangs

Mark Andrews Mark_Andrews at isc.org
Sat May 2 22:10:29 UTC 2009


	This is a bug in the kernel where it does not honour that
	the socket is set to non-blocking mode but instead blocks.
	Go complain to your OS vendor.

	Mark

In message <38a4524a0905020806s4c939382n80c1c3da656c8182 at mail.gmail.com>, Nelson Vale writes:
> Hi all,
> 
> 
> I've been facing a problem in my private network which I was not able to fix
> yet.
> 
> In my gateway (linux debian alike) I have bind 9.5 installed and running,
> and I have one IPSec tunnel to another gateway over the internet. It also
> has configured a forward zone with the name server being the other gateway
> internal address (accessibly through the IPSec tunnel only).
> 
> Recently the other IPSec endpoint was shutdown and, of course, my queries to
> the forward domain started failling. Nothing strange here...
> 
> The real problem is that I suddendly were not able to resolve any other DNS
> queries, like www.google.com, from inside my network:
> 
> "host www.google.com
> ;; connection timed out; no servers could be reached"
> 
> I took a look at the named daemon and I see that it does not respond to
> anything as long as the IPSec tunnel is down, but only if it's the other
> endpoint that is down. I've tried stopping my endpoint and this problem do
> not occur as long as I restart named. I think this happens because as long
> as my endpoint is up the routes to the other endpoint are set, and named
> trys to querie the forward domain name server. The problem is that the
> queries do not timeout and named hangs there:
> 
> The configuration I have is:
> 
> Bind: BIND 9.5.0-P2
> IP Address (private): 192.168.9.254
> Forwarders: ADSL provider (2 forwarders)
> Forward Zone: mylan.loc
> Name Server:192.168.90.254
> 
> 
> After it starts if I try to querie one of the forward zone record
> (box.mylan.loc) it displays:
> 
> "...
> 02-May-2009 14:22:21.843 socket 0xb7bd5548: dispatch_recv:  event 0xb7be3d28
> -> task 0xb7b74d18
> 02-May-2009 14:22:21.844 socket 0xb7bd5548: internal_recv: task 0xb7b74d18
> got event 0xb7bd559c
> 02-May-2009 14:22:21.844 socket 0xb7bd5548 192.168.9.2#47869: packet
> received correctly
> 02-May-2009 14:22:21.844 socket 0xb7bd5548: processing cmsg 0xb7bb2120
> 02-May-2009 14:22:21.844 client 192.168.9.2#47869: UDP request
> 02-May-2009 14:22:21.844 client 192.168.9.2#47869: using view '_default'
02-May-2009 14:22:21.845 client 192.168.9.2#47869: request is not signed
> 02-May-2009 14:22:21.845 client 192.168.9.2#47869: recursion available
> 02-May-2009 14:22:21.845 client 192.168.9.2#47869: query
> 02-May-2009 14:22:21.845 client 192.168.9.2#47869: ns_client_attach: ref = 1
> 02-May-2009 14:22:21.845 client 192.168.9.2#47869: query (cache)
> 'box.mylan.loc/A/IN' approved
> 02-May-2009 14:22:21.845 client 192.168.9.2#47869: replace
> 02-May-2009 14:22:21.845 clientmgr @0xb7baa608: createclients
> 02-May-2009 14:22:21.846 clientmgr @0xb7baa608: recycle
> 02-May-2009 14:22:21.846 createfetch: box.mylan.loc A
> 02-May-2009 14:22:21.846 fctx 0xb7bae408(box.mylan.loc/A'): create
> 02-May-2009 14:22:21.846 fctx 0xb7bae408(box.mylan.loc/A'): join
> 02-May-2009 14:22:21.846 fetch 0xb7bb4148 (fctx
> 0xb7bae408(box.mylan.loc/A)): created
> 02-May-2009 14:22:21.846 client @0xb7bda008: udprecv
> 02-May-2009 14:22:21.846 socket 0xb7bd5548: socket_recv: event 0xb7bd4b48 ->
> task 0xb7bb1690
> 02-May-2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): start
> 02-May-2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): try
> 02-May-2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): cancelqueries
> 02-May-2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): getaddresses
> 02-May-2009 14:22:21.847 findaddrinfo: new entry 0xb7aec4a0
> 02-May-2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): query
> 02-May-2009 14:22:21.848 socket 0xb7b79938: created
> 02-May-2009 14:22:21.848 socket 0xb7b79938 0.0.0.0#43841: bound
> 02-May-2009 14:22:21.848 dispatchmgr 0xb7bbb168: created UDP dispatcher
> 0xb7b6d378
> 02-May-2009 14:22:21.848 dispatch 0xb7b6d378: created task 0xb7b74d70
> 02-May-2009 14:22:21.848 dispatch 0xb7b6d378: created socket 0xb7b79938
> 02-May-2009 14:22:21.848 resquery 0xb7b80008 (fctx
> 0xb7bae408(box.mylan.loc/A)): send
> 02-May-2009 14:22:21.849 dispatch 0xb7b6d378 response 0xb7ba7848
> 192.168.90.254#53: attached to task 0xb7b6f2c8
> 02-May-2009 14:22:21.849 socket 0xb7b79938: socket_recv: event 0xb7b81698 ->
> task 0xb7b74d70
> 
> 
> and it hangs here forever. Even if I restart the named server it does not
> respond to any of my queries. If I stop the named server with Ctrl + C it
> displays:
> 
> "...
> ^C02-May-2009 14:23:46.773 socket.c:1226: unexpected error:
> 02-May-2009 14:23:46.773 internal_send: 192.168.90.254#53: Interrupted
> system call should be restarted
> 02-May-2009 14:23:46.774 errno2result.c:111: unexpected error:
> 02-May-2009 14:23:46.774 unable to convert errno to isc_result: 85:
> Interrupted system call should be restarted
> 02-May-2009 14:23:46.774 resquery 0xb7b80008 (fctx
> 0xb7bae408(box.mylan.loc/A)): sent
> 02-May-2009 14:23:46.774 resquery 0xb7b80008 (fctx
> 0xb7bae408(box.mylan.loct/A)): senddone
> 02-May-2009 14:23:46.774 fctx 0xb7bae408(ebox.voiplan.pt/A'): cancelquery
> 02-May-2009 14:23:46.774 dispatch 0xb7b6d378 response 0xb7ba7848
> 192.168.90.254#53: detaching from task 0xb7b6f2c8
> 02-May-2009 14:23:46.774 dispatch 0xb7b6d378: detach: refcount 0
> 02-May-2009 14:23:46.774 dispatch 0xb7b6d378: got packet: requests 0,
> buffers 1, recvs 1
> 02-May-2009 14:23:46.775 dispatch 0xb7b6d378: shutting down; detaching from
> sock 0xb7b79938, task 0xb7b74d70
> 02-May-2009 14:23:46.775 socket 0xb7b79938: destroying
> 02-May-2009 14:23:46.775 dispatchmgr 0xb7bbb168: destroy_mgr_ok:
> shuttingdown=0, listnonempty=1, epool=10, rpool=0, dpool=10
> 02-May-2009 14:23:46.775 shutting down
> 02-May-2009 14:23:46.775 stopping command channel on 127.0.0.1#953
> 02-May-2009 14:23:46.776 res 0xb7bbe200: shutdown
> 02-May-2009 14:23:46.776 res 0xb7bbe200: exiting
> 02-May-2009 14:23:46.776 dns_requestmgr_shutdown: 0xb7b75008
> 02-May-2009 14:23:46.776 send_shutdown_events: 0xb7b75008
> 02-May-2009 14:23:46.777 no longer listening on 127.0.0.1#53
> 02-May-2009 14:23:46.777 clientmgr @0xb7baa3f8: destroy
> 02-May-2009 14:23:46.777 no longer listening on 192.167.200.254#53
> 02-May-2009 14:23:46.777 clientmgr @0xb7baa548: destroy
> ..."
> 
> 
> If anybody could give me a hand on this I surelly would appreciate it.
> 
> 
> Nelson Vale
> 
> --0016e6d63f6c70eabb0468ef458e
> Content-Type: text/html; charset=ISO-8859-1
> Content-Transfer-Encoding: quoted-printable
> 
> Hi all,<br><br><br>I've been facing a problem in my private network whi=
> ch I was not able to fix yet.<br><br>In my gateway (linux debian alike) I h=
> ave bind 9.5 installed and running, and I have one IPSec tunnel to another =
> gateway over the internet. It also has configured a forward zone with the n=
> ame server being the other gateway internal address (accessibly through the=
>  IPSec tunnel only). <br>
> <br>Recently the other IPSec endpoint was shutdown and, of course, my queri=
> es to the forward domain started failling. Nothing strange here...<br><br>T=
> he real problem is that I suddendly were not able to resolve any other DNS =
> queries, like <a href=3D"http://www.google.com">www.google.com</a>, from in=
> side my network:<br>
> <br>"host <a href=3D"http://www.google.com">www.google.com</a><br>;; c=
> onnection timed out; no servers could be reached"<br><br>I took a look=
>  at the named daemon and I see that it does not respond to anything as long=
>  as the IPSec tunnel is down, but only if it's the other endpoint that =
> is down. I've tried stopping my endpoint and this problem do not occur =
> as long as I restart named. I think this happens because as long as my endp=
> oint is up the routes to the other endpoint are set, and named trys to quer=
> ie the forward domain name server. The problem is that the queries do not t=
> imeout and named hangs there:<br>
> <br>The configuration I have is:<br><br>Bind: BIND 9.5.0-P2<br>IP Address (=
> private): 192.168.9.254<br>Forwarders: ADSL provider (2 forwarders)<br>Forw=
> ard Zone: mylan.loc<br>Name Server:192.168.90.254<br><br><br>After it start=
> s if I try to querie one of the forward zone record (box.mylan.loc) it disp=
> lays:<br>
> <br>"...<br>02-May-2009 14:22:21.843 socket 0xb7bd5548: dispatch_recv:=
> =A0 event 0xb7be3d28 -> task 0xb7b74d18<br>02-May-2009 14:22:21.844 sock=
> et 0xb7bd5548: internal_recv: task 0xb7b74d18 got event 0xb7bd559c<br>02-Ma=
> y-2009 14:22:21.844 socket 0xb7bd5548 192.168.9.2#47869: packet received co=
> rrectly<br>
> 02-May-2009 14:22:21.844 socket 0xb7bd5548: processing cmsg 0xb7bb2120<br>0=
> 2-May-2009 14:22:21.844 client 192.168.9.2#47869: UDP request<br>02-May-200=
> 9 14:22:21.844 client 192.168.9.2#47869: using view '_default'<br>
> 02-May-2009 14:22:21.845 client 192.168.9.2#47869: request is not signed<br=
> >02-May-2009 14:22:21.845 client 192.168.9.2#47869: recursion available<br>=
> 02-May-2009 14:22:21.845 client 192.168.9.2#47869: query<br>02-May-2009 14:=
> 22:21.845 client 192.168.9.2#47869: ns_client_attach: ref =3D 1<br>
> 02-May-2009 14:22:21.845 client 192.168.9.2#47869: query (cache) 'box.m=
> ylan.loc/A/IN' approved<br>02-May-2009 14:22:21.845 client 192.168.9.2#=
> 47869: replace<br>02-May-2009 14:22:21.845 clientmgr @0xb7baa608: createcli=
> ents<br>
> 02-May-2009 14:22:21.846 clientmgr @0xb7baa608: recycle<br>02-May-2009 14:2=
> 2:21.846 createfetch: box.mylan.loc A<br>02-May-2009 14:22:21.846 fctx 0xb7=
> bae408(box.mylan.loc/A'): create<br>02-May-2009 14:22:21.846 fctx 0xb7b=
> ae408(box.mylan.loc/A'): join<br>
> 02-May-2009 14:22:21.846 fetch 0xb7bb4148 (fctx 0xb7bae408(box.mylan.loc/A)=
> ): created<br>02-May-2009 14:22:21.846 client @0xb7bda008: udprecv<br>02-Ma=
> y-2009 14:22:21.846 socket 0xb7bd5548: socket_recv: event 0xb7bd4b48 -> =
> task 0xb7bb1690<br>
> 02-May-2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): start<br>02=
> -May-2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): try<br>02-May=
> -2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): cancelqueries<br>
> 02-May-2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): getaddresse=
> s<br>02-May-2009 14:22:21.847 findaddrinfo: new entry 0xb7aec4a0<br>02-May-=
> 2009 14:22:21.847 fctx 0xb7bae408(box.mylan.loc/A'): query<br>02-May-20=
> 09 14:22:21.848 socket 0xb7b79938: created<br>
> 02-May-2009 14:22:21.848 socket 0xb7b79938 0.0.0.0#43841: bound<br>02-May-2=
> 009 14:22:21.848 dispatchmgr 0xb7bbb168: created UDP dispatcher 0xb7b6d378<=
> br>02-May-2009 14:22:21.848 dispatch 0xb7b6d378: created task 0xb7b74d70<br=
> >
> 02-May-2009 14:22:21.848 dispatch 0xb7b6d378: created socket 0xb7b79938<br>=
> 02-May-2009 14:22:21.848 resquery 0xb7b80008 (fctx 0xb7bae408(box.mylan.loc=
> /A)): send<br>02-May-2009 14:22:21.849 dispatch 0xb7b6d378 response 0xb7ba7=
> 848 192.168.90.254#53: attached to task 0xb7b6f2c8<br>
> 02-May-2009 14:22:21.849 socket 0xb7b79938: socket_recv: event 0xb7b81698 -=
> > task 0xb7b74d70<br><br><br>and it hangs here forever. Even if I restar=
> t the named server it does not respond to any of my queries. If I stop the =
> named server with Ctrl + C it displays:<br>
> <br>"...<br>^C02-May-2009 14:23:46.773 socket.c:1226: unexpected error=
> :<br>02-May-2009 14:23:46.773 internal_send: 192.168.90.254#53: Interrupted=
>  system call should be restarted<br>02-May-2009 14:23:46.774 errno2result.c=
> :111: unexpected error:<br>
> 02-May-2009 14:23:46.774 unable to convert errno to isc_result: 85: Interru=
> pted system call should be restarted<br>02-May-2009 14:23:46.774 resquery 0=
> xb7b80008 (fctx 0xb7bae408(box.mylan.loc/A)): sent<br>02-May-2009 14:23:46.=
> 774 resquery 0xb7b80008 (fctx 0xb7bae408(box.mylan.loct/A)): senddone<br>
> 02-May-2009 14:23:46.774 fctx 0xb7bae408(<a href=3D"http://ebox.voiplan.pt/=
> A'">ebox.voiplan.pt/A'</a>): cancelquery<br>02-May-2009 14:23:46.77=
> 4 dispatch 0xb7b6d378 response 0xb7ba7848 192.168.90.254#53: detaching from=
>  task 0xb7b6f2c8<br>
> 02-May-2009 14:23:46.774 dispatch 0xb7b6d378: detach: refcount 0<br>02-May-=
> 2009 14:23:46.774 dispatch 0xb7b6d378: got packet: requests 0, buffers 1, r=
> ecvs 1<br>02-May-2009 14:23:46.775 dispatch 0xb7b6d378: shutting down; deta=
> ching from sock 0xb7b79938, task 0xb7b74d70<br>
> 02-May-2009 14:23:46.775 socket 0xb7b79938: destroying<br>02-May-2009 14:23=
> :46.775 dispatchmgr 0xb7bbb168: destroy_mgr_ok: shuttingdown=3D0, listnonem=
> pty=3D1, epool=3D10, rpool=3D0, dpool=3D10<br>02-May-2009 14:23:46.775 shut=
> ting down<br>
> 02-May-2009 14:23:46.775 stopping command channel on 127.0.0.1#953<br>02-Ma=
> y-2009 14:23:46.776 res 0xb7bbe200: shutdown<br>02-May-2009 14:23:46.776 re=
> s 0xb7bbe200: exiting<br>02-May-2009 14:23:46.776 dns_requestmgr_shutdown: =
> 0xb7b75008<br>
> 02-May-2009 14:23:46.776 send_shutdown_events: 0xb7b75008<br>02-May-2009 14=
> :23:46.777 no longer listening on 127.0.0.1#53<br>02-May-2009 14:23:46.777 =
> clientmgr @0xb7baa3f8: destroy<br>02-May-2009 14:23:46.777 no longer listen=
> ing on 192.167.200.254#53<br>
> 02-May-2009 14:23:46.777 clientmgr @0xb7baa548: destroy<br>..."<br><br=
> ><br>If anybody could give me a hand on this I surelly would appreciate it.=
> <br><br><br>Nelson Vale<br>
> 
> --0016e6d63f6c70eabb0468ef458e--
> 
> --===============0334541641552837543==
> Content-Type: text/plain; charset="us-ascii"
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> _______________________________________________
> bind-users mailing list
> bind-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
> --===============0334541641552837543==--
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: Mark_Andrews at isc.org



More information about the bind-users mailing list