BIND 10 #988: Infinite loop on xfrout
BIND 10 Development
do-not-reply at isc.org
Wed Jun 1 07:48:50 UTC 2011
#988: Infinite loop on xfrout
-------------------------------------+-------------------------------------
Reporter: shane | Owner:
Type: defect | Status: new
Priority: major | Milestone: New
Component: Unclassified | Tasks
Sensitive: 0 | Keywords:
Sub-Project: DNS | Defect Severity:
Estimated Difficulty: 0 | Medium
Total Hours: 0 | Feature Depending on Ticket:
| Add Hours to Ticket: 0
| Internal?: 0
-------------------------------------+-------------------------------------
There is an infinite loop in xfrout on my server.
The process list:
{{{
root at h:/opt/bind10/var/bind10-devel/log# ps -eLf | grep xfrout
root 625 621 625 0 5 May25 pts/1 00:00:01
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 625 621 632 0 5 May25 pts/1 00:00:00
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 625 621 633 0 5 May25 pts/1 00:00:52
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 625 621 3771 96 5 May28 pts/1 3-04:54:56
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 625 621 3772 0 5 May28 pts/1 00:00:00
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 12322 12117 12322 0 1 07:42 pts/0 00:00:00 grep xfrout
}}}
It appears to be getting a read event on a file descriptor, and then
getting 0 bytes when reading, and then not taking action:
{{{
root at h:/opt/bind10/var/bind10-devel/log# strace -p 3771 2>&1 | head -10
Process 3771 attached - interrupt to quit
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
select(14, [9 13], [], [], NULL) = 1 (in [13])
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
select(14, [9 13], [], [], NULL) = 1 (in [13])
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
select(14, [9 13], [], [], NULL) = 1 (in [13])
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
select(14, [9 13], [], [], NULL) = 1 (in [13])
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
}}}
This file descriptor is for the Unix-domain socket used to transport file
descriptors:
{{{
root at h:/opt/bind10/var/bind10-devel/log# lsof -p 625 | grep 13u
b10-xfrou 625 root 13u unix 0xffff88001e642900 0t0 35734
/opt/bind10/var/auth_xfrout_conn
}}}
This is almost certainly being returned to notify_out.py:
{{{
def _get_notify_reply(self, sock, tgt_addr):
try:
msg, addr = sock.recvfrom(512)
except socket.error:
self._log_msg('error', "notify to %s failed: can't read notify
reply" % addr_to_str(tgt_addr))
return None
return msg
}}}
What should happen is that an error needs to be recognized and the socket
needs to be re-opened.
Ideally logging will be added to find out about when this happens so we
can figure out why we are getting this response.
--
Ticket URL: <http://bind10.isc.org/ticket/988>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list