BIND 10 #988: Infinite loop on xfrout
BIND 10 Development
do-not-reply at isc.org
Wed Jun 1 09:50:03 UTC 2011
#988: Infinite loop on xfrout
-------------------------------------+-------------------------------------
Reporter: shane | Owner:
Type: | Status: new
defect | Milestone: New Tasks
Priority: major | Resolution:
Component: | Sensitive: 0
Unclassified | Sub-Project: DNS
Keywords: | Estimated Difficulty: 0.0
Defect Severity: | Total Hours: 0
Medium |
Feature Depending on Ticket: |
Add Hours to Ticket: 0 |
Internal?: 0 |
-------------------------------------+-------------------------------------
Description changed by shane:
Old description:
> There is an infinite loop in xfrout on my server.
>
> The process list:
>
> {{{
> root at h:/opt/bind10/var/bind10-devel/log# ps -eLf | grep xfrout
> root 625 621 625 0 5 May25 pts/1 00:00:01
> /usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
> root 625 621 632 0 5 May25 pts/1 00:00:00
> /usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
> root 625 621 633 0 5 May25 pts/1 00:00:52
> /usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
> root 625 621 3771 96 5 May28 pts/1 3-04:54:56
> /usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
> root 625 621 3772 0 5 May28 pts/1 00:00:00
> /usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
> root 12322 12117 12322 0 1 07:42 pts/0 00:00:00 grep xfrout
> }}}
>
> It appears to be getting a read event on a file descriptor, and then
> getting 0 bytes when reading, and then not taking action:
>
> {{{
> root at h:/opt/bind10/var/bind10-devel/log# strace -p 3771 2>&1 | head -10
> Process 3771 attached - interrupt to quit
> recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
> msg_flags=0}, 0) = 0
> select(14, [9 13], [], [], NULL) = 1 (in [13])
> recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
> msg_flags=0}, 0) = 0
> select(14, [9 13], [], [], NULL) = 1 (in [13])
> recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
> msg_flags=0}, 0) = 0
> select(14, [9 13], [], [], NULL) = 1 (in [13])
> recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
> msg_flags=0}, 0) = 0
> select(14, [9 13], [], [], NULL) = 1 (in [13])
> recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
> msg_flags=0}, 0) = 0
> }}}
>
> This file descriptor is for the Unix-domain socket used to transport file
> descriptors:
>
> {{{
> root at h:/opt/bind10/var/bind10-devel/log# lsof -p 625 | grep 13u
> b10-xfrou 625 root 13u unix 0xffff88001e642900 0t0 35734
> /opt/bind10/var/auth_xfrout_conn
> }}}
>
> This is almost certainly being returned to notify_out.py:
>
> {{{
> def _get_notify_reply(self, sock, tgt_addr):
> try:
> msg, addr = sock.recvfrom(512)
> except socket.error:
> self._log_msg('error', "notify to %s failed: can't read
> notify reply" % addr_to_str(tgt_addr))
> return None
>
> return msg
> }}}
>
> What should happen is that an error needs to be recognized and the socket
> needs to be re-opened.
>
> Ideally logging will be added to find out about when this happens so we
> can figure out why we are getting this response.
New description:
There is an infinite loop in xfrout on my server.
The process list:
{{{
root at h:/opt/bind10/var/bind10-devel/log# ps -eLf | grep xfrout
root 625 621 625 0 5 May25 pts/1 00:00:01
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 625 621 632 0 5 May25 pts/1 00:00:00
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 625 621 633 0 5 May25 pts/1 00:00:52
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 625 621 3771 96 5 May28 pts/1 3-04:54:56
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 625 621 3772 0 5 May28 pts/1 00:00:00
/usr/bin/python3 /opt/bind10/libexec/bind10-devel/b10-xfrout
root 12322 12117 12322 0 1 07:42 pts/0 00:00:00 grep xfrout
}}}
It appears to be getting a read event on a file descriptor, and then
getting 0 bytes when reading, and then not taking action:
{{{
root at h:/opt/bind10/var/bind10-devel/log# strace -p 3771 2>&1 | head -10
Process 3771 attached - interrupt to quit
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
select(14, [9 13], [], [], NULL) = 1 (in [13])
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
select(14, [9 13], [], [], NULL) = 1 (in [13])
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
select(14, [9 13], [], [], NULL) = 1 (in [13])
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
select(14, [9 13], [], [], NULL) = 1 (in [13])
recvmsg(13, {msg_name(0)=NULL, msg_iov(1)=[{"\0", 1}], msg_controllen=0,
msg_flags=0}, 0) = 0
}}}
This file descriptor is for the Unix-domain socket used to transport file
descriptors:
{{{
root at h:/opt/bind10/var/bind10-devel/log# lsof -p 625 | grep 13u
b10-xfrou 625 root 13u unix 0xffff88001e642900 0t0 35734
/opt/bind10/var/auth_xfrout_conn
}}}
--
--
Ticket URL: <http://bind10.isc.org/ticket/988#comment:2>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list