[bind10-dev] fatal problems: AUTH_SERVER_FAILED and BIND10_MSGQ_DISAPPEARED

Jeremy C. Reed jreed at isc.org
Mon Nov 19 16:22:12 UTC 2012


Server started on 2012-10-01. When it was restarted then, the operators 
re-routed the traffic elsewhere until it was fixed on the 22nd when I 
reported that it had no traffic.  Logging started again on the 25th with 
various NOTIFY problems (see my previous email and 
http://bind10.isc.org/ticket/1938). These logs continued. No DEBUG 
enabled.

See the FATAL AUTH_SERVER_FAILED below. I don't see any core dump for 
that one. I think it may have got overwritten with a later core file.

2012-10-31 01:17:04.633 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 139.168.192.IN-ADDR.ARPA. 
(class IN) is not known to the zone manager
2012-10-31 01:17:04.633 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(139.168.192.IN-ADDR.ARPA., IN) doesn't belong to zonemgr"
2012-10-31 01:17:04.633 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(139.168.192.IN-ADDR.ARPA., IN) doesn't belong to zonemgr"
2012-11-01 01:00:05.661 FATAL [b10-auth.auth] AUTH_SERVER_FAILED server 
failed: Can't assign requested address
2012-11-01 01:00:05.663 INFO  [b10-boss.boss] 
BIND10_LOST_SOCKET_CONSUMER consumer 33 of sockets disconnected, 
considering all its sockets closed
2012-11-01 01:00:05.664 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED 
process 76219 of b10-auth-3 ended with status 256
2012-11-01 01:00:05.664 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED 
component b10-auth-3 (pid 76219) failed: process exited normally with 
exit status 256
2012-11-01 01:00:05.664 INFO  [b10-boss.boss] BIND10_COMPONENT_START 
component b10-auth-3 is starting
2012-11-01 01:00:05.664 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS 
starting process b10-auth
2012-11-01 01:00:05.785 INFO  [b10-auth.auth] AUTH_SERVER_STARTED server 
started
2012-11-01 17:20:39.318 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 10.in-addr.arpa. (class IN) 
is not known to the zone manager
2012-11-01 17:20:39.319 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(10.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-01 17:20:39.749 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 17.172.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-01 17:20:39.749 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 10.in-addr.arpa. (class IN) 
is not known to the zone manager

There were 44 FATAL AUTH_SERVER_FAILED problems between when it was 
started on August 1 and October 1, but I am ignoring them for now, since 
the BIND 10 suite was upgraded to next version after them and it always 
restarted them.

Here is a later FATAL example:

2012-11-11 01:17:02.300 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 139.168.192.IN-ADDR.ARPA. 
(class IN) is not known to the zone manager
2012-11-11 01:17:02.300 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(139.168.192.IN-ADDR.ARPA., IN) doesn't belong to zonemgr"
2012-11-11 01:17:02.300 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(139.168.192.IN-ADDR.ARPA., IN) doesn't belong to zonemgr"
2012-11-12 01:01:40.038 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 30.168.192.in-addr.arpa. 
(class IN) is not known to the zone manager
2012-11-12 01:01:40.038 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(30.168.192.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-12 01:01:43.516 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 30.168.192.in-addr.arpa. 
(class IN) is not known to the zone manager
2012-11-12 01:01:43.517 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(30.168.192.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-12 01:01:47.375 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 30.168.192.in-addr.arpa. 
(class IN) is not known to the zone manager
2012-11-12 01:01:47.376 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(30.168.192.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-12 01:01:55.906 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 30.168.192.in-addr.arpa. 
(class IN) is not known to the zone manager
2012-11-12 01:01:55.907 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(30.168.192.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-12 01:02:03.968 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 30.168.192.in-addr.arpa. 
(class IN) is not known to the zone manager
2012-11-12 01:02:03.969 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(30.168.192.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-12 01:02:20.218 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 30.168.192.in-addr.arpa. 
(class IN) is not known to the zone manager
2012-11-12 01:02:20.219 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(30.168.192.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-14 02:42:58.656 FATAL [b10-auth.auth] AUTH_SERVER_FAILED server 
failed: Can't assign requested address
2012-11-14 02:42:58.659 INFO  [b10-boss.boss] 
BIND10_LOST_SOCKET_CONSUMER consumer 48 of sockets disconnected, 
considering all its sockets closed
2012-11-14 02:42:58.659 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED 
process 76221 of b10-auth-5 ended with status 256
2012-11-14 02:42:58.659 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED 
component b10-auth-5 (pid 76221) failed: process exited normally with 
exit status 256
2012-11-14 02:42:58.659 INFO  [b10-boss.boss] BIND10_COMPONENT_START 
component b10-auth-5 is starting
2012-11-14 02:42:58.659 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS 
starting process b10-auth
2012-11-14 02:42:58.779 INFO  [b10-auth.auth] AUTH_SERVER_STARTED server 
started
2012-11-15 18:28:52.895 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 168.192.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-15 18:28:52.896 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(168.192.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-15 18:28:53.396 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 168.192.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-15 18:28:53.396 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 21.172.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-15 18:28:53.397 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(168.192.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-15 18:28:53.397 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 21.172.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-15 18:28:53.397 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(21.172.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-15 18:28:53.397 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 26.172.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-15 18:28:53.397 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(21.172.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-15 18:28:53.398 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 17.172.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-15 18:28:53.398 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 26.172.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-15 18:28:53.398 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(17.172.in-addr.arpa., IN) doesn't belong to zonemgr"
2012-11-15 18:28:53.398 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 17.172.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-15 18:28:53.398 ERROR [b10-zonemgr.zonemgr] 
ZONEMGR_UNKNOWN_ZONE_NOTIFIED notified zone 28.172.in-addr.arpa. (class 
IN) is not known to the zone manager
2012-11-15 18:28:53.398 ERROR [b10-auth.auth] AUTH_ZONEMGR_ERROR 
received error response from zone manager: "[b10-zonemgr] Notified zone 
(17.172.in-addr.arpa., IN) doesn't belong to zonemgr"

The auth server was restarted again. (It was one of eight b10-auth 
servers.)

Now see below where BIND10_MSGQ_DISAPPEARED; this was not recoverable.

2012-11-16 14:20:10.743 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED 
component b10-stats (pid 76217) failed: process exited normally with 
exit status 256
2012-11-16 14:20:10.790 INFO  [b10-boss.boss] BIND10_COMPONENT_START 
component b10-stats is starting
2012-11-16 14:20:10.842 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS 
starting process b10-stats
2012-11-16 14:20:11.902 ERROR [b10-auth.cc] CC_TIMEOUT timeout reading 
data from command channel
2012-11-16 14:20:11.903 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: Timeout while reading data from cc 
session
2012-11-16 14:20:11.903 ERROR [b10-auth.cc] CC_LENGTH_NOT_READY length 
not ready
2012-11-16 14:20:11.904 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: ASIO read: data length is not ready
2012-11-16 14:20:11.905 ERROR [b10-auth.cc] CC_TIMEOUT timeout reading 
data from command channel
2012-11-16 14:20:11.905 ERROR [b10-auth.cc] CC_TIMEOUT timeout reading 
data from command channel
2012-11-16 14:20:11.907 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: Timeout while reading data from cc 
session
2012-11-16 14:20:11.907 ERROR [b10-auth.cc] CC_LENGTH_NOT_READY length 
not ready
2012-11-16 14:20:11.907 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: ASIO read: data length is not ready
2012-11-16 14:20:11.908 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: Timeout while reading data from cc 
session
2012-11-16 14:20:11.908 ERROR [b10-auth.cc] CC_LENGTH_NOT_READY length 
not ready
2012-11-16 14:20:11.908 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: ASIO read: data length is not ready
2012-11-16 14:20:12.405 FATAL [b10-boss.boss] BIND10_MSGQ_DISAPPEARED 
msgq channel disappeared
2012-11-16 14:20:12.410 INFO  [b10-boss.boss] BIND10_SHUTDOWN stopping 
the server
2012-11-16 14:20:12.693 ERROR [b10-boss.config] 
CONFIG_SESSION_STOPPING_FAILED error sending stopping message: [Errno 
32] Broken pipe
2012-11-16 14:20:12.725 INFO  [b10-boss.boss] BIND10_CONFIGURATOR_STOP 
bind10 component configurator is shutting down
2012-11-16 14:20:12.749 ERROR [b10-auth.cc] CC_ASYNC_READ_FAILED 
asynchronous read failed (error code = 2)
2012-11-16 14:20:12.749 ERROR [b10-auth.cc] CC_READ_ERROR error reading 
data from command channel (End of file.)
2012-11-16 14:20:12.749 ERROR [b10-auth.cc] CC_ASYNC_READ_FAILED 
asynchronous read failed (error code = 2)
2012-11-16 14:20:12.750 ERROR [b10-auth.cc] CC_READ_ERROR error reading 
data from command channel (End of file.)
2012-11-16 14:20:12.784 ERROR [b10-auth.cc] CC_READ_ERROR error reading 
data from command channel (End of file.)
2012-11-16 14:20:12.790 ERROR [b10-auth.cc] CC_ASYNC_READ_FAILED 
asynchronous read failed (error code = 2)
2012-11-16 14:20:13.296 ERROR [b10-auth.cc] CC_ASYNC_READ_FAILED 
asynchronous read failed (error code = 2)
2012-11-16 14:20:13.296 ERROR [b10-auth.cc] CC_ASYNC_READ_FAILED 
asynchronous read failed (error code = 2)
2012-11-16 14:20:13.296 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: Error while reading data from cc 
session: End of file.
2012-11-16 14:20:13.297 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: Error while reading data from cc 
session: End of file.
2012-11-16 14:20:13.297 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: asynchronous read failed
2012-11-16 14:20:13.297 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: asynchronous read failed
2012-11-16 14:20:13.297 ERROR [b10-auth.cc] CC_LENGTH_NOT_READY length 
not ready
2012-11-16 14:20:13.297 ERROR [b10-auth.cc] CC_LENGTH_NOT_READY length 
not ready
2012-11-16 14:20:13.297 ERROR [b10-auth.cc] CC_LENGTH_NOT_READY length 
not ready
2012-11-16 14:20:13.297 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: ASIO read: data length is not ready
2012-11-16 14:20:13.297 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: error: unexpected character 
2012-11-16 14:20:13.297 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: ASIO read: data length is not ready
2012-11-16 14:20:13.917 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: ASIO read: data length is not ready
2012-11-16 14:20:13.917 ERROR [b10-auth.cc] CC_ASYNC_READ_FAILED 
asynchronous read failed (error code = 2)
2012-11-16 14:20:13.917 ERROR [b10-auth.cc] CC_ASYNC_READ_FAILED 
asynchronous read failed (error code = 2)
2012-11-16 14:20:13.918 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: Error while reading data from cc 
session: End of file.
2012-11-16 14:20:13.918 ERROR [b10-auth.cc] CC_LENGTH_NOT_READY length 
not ready
2012-11-16 14:20:13.918 ERROR [b10-auth.auth] AUTH_ZONEMGR_COMMS error 
communicating with zone manager: ASIO read: data length is not ready
2012-11-16 14:20:13.918 ERROR [b10-auth.cc] CC_ASYNC_READ_FAILED 
asynchronous read failed (error code = 2)
2012-11-16 14:20:14.101 ERROR [b10-xfrin.xfrin] XFRIN_UNKNOWN_ERROR 
unknown error: Read of 0 bytes: connection closed
2012-11-16 14:20:14.132 ERROR [b10-zonemgr.config] 
CONFIG_SESSION_STOPPING_FAILED error sending stopping message: [Errno 
32] Broken pipe
2012-11-16 14:20:14.132 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP 
component b10-auth-2 is being stopped
2012-11-16 14:20:14.132 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking 
b10-auth-2 to shut down
2012-11-16 14:20:14.132 ERROR [b10-boss.boss] 
BIND10_CONFIGURATOR_PLAN_INTERRUPTED configurator plan interrupted, only 
0 of 16 done
2012-11-16 14:20:14.247 FATAL [b10-auth.auth] AUTH_SERVER_FAILED server 
failed: asynchronous read failed
2012-11-16 14:20:14.251 FATAL [b10-auth.auth] AUTH_SERVER_FAILED server 
failed: asynchronous read failed
2012-11-16 14:20:14.262 FATAL [b10-auth.auth] AUTH_SERVER_FAILED server 
failed: asynchronous read failed
2012-11-16 14:20:14.268 FATAL [b10-auth.auth] AUTH_SERVER_FAILED server 
failed: asynchronous read failed
2012-11-16 14:20:14.314 FATAL [b10-auth.auth] AUTH_SERVER_FAILED server 
failed: asynchronous read failed
2012-11-16 14:20:14.332 FATAL [b10-auth.auth] AUTH_SERVER_FAILED server 
failed: asynchronous read failed
2012-11-16 14:20:14.493 FATAL [b10-auth.cc] CC_WRITE_ERROR error writing 
data to command channel (Broken pipe)
2012-11-16 14:20:14.493 FATAL [b10-auth.cc] CC_WRITE_ERROR error writing 
data to command channel (Broken pipe)
2012-11-16 14:20:14.495 FATAL [b10-auth.cc] CC_WRITE_ERROR error writing 
data to command channel (Broken pipe)
2012-11-16 14:20:14.497 FATAL [b10-auth.cc] CC_WRITE_ERROR error writing 
data to command channel (Broken pipe)
2012-11-16 14:20:14.501 FATAL [b10-auth.cc] CC_WRITE_ERROR error writing 
data to command channel (Broken pipe)
2012-11-16 14:20:14.543 FATAL [b10-auth.cc] CC_WRITE_ERROR error writing 
data to command channel (Broken pipe)
2012-11-16 14:20:14.933 ERROR [b10-xfrin.config] 
CONFIG_SESSION_STOPPING_FAILED error sending stopping message: [Errno 
32] Broken pipe
2012-11-16 14:20:15.200 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED 
process 76210 of msgq ended with status 256
2012-11-16 14:20:15.203 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-zonemgr (PID 76224)
2012-11-16 14:20:15.215 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-auth-5 (PID 93366)
2012-11-16 14:20:15.215 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-stats (PID 17522)
2012-11-16 14:20:15.228 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-auth-3 (PID 67352)
2012-11-16 14:20:15.283 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to Socket creator (PID 76209)
2012-11-16 14:20:15.283 WARN  [b10-boss.boss] BIND10_SOCKCREATOR_KILL 
killing the socket creator
2012-11-16 14:20:15.283 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to cfgmgr (PID 76211)
2012-11-16 14:20:15.295 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-stats-httpd (PID 76212)
2012-11-16 14:20:15.301 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-xfrin (PID 76213)
2012-11-16 14:20:15.301 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-cmdctl (PID 76214)
2012-11-16 14:20:15.301 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-auth (PID 76215)
2012-11-16 14:20:15.301 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-auth-8 (PID 76216)
2012-11-16 14:20:15.301 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-auth-6 (PID 76218)
2012-11-16 14:20:15.301 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM 
sending SIGTERM to b10-auth-2 (PID 76220)
...
2012-11-16 14:20:17.288 INFO  [b10-boss.boss] BIND10_SHUTDOWN_COMPLETE 
all processes ended, shutdown complete

Now all of bind10 is crashed.

Later the operations team noticed it was down, they logged and no bind10 
was running, so they started it:

2012-11-16 20:27:47.126 INFO  [b10-boss.boss] 
BIND10_READING_BOSS_CONFIGURATION reading boss configuration

This is on a FreeBSD 8.1 amd64 system which handles around 6000 qps 
using eight b10-auth processes.

The following is backtrace from a b10-auth core dump from 14:20:

#0  0x00000008026763cc in kill () from /lib/libc.so.7
#1  0x00000008026751cb in abort () from /lib/libc.so.7
#2  0x00000008020aefa4 in __gnu_cxx::__verbose_terminate_handler ()
   from /usr/lib/libstdc++.so.6
#3  0x00000008020b34a3 in std::set_unexpected () from 
/usr/lib/libstdc++.so.6
#4  0x00000008020b34e3 in std::terminate () from /usr/lib/libstdc++.so.6
#5  0x00000008020b344a in __cxa_throw () from /usr/lib/libstdc++.so.6
#6  0x0000000800dbd960 in isc::cc::SessionImpl::writeData ()
   from /usr/local/lib/libb10-cc.so.0
#7  0x0000000800dbf65e in isc::cc::Session::sendmsg ()
   from /usr/local/lib/libb10-cc.so.0
#8  0x0000000800dbf99e in isc::cc::Session::unsubscribe ()
   from /usr/local/lib/libb10-cc.so.0
#9  0x0000000800c7b60e in 
isc::config::ModuleCCSession::removeRemoteConfig ()
   from /usr/local/lib/libb10-cfgclient.so.1
#10 0x000000000042e221 in boost::function1<bool, 
boost::shared_ptr<isc::data::Element const> const&>::dummy::nonnull ()

I tried to enable some DEBUG even at debuglevel 54 (to skip 
AUTH_SEND_NORMAL_RESPONSE and AUTH_PACKET_RECEIVED), but the logging was 
too much (17 megabytes within a few seconds).

A quick count:

# cut -f 5 -d " " bind10.log* | sort | uniq -c | sort -n -r
45046 DATASRC_MEM_SUCCESS
44987 DATASRC_MEMORY_MEM_FIND_ZONE
44770 DATASRC_MEM_NOT_FOUND
  16 CC_GROUP_RECEIVED
  16 CC_GROUP_RECEIVE
  11 CONFIG_LOG_WILD_MATCH
   8 CC_START_READ
   8 CC_REPLY
   8 AUTH_RECEIVED_COMMAND
   1 STATS_SEND_STATISTICS_REQUEST
   1 CMDCTL_COMMAND_SENT
   1 BIND10_RECEIVED_COMMAND

The top three above are at level 45. I could log at debuglevel 44, but I 
don't know what useful data I will lose.

Any ideas about any of this?

Thanks,

  Jeremy C. Reed
  ISC


More information about the bind10-dev mailing list