BIND 10 #1762: loading huge zone crashes all

BIND 10 Development do-not-reply at isc.org
Tue Mar 6 12:36:19 UTC 2012


#1762: loading huge zone crashes all
-------------------------------------+-------------------------------------
            Reporter:  jreed         |                        Owner:
                Type:  defect        |                       Status:  new
            Priority:  major         |                    Milestone:  New
           Component:  Unclassified  |  Tasks
           Sensitive:  0             |                     Keywords:
         Sub-Project:  DNS           |              Defect Severity:  N/A
Estimated Difficulty:  0             |  Feature Depending on Ticket:
         Total Hours:  0             |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------
 I tried to load 100,000,003 records in a single zone. After six minutes
 bind10 crashed.

 {{{
 2012-03-05 21:59:37.971 INFO  [b10-auth.auth] AUTH_SERVER_CREATED server
 created
 2012-03-05 21:59:38.164 INFO  [b10-stats.stats] STATS_STARTING starting
 2012-03-05 21:59:38.177 INFO  [b10-xfrout.xfrout] XFROUT_NEW_CONFIG Update
 xfrout configuration
 2012-03-05 21:59:38.177 INFO  [b10-xfrout.xfrout] XFROUT_NEW_CONFIG_DONE
 Update xfrout configuration done
 2012-03-05 21:59:38.237 INFO  [b10-stats-httpd.stats-httpd]
 STATHTTPD_STARTED listening on 127.0.0.1#8000
 Traceback (most recent call last):
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10-cfgmgr",
 line 107, in <module>
 Traceback (most recent call last):
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10-xfrout",
 line 1041, in <module>
 Traceback (most recent call last):
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10
 -stats-httpd", line 829, in <module>
 2012-03-05 22:05:52.114 2012-03-05 22:05:52.114INFO  [b10-boss.boss]
 BIND10_PROCESS_ENDED process 16839 of Socket creator ended with status 9
  ERROR [b10-xfrin.xfrin] XFRIN_UNKNOWN_ERROR unknown error: Read of 0
 bytes: connection closed
 2012-03-05 22:06:00.128    sys.exit(main())
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10-cfgmgr",
 line 95, in main
 2012-03-05 22:06:00.697    cm.run()
   ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED component Socket creator
 (pid 16839) failed with 9 exit status
 ERROR [  File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/config/cfgmgr.py", line 526, in run
 b10-zonemgr.config] CONFIG_SESSION_STOPPING_FAILED error sending stopping
 message: [Errno 32] Broken pipe
     xfrout_server.run()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10-xfrout",
 line 1014, in run
     stats_httpd.start()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10
 -stats-httpd", line 292, in start
     self.mccs.check_command(nonblock=False)
     self._cc.check_command(False)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/config/ccsession.py", line 258, in check_command
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/config/ccsession.py", line 258, in check_command
     msg, env = self._session.group_recvmsg(nonblock)
     msg, env = self._session.group_recvmsg(nonblock)
 2012-03-05 22:06:03.862  File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 266, in group_recvmsg
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 266, in group_recvmsg
  FATAL [b10-boss.boss] BIND10_COMPONENT_UNSATISFIED component Socket
 creator is required to run and failed
 Traceback (most recent call last):
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10-zonemgr",
 line 693, in <module>
     env, msg  = self.recvmsg(nonblock, seq)
     msg, env = self.cc.group_recvmsg(False)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 266, in group_recvmsg
     env, msg  = self.recvmsg(nonblock, seq)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 121, in recvmsg
     data = self._receive_full_buffer(nonblock)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 203, in _receive_full_buffer
     self._receive_len_data()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 163, in _receive_len_data
     new_data = self._receive_bytes(self._recv_len_size)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 151, in _receive_bytes
     env, msg  = self.recvmsg(nonblock, seq)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 121, in recvmsg
     data = self._receive_full_buffer(nonblock)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 203, in _receive_full_buffer
     self._receive_len_data()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 163, in _receive_len_data
     new_data = self._receive_bytes(self._recv_len_size)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 151, in _receive_bytes
     raise ProtocolError("Read of 0 bytes: connection closed")
     raise ProtocolError("Read of 0 bytes: connection closed")
 isc.cc.session.ProtocolError: Read of 0 bytes: connection closed
 2012-03-05 22:06:07.267 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process
 16840 of msgq ended with status 9
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 133, in recvmsg
     return self.recvmsg(nonblock, seq)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/installlib/python3.1
 /site-packages/isc/cc/session.py", line 121, in recvmsg
     data = self._receive_full_buffer(nonblock)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 203, in _receive_full_buffer
     self._receive_len_data()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 163, in _receive_len_data
     new_data = self._receive_bytes(self._recv_len_size)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 151, in _receive_bytes
     raise ProtocolError("Read of 0 bytes: connection closed")
 isc.cc.session.ProtocolError: Read of 0 bytes: connection closed
 isc.cc.session.ProtocolError: Read of 0 bytes: connection closed
 2012-03-05 22:06:07.267 ERROR [b10-stats.config]
 CONFIG_SESSION_STOPPING_FAILED error sending stopping message: [Errno 32]
 Broken pipe
     zonemgrd.run()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10-zonemgr",
 line 663, in run
 2012-03-05 22:06:13.788 FATAL [b10-boss.boss] BIND10_MSGQ_DISAPPEARED msgq
 channel disappeared
     self._module_cc.check_command(False)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/config/ccsession.py", line 258, in check_command
     msg, env = self._session.group_recvmsg(nonblock)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 266, in group_recvmsg
     env, msg  = self.recvmsg(nonblock, seq)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 121, in recvmsg
     data = self._receive_full_buffer(nonblock)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 203, in _receive_full_buffer
     self._receive_len_data()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 163, in _receive_len_data
     new_data = self._receive_bytes(self._recv_len_size)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 151, in _receive_bytes
     raise ProtocolError("Read of 0 bytes: connection closed")
 Traceback (most recent call last):
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10-stats",
 line 406, in <module>
 isc.cc.session.ProtocolError: Read of 0 bytes: connection closed
 Exception in thread Thread-1:
 Traceback (most recent call last):
   File "/home/jreed/pkg/lib/python3.1/threading.py", line 509, in
 _bootstrap_inner
     self.run()
   File "/home/jreed/pkg/lib/python3.1/threading.py", line 462, in run
     self._target(*self._args, **self._kwargs)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10-cmdctl",
 line 351, in _handle_msg_from_msgq
     self._module_cc.check_command(False)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/config/ccsession.py", line 258, in check_command
     msg, env = self._session.group_recvmsg(nonblock)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 266, in group_recvmsg
     env, msg  = self.recvmsg(nonblock, seq)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 121, in recvmsg
     data = self._receive_full_buffer(nonblock)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 203, in _receive_full_buffer
     self._receive_len_data()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 163, in _receive_len_data
     new_data = self._receive_bytes(self._recv_len_size)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 151, in _receive_bytes
     raise ProtocolError("Read of 0 bytes: connection closed")
 isc.cc.session.ProtocolError: Read of 0 bytes: connection closed

 2012-03-05 22:06:17.8552012-03-05 22:06:17.855  ERROR [b10-xfrin.config]
 CONFIG_SESSION_STOPPING_FAILED error sending stopping message: [Errno 32]
 Broken pipe
 INFO  [b10-boss.boss] BIND10_SHUTDOWN stopping the server
     stats.start()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/libexec/bind10-devel/b10-stats",
 line 189, in start
     self.mccs.check_command(False)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/config/ccsession.py", line 258, in check_command
 2012-03-05 22:06:31.553 ERROR [b10-boss.config]
 CONFIG_SESSION_STOPPING_FAILED error sending stopping message: [Errno 32]
 Broken pipe
 2012-03-05 22:06:34.286 INFO  [b10-boss.boss] BIND10_CONFIGURATOR_STOP
 bind10 component configurator is shutting down
     msg, env = self._session.group_recvmsg(nonblock)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 266, in group_recvmsg
     env, msg  = self.recvmsg(nonblock, seq)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 121, in recvmsg
 2012-03-05 22:06:40.777 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP
 component msgq is being stopped
 2012-03-05 22:06:46.091 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP
 component b10-xfrin is being stopped
     data = self._receive_full_buffer(nonblock)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 203, in _receive_full_buffer
     self._receive_len_data()
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 163, in _receive_len_data
     new_data = self._receive_bytes(self._recv_len_size)
   File
 "/home/jreed/dnsbenchsuite/work/master-2012-02-15/20120222215946/install/lib/python3.1
 /site-packages/isc/cc/session.py", line 151, in _receive_bytes
     raise ProtocolError("Read of 0 bytes: connection closed")
 2012-03-05 22:06:50.005isc.cc.session.ProtocolError: Read of 0 bytes:
 connection closed
  INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-xfrin to shut down
 2012-03-05 22:07:00.415 ERROR [b10-boss.boss]
 BIND10_CONFIGURATOR_PLAN_INTERRUPTED configurator plan interrupted, only 1
 of 9 done
 2012-03-05 22:07:13.574 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM sending
 SIGTERM to cfgmgr (PID 16841)
 2012-03-05 22:07:18.566 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM sending
 SIGTERM to b10-zonemgr (PID 16842)
 2012-03-05 22:07:18.567 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM sending
 SIGTERM
 ...
 2012-03-05 22:07:19.350 INFO  [b10-boss.boss] BIND10_SHUTDOWN_COMPLETE all
 processes ended, shutdown complete
 }}}

 (Even though that is code from February I reproduced with more recent code
 from March 1 too.)

 The long wait was during the datasrc doing the "DATASRC_MEM_ADD_RRSET
 adding RRset" into the zone.

 The files auth_xfrout_conn and msgq_socket were left around.

 I have done this with --verbose debugging but don't yet see any specific
 error. It crashed after loading about 15% of the master file. Maybe caused
 by communication timeout (busy while loading zone?) or hit some memory
 limit.

 {{{
 2012-03-05 23:31:37.406 DEBUG [b10-auth (7232).datasrc]
 DATASRC_MEM_ADD_RRSET adding RRset 'ns2.4104824.example./A' into zone
 'example.'
 2012-03-05 23:31:41.226 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED
 component Socket creator (pid 7223) failed with 9 exit status
 2012-03-05 23:31:42.381 FATAL [b10-boss.boss] BIND10_COMPONENT_UNSATISFIED
 component Socket creator is required to run and failed
 2012-03-05 23:31:44.381 DEBUG [b10-auth (7232).datasrc]
 DATASRC_MEM_ADD_RRSET adding RRset '4104825.example./NS' into zone
 'example.'
 }}}

 It continued to load many more DATASRC_MEM_ADD_RRSET after that even after
 the rest was shutting down, until a SIGTERM was sent to auth.

-- 
Ticket URL: <http://bind10.isc.org/ticket/1762>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list