BIND 10 #1597: traceback in bind10 when cfgmrg can't start

BIND 10 Development do-not-reply at isc.org
Wed Jan 18 19:18:53 UTC 2012


#1597: traceback in bind10 when cfgmrg can't start
-------------------------------------+-------------------------------------
            Reporter:  jreed         |                        Owner:
                Type:  defect        |                       Status:  new
            Priority:  major         |                    Milestone:  New
           Component:  Boss of BIND  |  Tasks
           Sensitive:  0             |                     Keywords:
         Sub-Project:  Core          |              Defect Severity:  High
Estimated Difficulty:  0             |  Feature Depending on Ticket:
         Total Hours:  0             |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------
 This maybe related to #1291.

 I tried bind10 --user. I forgot about config permissions.

 This is from shutdown first:

 {{{

 2012-01-18 12:54:51.411 ERROR [b10-boss.boss]
 BIND10_CONFIGURATOR_PLAN_INTERRUPTED configurator plan interrupted, only 2
 of 8 done
 2012-01-18 12:54:52.503 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process
 1481 of Socket creator ended with status 0

 }}}

 Not sure what that 2 of 8 means or if it matters. Anyways here is the
 problem:

 {{{

 2012-01-18 12:55:23.192 INFO  [b10-boss.boss] BIND10_STARTING starting
 BIND10: bind10 20110223 (BIND 10 20120119)
 2012-01-18 12:55:23.192 DEBUG [b10-boss.boss]
 BIND10_CHECK_MSGQ_ALREADY_RUNNING checking if msgq is already running
 2012-01-18 12:55:23.204 INFO  [b10-boss.boss] BIND10_CONFIGURATOR_START
 bind10 component configurator is starting up
 2012-01-18 12:55:23.204 DEBUG [b10-boss.boss] BIND10_CONFIGURATOR_BUILD
 building plan '{}' -> '{'sockcreator': {'priority': 200, 'kind': 'core',
 'special': 'sockcreator'}, 'msgq': {'priority': 199, 'kind': 'core',
 'special': 'msgq'}, 'cfgmgr': {'priority': 198, 'kind': 'core', 'special':
 'cfgmgr'}}'
 2012-01-18 12:55:23.204 DEBUG [b10-boss.boss] BIND10_CONFIGURATOR_RUN
 running plan of 3 tasks
 2012-01-18 12:55:23.204 DEBUG [b10-boss.boss] BIND10_CONFIGURATOR_TASK
 performing task start on Socket creator
 2012-01-18 12:55:23.204 INFO  [b10-boss.boss] BIND10_COMPONENT_START
 component Socket creator is starting
 2012-01-18 12:55:23.231 INFO  [b10-boss.boss] BIND10_SOCKCREATOR_INIT
 initializing socket creator parser
 2012-01-18 12:55:23.231 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID
 started b10-sockcreator (PID 10365)
 2012-01-18 12:55:23.231 INFO  [b10-boss.boss] BIND10_SETUID setting UID to
 1000
 2012-01-18 12:55:23.232 DEBUG [b10-boss.boss] BIND10_CONFIGURATOR_TASK
 performing task start on msgq
 2012-01-18 12:55:23.232 INFO  [b10-boss.boss] BIND10_COMPONENT_START
 component msgq is starting
 2012-01-18 12:55:23.232 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS
 starting process b10-msgq
 2012-01-18 12:55:23.348 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID
 started b10-msgq (PID 16077)
 2012-01-18 12:55:23.569 DEBUG [b10-boss.boss] BIND10_CONFIGURATOR_TASK
 performing task start on cfgmgr
 2012-01-18 12:55:23.569 INFO  [b10-boss.boss] BIND10_COMPONENT_START
 component cfgmgr is starting
 2012-01-18 12:55:23.569 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS
 starting process b10-cfgmgr
 2012-01-18 12:55:23.637 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID
 started b10-cfgmgr (PID 13175)
 2012-01-18 12:55:23.637 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:23.951 FATAL [b10-cfgmgr.cfgmgr] CFGMGR_DATA_READ_ERROR
 error reading configuration database from disk: Can't read configuration
 file: [Errno 13] Permission denied:
 '/home/reed/opt/bind10/var/bind10-devel/b10-config.db'
 2012-01-18 12:55:23.967 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:24.977 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:25.987 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:26.998 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:28.008 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:29.018 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:30.029 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:31.039 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:32.050 DEBUG [b10-boss.boss] BIND10_WAIT_CFGMGR waiting
 for configuration manager process to initialize
 2012-01-18 12:55:33.060 ERROR [b10-boss.boss]
 BIND10_COMPONENT_START_EXCEPTION component cfgmgr failed to start:
 Configuration manager process has not started
 2012-01-18 12:55:33.061 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED
 component cfgmgr (pid None) failed with unknown exit status
 2012-01-18 12:55:33.061 FATAL [b10-boss.boss] BIND10_COMPONENT_UNSATISFIED
 component cfgmgr is required to run and failed
 2012-01-18 12:55:33.062 ERROR [b10-boss.boss]
 BIND10_CONFIGURATOR_PLAN_INTERRUPTED configurator plan interrupted, only 2
 of 3 done
 2012-01-18 12:55:33.062 INFO  [b10-boss.boss] BIND10_KILLING_ALL_PROCESSES
 killing all started processes
 2012-01-18 12:55:33.063 INFO  [b10-boss.boss] BIND10_KILL_PROCESS killing
 process Socket creator
 2012-01-18 12:55:33.063 WARN  [b10-boss.boss] BIND10_SOCKCREATOR_KILL
 killing the socket creator
 Traceback (most recent call last):
   File "/home/reed/opt/bind10/lib/python3.1/site-
 packages/isc/bind10/component.py", line 165, in start
     self._start_internal()
   File "/home/reed/opt/bind10/lib/python3.1/site-
 packages/isc/bind10/component.py", line 403, in _start_internal
     procinfo = self._start_func()
   File "/home/reed/opt/bind10/sbin/bind10", line 476, in start_cfgmgr
     raise ProcessStartError("Configuration manager process has not
 started")
 __main__.ProcessStartError: Configuration manager process has not started

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "/home/reed/opt/bind10/sbin/bind10", line 638, in startup
     self.start_all_components()
   File "/home/reed/opt/bind10/sbin/bind10", line 599, in
 start_all_components
     self._component_configurator.startup(self.__core_components)
   File "/home/reed/opt/bind10/lib/python3.1/site-
 packages/isc/bind10/component.py", line 504, in startup
     self.__reconfigure_internal(self._components, configuration)
   File "/home/reed/opt/bind10/lib/python3.1/site-
 packages/isc/bind10/component.py", line 492, in __reconfigure_internal
     self._run_plan(self._build_plan(old, new))
   File "/home/reed/opt/bind10/lib/python3.1/site-
 packages/isc/bind10/component.py", line 633, in _run_plan
     component.start()
   File "/home/reed/opt/bind10/lib/python3.1/site-
 packages/isc/bind10/component.py", line 168, in start
     self.failed(None)
   File "/home/reed/opt/bind10/lib/python3.1/site-
 packages/isc/bind10/component.py", line 222, in failed
     self._boss.component_shutdown(1)
   File "/home/reed/opt/bind10/sbin/bind10", line 671, in
 component_shutdown
     raise Exception("Component failed during startup");
 Exception: Component failed during startup

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "/home/reed/opt/bind10/sbin/bind10", line 1170, in <module>
     main()
   File "/home/reed/opt/bind10/sbin/bind10", line 1148, in main
     startup_result = boss_of_bind.startup()
   File "/home/reed/opt/bind10/sbin/bind10", line 640, in startup
     self.kill_started_components()
   File "/home/reed/opt/bind10/sbin/bind10", line 343, in
 kill_started_components
     self.components[pid].kill(True)
   File "/home/reed/opt/bind10/lib/python3.1/site-
 packages/isc/bind10/special_component.py", line 68, in kill
     self.__creator.kill()
   File "/home/reed/opt/bind10/lib/python3.1/site-
 packages/isc/bind10/sockcreator.py", line 239, in kill
     self.__process.kill()
   File "/usr/pkg/lib/python3.1/subprocess.py", line 1403, in kill
     self.send_signal(signal.SIGKILL)
   File "/usr/pkg/lib/python3.1/subprocess.py", line 1393, in send_signal
     os.kill(self.pid, sig)
 OSError: [Errno 1] Operation not permitted

 }}}

 I tried to start bind10 again without --user and it failed due to running
 still:


 {{{
 2012-01-18 12:56:05.693 INFO  [b10-boss.boss] BIND10_STARTING starting
 BIND10: bind10 20110223 (BIND 10 20120119)
 2012-01-18 12:56:05.694 DEBUG [b10-boss.boss]
 BIND10_CHECK_MSGQ_ALREADY_RUNNING checking if msgq is already running
 2012-01-18 12:56:05.695 FATAL [b10-boss.boss] BIND10_MSGQ_ALREADY_RUNNING
 msgq daemon already running, cannot start
 2012-01-18 12:56:05.695 FATAL [b10-boss.boss] BIND10_STARTUP_ERROR error
 during startup: b10-msgq already running, or socket file not cleaned ,
 cannot start
 [b10-msgq] Closing socket fd 5
 [b10-msgq] Receive error: EOF
 [b10-msgq] Closing socket fd 4
 [b10-msgq] Receive error: EOF
 [b10-msgq] Closing socket fd 4
 [b10-msgq] Receive error: EOF

 }}}

 with these processes:

 {{{

 root 10365  0.0  0.2  17920  1592 ttyp1 I    12:55PM  0:00.01
 b10-sockcreator
 reed 16077  0.0  1.7 109924 15296 ttyp1 I    12:55PM  0:00.28
 /usr/pkg/bin/python3.1 /home/reed/opt/bind10/libexec/bind10-devel/b10-msgq

 }}}

 I killed these manually and started bind10 without --user.

 The traceback should not happen.

 sockcreator should never stay running (I have seen this before).

 msgq should never stay running.

 bind10 should say to stderr if can't start (don't assume logging works).

-- 
Ticket URL: <http://bind10.isc.org/ticket/1597>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list