BIND 10 #1271: exception on extremely slow startup
BIND 10 Development
do-not-reply at isc.org
Mon Oct 10 17:57:52 UTC 2011
#1271: exception on extremely slow startup
--------------------------------------------+-----------------------------
Reporter: jreed | Owner:
Type: defect | Status: new
Priority: minor | Milestone:
Component: Unclassified | Resolution:
Keywords: | Sensitive: 0
Defect Severity: N/A | Sub-Project: Core
Feature Depending on Ticket: | Estimated Difficulty: 0
Add Hours to Ticket: 0 | Total Hours: 0
Internal?: 0 |
--------------------------------------------+-----------------------------
Comment (by jinmei):
Replying to [comment:1 shane]:
> Even though the report is that it took over two minutes, the program
thought it only took 9 seconds.
>
> I'm not sure what we can do on a machine so badly broken. :(
This exact same happened on my personal server. (You might call it
broken:-) it's (in today's standard) unusually slow, but otherwise
it works perfectly fine. In my case sleeping for 1 second wasn't
sufficient for cfgmr to start up:
{{{
def start_cfgmgr(self, c_channel_env):
"""
Starts the configuration manager process
"""
[...]
# sleep until b10-cfgmgr is fully up and running, this is a good
place
# to have a (short) timeout on synchronized groupsend/receive
# TODO: replace the sleep by a listen for ConfigManager started
# message
time.sleep(1)
}}}
and so it encounters "broken pipe" in start_ccssession:
{{{
self.start_cfgmgr(c_channel_env)
self.start_ccsession(c_channel_env)
}}}
I tentatively changed the sleep time to 5 seconds, and it worked,
but it's needless to say that it's just an ugly workaround.
As noted in TODO, what the boss process should actually do is to establish
more explicit synchronization with cfgmgr. It should wait on msgq,
cfgmgr sends an "I'm ready" message to the boss when it starts up,
the boss continues its process then. We might also want to specify
a limit on the waiting time (and since it's before the boss gets the
configuration, it must be given via a command line option or an
environment variable).
I'd also suggest revising the log messages in this failure case. It's
difficult to know what was actually wrong from these messages. Seeing
the bare 'broken pipe' exception is also not cool.
Finally, not directly related to this issue, but this log message is
not good:
{{{
2011-09-29 15:39:49.376 INFO [b10-boss.boss] BIND10_STARTING_PROCESS
starting process ccsession
}}}
because ccsession is not a "process". This is a result of a naive
copy-paste here:
{{{
self.log_starting("ccsession")
}}}
it should be fixed.
I proposed this ticket to be handled in the next sprint.
--
Ticket URL: <http://bind10.isc.org/ticket/1271#comment:3>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list