BIND 10 #213: Change hard-coded process startups to configuration-driven
BIND 10 Development
do-not-reply at isc.org
Tue Oct 25 15:57:42 UTC 2011
#213: Change hard-coded process startups to configuration-driven
-------------------------------------+-------------------------------------
Reporter: shane | Owner: vorner
Type: | Status: reviewing
enhancement | Milestone:
Priority: major | Sprint-20111025
Component: Boss | Resolution:
of BIND | Sensitive: 0
Keywords: | Sub-Project: Core
Defect Severity: N/A | Estimated Difficulty: 9
Feature Depending on Ticket: | Total Hours:
Add Hours to Ticket: |
Internal?: 0 |
-------------------------------------+-------------------------------------
Comment (by vorner):
Hello
Replying to [comment:16 jinmei]:
> - kill_started_processes(): from a quick look at the library
> reference, os.kill(signal.SIGTERM) is less portable than
> Poepen.kill(), which was used for the original code (especially if
> we consider Windows support), and in that sense this is a
> regression. The same comment applies to other cases where os.kill()
> is used.
The killing was moved into the components which handle them in concrete
way.
> - read_bind10_config(): the method's doc string may need to be
> updated, depending on whether this work also means we are going to
> decide how/where a list of running components should be defined.
I believe that this ticket explicitly requires it to be in boss (it
doesn't seem doable to search for the correct configuration everywhere).
Besides, configuration for a module should be something the module itself
needs for correct working. As it is not the job of each component to start
itself, it doesn't seem to be the best place to put the configuration into
each respective component. But if you think we should talk this over on
the dev list, it is possible.
> - start_auth(), etc: While the comment above start_auth() seems to
> suggest the use of them in tests, they now are not used in
> tests (overkilling?). Also, as already stated, ideally I'd move
> these outside the bind10 module.
I'd like that too, in the long term. But as they use quite a lot from boss
internals, I think we should defer it.
> - start_all_processes(): is it okay to remove this?
> {{{
> - # Everything after the main components can run as non-root.
> - # TODO: this is only temporary - once the privileged socket
creator is
> - # fully working, nothing else will run as root.
> - if self.uid is not None:
> - posix.setuid(self.uid)
> }}}
Not really, I missed that one. But it's unhappy in the current state, it
doesn't do what it should, as there's no right place. We should finish the
socket creator soon enough.
> - stop_process(): I was not sure how we could remove this comment due
> to this branch:
> {{{
> - # TODO: Some timeout to solve processes that don't want to die
would
> - # help. We can even store it in the dict, it is used only as a
set
> }}}
Ah, that one was removed because it now belongs somewhere else. But it got
lost in the ether until it got there, so I put it there now.
> and this line (or the concept of "expected_shutdowns"):
> {{{
> - self.expected_shutdowns[process] = 1
> }}}
> (I'm not necessarily saying we can't. I simply didn't understand).
The expected shutdowns are no longer needed. A component is told to shut
down, which puts it into a stopped state. A failed is called only on the
running components. So the component remembers itself instead of relying
on external dictionary.
> - component_shutdown(): shouldn't this check be at the beginning of the
> method? At least before setting exitcode?
> {{{
> if not self.__started:
> raise Exception("Component failed during startup");
> }}}
No, it shouldn't, as this is not an exception from this method, but rather
two different ways to stop the boss, depending on when it happens. I added
a comment explaining what it does and why.
> - component_shutdown(): if I read the code correctly, `__stopping` can
> never be True in the context where this method can be called:
> {{{
> if self.__stopping:
> return
> }}}
> If the intent is to prevent a duplicate call, we'll probably need a
> dedicated attribute like "shutting_down", and set it to True in this
> own method.
I don't think it can be called twice currently and it probably wouldn't
hurt either. I don't know what I was thinking before, when I added it, so
I removed it.
> - reap_children(): I was not sure if it's safe to remove this:
> {{{
> - self.dead_processes[proc_info.pid] = proc_info
> }}}
The dead_processes were used for the delayed starts. We need to
reintroduce them sometime, but maybe in a completely different way. So
that's why they are removed from the code now.
> - restart_processes(): I was not sure if we can safely remove these:
> {{{
> - if proc_info.name in self.expected_shutdowns:
> - # We don't restart, we wanted it to die
> - del self.expected_shutdowns[proc_info.name]
> - continue
> }}}
> see also comments about stop_process().
Yes, the same explanation applies here ‒ it is handled differently now.
> - not actually about the spec itself, but when we specify the default
> for some configuration parameters, the application that uses them
> should get the default from the spec, rather than hardcoding them in
> the source code. One example:
> {{{
> component = creator(params.get('process', cname),
self.__boss,
> params.get('kind', 'dispensable'),
> params.get('address'),
> params.get('params'))
> }}}
The only one that had a default value was the kind, so I set it to
required instead of optional.
The rest is optional and has no default. These defaults are either „empty“
(eg. nothing there) in the code or guessed from something that changes for
each component (the process to start), which is impossible to state in the
spec file.
> - I'm not sure "b10-xxx" is the best choice for the component name.
> It seems more like a kind of component, rather than the program
> (file) name, so things like auth (or Auth) sound a bit better to
> me. But this is probably a minor preference matter.
That is because I use the name as the default value of process to start,
to save the user some typing when introducing a component. I'd like to
unify it with the address on the message queue somehow as well, so the
only thing needed to add a new component would be to type:
{{{
config add Boss/components b10-component
}}}
And be done.
> - likewise, while "params" are defined as a list of strings,
> - I can't find "priority" in the spec.
> - dhcp related config is missing, too.
What dhcp related config? I think the DHCP wasn't started before (and it
didn't really have a start_dhcp option yet). But if a user wants, he can
always add the component, right? There's no special need for DHCP. Or did
I overlook something?
> General comment: TestStartStopProcessesBob was completely removed and
> TestBossComponents was completely newly introduced. Both are pretty
> big, and the changes are not trivial, so I was not sure such changes
> in tests ensure safe refactoring. (For example, the original tests
> check whether auth/resolver are started based on some
> run-time/config-time conditions. Such kind of tests seem to be
> missing). Overall, I've not seen anything obviously wrong in the
> added tests, but I couldn't be confident that these tests are
> sufficient, and I was not really comfortable about that.
Well, it's simply because the old way of configuration no longer exists,
so there's no reason (or way) to test it. Instead of testing that if
there's a start_resolver, the resolver is started, we test that if there's
a component in configuration, that component is started (by passing it to
the configurator and by the configurator tests).
> - TestBossComponents.test_correct_run: about this point, my
> understanding is that we are expected to return an error answer in
such
> cases:
> {{{
> # Is it OK to raise, or should it catch also and convert to
error
> # answer?
> self.assertRaises(Exception, bob.config_handler,
> {'components': compconf})
> }}}
> See also the comment about config_handler() above.
I was thinking it should be the job of whatever calls the handler to catch
the exceptions and convert them to error messages. But I probably didn't
have a reason to expect that to do so.
I know I still haven't looked at the system tests, but I'm assigning the
current changes to you for now, as I don't expect to have time for it the
next 2 or 3 days :-|.
Thanks
--
Ticket URL: <http://bind10.isc.org/ticket/213#comment:23>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list