BIND 10 #640: cfgmgr hanging and command and configurations for no-longer-running components
BIND 10 Development
do-not-reply at isc.org
Fri Feb 3 17:38:14 UTC 2012
#640: cfgmgr hanging and command and configurations for no-longer-running
components
-------------------------------------+-------------------------------------
Reporter: jreed | Owner: jinmei
Type: | Status: reviewing
defect | Milestone:
Priority: major | Sprint-20120207
Component: | Resolution:
Unclassified | Sensitive: 0
Keywords: | Sub-Project: DNS
Defect Severity: N/A | Estimated Difficulty: 0.0
Feature Depending on Ticket: | Total Hours: 0
Add Hours to Ticket: 0 |
Internal?: 0 |
-------------------------------------+-------------------------------------
Comment (by jelte):
Replying to [comment:7 jinmei]:
> I've not yet reviewed everything, but as I'm taking a day off tomorrow
> I'll provide feedback on the things I've covered so far.
>
Ack, thanks so far :)
> First off, it was not clear which part of the original report was
> solved. I see the potential that this set of changes can make bindctl
> more responsive in the reported scenario, but was it identified why
> cfgmgr was hanging (or did it really hang)? And does this branch
> solve that problem? What about the traceback output? (BTW I
> understand this branch cannot solve the issue of configuring inactive
> component).
>
Oh, sorry for not being clear; that ConfigManager timeout was actually
closely related; the timeout values for communication from bindctl to
configmanager and from configmanager to any other (in this case non-
running) module are the same; so the 'configmanager is not responding' was
not because configmanager was hanging, but because it was simply waiting
for a timeout itself. So this would address the root cause, but not that
specific problem (we probably need to make cmdctl->cfgmgr timeouts a bit
higher than the rest).
> A related point is that since this is an issue concerning interactions
> of multiple processes, unittests wouldn't be enough to be sure whether
> the problem was solved. It may be okay to defer it to a separate
> ticket, but I think we need a system level test for this kind of
> thing.
>
oh yeah, good point. I've added a few steps to terrain/bind10_control.py
that I needed, and then cleaned it up a bit.
Added a new feature file 'bindctl_commands.feature', which currently only
contains this test (run basic setup, remove a number of modules, and check
if they are not shown anymore)
> Next, I made a few trivial changes to the branch.
>
> '''ccsession.cc'''
>
> - Not a big deal, but `sendStopping()` is a private method, not very
> big, and used only within the ModuleCCSession destructor. Is it
> worth a separate method?
I personally prefer it in a separate method; not so much because of its
size, but because the action it performs is not directly related to local
resource cleanup, and is kind of a side effect, so constructing the
message and sending it seems (imho) nicer in a separate method. No strong
opinion though (side not, i even considered making it a public call, and
having some check to see if it had been called already in the destructor,
so the client code can decide when it should be called. But that did not
seem worth it).
> - logging itself could cause an exception, so the destructor could
> still throw. However, this is not the only point that has this
> problem, so maybe we can leave it open for now...
>
hmm, imo we should either make logging calls exception-free or add
exception-free ones, since it is usually about the only thing you can do
in a destructor.
> '''ccsession.h'''
>
> - I thought we decided to use C++ style comments for doxygen
oops, fixed
> - In our convention we'd explicitly add 'virtual' (for clarity) to the
> destructor declaration.
>
actually, it's even needed for technical reasons for classes that are
subclassed, but this one isn't supposed to be subclassed afaik I realize
we can't stop people from doing it, but personally i tend to look at the
virtualness of the destructor to see if it should be subclassed in the
first place. Of course, one could argue that it would be better to add it
when it's not necessary than to not have it when it is, so if you insist
i'll make it virtual.
> '''config_message.mes'''
>
> If not yet done, you may want to reorder the messages using
> tools/reorder_message_file.py
>
done (it only moved an older one)
> '''ccsession_unittests.cc'''
>
> - maybe a matter of taste, but the intermediate block just to have a
> separate scope looks a bit awkward. I'd do something like this:
> {{{#!c++
> scoped_ptr<ModuleCCSession> mccs(new ModuleCCSession(
> ccspecfile("spec2.spec"),
> session, NULL, NULL,
> true, false));
> ...
> mccs.reset(); // this will invoke destructor
> }}}
> At least I think some comments about the intent of the scope would
> be necessary.
changed
> - it would be nicer if we could test the case where the destructor
> throws.
>
ack, added a simple throw_on_send option to fakesession, and added test
> '''ccsession.py'''
>
> - send_stopping: while the pydoc says "any errors are logged", only
> SessionError is caught. Is that intentional? Can we assume
> create_command() doesn't raise an exception?
Oh, right, made it except Exception. create_command can raise as well, but
i don't really want to catch those; imo they can only be programmer errors
(well, except for out of mem of course), and should fail as loudly as
possible. Added a comment to that effect.
> - send_stopping: I believe you don't need 'str' for 'se'. (might not
> be a big deal in this case, but delaying 'str' could possibly help
> improve performance).
>
ack, changed
> '''cfgmgr.py'''
>
> - if `_handle_module_stopping` is private, should it better be
> prefixed with `__`? Same for `_send_module_spec_to_cmdctl`.
as you can see from the rest of the code here, i tend to only use the one
when I don't think the class/method name mangling is necessary. Should we
start doing this as a principle? (and if so, should we update all
'private' methods?)
> - `_handle_module_stopping` returns None, saying "This command is not
> expected to be answered". But returning None doesn't seem to
> suppress sending a response.
ah doh, fixed.
> - Not directly related to this ticket, but the if-else in handle_msg
> is getting too big to understand/manage. I guess it's time to
> consider refactoring, maybe in some OO way.
>
depends on how; as it is now, it is indeed getting long, but is
essentially a switch. I'm open to proposals, but some 'smart' solutions
might actually be more complicated :p
> '''ccsession_test.py'''
>
> - test_stop: not necessarily a problem, but it seems the same pattern
> repeats in multiple tests. It would be better to unify the common
> sequence of tests for conciseness.
Yes, actually, more generally, this is partly why I was looking at the
callcounter thingy a few nights ago; i am working during my down-hours on
a slightly more general version of that, and I think we can replace a lot
of 'checking if stuff has been called' tests with such a beasty. But I
guess you're not just talking about that. I actually think we should not
just unify more test code; we can probably do a much better job of
unifying the common patterns in the modules themselves as well (and
unified test code should then kind of roll out as part of that work).
--
Ticket URL: <http://bind10.isc.org/ticket/640#comment:9>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list