BIND 10 #775: b10-auth should not exit if it cannot bind to ports
BIND 10 Development
do-not-reply at isc.org
Mon Apr 18 09:31:59 UTC 2011
#775: b10-auth should not exit if it cannot bind to ports
-------------------------------------+-------------------------------------
Reporter: shane | Owner: hanfeng
Type: defect | Status: reviewing
Priority: | Milestone:
critical | Sprint-20110419
Component: | Resolution:
b10-auth | Sensitive: 0
Keywords: | Add Hours to Ticket: 0
Estimated Number of Hours: 0.0 | Total Hours: 0
Billable?: 1 |
Internal?: 0 |
-------------------------------------+-------------------------------------
Changes (by vorner):
* owner: vorner => hanfeng
Comment:
Hello
Replying to [comment:9 hanfeng]:
> Replying to [comment:8 vorner]:
> > So, in short, the throw must stay there. In case of the first startup,
if it throws, we might want to catch it and not exit the whole program, at
last as a short-time workaround, before we make it possible to configure
things even when they are not running.
> For this point, I don't agree, if every time port binding failed, we
restart, it will make boss quite busy, during our test last time, we can
see the server launch and quit for several times. The auth server isn't
alive or dead, but keep jumping between them, which is quite terrible, you
even can not get a chance to modify the configure by hand.
Well, for one, it wouldn't get so busy, one restart every 10 seconds isn't
busy (but it can be annoying, of course).
Anyway, the abort is not responsible for the jumping. If the process is
starting up and it can't bind to the ports, the rollback is to the empty
set of addresses, so the second exception can not happen, therefore the
abort can't happen. The thing that did kill the process was the first
exception, which you catch now.
The effect of the abort is, if user changes the configuration at runtime
and it fails, it tries to return back to the original ones. If that fails
as well (which it should not, in reality), there's some serious problem.
So in that case, it aborted, making the boss restart it (with the old
config). That should work, because it worked some time before already. But
if that fails as well, it rollbacks to empty set of sockets, you catch the
exception and it sits there. So it would jump only once and only in the
really improbable situation.
But, after explaining the situation, I don't really care much about it,
it's rare. In the long term, we should rewrite the changing of sockets so
the old ones are released only after the new ones are successfully bound,
so we wouldn't have to care about it (eg. the rollback couldn't throw).
> As for the last throw in port config. I have restore it.
ACK. I don't really like the catch-all thing there or the fact that the
server would be sitting there in completely useless way (and, in fact, in
somehow inconsistent configuration). But due to the current problem with
configuration, it's probably the less evil thing. So, could you add some
comment around it that it's a temporary solution and should be removed
once we are able to configure modules while they are not running?
And, maybe, it should have a changelog.
Otherwise, it is OK
Thanks
--
Ticket URL: <https://bind10.isc.org/ticket/775#comment:11>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list