[bind10-dev] proposal: revising listen port configuration
JINMEI Tatuya / 神明達哉
jinmei at isc.org
Thu Mar 29 17:45:45 UTC 2012
(Also related to #1707 http://bind10.isc.org/ticket/1707)
The current implementation of configuring listen ports has a number of
problems:
- It involves communication with the socket creator via the boss.
This communication can block and introduce service disruption.
Also, even releasing (closing) a socket, which is generally
considered a kind of no-fail operation, can fail due to a
communication error.
- We could avoid the delay if we make the communication asynchronous,
but then we'd not be able to return the configuration result to the
cfgmgr in a timely fashion. So, for example, if opening a socket
fails, the state in cfgmgr can be inconsistent with that of
b10-auth/resolver/or-other-apps.
- when the listen_on configuration is updated, the current
implementation first releases all open sockets, and then tries to
open (acquire) sockets for the specified port. This is wrong for
several reasons: first, it can cause temporary service disruption
(which would actually be real for busy servers). Further, this can
be permanent service disruption if the operation results in
re-acquiring a socket for a port already in use but the attempt of
the second acquisition fails.
(I think there are already tickets for some of the issues).
I'd propose the following approach to address these issues:
- the application (b10-auth etc) maintains a list of "configured
ports", which is basically a copy of what's specified in listen_on.
Normally these ports are opened and used by that app, but may not
necessarily have to be so. So each port in the list should also
have some kind of flag that indicates whether it's opened or not.
- configuring listen_on now means updating this list and always
succeeds, whether or not acquiring the socket succeeds.
- on an update to listen_on, the application identifies
- ports for which new sockets have to be acquired. those include a
newly given ports by the updates and ports already in the list but
don't have associated sockets by that time for some reason
- ports that currently have sockets and are to be removed from the list.
- ports to be removed from the list but currently don't have
- sockets.
and tries to acquire sockets for the first set of ports, tries to
release the sockets for the second set of ports, and simply remove
the third set from the list. If the attempt of acquiring a new one
fails, it marks that port in the list accordingly and logs that fact
(but keeps it in the list). If the attempt of releasing a new port
fails, it's considered a fatal error and the application should be
terminated (the assumption is that it should be a very rare event.
And, in fact, when this happens the entire system will be in some
inconsistent state, and it's pretty difficult to recover anyway).
- eventually we'll make communication with the socket creator
asynchronous.
So, as a result, the server will keep running even if some (or even
all) specified ports fail to open. This behavior is actually
compatible with BIND 9, so in terms of the compatibility this should
actually make sense.
---
JINMEI, Tatuya
More information about the bind10-dev
mailing list