[bind10-dev] interaction between 'active' module list and multi-process modules

Mon Feb 6 12:28:36 UTC 2012

On 06/02/2012 10:54, Jelte Jansen wrote:

> 1) Treat any extra instance of a module as a second class citizen; 
> they 'steal' their config from the first instance, and skip part
> of the normal module protocol. For starters, they would not send a 
> stopping message, and they would not respond to config changes
> (they would apply them, but they would assume the first instance
> reports any errors). In effect, this would mean that for
> configuration and command purposes, it would appear to the system
> that there is only one process for this module, and the others
> would (pretty quietly) run in the background.

I don't really like this idea - I think it will introduce unnecessary
complications.

How are secondary instances started - via the Boss or does the primary
instance fork them?  What happens if the primary instance stops and is
restarted - how does the Boss know not to start up other secondary
instances? etc.

> 2) Provide any instance with its own name; possibly auto-generated 
> (Auth1, Auth2, Auth3), but maybe even configurable. So instead of 
> running 'Auth' (or any module) multi-process, you'd actually
> configure multiple modules, that happen to be the same binary. This
> would require more work; we'd have to modify the basic code so that
> modules get told what their names are, and figure out how to handle
> and respond to naming conflicts.

We could pass the process number through the command line - just start
a counter in Boss and increment it for every process started.  It will
guarantee unique names for every process in each run of BIND 10.

> It would also require more configuration activity to run them. But
> one could then configure 2 Auths to do very different things.

I would be wary of this.  Multiple Auths are there to improve
concurrency - having different configurations would be a can of worms.
 For example, as they will all read from the same socket, it will be
an accident of timing that causes a specific instance to read a
particular query, giving essentially random behaviour in the responses.

> 3) Do some form of reference counting. If X modules of type Y
> start up, there should be X messages to that effect, and when they
> close, and the number does not drop to 0, it should still be
> considered an active module. This has the advantage of 'hiding'
> extra instances, but we'd need to make the messaging much more
> robust. Currently, if a module simply gets killed, it wouldn't send
> a 'stopping' message. We can make Boss respond to process deaths in
> this sense, but Boss is not necessarily aware of every module (you
> can still start them by hand). So the only reliable way would be on
> the level of the messaging system (msgq does notice when clients
> disconnect, and I assume so would any msgq replacement), but
> especially right now, msgq operates on an abstraction level below
> 'modules', so putting in module-level communication there seems
> like a break in abstraction (i think there would be nice ways
> around it btw, but haven't really thought about it much).

I don't like the idea of hiding the extra processes, I think that will
lead to complications in the future.  Instead I propose that we
clearly separate the concept of modules and processes. From the user
point of view the system comprises different modules and they interact
with modules the same way they do now.  Internally, the system
associated a number of processes with each module, and translates
module actions to actions on all related processes.

In the case above, the Boss process will need to hold the concept of
modules. Ignoring processes started by hand, Boss would start "n"
instances of Auth, each with its own name, and keep track of each
process.  If an Auth process dies, it starts another with a different
unique number.  If the user raises the number of Auth processes
required, Boss will need to start additional ones; if they lower it,
Boss will need to stop some.

At present, user interaction is at the module level, but we may want
to extend this to the process level, e.g.

Auth shutdown [n]

Without the argument, the command shuts down all Auth processes; with
the argument it shuts down only Auth process "n".

Stephen