[bind10-dev] Thoughts on Architecture of Xfrin & Notify-In/Out

Mon May 17 04:53:48 UTC 2010

Likun,

On Mon, 2010-05-17 at 11:28 +0800, zhanglikun wrote:
> > Why are you choosing to use threads here, when we choose not to use
> > threads in something like the auth server, but instead choose to use
> > more than one process?
> > 
> > That is, what does using threads gain over multiple processes?
> 
> I think the count of auth srv processes is fixed, no matter how many queries
> coming( Can the count of auth srv processes be setted in runtime), but xfrin
> processes is different. 

Exactly.

For auth we are busy processing so having a small number of processes
per core makes sense (1 or 2). For xfrin we are waiting for IO so want
many more processes/threads/whatever than CPU.

> If one transfer-in with one process, My concern is there will be too many
> xfrin processes at worst. Is there any other excuse?

Well, the other concern is administrator annoyance. For example, this is
annoying for people logged into a box:

$ ps ax
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:01 /sbin/init
    2 ?        S      0:00 [kthreadd]
    3 ?        S      0:00 [migration/0]
       .
       .
       .
45211 ?        S      0:01 b10-xfrin fromit.example
45213 ?        S      0:00 b10-xfrin sirspamalot.example
45214 ?        S      0:02 b10-xfrin xn--yus051k.example
45215 ?        S      0:02 b10-xfrin cheese.example
       .
       .
       .
     [1000 more lines saying "b10-xfrin"]

BUT... if we assume reasonable zone refresh rates and/or relatively
decent connectivity, then this won't be such a big problem. Consider
that with 100k zones and a 2-hour refresh rate, we should see only a few
10s (20? 30?) of transfers in progress during normal operation. Someone
with 100k zones won't be surprised to see lots of processes! ;)

A further complication is what happens in ABNORMAL situations. A simple
example is when the server starts. One improvement that BIND 10 will
have over BIND 9 is that we will be able to maintain zone timers across
runs, so we don't need to do an SOA check for *every* zone we are slave
for. But if a server is offline for a couple hours (for hardware
maintenance perhaps, or if it is in California and the power goes out)
then our 100k zones with a 2-hour refresh will have to refresh all of
its zones. This could also happen with a network outage, or many other
scenarios.

So I think it is not always possible to limit the number of XFR in
progress to a "reasonable" number. So I think a threaded (or
event-driven) model makes sense here.

What we lose from multiple processes is the ability to use multiple CPUs
and also fault isolation (possibly including security bugs).

If we decide that using multiple CPUs is a worthwhile benefit, we can
either switch to a one-process-per-XFR model, or we can move to a mixed
model where we have a number of processes based on our number of cores,
and then have multiple threads in each of those. This is potentially
complicated, so we need to base this on demonstrated (benchmarked
benefit.

Fault isolation is possibly a bigger deal. "Fate sharing" could be a
serious problem. For example, I can imagine a bug in transferring a
small zone that causes a big, slow XFR to fail, making it impossible to
ever complete the XFR and eventually becoming lame.

You could use the mixed model above, with a knob that you can crank the
number of threads per process. For "total safety" you set threads to 1
per process. For "minimal annoyance" you set threads to 1 million per
process.

The XFR code is relatively simple now, and if it remains simple we can
be more sure that it is robust. I propose we go forward with the
threaded model until we discover that it is broken, and then fix it if
necessary.

--
Shane