Timeouts during cache cleaning and zone collection

Auer, Karl James karl.auer at id.ethz.ch
Mon Jun 20 09:05:06 UTC 2005


Hi there.

We are seeing a problem with BIND 9.3.0, compiled with threading on
Solaris, whereby the servers stop answering queries for a couple of
seconds. Qeuries in this interval time out. That is, they are not
answered slowly, they are not answered at all.

The servers do this a) when they clean their caches and b) when they are
downloading zones.

Archived messages on the matter of cache cleaning suggest that these
timeouts are normal for BIND, and that the only way to avoid them is to
set turn cache cleaning off. I've tried setting the cleaning interval to
only a few minutes, but it just caused more timeouts - there seems to be
a sort of minimum interruption due to cache cleaning.

Of more concern are the interruptions due to zone downloads. We have a
(poorly designed) system whereby our zone files are generated from a
database if required; the zone files are completely rewritten, with a
new serial number, and a hidden master is reloaded. That causes a bunch
of notifies to hit the secondaries, which then reload those zones from
the hidden master. That process causes these timeouts. The secondaries
are the nameservers that field all our queries. Note that about 400
zones are downloaded, though most are very small or even empty. There
are a couple of larger ones, but even there we are only loading 50000 or
so entries.

We have not separated authoritative and caching nameserver functions.
The affected secondaries handle internal and external queries, but are
not under a particularly heavy load. Even if we did separate the
functions, it wouldn't help (I think) because the caching issue would
still be there on the caching servers, and the download issue would
still be there on the authoritative servers. We want to separate the
functions anyway, for all the usual reasons.

Note that this problem isn't new - it's just that our monitoring has
improved :-)

So my question: Is it normal for a BIND server to stop answering queries
during zone downloads? If not what might be the problem here?

Regards, K.

--=20
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karl Auer (karl.auer at id.ethz.ch)       Geschaeft/work +41- 1-6327531
Kommunikation, ETHZ RZ                    Privat/home +41-43-2660706
Eidgenoessische Technische Hochschule, Zuerich    Fax +41- 1-6321225
Clausiusstrasse 59 CH-8092 ZUERICH Switzerland



More information about the bind-users mailing list