Strange recursor response time pattern

Havard Eidnes he at uninett.no
Tue Sep 5 14:44:36 UTC 2017


Hi,

one of my users made me aware of this rather unexpected behaviour
in our recursors running BIND 9.10.5-P3 on NetBSD/amd64:

It seems that every 10s, "on the clock", BIND will temporarily
increase the query response time rather drastically for a short while,
only to settle down to normal behaviour until the next 10s event.

A test I did to replicate the result was to run a shell loop with
a small sub-second sleep (yes, nonstandard):

while sleep .1; do
   date +%s
   dig @$recursor $query A +noadd +noauth
done > /tmp/out

Do this with a query for a name which exists and is in-cache.
Let it run for a while.  Then do:

awk '/Query time:/ { qt=$4 } /^1/ { print $1 " " qt }' /tmp/out > /tmp/qt

Plot the result and see a rather wide variation in response time.

Next do

awk '{ print $1 % 10 " " $2 }' /tmp/qt > /tmp/bins

and put the result through a small perl script (attached) to give
some basic statistics, and in my case I see:

Ix    mean   stddev   median min max
0 0.083303 0.555366 0.000000 0 16
1 0.071531 0.353116 0.000000 0 11
2 0.130875 3.170423 0.000000 0 166
3 0.075922 0.467403 0.000000 0 18
4 0.167146 3.928216 0.000000 0 167
5 0.069928 0.357949 0.000000 0 11
6 0.148990 3.536247 0.000000 0 183
7 0.090478 1.786360 0.000000 0 92
8 47.902799 109.998464 0.000000 0 479
9 0.080216 0.373228 0.000000 0 12

This I find rather surprising!

This measurement is based on some 27k samples, measured against a BIND
which processes 6-800 qps, so not overly busy, but similar behaviour
is also observed with BINDs which process only 50-150 qps, so the
query load does not appear to affect this significantly -- the result
for the two other BIND installations I've looked at were (based on 24k
and 20k samples):

Ix    mean   stddev   median min max
0 15.031710 0.736819 15.000000 14 46
1 15.041801 1.359741 15.000000 14 77
2 15.003519 0.248714 15.000000 14 20
3 15.008326 0.258005 15.000000 14 19
4 15.018990 0.447535 15.000000 14 30
5 15.007415 0.289008 15.000000 14 20
6 31.504860 42.598276 15.000000 14 307
7 15.128158 3.613397 15.000000 14 133
8 15.055274 2.329867 15.000000 14 123
9 15.011628 0.305681 15.000000 14 20

and

Ix    mean   stddev   median min max
0 9.015588 2.394387 9.000000 8 121
1 8.994696 2.334736 9.000000 8 124
2 8.966915 0.985907 9.000000 8 55
3 8.998417 2.259316 9.000000 8 122
4 24.782942 42.980507 9.000000 8 265
5 8.961137 0.363112 9.000000 8 17
6 8.947259 0.225413 9.000000 8 10
7 9.048827 3.174344 9.000000 8 128
8 9.004134 2.429312 9.000000 8 128
9 9.010314 2.294347 9.000000 8 121

Any idea what might be causing this?  Anything I can do to remedy
the behaviour?

Best regards,

- Håvard
-------------- next part --------------
#! /usr/bin/perl

use strict;
use Statistics::Descriptive;

our(@d);

while(<>) {
    chomp;
    @_ = split;
    my($i) = $_[0];
    my($v) = $_[1];
    if (!defined($d[$i])) {
	$d[$i] = ();
    }
    push(@{ $d[$i] }, $v);
}

printf("Ix    mean   stddev   median min max\n");
foreach my $i (0 .. $#d) {
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@{ $d[$i] });
    
    my $mean = $stat->mean;
    my $stddev  = $stat->standard_deviation;
    my $median  = $stat->median;
    my $min = $stat->min;
    my $max = $stat->max;

    printf("%d %f %f %f %d %d\n", $i, $mean, $stddev, $median, $min, $max);
}


More information about the bind-users mailing list