[bind10-dev] initial memo on recursive server performance research
JINMEI Tatuya / 神明達哉
jinmei at isc.org
Wed Jul 4 06:41:23 UTC 2012
I've been working on some experiments about recursive server
performance (as Shane reported here a few days ago). I don't have any
exciting results yet, but here are a couple of initial results.
1. Performance of an "ideal" implementation
I've written an imaginary "ideal" server implementation that basically
does nothing except for network I/O, and measured its max possible QPS
on a test machine. The "server" is useful for any practical purposes,
but its performance would be still informative in that it would be the
possible highest performance of our real implementation. The server
still somehow behaves "recursive DNS server like" - it checks the
first byte of receiving "queries", and either immediately sends a
given size of response or "forwards" it to a pre-specified address.
It opens another socket for receiving "responses" to the forwarded
packets, and, like the case of queries, it either responds to the
"original client" or repeats the "forwarding". By controlling the
content of test packets, we can control the imaginary "cache hit rate"
or the number of "recursion" needed when cache miss happens.
Since the resulting throughput was very high, it seemed mostly
impossible to measure the max possible QPS with a usual tool like
queryperf (partly because other hosts I can use as the "client" is
much less powerful than the server box). So I gradually increased
the query/response rate until the server clearly started dropping
queries (ignoring some intermediate minor margins).
This graph summarizes the result of this experiment:
http://bind10.isc.org/~jinmei/rec-max-qps.png
("n iter" means the server would need to do n recursions when cache
miss happens).
It just seems something predictable, but I think it's still
informative in that it highlights the point that the cost of each
recursive transaction matters (i.e., if we need more
recursions/transactions on cache miss, that could affect the overall
performance substantially). In some sense it's obvious from simple
math even without running experiments, but I guess we were not really
aware of this in the previous discussions. And, this means, for
example:
- attempts that lead to increase the number external transaction may
not really be a good idea, at least for throughput. Those include
querying multiple authoritative servers at the same time and trying
very hard to identify EDNS availability.
- anything that helps reduce the number of transaction may provide
substantial performance improvement. For example, it may make sense
to keep delegation information at a higher priority (or separately)
than other cached records even if the former is not used so often
(and can be easy target of pursing due to a simple LRU-based
algorithm).
2. popularity of queries
I've analyzed an old (about 5 years ago) but real query sample taken
on a busy recursive server in terms of "query popularity", i.e., which
queries were asked more and how often.
The result is summarized in this graph:
http://bind10.isc.org/~jinmei/query-popularity.png
The query sample consists of about 2.49M queries containing 290K
unique queries. This summary graph should read the most "popular"
query occupies 4.71% of the total of the 2.49M queries, the top 10
consume 10.62%, and so on. Not so surprisingly, the distribution is
pretty heavy tailed. The top 10,000 (about top 1/30) occupies about
80% of the entire queries.
While the data are too old and this is only one data point (next steps
of this work includes getting and analysing more recent samples and
from multiple points), if it still holds today and commonly for busy
servers (which I guess is quite likely), it suggests it's promising to
introduce optimization approaches like multi-layer caches. We could
even let separate thread or process running on different CPU cores
have their own "layer 1" cache in the most efficient form for faster
response (like pre-rendered response).
These are just speculation at the moment though. We need more
intensive research, which will be our next steps.
---
JINMEI, Tatuya
More information about the bind10-dev
mailing list