BIND 10 #2228: research requirements for DB-based data source performance

Thu Nov 8 18:24:06 UTC 2012

#2228: research requirements for DB-based data source performance
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  larissas
  jinmei                             |                Status:  assigned
                       Type:  task   |             Milestone:
                   Priority:         |            Resolution:
  medium                             |             Sensitive:  0
                  Component:         |           Sub-Project:  DNS
  Unclassified                       |  Estimated Difficulty:  0
                   Keywords:         |           Total Hours:  0
            Defect Severity:  N/A    |
Feature Depending on Ticket:         |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Replying to [comment:1 shane]:
 > Just to be clear, we are not building a custom solution for a single
 user, so there ARE no performance requirements, in the sense that "we must
 serve X queries per second on hardware Y" or "we must handle N zones of
 size M".

 Yes, I understand that.  But knowing real examples are still very
 useful.  If we have a sufficient number of such examples, we'll be
 able to figure out the sensible target for a general purpose
 implementation.  Even if we have only one or a few examples, we can at
 least discuss it based on some real requirement, and consider whether
 it can be reasonable data point for the general purpose.  That's far
 from just guessing an arbitrary number (or even skipping that) and
 trying random optimization ideas.

 Anyway, thanks for the pointers.  These are exactly what I wanted to
 see.

 > Our main competition in this space is PowerDNS, which has successfully
 worked into DNS hosting markets in Europe.
 >
 > According to this presentation, PowerDNS gets 46k queries per second
 with 10 million domains:
 >
 > http://www.sanog.org/resources/sanog14/sanog14-devdas-dns-
 scalability.pdf

 This looks quite useful, but some important points are missing,
 especially about the benchmark.

 - the query pattern for the test data
 - which DB backend was used in the case of PowerDNS
 - regarding "PowerDNS, RBT", whether it used any DB backend at all, or
   whether it's the in-memory data with hot spot cache.  I suspect it
   was the latter.

 In my rough (and quite possibly incorrect) understanding, the response
 performance of PowerDNS largely relies on its cache, especially its
 full packet (response) cache.  So, if the above query pattern results
 in a high query rate, it's not surprising that it has seemingly very
 good performance.  And, if that's actually based on the pattern of
 their live queries, it means such caching architecture completely
 makes sense; on the other hand, if it's just crafted for the
 experiments, the results are not that informative if not misleading.

 That's one of the things we still need to know (from someone).

 I'd also note that if the startup time is the main issue, the
 imaginary full mmap version of our in-memory data source could also be
 a solution.

 Another thing I wonder is what "user defined / hardcoded schema" means
 (like in p.13 about MyDNS or in p.7 mentioning "user defined
 queries").  In my understanding PowerDNS also defines specific schema
 for zone data.  Did they want to define a different DB schema for
 their zone data and configure the DNS server to send queries based on
 it?  Is that possible with PowerDNS (or BIND 9 DLZ)?

 > We might not need that level of performance, but the user has seen DoS
 attacks of 10k queries per second, so we *do* need at least that level of
 performance.

 Sure, this part is probably the most informative bit in the slides (it
 also suggests in their normal operation the expected qps is much lower
 than that).  I'd note, however, smart attackers would choose query
 patterns so the internal cache isn't very effective, so we need to
 think about achieving that level of performance without benefiting
 from things like a full packet cache.

 > Note that this does not match my previous research, which was from
 around the same time (at my previous job):
 >
 > https://lists.dns-oarc.net/pipermail/dns-
 operations/2009-February/003556.html

 I think your numbers actually match the other slides to some extent.
 I suspect the latter mixed apples and oranges and pineapples and can
 be misleading (I don't think that was intentional, though).  I suspect
 the data in the slides that most match your experiments is the case of
 "PowerDNS, hash as cache, 3M domains".  In that case the cache was
 probably not that effective, so the bottleneck of DB queries should
 affect the overall performance.

 (Your email message to dns-oarc also seems mix different fruits, but
 that's pretty obvious from the context so it's not that misleading:-)

 > In any case, 2k queries/second is not enough.

 If you're referring to my previous experiments:
 https://lists.isc.org/pipermail/bind10-dev/2012-October/003866.html
 I'd note that it only used a single core, and it was on my laptop (not
 that poor in terms of CPU, but generally far from ideal for production
 server operations).  I can also think of other types of optimizations
 that wouldn't rely on cache that depends on a specific query patterns,
 so assuming we do it and run it on a reasonable high performance
 machine with several CPU cores, I guess it's not that difficult to
 reach the 10K-ish qps performance.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2228#comment:3>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development