lwresd performance with server down
romash at sonusnet.com
Thu Mar 29 03:17:10 UTC 2012
We are using lwresd to resolve DNS ENUM queries with the cache TTL set to 1 second (effective off) and only two servers on a Solaris 10 Netra 5220 system. Performance is reasonable if the first server is up, but when the first server stops responding, we get unreasonably bad performance.
With a 9.2.6 lwresd, we see only 4 requests leave lwresd, and no further requests sent or processed until these 4 complete. Anything much above 8 requests per second ultimately leads to greater than 10 second responses from lwresd to our application.
With a 9.6 lwresd, performance is better, but at 30 requests per second we are still seeing geater than 1 % of the requests exceeding 3 seconds. At 60 requests per second, this goes up over 10 %.
Can anyone explain either of these behaviors and what we might do to improve this?
I'd also welcome an explanation of the retransmission behavior of lwresd in this sort of situation (or a pointer to documentation that might describe such...).
More information about the bind-users