[bind10-dev] bind10.isc.org web server lagginess
David W. Hankins
dhankins at isc.org
Mon Aug 30 19:50:36 UTC 2010
I added %T to the customlog we have of logs of activity on bind10, so
that after a bit of waiting I could see what's taking the most cpu
time (%T just prints the wall time it took to process the query, but
since we're seeing apache2 spin CPU these are approximately the amount
of CPU time apache2 is using). Some of these seem to take 30-50
seconds of apache2's time per individual hit;
root at bind10:/var/log/apache2# awk 'BEGIN { botsecs=0; totalsecs = 0; } /.*[bB][Oo][Tt].*[0-9]$/ { botsecs += $(NF); } /[0-9]$/ { totalsecs += $(NF); addresses[$1] += $(NF); pages[$7] += $(NF); } END { printf("Top 10 pages:\n"); for (page in pages) { if (pages[page] > 0) { printf("%3d (%d%%) %s\n", pages[page], (pages[page] * 100) / totalsecs, page) | "sort -n | tail -10"; } } close("sort -n | tail -10"); printf("---\n"); printf("Top 5 client addresses:\n"); for (addr in addresses) { if (addresses[addr] > 0) { printf("%3d (%d%%) %s\n", addresses[addr], (addresses[addr] * 100) / totalsecs, addr) | "sort -n | tail -5"; } } close("sort -n | tail -5"); printf("---\nBots: %d (%d%%)\nTotal: %d\n", botsecs, (botsecs*100) / totalsecs, totalsecs); }' access.log ssl_access.log
Top 10 pages:
12 (2%) /timeline?from=2010-08-03T01%3A22%3A09Z%2B0000&precision=second
12 (2%) /timeline?from=2010-08-05T18%3A52%3A56Z%2B0000&precision=second
15 (2%) /timeline?from=2010-03-08T10%3A20%3A20Z%2B0000&precision=second
17 (3%) /timeline?from=2010-05-16&daysback=30
23 (4%) /ticket/216
33 (6%) /roadmap
37 (7%) /timeline
39 (7%) /report/8?USER=each&page=1
91 (17%) /report/8
118 (23%) /timeline?ticket=on&changeset=on&milestone=on&wiki=on&max=50&daysback=90&format=rss
---
Top 5 client addresses:
46 (9%) 95.108.247.253
66 (12%) 66.249.68.100
70 (13%) 24.236.85.181
118 (23%) 149.20.50.219
155 (30%) 2001:4f8:3:65:226:c6ff:fe73:d8a2
---
Bots: 164 (32%)
Total: 511
I'm surprised that apache2's behaviour is to head-of-line block when
one of these long-term hits is in the queue. Everyone else seems to
wait and then gets handled quickly in a burst. This may actually be
due to locking in the web application rather than an apache issue.
--
David W. Hankins BIND 10 needs more DHCP voices.
Software Engineer There just aren't enough in our heads.
Internet Systems Consortium, Inc. http://bind10.isc.org/
More information about the bind10-dev
mailing list