[bind10-dev] bind10.isc.org web server lagginess

David W. Hankins dhankins at isc.org
Mon Aug 30 19:50:36 UTC 2010


I added %T to the customlog we have of logs of activity on bind10, so
that after a bit of waiting I could see what's taking the most cpu
time (%T just prints the wall time it took to process the query, but
since we're seeing apache2 spin CPU these are approximately the amount
of CPU time apache2 is using).  Some of these seem to take 30-50
seconds of apache2's time per individual hit;

root at bind10:/var/log/apache2# awk 'BEGIN { botsecs=0; totalsecs = 0; }  /.*[bB][Oo][Tt].*[0-9]$/ { botsecs += $(NF); } /[0-9]$/ { totalsecs += $(NF); addresses[$1] += $(NF); pages[$7] += $(NF); } END { printf("Top 10 pages:\n"); for (page in pages) { if (pages[page] > 0) { printf("%3d (%d%%) %s\n", pages[page], (pages[page] * 100) / totalsecs, page) | "sort -n | tail -10"; } } close("sort -n | tail -10"); printf("---\n"); printf("Top 5 client addresses:\n"); for (addr in addresses) { if (addresses[addr] > 0) { printf("%3d (%d%%) %s\n", addresses[addr], (addresses[addr] * 100) / totalsecs, addr) | "sort -n | tail -5"; } } close("sort -n | tail -5"); printf("---\nBots:  %d (%d%%)\nTotal: %d\n", botsecs, (botsecs*100) / totalsecs, totalsecs); }' access.log ssl_access.log

Top 10 pages:
 12 (2%) /timeline?from=2010-08-03T01%3A22%3A09Z%2B0000&precision=second
 12 (2%) /timeline?from=2010-08-05T18%3A52%3A56Z%2B0000&precision=second
 15 (2%) /timeline?from=2010-03-08T10%3A20%3A20Z%2B0000&precision=second
 17 (3%) /timeline?from=2010-05-16&daysback=30
 23 (4%) /ticket/216
 33 (6%) /roadmap
 37 (7%) /timeline
 39 (7%) /report/8?USER=each&page=1
 91 (17%) /report/8
118 (23%) /timeline?ticket=on&changeset=on&milestone=on&wiki=on&max=50&daysback=90&format=rss
---
Top 5 client addresses:
 46 (9%) 95.108.247.253
 66 (12%) 66.249.68.100
 70 (13%) 24.236.85.181
118 (23%) 149.20.50.219
155 (30%) 2001:4f8:3:65:226:c6ff:fe73:d8a2
---
Bots:  164 (32%)
Total: 511


I'm surprised that apache2's behaviour is to head-of-line block when
one of these long-term hits is in the queue.  Everyone else seems to
wait and then gets handled quickly in a burst.  This may actually be
due to locking in the web application rather than an apache issue.

-- 
David W. Hankins	BIND 10 needs more DHCP voices.
Software Engineer		There just aren't enough in our heads.
Internet Systems Consortium, Inc.		http://bind10.isc.org/



More information about the bind10-dev mailing list