Slow expireover (again)
davidsen at tmr.com
Mon May 24 12:42:47 UTC 2004
Russ Allbery wrote:
> Mike Zanker <mike-sender-6677e0 at zanker.org> writes:
>>This particular box has only ever has 2.4.1 on it. Initially it was a
>>feeder+reader with tradspool. Pretty soon expireover and expire times
>>started going downhill so I changed to timecaf which seemed to help.
> Did they go down in direct proportion to the number of articles on the
> system, or did they start going downhill later on after the article count
> had stabilized?
>>I then moved feeding to a new P4 with 1GB RAM and U320 SCSI disks (which
>>rocks) and changed the Ultra 10 to a reader using timehash (because
>>timehash had worked well on the linux box). This is when expireover
>>times really started increasing, especially when I started carrying a
>>full text-only feed (about 41,000 groups).
> What's puzzling me in part is why your system is so much slower than mine,
> when I'm also running Solaris (albeit on much faster disk). But I do have
> only a fourth as many newsgroups as you do, so if the time to change
> between groups while expiring is actually dominating, that could explain
> it. That would intuitively surprise me, though. Hm.
> I wonder if we could manage to build a profiled expireover and gather some
> statistics from that as to what functions are dominating the profile.
>>Thanks for looking into this - I'm sure expireover will still be running
>>first thing tomorrow so I'll try and get a more interesting trace.
> Sure thing. Slowness in expiration is one of INN's big problems. I have
> a few ideas as to how to possibly fix it, but none of them are completely
> baked yet.
The problem is really that the whole expire is probably a bad idea at
current volumes. Old articles accumulate all day and then we try to
clean up all at once. That may have made sense in the days when the
system was idle from 2-6am in most places, but it dosen't see to fit
now. In the long run trying to improve it is a band-aid, like coding a
bubble sort in assembler. Like the leap from file-per-article to CNFS,
to get an order of magnitude improvement you need to change WHAT you do,
not HOW you do it.
I think it's time for a radical rethink of history and overview. The
idea of overview in a database is appealing, but other than trying it I
am just not able to even guess if it would be fast enough. I have some
ideas on history, but until I find some time they will be just ideas.
Hopefully in the next 6-12 months I will have time to try them.
-bill davidsen (davidsen at tmr.com)
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me
More information about the inn-workers