Natterings about history files
bill davidsen
davidsen at tmr.com
Thu Mar 8 16:06:39 UTC 2001
In article <ylg0gvm6od.fsf at windlord.stanford.edu>,
Russ Allbery <rra at stanford.edu> wrote:
|
| Bill Davidsen <davidsen at tmr.com> writes:
| > And while disk is cheap, building a copy of the database on the fly is
| > still a lot of i/o.
|
| So's doing it throughout the entire day. The total I/O cannot be *more*
| by doing it at night; it's almost certainly *less*.
Less total i/o, but it concentrates the i/o into a short time and gets
much higher io/sec rates, which in turn hurts article accept and reader
lookup to some extent. If you have your servers heavily loaded, maybe
enough to be really visible.
| Third, it *is* more efficient on disk I/O to do it all at the same time
| rather than spreading out small reads and writes throughout the day and
| mixed into your normal news traffic so that your disk cache isn't as
| useful. I think that's pretty obvious.
And expire all at once hits memory, buffers and i/o all at the same
time. If the performance is down a bit all day it's less obvious than a
big drop for an hour. There are tradeoffs, I never meant to state
otherwise.
| I can see places where spreading some portions of expire out is better,
| and I think INN should support that. And you may be correct that
| spreading out all of expire makes for a cleaner and easier architecture to
| maintain. But I don't think it's as obvious of a win as you think.
In terms of overall work, it's probably a loss. In terms of not having
bad times, it's probably a win.
| Sure, it should be possible for people to try stuff out. I think you're
| fooling yourself if you think that steady-state expiration is going to buy
| you any scaling over nightly expire given the same basic underlying
| structures, though. (In other words, different structures will give you
| better performance, but those performance gains would be realized by the
| same structure using a nightly expire too.)
I suspect things have to change anyway, the size of files is getting
near the per-process address space of many systems, and at some point
soon we will lose the ability to mmap() on those systems. I have to do
serious hacks to AIX executables to get more than the default 256MB
process data space, and I can only have 2GB with the fix. This is
address space, not files size, different ugliness.
--
bill davidsen <davidsen at tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
More information about the inn-workers
mailing list