Natterings about history files

bill davidsen davidsen at tmr.com
Thu Mar 8 16:06:39 UTC 2001


In article <ylg0gvm6od.fsf at windlord.stanford.edu>,
Russ Allbery  <rra at stanford.edu> wrote:
| 
| Bill Davidsen <davidsen at tmr.com> writes:

| > And while disk is cheap, building a copy of the database on the fly is
| > still a lot of i/o.
| 
| So's doing it throughout the entire day.  The total I/O cannot be *more*
| by doing it at night; it's almost certainly *less*.

  Less total i/o, but it concentrates the i/o into a short time and gets
much higher io/sec rates, which in turn hurts article accept and reader
lookup to some extent. If you have your servers heavily loaded, maybe
enough to be really visible.


| Third, it *is* more efficient on disk I/O to do it all at the same time
| rather than spreading out small reads and writes throughout the day and
| mixed into your normal news traffic so that your disk cache isn't as
| useful.  I think that's pretty obvious.

  And expire all at once hits memory, buffers and i/o all at the same
time. If the performance is down a bit all day it's less obvious than a
big drop for an hour. There are tradeoffs, I never meant to state
otherwise.

| I can see places where spreading some portions of expire out is better,
| and I think INN should support that.  And you may be correct that
| spreading out all of expire makes for a cleaner and easier architecture to
| maintain.  But I don't think it's as obvious of a win as you think.

  In terms of overall work, it's probably a loss. In terms of not having
bad times, it's probably a win.

| Sure, it should be possible for people to try stuff out.  I think you're
| fooling yourself if you think that steady-state expiration is going to buy
| you any scaling over nightly expire given the same basic underlying
| structures, though.  (In other words, different structures will give you
| better performance, but those performance gains would be realized by the
| same structure using a nightly expire too.)

  I suspect things have to change anyway, the size of files is getting
near the per-process address space of many systems, and at some point
soon we will lose the ability to mmap() on those systems. I have to do
serious hacks to AIX executables to get more than the default 256MB
process data space, and I can only have 2GB with the fix. This is
address space, not files size, different ugliness.
-- 
bill davidsen <davidsen at tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


More information about the inn-workers mailing list