I would really like to see expire gone

bill davidsen davidsen at tmr.com
Sun Jan 26 14:03:17 UTC 2003


In article <p05100314ba571b3a2f6f@[128.253.64.63]>,
Todd Olson  <tco2 at cornell.edu> wrote:

| I suspect the Sleepycat database would be way to slow.
| The e-mail-list server, Listproc, uses ver 2.6.7 of Sleepcat
| to build a simple data base that lets you answer the question
| 'what e-lists is this address on'
| 
| when this list has be be rebuild from scratch with approximately 100,000
| addresses, it takes close to two hours on a Sun E250 (dual 400Mhz ultrasparc
| processors, 2G RAM, fastwide SCSI mirrored disks, ufs with veritas volume mgmt).
| 
| When the same build is done to a tmp filesystem, it takes about 1 min.

If you mean that it takes that time using sleepycat but faster disk,
then clearly the db itself isn't the bottleneck (ie. CPU usage). Clearly
any database is going to depend on disk speed to some extent, sure as
hell dbz does.

And this is a very different problem, I assume that it is building the
db in such a way that it is fast to use, rather than fast to build slow
to use. So it may be creating an index from the list to each name, and
from each name to each list. Since the name may be on multiple lists it
has to handle multiple records with the same index, something we try to
avoid indexing on msg-id in news. And "duplicate entry" warnings are not
a surprize during expire, so I have to think that dbz doesn't do a
perfect job every time.

| Most of the time with the disks the system seems to be waiting for
| the disk system to commit a data item written to the database.

Does "tmp filesystem" mean ramdisk, or just a filesystem with a less
robust method?

| I would worry that using something like Sleepcat would greatly slowdown
| the handling of incoming news.

Eliminating expire with a database doesn't have to mean sleepycat, just
a database which allows removal of entries. Can someone comment on the
performance of db for overview? I would think that's at least as
intensive as history, since info needs to be in multiple places for
cross posting. I haven't tried it in ages.
-- 
bill davidsen <davidsen at tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


More information about the inn-workers mailing list