Theory questions about ovdb

Forrest J. Cavalier III mibsoft at
Fri Jul 7 14:18:45 UTC 2000

[Posted to the list, but I want Heath to see and respond here,
since I haven't seen this discussed on list.]

Two short questions, and then the important one.

How often is delete_old_stuff(), called and, for how many
newsgroups does it call delete_all_records()?  

In other words, what does the Berkeley DB do internally
for a: = &dk;
   key.size = sizeof dk;
   val.flags = DB_DBT_PARTIAL;

   ret = dbcursor->c_get(dbcursor, &key, &val, DB_SET_RANGE)) {
     ret = dbcursor->c_get(dbcursor, &key, &val, DB_NEXT);

Is it O(1)?  (I hope!)


Does it make sense to add a retry limit and failure with log
(such as in ovdb_groupadd() and elsewhere.)  How often
in real life circumstances does a retry get needed?  Right
now it looks like it could infinite loop.


Now the big question.

The other overview schemes which have been used in INN
have (more or less) well known read/write/seek characteristics.

How do the internals of Berkeley DB work for things like:
   - groupstats->put(), and db->put()  
       (done for each call to ovdb_add())

   - and c_get()
       (done in ovdb_search())

I guess the main questions for comparison are:

     Are records with the same key clustered on disk?

     Is there any magic that new allocations ( ->put() )
     can write to sequential locations on disk?  

I don't see how either of these is possible, unless
writing into a fresh database, or one that was swept
and packed, or maybe some other magic.  

If there is none of these, there is a per-article seek
penalty (2 for writing, 1 for reading.)

For reading, that sounds a lot like uniover to me.
   (uniover is slow at reading, fast at writing.)

For writing, that sounds a lot like tradindexed.
   (tradindexed is slow at writing, and fast at reading.)

Hope you can shed some light on this for us.



Since this is Berkeley DB

More information about the inn-workers mailing list