Theory questions about ovdb

Fri Jul 7 14:18:45 UTC 2000

[Posted to the list, but I want Heath to see and respond here,
since I haven't seen this discussed on list.]

Two short questions, and then the important one.
---------------------------------------------------------

How often is delete_old_stuff(), called and, for how many
newsgroups does it call delete_all_records()?  

In other words, what does the Berkeley DB do internally
for a:
   key.data = &dk;
   key.size = sizeof dk;
   val.flags = DB_DBT_PARTIAL;

   ret = dbcursor->c_get(dbcursor, &key, &val, DB_SET_RANGE)) {
   loop
     ret = dbcursor->c_get(dbcursor, &key, &val, DB_NEXT);

Is it O(1)?  (I hope!)

---------------------------------------------------

Does it make sense to add a retry limit and failure with log
(such as in ovdb_groupadd() and elsewhere.)  How often
in real life circumstances does a retry get needed?  Right
now it looks like it could infinite loop.

---------------------------------------------------

Now the big question.

The other overview schemes which have been used in INN
have (more or less) well known read/write/seek characteristics.

How do the internals of Berkeley DB work for things like:
   - groupstats->put(), and db->put()  
       (done for each call to ovdb_add())

   - and c_get()
       (done in ovdb_search())

I guess the main questions for comparison are:

     Are records with the same key clustered on disk?

     Is there any magic that new allocations ( ->put() )
     can write to sequential locations on disk?  

I don't see how either of these is possible, unless
writing into a fresh database, or one that was swept
and packed, or maybe some other magic.  

If there is none of these, there is a per-article seek
penalty (2 for writing, 1 for reading.)

For reading, that sounds a lot like uniover to me.
   (uniover is slow at reading, fast at writing.)

For writing, that sounds a lot like tradindexed.
   (tradindexed is slow at writing, and fast at reading.)

Hope you can shed some light on this for us.

Thanks,

Forrest

Since this is Berkeley DB