OVDB crashing on db->put with BerkeleyDB (was ovdb.c warning with INN 2.3.3 and BerkeleyDB)
Steven C. Neighorn
neighorn at qiclab.scn.rain.com
Wed Jun 5 15:19:28 UTC 2002
[In the message entitled OVDB crashing on db->put with BerkeleyDB (was ovdb.c warning with INN 2.3.3 and BerkeleyDB) , On Wed, Jun 05, 2002 at 02:52:25PM +0200, Antoine Delvaux wrote:]
> > > crashing very often and leaving the overview DB not in a clean state,
> > > we were wondering if this warning could indeed be a problem.
> > >
> > Crashing or hanging? If its hanging its sounds like its the (known?)
> > problem with the mutex handling going wrong.
> Well, crashing.
> Here is what it is found in the log (crash last hour...) :
> Jun 5 14:09:39 naxos innd: [ID 200958 news.error] OVDB: add: db->put:
> Invalid argument
> Jun 5 14:09:39 naxos innd: [ID 508530 news.error] SERVER cant store
> overview for @03024348554E4B36350000141E8800000003@
> What is quite strange is that we were having this server running very
> fine during nearly a year, and since a few weeks the crashes are getting
> very recurent (one every two days !). So we're really considering
> switching to another solution.
> > I did start digging into it, but really got nowhere (and we're now
> > running tradindexed) so I stopped looking. Solaris and ovdb seem to
> > be the common factor.
> >  with a full feed - we must be mad!
> So our other solution semms not being in a different overview storage
> method. We have the choice of switching to another platform on another
> box (Linux...?) or to another software (Diablo ?).
> Anyone have clues how to solve this OVDB problem ?
I would like to chime in with a similar description of a problem.
This is a msg I posted back in November of 2001:
>My system is running inn-STABLE-20011009, and uses CNFS and ovdb with BerkeleyDB.3.3.11
>and after a clean start expireover takes a little longer each day. After about
>3 weeks I believe, my expireover takes (from the nightly report):
>expireover start Wed Nov 14 03:03:31 PST 2001: ( -Z/usr/local/news/log/expire.lowmark)
>expireover end Wed Nov 14 09:09:17 PST 2001
>lowmarkrenumber begin Wed Nov 14 09:09:17 PST 2001: (/usr/local/news/log/expire.lowmark)
>lowmarkrenumber end Wed Nov 14 09:09:18 PST 2001
>This is with: /usr/local/news/bin/news.daily expireover lowmark
>I also have the "once every 2-4 weeks all the news processes get stuck" problem, ie
>they are all in lwp_mutex_lock(). I can kill TERM/9 everything, clean out the __db.???
>in the overview dir, and restart. I have not had to reboot after a hang. This lockup
>also appears to be quantum-partically linked to my leaving the building, at lunch,
>or being otherwise occupied.
I am still seeing the exact same problem as Antoine and it has survived INN upgrades.
I am now running inn-2.3.3 and BerkeleyDB.3.3.11 on solaris 8 for x86. I see these
sorts of errors every few weeks:
May 5 16:00:32 quark innd: [ID 200958 news.error] OVDB: add: db->put: Invalid argument
May 5 16:04:49 quark nnrpd: [ID 843286 news.error] OVDB: groupnum: get: DB_RUNRECOVERY: Fatal error, run database recovery
May 5 16:04:57 quark nnrpd: [ID 843286 news.error] OVDB: groupnum: get: DB_RUNRECOVERY: Fatal error, run database recovery
Jun 1 18:46:42 quark innd: [ID 200958 news.error] OVDB: add: db->put: Invalid argument
Jun 1 18:46:42 quark innd: [ID 508530 news.error] SERVER cant store overview for @0301425546303600000005016EBD0000000D@
When this happens, I have to kill -TERM or -9 all news processes, dbprocs, etc, and then clean out
and then restart news, and let ovdb_recover run, and then, usually, I am ok again.
I know inn-CURRENT-* has OVDB2 but I haven't tried it yet.
Steven C. Neighorn neighorn at qiclab.scn.rain.com http://www.scn.rain.com
SCN Research, Inc. "Where we train the Star Fighters who defend the
9120 N.W. Wiley Lane frontier against Xur and the Ko-dan Armada"
Portland, Oregon 97229-8067 Voice: +1 (503) 297-3039 Fax: +1 (503) 297-3726
More information about the inn-workers