Chapter n+1, ovdb woes and woes

Jaye Mathisen mrcpu at internetcds.com
Tue Oct 10 05:15:21 UTC 2000



I am(was) getting similar problems with the 10/3 snapshot, and 3.1.17.

I ended up punting back to my 7/28 snapshot with DB 2.7.7 which works just
fine, until I can get a test machine up and running with the new DB stuff,
and see what the problem is.

My scenario was that the first time I ran rc.news, it started everything
up just fine, and everything ran for several hours.  Then I rebooted the
machine, tried rc.news again, it fired up ovdb_init, which then sat there
burning up 99% of the CPU, and no I/O that I could see for about 2-3 hours
processing the one .log file, then core dumped.  ONly way I could get it
restarted was to wipe out overview.

While I realize this doesn't solve your problem, you aren't alone in
having problems with it. :)

I'm running on FreeBSD 4.1.1.  

On Tue, 10 Oct 2000, Anonymous OVDB luser wrote:

> 
> Me again.  You can turn off your machines and go outside.
> 
> Latest ovdb including the patches, 3.1.17, mid-September INN...
> 
> Everything seems fine, I announce the fact, and watch.  Falls over
> after about an hour.  So do I.  Oh well, find a few old version ovdb
> processes I hadn't completely killed when doing the upgrade.
> 
> Then after another half hour following a fresh start, it's wedged
> again (readers pile in but fail to get a greeting).  I kill most
> processes, some with -9.  I pay attention to the four monitor
> processes, since I've been unable to get nnrpd to give me a greeting
> when everything else is gone.
> 
> I kill the parent, the man page claims it should take its children,
> but while the parent is gone (no -9 needed), all three children are
> still there, won't die with a simple kill, but need a -9.
> 
> After the parent was killed, I was able to get nnrpd to give me a
> 503 failure message.
> Oct 10 03:45:49 news02.inet.tele.dk nnrpd[22302]: OVDB: can not open database un
> less ovdb_monitor is running.
> Oct 10 03:45:49 news02.inet.tele.dk nnrpd[22302]: OVDB: ovdb_getlock failed: No
> such process
> 
> 
> Just for fun, I try ovdb_init standalone, which worked earlier fine as
> part of starting innd.
> 
> 
> [03:47:17]news at news02.inet.tele.dk:/news/src/inn-STABLE-20000917{1166}$ ovdb_init  -r
> ovdb_init: Database is quiescent.  Running normal recovery.
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 128241
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2498167
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2498319
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2506595
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2506667
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2679754
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2679826
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 3055452
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 121927
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5328328
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5328480
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5336756
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 3055524
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 5337853
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5336828
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 6294267
> OVDB: ov00025: close: 16 blocks left pinned
> ovdb_init: Could not open BerkeleyDB: DB_INCOMPLETE: Cache flush was unable to c
> omplete
> 
> Such is life.  Oh well, start innd:
> 
> INND:  PID file exists -- unclean shutdown!
> Starting ovdb.
> ovdb_init: Database is quiescent.  Running normal recovery.
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 128241
> [...]
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5336828
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 6294267
> OVDB: ov00025: close: 16 blocks left pinned
> ovdb_init: Starting ovdb monitor
> Starting innd.
> 
> Well.  nnrpd seems fine, innd is running...
> Server running
> Allowing remote connections
> Parameters c 14 i 900 (17) l 10000000 o 1011 t 300 H 2 T 60 X 0 normal specified
> 
> 
> But I still had things freeze up earlier because of the apparent lock.
> 
> Am I missing something, like, should I yank all my local hacks out of
> 0917 and merge them along with the latest ovdb beta patch into, oh,
> 06.Okt -STABLE source, or -CURRENT from the same date, or should I
> just eat lots more chocolate?
> 
> And why am *I* the only one seeing this on my Slowaris machine,
> always?  And why is it cold?  And why does it rain?  And so on.
> 
> I need sleep, if there's anything more I should poke at, let me
> know, but I'll let it run without readers for a spell.
> 
> The only thing in the logs was one luser making several parallel
> connections, the first three of which were successful...
> 
> Oct 10 03:25:01 news02.inet.tele.dk nnrpd[22066]: SERVER perl filtering enabled
> Oct 10 03:25:01 news02.inet.tele.dk nnrpd[22067]: SERVER perl filtering enabled
> Oct 10 03:25:01 news02.inet.tele.dk nnrpd[22068]: SERVER perl filtering enabled
> Oct 10 03:25:02 news02.inet.tele.dk nnrpd[22066]: kob-he-1-5.dsl.cybercity.dk co
> nnect
> Oct 10 03:25:02 news02.inet.tele.dk nnrpd[22067]: kob-he-1-5.dsl.cybercity.dk co
> nnect
> Oct 10 03:25:02 news02.inet.tele.dk nnrpd[22068]: kob-he-1-5.dsl.cybercity.dk co
> nnect
> 
> ... two connections were closed and a new one made, but there are no
> further log entries from process 22069, which has hung...
> 
> Oct 10 03:25:03 news02.inet.tele.dk nnrpd[22064]: kob-he-1-5.dsl.cybercity.dk ar
> tstats get 2 time 1441 size 142471
> Oct 10 03:25:03 news02.inet.tele.dk nnrpd[22069]: SERVER perl filtering enabled
> 
> ... further connections are closed and logged, with the mysterious
> double logging noted earlier at these times...
> 
> Oct 10 03:25:03 news02.inet.tele.dk nnrpd[22065]: kob-he-1-5.dsl.cybercity.dk ar
> tstats get 2 time 1620 size 145286
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk gr
> oup alt.binaries.pictures.PR0N 2
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ex
> it articles 2 groups 1
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ti
> mes user 0.570 system 0.050 elapsed 3.706
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ar
> tstats get 2 time 1879 size 139386
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk gr
> oup alt.binaries.pictures.PR0N 0 <======
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ex
> it articles 2 groups 1
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ti
> mes user 0.570 system 0.060 elapsed 3.709 <=========
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ar
> tstats get 2 time 1879 size 139386
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22063]: kob-he-1-5.dsl.cybercity.dk gr
> oup alt.binaries.pictures.PR0N 2
> 
> (more connections are made, but also pile up)
> 
> Meow.
> 
> 
> 




More information about the inn-workers mailing list