Chapter n+1, ovdb woes and woes
Jaye Mathisen
mrcpu at internetcds.com
Tue Oct 10 05:15:21 UTC 2000
I am(was) getting similar problems with the 10/3 snapshot, and 3.1.17.
I ended up punting back to my 7/28 snapshot with DB 2.7.7 which works just
fine, until I can get a test machine up and running with the new DB stuff,
and see what the problem is.
My scenario was that the first time I ran rc.news, it started everything
up just fine, and everything ran for several hours. Then I rebooted the
machine, tried rc.news again, it fired up ovdb_init, which then sat there
burning up 99% of the CPU, and no I/O that I could see for about 2-3 hours
processing the one .log file, then core dumped. ONly way I could get it
restarted was to wipe out overview.
While I realize this doesn't solve your problem, you aren't alone in
having problems with it. :)
I'm running on FreeBSD 4.1.1.
On Tue, 10 Oct 2000, Anonymous OVDB luser wrote:
>
> Me again. You can turn off your machines and go outside.
>
> Latest ovdb including the patches, 3.1.17, mid-September INN...
>
> Everything seems fine, I announce the fact, and watch. Falls over
> after about an hour. So do I. Oh well, find a few old version ovdb
> processes I hadn't completely killed when doing the upgrade.
>
> Then after another half hour following a fresh start, it's wedged
> again (readers pile in but fail to get a greeting). I kill most
> processes, some with -9. I pay attention to the four monitor
> processes, since I've been unable to get nnrpd to give me a greeting
> when everything else is gone.
>
> I kill the parent, the man page claims it should take its children,
> but while the parent is gone (no -9 needed), all three children are
> still there, won't die with a simple kill, but need a -9.
>
> After the parent was killed, I was able to get nnrpd to give me a
> 503 failure message.
> Oct 10 03:45:49 news02.inet.tele.dk nnrpd[22302]: OVDB: can not open database un
> less ovdb_monitor is running.
> Oct 10 03:45:49 news02.inet.tele.dk nnrpd[22302]: OVDB: ovdb_getlock failed: No
> such process
>
>
> Just for fun, I try ovdb_init standalone, which worked earlier fine as
> part of starting innd.
>
>
> [03:47:17]news at news02.inet.tele.dk:/news/src/inn-STABLE-20000917{1166}$ ovdb_init -r
> ovdb_init: Database is quiescent. Running normal recovery.
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 128241
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2498167
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2498319
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2506595
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2506667
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2679754
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 2679826
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 3055452
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 121927
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5328328
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5328480
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5336756
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 3055524
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 5337853
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5336828
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 6294267
> OVDB: ov00025: close: 16 blocks left pinned
> ovdb_init: Could not open BerkeleyDB: DB_INCOMPLETE: Cache flush was unable to c
> omplete
>
> Such is life. Oh well, start innd:
>
> INND: PID file exists -- unclean shutdown!
> Starting ovdb.
> ovdb_init: Database is quiescent. Running normal recovery.
> OVDB: Log sequence error: page LSN 2033:6873525; previous LSN 2070 128241
> [...]
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 5336828
> OVDB: Log sequence error: page LSN 2032:8697421; previous LSN 2070 6294267
> OVDB: ov00025: close: 16 blocks left pinned
> ovdb_init: Starting ovdb monitor
> Starting innd.
>
> Well. nnrpd seems fine, innd is running...
> Server running
> Allowing remote connections
> Parameters c 14 i 900 (17) l 10000000 o 1011 t 300 H 2 T 60 X 0 normal specified
>
>
> But I still had things freeze up earlier because of the apparent lock.
>
> Am I missing something, like, should I yank all my local hacks out of
> 0917 and merge them along with the latest ovdb beta patch into, oh,
> 06.Okt -STABLE source, or -CURRENT from the same date, or should I
> just eat lots more chocolate?
>
> And why am *I* the only one seeing this on my Slowaris machine,
> always? And why is it cold? And why does it rain? And so on.
>
> I need sleep, if there's anything more I should poke at, let me
> know, but I'll let it run without readers for a spell.
>
> The only thing in the logs was one luser making several parallel
> connections, the first three of which were successful...
>
> Oct 10 03:25:01 news02.inet.tele.dk nnrpd[22066]: SERVER perl filtering enabled
> Oct 10 03:25:01 news02.inet.tele.dk nnrpd[22067]: SERVER perl filtering enabled
> Oct 10 03:25:01 news02.inet.tele.dk nnrpd[22068]: SERVER perl filtering enabled
> Oct 10 03:25:02 news02.inet.tele.dk nnrpd[22066]: kob-he-1-5.dsl.cybercity.dk co
> nnect
> Oct 10 03:25:02 news02.inet.tele.dk nnrpd[22067]: kob-he-1-5.dsl.cybercity.dk co
> nnect
> Oct 10 03:25:02 news02.inet.tele.dk nnrpd[22068]: kob-he-1-5.dsl.cybercity.dk co
> nnect
>
> ... two connections were closed and a new one made, but there are no
> further log entries from process 22069, which has hung...
>
> Oct 10 03:25:03 news02.inet.tele.dk nnrpd[22064]: kob-he-1-5.dsl.cybercity.dk ar
> tstats get 2 time 1441 size 142471
> Oct 10 03:25:03 news02.inet.tele.dk nnrpd[22069]: SERVER perl filtering enabled
>
> ... further connections are closed and logged, with the mysterious
> double logging noted earlier at these times...
>
> Oct 10 03:25:03 news02.inet.tele.dk nnrpd[22065]: kob-he-1-5.dsl.cybercity.dk ar
> tstats get 2 time 1620 size 145286
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk gr
> oup alt.binaries.pictures.PR0N 2
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ex
> it articles 2 groups 1
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ti
> mes user 0.570 system 0.050 elapsed 3.706
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ar
> tstats get 2 time 1879 size 139386
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk gr
> oup alt.binaries.pictures.PR0N 0 <======
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ex
> it articles 2 groups 1
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ti
> mes user 0.570 system 0.060 elapsed 3.709 <=========
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22062]: kob-he-1-5.dsl.cybercity.dk ar
> tstats get 2 time 1879 size 139386
> Oct 10 03:25:04 news02.inet.tele.dk nnrpd[22063]: kob-he-1-5.dsl.cybercity.dk gr
> oup alt.binaries.pictures.PR0N 2
>
> (more connections are made, but also pile up)
>
> Meow.
>
>
>
More information about the inn-workers
mailing list