INN 2.5.4 strange crash

Petr Novopashenniy pety at rusnet.ru
Wed Feb 18 13:40:47 UTC 2015


Good day, Julien!

On Thu, 5 Feb 2015, Julien ?LIE wrote:

J?? Hi Petr,
J?? 
J?? > J??     if (!CHANsleeping(cp)) {
J?? > J??         notice("%s was not sleeping", CHANname(cp));
J?? > J??         return;
J?? > J??     }
J?? > 
J?? > I see "was not sleeping" for all peers (many times per second) after start
J?? > patched innd.
J?? 
J?? The interesting stuff is whether the line "was not sleeping" occurs after
J?? "sleeping without Waker".  If it occurs for the same peer, it is abnormal
J?? and would explain that SCHANremove does not remove the sleeping state.

Today, I get this error:

Feb 18 12:09:17 ns innd: innfeed-neva2! exit 0 elapsed 31726 pid 16361
Feb 18 12:09:17 ns innd: innfeed-neva2!:49:proc:50653 was not sleeping
Feb 18 12:09:17 ns innd: innfeed-neva2!:49:proc:50653 49 sleeping without 
Waker
Feb 18 12:09:17 ns innd: innfeed-neva2!:49:proc:50653 closed
Feb 18 12:09:17 ns innd: ?:49:proc was not sleeping
Feb 18 12:09:17 ns innfeed[50653]: ME source lost . Exiting

[dd]

Feb 18 12:09:17 ns innd: aspen1.stu.neva.ru:49 was not sleeping
Feb 18 12:09:17 ns innd: aspen1.stu.neva.ru:49 49 sleeping without Waker
Feb 18 12:09:17 ns innd: aspen1.stu.neva.ru:49 checkpoint seconds 0 
accepted 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size 
0 rejected size 0
Feb 18 12:09:17 ns innd: aspen1.stu.neva.ru:49 closed seconds 0 accepted 0 
refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size 0 rejected 
size 0
Feb 18 12:09:17 ns innd: aspen1.stu.neva.ru:49 was not sleeping

And several days before:

Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com connected 26 
streaming allowed
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 was not sleeping
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 26 sleeping 
without Waker
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 checkpoint seconds 
0 accepted 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate 
size 0 rejected size 0
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 closed seconds 0 
accepted 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size 
0 rejected size 0
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 was not sleeping
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com connected 26 
streaming allowed
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 was not sleeping
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 26 sleeping 
without Waker
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 checkpoint seconds 
0 accepted 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate 
size 0 rejected size 0
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 closed seconds 0 
accepted 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size 
0 rejected size 0
Feb 15 03:20:08 ns innd: voer-me.highwinds-media.com:26 was not sleeping
Feb 15 03:20:08 ns innd: free:-1 26 free but was in SMASK


J?? 
J?? 
J?? > Feb  5 11:15:20 ns innd: innfeed-neva1!:63:proc:77050 cant write: Resource temporarily unavailable
J?? > Feb  5 11:15:20 ns innd: innfeed-neva1!:63:proc:77050 blocked sleeping 5
J?? > Feb  5 11:15:20 ns innd: innfeed-neva1!:63:proc:77050 cant write: Resource temporarily unavailable
J?? > Feb  5 11:15:20 ns innd: innfeed-neva1!:63:proc:77050 blocked sleeping 5
J?? >
J?? > And have many "wakeup" messages.
J?? 
J?? Incidentally, these EAGAIN errors are very frequent on your server.
J?? Don't you have issues with other programs?  Is it only happening with INN?

Looks like, yes, only with innd.
But, I remind you, this is old FreeBSD box (4.11). Same INN version works 
without this problem on more modern Linux box. 

INN 2.4.3 worked fine on this FreeBSD box, from time to time I have seen 
"blocked sleeping" messages, but innd never crash.

--pety



More information about the inn-workers mailing list