innd lockup

Fletcher Mattox fletcher at cs.utexas.edu
Fri Mar 31 15:26:58 UTC 2000


Just for the record, I just experienced this problem under Solaris 7.
I had seen it fairly frequently on earlier versions of Solaris, but hadn't
seen it since putting up Solaris 7 last summer.  So I assumed that it was
an OS bug, too.  This belief was supported by the fact that every process
(not just innd) which tried to access the disk became similarly stuck.
However, this time, only innd was stuck.  I/O to the disk by other
processes worked just fine.  If I had to bet, I'd probably still put
my money on an OS bug.  But my most recent experience has shaken that
belief a little.

This is a nearly impossible problem to debug, because you can't
ptrace(2) the process once it gets in this state, so debuggers
and truss won't work.  I was able to one line out of /usr/proc/bin/pstack,
which indicated innd was inside write(2).  Duh.

> From: brian at UCSD.Edu (Brian Kantor)
> Subject: Re: innd lockup
> 
> >
> >> I am currently experiencing a problem where innd locks up completely at
> >> random times.  In this state not even a kill -9 will do it and I have to
> >> reboot.
> >
> I missed which operating system you're running.
> 
> I have encountered this problem as well, but I believe it is a flaw in
> the OS I'm using (BSD/OS 4.1) which may only be triggered by a very busy
> system.  I would not be surprised to hear that the same thing is occuring
> in other 4.4BSD-derived OSs.
> 
> When it occurs, INND and other processes are stuck in 'inode' wait,
> which is a highly unusual place to see a process stuck.  No kill will
> release that, necessitating a reboot.  In some cases, the reboot will
> fail (usually the last message on the console is 'syncing filesystems'),
> and the system never shuts down, necessitating a hardware reboot.
> 
> I don't think it's INN that's doing it, as our Cyrus mail server has
> done this as well.  My suspicions lie in the area of memmapped file
> I/O, but that's just a guess.  I've gone no further for lack of time.
> Perhaps someone else can.
> 	- Brian
> 
> 
> 




More information about the inn-workers mailing list