Causes of NULL message-IDs (was Re: INN 2.4.0 dumps core on Sola

Forrest J. Cavalier III mibsoft at epix.net
Mon Oct 27 15:47:32 UTC 2003


> Hm.  Okay, I have no clue.  The message ID is null deep into the middle of
> ARTpost but isn't null at the beginning of that same routine.  So something
> else is going on, but lots of other people are running this same code without
> this problem.  I have no idea what's going on here.
> 

I've recently seen segfaults due to NULL Message-ID, but not in that
bit of code.  I don't think it is related, but it might give someone
an idea....

What happened (I think) was that an NFS MMAP'ed tradindexed had mapped in a
tradindexed.DAT file, which was then later extended by a writer running on
a different box.  

The reading box Linux kept the old copy of the page in memory,
extended with NUL fills!  The .IDX updated properly, so when
nnrpd got the overview record, everything was NUL.  I think nnrpd
was smart enough not to send this record back to the
client, but it tried to add the Message ID to the history cache,
which segfaulted in HashMessageID.

Synchronizing MMAPed NFS reads on reader boxes different than writer
box is turning into a tremendous problem in a current INN deployment.

nfsreader is set to true, and I'm seeing the inode meta information
getting updated, and nnrpd does the tradindexed remap.  But Linux 
persists in using the already-present, but stale page, and not fetching
the updated one.  (It may be that more than one process has the
same page mapped, and an individual nnrpd request for remapping is
insufficient.)  I'm hoping maybe there are some nfs/vfs/Linux gurus on list
who can help me understand it.

I've totally given up mmapping group.index, and now a method
of copying group.index to the local file system is being used.  That
works.  But doing that for all the overviews is impractical.

I'm working on a solution to do overview deltas, but it isn't simple,
for obvious reasons.  The only guaranteed-correct MMAPed overview
files are on the writer, so we have to ask the writer to return
everything we need to update local overviews.  And if articles
arrive out of order (the writer is an xrefslave), doing an
"XOVER <localhi>-" isn't going to find the backfills.

No easy solutions that I see, but I nearly have code which solves
it.  It turns out that modifying nnrpd to give back tokens and
XOVER data including backfills was pretty easy.  Efficiently selecting
and updating overviews is the hard part in this.

Anyone else here have any insights?




More information about the inn-workers mailing list