Rebuilding group.index

Russ Allbery rra at stanford.edu
Wed Nov 10 23:48:40 UTC 2004


Lars Magne Ingebrigtsen <larsi at gnus.org> writes:

> I started up inn again, and it inflooped.  After poking around a bit, I
> discovered that the group.index file was the problem.  If I understand
> things correctly, there's an off_t in there that turned into a loff_t,
> which understandably threw things off.

> So I moved the old group.index file out of the way.

group.index was probably fine; it doesn't contain any offsets.  What
probably broken were all of the individual .IDX files for every individual
group.  Although, I don't know if ino_t changes with large file support;
if it does, you have to patch up group.index as well.

The fastest recovery mechanism is going to be to rewrite all of the .IDX
file to handle the change in data types.  The .IDX file is just a bunch of
these:

/* An entry in the per-group .IDX index file. */
struct index_entry {
    off_t       offset;
    int         length;
    time_t      arrived;
    time_t      expires;        /* Expiration time from Expires: header. */
    TOKEN       token;
};

which means you should be able to write a pretty simple C program to read
in a struct (with off_t set to an unsigned long) and then write it out
again (copying that data into a real off_t).  It should be possible in any
language; the nice thing about C is that you could use the INN headers to
get you the other data sizes easily.  This is going to be faster and more
reliable than pretty much anything else you could do, I think.

If you have to rebuild group.index as well, it's just one of these:

struct group_header {
    int         magic;
    struct loc  hash[TDX_HASH_SIZE];
    struct loc  freelist;
};

followed by some number of these:

struct group_entry {
    HASH        hash;           /* MD5 hash of the group name. */
    HASH        alias;          /* Intended to point to the group this group
                                   is an alias for.  Not currently used. */
    ARTNUM      high;           /* High article number in the group. */
    ARTNUM      low;            /* Low article number in the group. */
    ARTNUM      base;           /* Article number of the first entry in the
                                   .IDX index file for the group. */
    int         count;          /* Number of articles in group. */
    int         flag;           /* Posting/moderation status. */
    time_t      deleted;        /* When this group was deleted, or 0 if the
                                   group is still valid. */    
    ino_t       indexinode;     /* The inode of the index file for the group,
                                   used to detect when the file has been
                                   recreated and swapped out. */
    struct loc  next;           /* Next block in this chain. */
};

where struct loc is just an int (don't ask).  So you want to skip the
first (16 * 1024) * sizeof(int) + 2 * sizeof(int) bytes of that file and
then, for each of the remaining bits of the file, do the same sort of
transformation to increase the ino_t element.

Alternately, you could try building INN with large file support for
history but no large file support for the overview portions of the code,
which might actually work, since INN doesn't pass any off_t's into the
overview code or expect any back from it.

-- 
Russ Allbery (rra at stanford.edu)             <http://www.eyrie.org/~eagle/>

    Please send questions to the list rather than mailing me directly.
     <http://www.eyrie.org/~eagle/faqs/questions.html> explains why.


More information about the inn-workers mailing list