storage tokens duplications
Julien ÉLIE
julien at trigofacile.com
Wed Aug 26 18:15:20 UTC 2009
Hi Kamil,
Thanks for your report!
> My *guess* is that there is only four hex digits descibings position of
> article in *.CAF file.
> Since there were posted ~200 000 articles into single .CF file, "number
> overflow occured", and number were reused. But, its only my gues.
@040200470388000200000000000000000000@
is in timecaf-02/03/4788.CF
04 = TIMECAF
02 = the storage.conf class
00470388 = path 03/4788
0002 = article number in the file
The path is generated at the arrival time.
%convdate -n "Wed, 3 Oct 2007 14:17:36 +0200"
1191413856
in binary: 01000111 00000011 10001000 01100000
therefore, with shifting >> 8, >> 16 and >> 0 (with proper endianness):
00000011/01000111|10001000.CF
which means that during 256 seconds (2^8), articles are put into the same
file. And we can have 65535 articles (2^16-1) in a file if I understand
how that works.
Article number in the file is given by:
s = htons(seqnum); // uint16_t htons(uint16_t hostshort)
memcpy(&token.token[4], &s + (sizeof(s) - 2), 2);
> %head -n 1000000 /var/lib/news/history|cut -f 3 |grep "@04" |sort |
> uniq -c|sort -nr |head -n 10
> 2 @0401004A94E8057000000000000000000000@
can you look at the corresponding file (timecaf-01/94/4AE8.CF) and
see how many articles you have in it?
(maybe a count of 'Message-Id: ' would roughly give that)
Another possibility would be race conditions, as Ray once suggested.
I read in the source:
/*
** variables for keeping track of currently pending write.
** FIXME: assumes only one article open for writing at a time.
*/
Maybe your count will tell us.
Thanks,
--
Julien ÉLIE
« Nam et ipsa scientia potestas est. » (Francis Bacon)
More information about the inn-workers
mailing list