"tdx-util -A" spurious "xxx incorrect" warnings in 2.5.x ?

Julien ÉLIE julien at trigofacile.com
Wed Oct 26 20:47:07 UTC 2011


Hi Matija,

> when I run "tdx-util -A" on tradspool/tradindex based
> inn2-2.5.2-2~squeeze1 (Debian Stable inn2) I get errors like:
> 
> tdx-util: tradindexed: count incorrect for alt.fan.tolkien: 68 != 69
> tdx-util: tradindexed: low water mark incorrect for misc.answers: 2566 != 2561
> tdx-util: tradindexed: count incorrect for misc.answers: 5 != 10
> tdx-util: tradindexed: count incorrect for misc.health.diabetes: 321 != 326
> tdx-util: tradindexed: count incorrect for rec.answers: 47 != 50
> tdx-util: tradindexed: count incorrect for rec.arts.books.tolkien: 202 != 203
> tdx-util: tradindexed: count incorrect for rec.autos.makers.chrysler: 18 != 20
> tdx-util: tradindexed: count incorrect for rec.games.trading-cards.magic.rules: 7 != 10

I also see similar things on my server (2.5.3rc1).
Let's have a look at:

tdx-util: tradindexed: count incorrect for fr.sci.sociologie: 941 != 942




LIST ACTIVE fr.sci.sociologie
215 Newsgroups in form "group high low status"
fr.sci.sociologie 0000003144 0000001251 y
.

LIST COUNTS fr.sci.sociologie
215 Newsgroups in form "group high low count status"
fr.sci.sociologie 3144 1251 942 y
.

% tdx-util -i -n fr.sci.sociologie            
fr.sci.sociologie 3144 1251 1123 942 y 0 14155854

% tdx-util -g -n fr.sci.sociologie | wc -l
2022

% tdx-util -g -n fr.sci.sociologie | grep -v 000000000000000000000000000000000000 | wc -l
941


Hmm, that's interesting.
"tdx-util -g -n fr.sci.sociologie" returns lines like:

3142 0 0 0 0 @000000000000000000000000000000000000@
3143 778359 1070 1319608541 0 @030346523100000000000005B99B00000003@
3144 0 0 0 0 @000000000000000000000000000000000000@

so 941 is the count of still valid tokens.




> the list changes over time. It seems like the race condition when the
> data changes while tdx-utils runs. Does "tdx-util -A" under 2.5.x
> needs to run under throttled/stopped server?

No it does not.
I have just tested with a throttled server, and the result is the same.



> This problem never happened under 2.4.6 for several years, and under
> 2.5.2 it happens practically every nigtly run. Maybe tdx-util just
> didn't have those checks back then, or it might have been locking
> data itself to ensure consistency?
> 
> Ideas?

I bet that it comes from cancelled articles.

% grephistory -l '<hk14rdff$7mrr1$1 at news.trigofacile.com>'
[99C1F7FE2815C88E2597F629C1D7BAE9]	1280501569~-~1280489007	@050000000007000001CD0000000000000000@

% tdx-util -i -n trigofacile.test
trigofacile.test 516 7 1 461 y 0 11731380

% tdx-util -g -n trigofacile.test | grep -v 000000000000000000000000000000000000 | wc -l
461

% ctlinnd cancel '<hk14rdff$7mrr1$1 at news.trigofacile.com>'
Ok

% tdx-util -g -n trigofacile.test | grep -v 000000000000000000000000000000000000 | wc -l
460

% tdx-util -i -n trigofacile.test
trigofacile.test 516 7 1 461 y 0 11731380

... and:

% tdx-util -A
tdx-util: tradindexed: count incorrect for trigofacile.test: 460 != 461




Changes in INN 2.5.2:
"When using tradindexed, the overview data for a cancelled article is now immediately
removed from the overview."

http://inn.eyrie.org/trac/changeset/8966




As for your second reported error ("low water mark incorrect for misc.answers: 2566 != 2561"),
it may also be related because you have "count incorrect for misc.answers: 5 != 10" at the same time.




I am unsure for the right fix.
When an article is added, the count is increased by tdx_data_add() in tdx-group.c called
by tradindexed_add() in tradindexed.c.
I believe that the count should be decreased when an article is cancelled.  It is not currently
done.  Properly doing that seems to imply to modify the prototype of tdx_data_cancel() in tdx-group.c
because the entry is not currently given to this function called by tradindexed_cancel().
We need to decrease entry->count.

bool tdx_data_add(struct group_index *, struct group_entry *,
                  struct group_data *, const struct article *);

bool tdx_data_cancel(struct group_data *, ARTNUM);

=> bool tdx_data_cancel(struct group_entry *, struct group_data *, ARTNUM);


** does it look like the right thing to do? (question probably for Russ who knows far better
than I the internals of tradindexed)




Regarding other overview methods, hmm...

bool buffindexed_cancel(const char *group UNUSED, ARTNUM artnum UNUSED)
{
    return true;
}

bool ovdb_cancel(const char *group UNUSED, ARTNUM artnum UNUSED)
{
    return true;
}

well, not many things to do!  Anyway, we do not have tools like tdx-util for them.
(Anyway, the count returned by LIST ACTIVE/COUNTS does not need to be exact.)

-- 
Julien ÉLIE

« Cinq millions et demi de conducteurs français ont une mauvaise vue.
  Heureusement, leur nombre diminue de jour en jour. » (Coluche)



More information about the inn-workers mailing list