wipcheck and wipexpire

Mon May 15 17:47:42 UTC 2000

In article <20000514003441N.kondou at inn.do.mms.mt.nec.co.jp>, you say...
>
>In article <200005122039.NAA82428 at pathlink.net>,
>        Kachun Lee <kachun at pathlink.com> wrote;
>
>} I could not understand why there should be wipcheck and wipexpire, instead
>} of just one value. Any insight will be greately appreciated. 
>
>The reason why both exist is to differenctiate offered channel
>from other channels.  I think both can be the same in most
>cases.  That's simply to have more flexibility, though I admit
>that may lead people confused.
>-- 
>Katsuhiro Kondou
>

Thank you for the response. The duplicate reject number on our inn 2.2 feed 
server was very high - close to 80+% the accepted number. I finally decided to 
look into it last week, since that should almost double our inbound bandwidth.

After turning on trace and putting in some more syslog, I found out that it was 
quite typical for 7-8 of our peers to offer the same articles within a few 
seconds. With current default (wipcheck=5, wipexpire=10), the 1st CHECK was 
replied with SENDIT. Then 1 or 2 of the followed CHECKs were replied with 
RESEND and that indicated the msgid was in WIP. However, followed CHECKs after 
2-3 seconds of the first one were responsed with SENDITs instead and resulted 
with 3-5 DUPLICATEs and caused the same articles send over many times.

After examined the source code, I saw that was how the WIPinprogress was coded. 
After wipcheck (5sec), WIPinprogress would return negative, until wipexpire. In 
other word, any same msgid CHECKs came in between 5-10sec after the 1st one 
would result with a DUPLICATE. Even one may argue that 5 sec for response to a 
CHECK is a long time, I do not see a practial reason for having this gap.

If different behavior was intended for offered channel between wipcheck and 
wipexpire, I think it should be the inverted, i.e. negative for the offered 
channel and positive for others. The logic is that if the offered channel send 
another CHECK for the same msgid, it must have lost the previous SENDIT 
response, so it should be responsed with SENDIT again. And others should always 
response with RESEND as long as the msgid is in WIP.

Also, in real time, look like the wipcheck time can be 1-2 sec off. That 
probably is due to most of the timing in innd is done with Now.time, which can 
be in a 1+ sec resolution.

Nevertheless, I set both wipcheck and wipexpire to 15 sec and reduced the 
DUPLICATEs by 70%. Then I looked further and found out most of the remaining 
DUPLICATEs were due to a lot of the big binaries articles were taken longer 
than 15 sec to transfer. I understand that 15 sec is very long time to tranfer 
a 600K size article if there are only a few feed channels. But when that spread 
out to 150 feed channels, 15 sec (or worse 5 sec - the defaulted wipcheck) is 
really not enough. With a patch to vary the wipcheck depending on the previous 
article transfer time of the channel, the DUPLICATE number was reduced to under 
1000/day and the inbound bandwidth from 20+mbit/sec to under 15 mbit/sec over 
the weekend. I can post the patch (to inn-patch?) after I recheck some numbers 
and if someone kind enough to confirm that I have not miss some purpose about 
wipcheck/wipexpire.

Best regards