wipcheck and wipexpire
kachun at pathlink.com
Mon May 15 17:47:42 UTC 2000
In article <20000514003441N.kondou at inn.do.mms.mt.nec.co.jp>, you say...
>In article <200005122039.NAA82428 at pathlink.net>,
> Kachun Lee <kachun at pathlink.com> wrote;
>} I could not understand why there should be wipcheck and wipexpire, instead
>} of just one value. Any insight will be greately appreciated.
>The reason why both exist is to differenctiate offered channel
>from other channels. I think both can be the same in most
>cases. That's simply to have more flexibility, though I admit
>that may lead people confused.
Thank you for the response. The duplicate reject number on our inn 2.2 feed
server was very high - close to 80+% the accepted number. I finally decided to
look into it last week, since that should almost double our inbound bandwidth.
After turning on trace and putting in some more syslog, I found out that it was
quite typical for 7-8 of our peers to offer the same articles within a few
seconds. With current default (wipcheck=5, wipexpire=10), the 1st CHECK was
replied with SENDIT. Then 1 or 2 of the followed CHECKs were replied with
RESEND and that indicated the msgid was in WIP. However, followed CHECKs after
2-3 seconds of the first one were responsed with SENDITs instead and resulted
with 3-5 DUPLICATEs and caused the same articles send over many times.
After examined the source code, I saw that was how the WIPinprogress was coded.
After wipcheck (5sec), WIPinprogress would return negative, until wipexpire. In
other word, any same msgid CHECKs came in between 5-10sec after the 1st one
would result with a DUPLICATE. Even one may argue that 5 sec for response to a
CHECK is a long time, I do not see a practial reason for having this gap.
If different behavior was intended for offered channel between wipcheck and
wipexpire, I think it should be the inverted, i.e. negative for the offered
channel and positive for others. The logic is that if the offered channel send
another CHECK for the same msgid, it must have lost the previous SENDIT
response, so it should be responsed with SENDIT again. And others should always
response with RESEND as long as the msgid is in WIP.
Also, in real time, look like the wipcheck time can be 1-2 sec off. That
probably is due to most of the timing in innd is done with Now.time, which can
be in a 1+ sec resolution.
Nevertheless, I set both wipcheck and wipexpire to 15 sec and reduced the
DUPLICATEs by 70%. Then I looked further and found out most of the remaining
DUPLICATEs were due to a lot of the big binaries articles were taken longer
than 15 sec to transfer. I understand that 15 sec is very long time to tranfer
a 600K size article if there are only a few feed channels. But when that spread
out to 150 feed channels, 15 sec (or worse 5 sec - the defaulted wipcheck) is
really not enough. With a patch to vary the wipcheck depending on the previous
article transfer time of the channel, the DUPLICATE number was reduced to under
1000/day and the inbound bandwidth from 20+mbit/sec to under 15 mbit/sec over
the weekend. I can post the patch (to inn-patch?) after I recheck some numbers
and if someone kind enough to confirm that I have not miss some purpose about
More information about the inn-workers