Artsize still not quite right in CURRENT

Fabien Tassin fta at sofaraway.org
Fri Apr 13 23:03:30 UTC 2001


According to bill davidsen:
> 
>   Worse than that, I found that sm is just hanging on some articles, not
> ever returning anything, and accumulating a small amount of CPU. I'll
> try to get a trace of whats happening.

same here. sm and innxmit from CURRENT are both eating all the memory
and spending most of their time in D state because of this.

strace -tt sm @0302425546303100000000AE4B9F00000007@
....
00:43:34.664324 open("/news/spool/articles/BUF01", O_RDONLY|0x8000) = 4
00:43:34.664421 fcntl(4, F_GETFD)       = 0
00:43:34.664491 fcntl(4, F_SETFD, FD_CLOEXEC) = 0
00:43:34.664564 mmap2(NULL, 13123584, PROT_READ, MAP_SHARED, 4, 0) = 0x4012a000
00:43:34.664672 pread(4, "7G)M%8D:LB9@", 12, 1553415680) = 12
00:43:34.664772 old_mmap(NULL, 927412224, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40dae000
00:43:34.664880 pread(4, "\"@BSA,.G#6[K03YUHS at UPD+@]ZN<[\r\nM"..., 927410509, 1553415692) = 927410509
00:44:43.659923 munmap(0x4012a000, 13123584) = 0
00:44:43.676237 close(4)                = 0
then nothing more.

10 seconds in pread().

Here is what gdb thinks of this :

(gdb) r  @0302425546303100000000AE4B9F00000007@
Starting program: /home/news/bin/sm @0302425546303100000000AE4B9F00000007@

Ctrl-C after one minute :

Program received signal SIGINT, Interrupt.
0x8049c6b in FromWireFmt (
    article=0x40dae008 "\"@BSA,.G#6[K03YUHS at UPD+@]ZN<[\r\nM$QP=F+3_%(_:9+>3#?W6;XJL1C+NR\"\\!D\"P&2H<3_A4K'R4-I8(.D!PY)SJ.\r\nM;=\">\"Q31']^7I2GE2:V5<Z/8K0,5>S6VZN677H\"/2)E1[7.!?DI2SB7P0ANQ\r\nM2>X1M_,.:$+I83_L%$_A_TLB?[@'00_WIK,#OU=_5W"..., len=927410509, newlen=0xbffffc80) at interface.c:179
179             } else if (p < &article[len-1] && *p == '\r' && p[1] == '\n') { 
(gdb) bt
#0  0x8049c6b in FromWireFmt (
    article=0x40dae008 "\"@BSA,.G#6[K03YUHS at UPD+@]ZN<[\r\nM$QP=F+3_%(_:9+>3#?W6;XJL1C+NR\"\\!D\"P&2H<3_A4K'R4-I8(.D!PY)SJ.\r\nM;=\">\"Q31']^7I2GE2:V5<Z/8K0,5>S6VZN677H\"/2)E1[7.!?DI2SB7P0ANQ\r\nM2>X1M_,.:$+I83_L%$_A_TLB?[@'00_WIK,#OU=_5W"..., len=927410509, newlen=0xbffffc80) at interface.c:179
#1  0x80499a2 in main (argc=2, argv=0xbffffd0c) at sm.c:105
(gdb) c
Continuing.

30 seconds later :

Program received signal SIGINT, Interrupt.
0x8049c6b in FromWireFmt (
    article=0x40dae008 "\"@BSA,.G#6[K03YUHS at UPD+@]ZN<[\r\nM$QP=F+3_%(_:9+>3#?W6;XJL1C+NR\"\\!D\"P&2H<3_A4K'R4-I8(.D!PY)SJ.\r\nM;=\">\"Q31']^7I2GE2:V5<Z/8K0,5>S6VZN677H\"/2)E1[7.!?DI2SB7P0ANQ\r\nM2>X1M_,.:$+I83_L%$_A_TLB?[@'00_WIK,#OU=_5W"..., len=927410509, newlen=0xbffffc80) at interface.c:179
179             } else if (p < &article[len-1] && *p == '\r' && p[1] == '\n') { 
(gdb) p len
$1 = 927410509

looks like the CNFS buffer (50 GB) is not clean.
Is there a tool to check the integrity of a cycbuff ? hmm.. for 50 GB each,
I assume it is faster to just reset them.

>   The hangs are so far only on Linux with largefile support. I doubt
> that has anything to do with anything.

This box is a Linux with largefile too.

-- 
Fabien Tassin -+- fta at sofaraway.org


More information about the inn-workers mailing list