how does history file get too large?

Wed Jan 28 19:39:58 UTC 2004

bill davidsen wrote:
> 
> What are you running for version and o/s which doesn't support large
> files? That's the real problem here, most o/s have allowed large files
> for some years.
> 

I'm running Linux 2.4.22, which does support large files.

But, I'm not convinced of the benefit of having a huge history file.  For one thing, innd does seem much faster with my new, relatively tiny history file.  (Although it's grown by almost a factor of 8 in about 24 hours.)  And, speed is a significant
concern for us.

My 'remember' value was (and still is) 10, and things have worked great for months.  I'm happy with remembering articles for 10 days generally - it's not that I want to remember them longer than that. 

> Warning, you first want to be sure that you don't delete entries for
> anything which is still in spool. That will *really* bite you over time.
> 

My understanding is that entries in the history file that have only two fields are either expired or rejected and thus everything else is still in the spool.  But, I got that idea from an example in the makehistory man page - I assume my interpretation is
correct.

> Given that, it's pretty simple to write a program, let's call it a perl
> script because the source is very easy, and do thus:
> 
> - calculate what you need to delete
> 
...
> 
> You're done.
> 

Yeah - this is what I was thinking...  

> ================ I made this up, untested ================
> 
> #!/usr/bin/perl
> 
> $DropGoal = shift @ARGV;
> 
> while (<>) {
>   if (m/@/) {
>     print $_;
>   } else {
>     $dropped += length($_);
>     last if $dropped >= $DropGoal;
>   }
> }
> 
> # copy the rest w/o checks
> print $_ while <>;
> exit 0;
> ================================================================
> 

Thanks!  Although my perl is weak - I tend to write such things in python.

>   Do remember there are other ways to do it, setting remember is
> probably better, but you DID say you wanted to check and recover...

The tradeoff, as I see it, is that the pruning script approach keeps the file within a specified size limit, while using remember is completely independent of the file size (and, indeed, has caused some problems).  Am I missing anything?

I could write the pruning script then see how old the timestamps are for the article tokens (or whatever they are) being removed.  If they're way less than 10 days then my size is definately too small.  If they're near the 10 day threshold and few in
number then my size would be good enough.

Anne
-- 
***************************************************
Anne Wilson			UCAR Unidata Program		
anne at unidata.ucar.edu		       P.O. Box 3000
              			  Boulder, CO  80307
----------------------------------------------------
Unidata WWW server       http://my.unidata.ucar.edu/
****************************************************