optimizing overview expiry (or: overview without expiry)

Harald Welte laforge at gnumonks.org
Sun Feb 3 10:53:18 UTC 2002


Hi All!

I'm running a medium-sized newsserver with about 200GB of storage,
24GB overview (inn-2.3.2, buffindexed, cnfs). 

The expireover process causes extremely high load on the box and takes
28hours (PIII-400, 512MB RAM, Linux 2.2).

The expireover process is very 'stupid':  We iterate over all groups,
and iterate over all articles in every group to see if the article is
still there. 

There are a couple of potential approaches to reduce overhead of expireover:

a) Have the cnfs code write a list of storage tokens for articles it over-
   writes.  This is possible without too big modifications to the current
   code.  The problem remaining is, that the overview doesn't have an index
   for storagetoken -> overviewblock lookup, since it's usually needed the
   other way around.  Without this index there would be no big improvement
   over the current situation. 

b) Include a pointer to the overview information 'ovtoken' into 
   CNFSARTICLEHEADER.  This way, at the time cnfs overwrites an article,
   it could directly call into the overview code and tell it to remove
   information about this article.
  
   The more detailed idea of how to integrate within innd would be:
	- include a new SMupdateovptr() function into the storage method

	- Flow of code as follows:
		- innd.c:ARTpost()
			- call ARTstore()
				- call SMstore()
					- store article
			- call OVadd()
				- add overview for article
				- return value: ovtoken
			- call SMupdateovptr(arth.token, ovtoken)
				- add ovtoken to article
				

  Inside CNFSstore we check CNFSUsedBlock() for every to-be-overwritten
  block.  If CNFSUsedBlock() == TRUE, we look up the CNFSARTHEADER.ovtoken
  member, and call into the overview code, which removes overview info for
  this article


I'd like to receive your comments on this.  Has this been proposed before?
What do you expect with regard to performance?  Would people be interested
if I'd implement this?

Thanks.


-- 
Live long and prosper
- Harald Welte / laforge at gnumonks.org               http://www.gnumonks.org/
============================================================================
GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M- 
V-- PS+ PE-- Y+ PGP++ t++ 5-- !X !R tv-- b+++ DI? !D G+ e* h+ r% y+(*)


More information about the inn-workers mailing list