Optimization for the expireover procedure.

Kirill Berezin kyb at online.ru
Wed Oct 17 08:03:41 UTC 2007


Russ Allbery ?????:
> Kirill Berezin <kyb at online.ru> writes:
>
>   
>> At the moment I am sure not. Current implementation of overview database
>> is optimized for user access only, that is the key for any search
>> operation is a group number and an article number. But in the case of
>> expiry the main keys are time of the arrival and group number. In the
>> current implementation we have to retrieve every overview record and use
>> arrival time stored in it. For example there are about 15 millions
>> overview records in our server, number of expired articles is about 2
>> millions, so we have to analyze extra 13 millions before removal. This
>> is not good for me.
>>     
>
> The overview database is heavily optimized for reads.  Looking at lots of
> articles is fast.
>
> What makes it slow is rebasing the overview files for tradindexed and
> thereby rewriting the .DAT files, not the scanning of all the articles.
> This is an inherent performance drawback to the very simple data structure
> that they use.  I'm fairly sure that there are better data structures that
> would be faster, but tradindexed is also very robust, and I wouldn't want
> to lose that.
>   
The main idea is to keep current structure of overview untouched. 
Strictly speaking I am trying to optimize disk access operations by 
using shorter data structures for expiryover.
>> My proposal is to have a separate storage for the expiry procedure. The
>> structure of this storage must be optimized to perform expiration as
>> fast as possible. For example we can use a list of pointers to overview
>> records sorted according to arrival date or even expiration date ( this
>> is a little bit tricky). I believe this will be much more faster.
>>     
>
> This won't make it any faster if you still have to rebase the tradindexed
> data structures.
>
> If you're using some other overview method, I can't really comment, since
> I don't know much about their internals.
>   


More information about the inn-workers mailing list