Natterings about history files

Fabien Tassin fta at sofaraway.org
Sun Feb 11 20:36:14 UTC 2001


According to Bill Davidsen:
> > I've tried to evaluate the gain in having multiple history files but I'm
> > not convinced it will be that interesting. The problem is more in
> > the expire design than in history. We want it to undo very quickly what has
> > been done in a day. A discrete approach should give better results.
> > I see two ways to achieve that : a cyclic history (constant size)
> > or an expired. The later is easier because it require no change to the
> > model to bring the pure CNFS performances to other spool methods. The
> > bad point is that it still need to be periodically cleaned.
> 
> My view is that trying to eliminate the performance hit of expire is like
> trying to improve the code in a bubble sort. No matter how cleverly you
> code it, it is still not efficient.

I don't want to try to improve the code of expire but rather change the
way entries pass from the exist state to the remembered state, including
article removal. It can be done without changing a single line in expire.
If articles removal can be done on the fly, the daily task will only have
to rewrite the history files without too old remembered entries.

> Your comment about CNFS type performance is sort of on the track of what I
> was thinking, but how about an expire daemon which simply trims old info
> as it reaches /remember/ time? Going to history per day would make that
> quite easy, going to a database (and based on how well db hasn't worked
> for many people, some *other* database) setup would allow deletion on the
> fly.

well, perhaps expired is a confusing name. It is not supposed to rewrite
the file but just walk into them and expire articles that need to be so
(probably using hooks to effectively remove these articles).
The daily rewrite can be done by expired too but it can only be periodic like
now.

> The argument for this is that the speed of the server stays more nearly
> constant, and if it can keep up it will. The against is that you can
> probably be more efficient doing an expire in batch. But we are talking a
> small number of articles per day compared to systems which do an order of
> magnitude more and keep up.

?

> Certainly s steady state would be desirable, I'm working Sunday because of
> an expire problem, maybe I'm biased. But expire seems to have more
> problems than anything else since CNFS came on board, and low time cost is
> really important to keeping INN viable.

I have no problem with expire used for a pure CNFS spool. The idea behind
an expire daemon is to make daily expire for other spool methods as fast
as it is for CNFS (a few minutes instead of several hours).

-- 
Fabien Tassin -+- fta at sofaraway.org


More information about the inn-workers mailing list