Benchmark of overview storage methods

Julien ÉLIE julien at trigofacile.com
Sun Jan 17 10:34:34 UTC 2021


Hi all,

Following a recent question in news.sotware.nntp about buffindexed 
buffer size, I've just done a little benchmark of our 4 storage methods.

I'll put a few elements in our documentation (FAQ maybe) in the hope it 
will help our users in choosing the appropriate size and overview 
storage method.

In 2020, the volume for a full-text Usenet feed is about 18,000 articles 
/ day, with peaks to 1,200 articles / hour.
Article storage size is about 65 MB / day.

Source:
   https://www.eternal-september.org/stats/
   https://news.aioe.org/stats/innreport-reports/


As for overview storage size, if you store more header fields in 
overview data than standard ones, the space needed to store overview 
data will be superior than what follows.  It is configured in the 
extraoverviewadvertised and extraoverviewhidden inn.conf parameters.


Benchwark with:
  - 3,278,095 overview records stored (with standard overview fields)
  - corresponding to about 2,852,200 news articles (each newsgroup an 
article is crossposted to counts for 1 overview record)
  - for 620 newsgroups (I do not carry a full-text Usenet feed)
  - on a slow Intel Atom CPU N2800 1.86GHz with 2GB RAM

With better hardware, time of overview rebuild and expiration will of 
course be faster.


buffindexed
-----------

% inndf -no
3278095 overview records stored
58.31% overview space used

I set up 2 buffers of 1,57 GB, so the fixed used space on disk is 3,15 GB.

overview rebuild in 26 mn
expireover in 14:30 mn
used space on disk is fixed, and equal to the space allocated to buffers 
(3,15 GB in this case, though only 1,83 GB is really used)

On a previous test with 5 buffers of 1,57 GB, for a total space of 7,86 
GB, overview rebuild lasted 32 mn and expireover 41 mn.  The advice is 
therefore not to allocate too much spare space for buffindexed, and add 
new buffers when needed.
*** An interesting information to add in our documentation!



tradindexed
-----------

overview rebuild in 17 mn
expireover in 5:00 mn
used space on disk is 2,01 GB after rebuild, and 1,77 GB after first run 
of expireover

*** Is tradindexed that slow to write finally?  It seems pretty 
performant!  Full-text traffic is not high nowadays.



ovdb
----

overview rebuild in 38 mn
expireover in 4:00 mn
used space on disk is 3,19 GB



compressed ovdb
---------------

overview rebuild in 40 mn
expireover in 3:55 mn
used space on disk is 2,61 GB (overview data < 600 bytes are kept 
uncompressed)



ovsqlite (new storage method in INN 2.7.0)
--------

overview rebuild in 34 mn
expireover in 6:30 mn
used space on disk is 2,67 GB



compressed ovsqlite (new storage method in INN 2.7.0)
-------------------

overview rebuild in 36 mn
expireover in 5:20 mn
used space on disk is 1,12 GB (all overview data is compressed)



Each overview storage method has its strengths and drawbacks.

-- 
Julien ÉLIE

« Il n'y a pas moyen de contenter ceux qui veulent savoir le pourquoi
   des pourquoi. » (Leibniz)


More information about the inn-workers mailing list