Scaling

Russell Vincent russellv at uk.uu.net
Fri Dec 8 10:04:13 UTC 2000


> The point is not how sites are providing service today.  The point
> is how full sites will continue to do a good job in the long term.
> Full feed growth is outstripping Moore's law, and that is a problem.

"Imminent death of Usenet predicted. Film at 11."  :-)

The above comment has been mentioned almost every year since I started
administering Usenet servers around 12 years ago. It still holds
true today.

> Suppose we take Usenet doubling every 6 months to be true, as Joe
> St. Sauver estimated, that means Usenet is outstripping Moore's law,
> which is the challenge I hoped would be discussed.

It has been "doing that" for a decade, but we are all still here
and in the same boat we were then: struggling to keep Usenet on
the hardware we have. We are still managing.

Getting serious for a moment, your main concern seems to be about
storage. I say that storage is not a problem. Many of us are in
the multi-terabyte range for news spools already and have been
for some time. We are keeping up just fine with feeds.

As always, it boils down to cost. How much you are willing to spend
on hardware? Joe Greco is deploying 6 or 7TB spools on commodity
(PC and similar "low" cost) hardware and I am deploying high-end 2TB
hardware RAID5 systems for distributed news spools. Despite the RAID5,
I am finding the systems quite capable of accepting a full newsfeed and
serving it to thousands (and probably tens of thousands when I get going)
of simultaneous clients. Joe is doing something similar, but at the lower
cost end. Granted, it isn't cheap, but then Usenet hardware never has
been cheap. I would even stick my neck out and say that we are
not paying much more for the hardware than we were every couple of
years when our servers needed upgrading.

All my newsfeeding (farm of around 30 servers, with each server handling
20-30 average feeds - some more) is happening off 100GB RAID5 systems
and disk is not the bottleneck. Slow Sun CPUs are. Even memory
requirements are not significantly higher than they have been in the
past. I have just setup a feeder taking in 200GB per day and serving out
700GB per day. It could do more (not much!) - this using the same
hardware we have been using for over a year.

I am mainly using Diablo for software (and being the main developer
at the moment, I know it has a lot more life in it), but I do still
have an old INN 2.2 trad spool handling a full feed. Not very well,
but it mostly keeps up which is impressive considering it has been
running solidly for over a year now - up 395 day(s)). INN CNFS should
easily be able to keep up on similar hardware to what I am using for
Diablo (Sun Ultra60 with 512MB RAM and 100GB-2TB spools). PC hardware
is capable of a lot more, in my experience.

An interesting datapoint I have just come across is that a full feed
these days is in excess of 200GB, of which less than 1GB is non-binary
(stripping every binary article based on it type - not newsgroup).
That means you could still run a full text-only feed on hardware
you had over 5 years ago - try that with most other Internet services.
It also means that you can keep text articles for a long time on
relatively cheap storage.

References:

Date          Articles      Volume
19990226       822411       28395MB
19991207       926306       76359MB
20001206      1206628      201770MB

A very nice ascii picture showing Joe Greco's news farm, given as
part of a presentation at SANE2000.

  http://www.nntp.sol.net/sane2000/slides/30.html

or another Diablo news farm (with storage sizes) by Cor Bosman at:

  http://www.xs4all.nl/~scorpio/news.jpg

Hope that helps.

 -Russell




More information about the inn-workers mailing list