[PATCH 0/2] CNFSv4: move to 4K blocksize

"Miquel van Smoorenburg" list-inn-workers at news.cistron.nl
Wed Nov 26 09:57:09 UTC 2008


Currently the CNFS storage method uses a 512 byte (1 sector)
granularity for its "filesystem". That was great in the nineties,
but nowadays that is very limiting:

- most filesystems use 4K blocks, so a write to a 512 byte
  CNFS block can result in a read-modify-write cycle, slowing
  down writes enormously (effectively making them synchronous)
- With larger devices, the block-bitmap at the start balloons in size
- The size limit of a CNFS file/partition is 2^31 * 512 = 1 TB.
  (the block-offset is stored in the CNFS token as a signed integer..)

So I have updated storage/cnfs/ to use 4K blocks.

This introduces a new CNFS version in the CNFS header, version 4.
The header now includes a blocksize member, which is 4K by default.
The block offset is now encoded in the CNFS token as a unsigned int.
CNFSv4 supports files/partitions up to 16 TB with a 4K blocksize.

If we want to support > 16TB with 4K blocks, that is doable by
stealing a few bits from the 'cycnum' value in the CNFS token.
I've updated the code so that for CNFSv4 and up the cyclenumber
wraps on 2^24 instead of 2^32 (with one wrap per day, that's good
for 45000 years, so I see no problems there). So we have 8 bits
for that, but I haven't written the rest of the code yet.

The code works fine with existing CNFSv3 files/partitions. I have
2 newsservers running with old CNFS partitions (512 bytes blocks)
and 2 servers with the new CNFS (4K blocks) on 6 TB arrays.

cnfsstat and cnfsheadconf have also been updated to understand CNFSv4.

Right now a new CNFS file/device is always initialized with
4K blocksize, but it would be trivial to make that configurable.
With larger blocksizes we might want to look at the CNFS write
padding though, I don't think it is useful to pad CNFS writes
to larger blocks than 4K. It doesn't do any harm though.

Mike.
-- 
The From: and Reply-To: addresses are internal news2mail gateway addresses.
Reply to the list or to "Miquel van Smoorenburg" <miquels at cistron.nl>



More information about the inn-workers mailing list