[bind10-dev] experimental mmap-based in-memory data

Wed Oct 17 07:10:36 UTC 2012

Over the last weekend I played with some Boost interprocess libraries
to see how easy/difficult it is to support mmap or shared memory based
in-memory zone data.  Actually, it was surprisingly easy.  I was able
to write a complete mmap backend and workable memory management daemon
("b10-memmgr") basically in about 1.5 days (of course, however, there
should be non easy issues in details if we think about full
implementation).

The implementation is available in the jinmei-shmem branch of the
public repository.  In case someone is interested, how to play with it
is as follows:

- build BIND 10 as usual.  b10-memmgr should also be built.
- start BIND 10
- enable the memmgr component:
> config add Boss/components b10-memmgr
> config set Boss/components/b10-memmgr/kind needed
> config set Boss/components/b10-memmgr/address Memmgr
> config commit

- enable on-disk image for mmap.  Assuming you already have some set
  up using the old style in-memory data source for class IN, the
  configuration would be:
> config set data_sources/classes/IN[0]/segment-type mmap
> config set data_sources/classes/IN[0]/mmap-file <some path>
> config commit

Then b10-memmgr will create a mmap image named "<some path>.0" and
tell b10-auth to map it to memory.  You can reload the zone data by
sending "loadzone" command to b10-memmgr just like to b10-auth.  Then
b10-memmgr will create a new mmap image named "<some path>.1" and
tell b10-auth to map the new one to memory.  b10-auth will do it and
then munmap the old version (this switch operation is done "instantly"
as expected, no matter how large the zone is - I confirmed it using
a >1GB image).  b10-xfrin now tells b10-memmgr when it completes an
incoming zone transfer.  Then the same process as "loadzone" will take
place at b10-memmgr and then b10-auth, and auth will server the new
version of the zone from in-memory.

The things I've learned from this experiments are:

- Boost interprocess works pretty well.  I used the
  boost::interprocess::managed_mapped_file class, and it's responsible
  most of difficult parts of the implementation, like memory
  allocation/deallocation or extending the region (file) if the
  current space is not sufficiently large, etc.  The wrapper
  MemorySegmentMmap class is pretty small and quite straightforward.
- Our internal in-memory data structure is almost ready for
  mmap/shared-memory version.  the offset pointers seem to work pretty
  well (I've not seen any strange crash after remapping, etc), and the
  MemorySegment abstraction hides details of the underlying memory
  model quite nicely, so I didn't have to change much of the existing
  in-memory implementation.
- This experimental implementation works naively in some points, e.g,
  for reloading it makes a full copy (file) of the old version and
  makes modifications on it.  But experimental results show the
  overhead isn't that big (as long as the files are cached in-memory
  at the OS level, I guess).  For the large zone I used in the
  experiment (containing about 10 million RRs), it took about 2.5
  minutes to create the initial data on our performance benchmark
  machine, and full reload took just about 3 minutes.

So, overall, I think it's not very difficult to implement a
"production ready" version of it.  We can begin with a naive version,
and as an experimental complement to the current local-segment model.
We can then improve it, e.g., so that we skip the initial creation
after restart when we can safely assume the pre-existing data are the
latest.  Then we can achieve nearly 0-startup time feature, which
should be very unique to BIND 10.

---
JINMEI, Tatuya