[bind10-dev] Simplification of the memory manager

Thu Aug 22 08:27:42 UTC 2013

Hello

As discussed yesterday on the call, the memory manager feels over-complicated
(at least to me). I was thinking about some ways to simplify it. This may not be
a complete list what would be possible, though.

Please, discuss.

Threads
=======

Currently, the memory manager has two threads. One that does long-term work and
one that controls things. However, splitting the logic across threads makes
things more complex.

It would be easier to just do everything in a single thread, with sequential
programming. However, if we did that, a long-running job would block
communication with the rest of the world. That would not be much of a problem
(we'll be sending mostly notifications to the daemon, with the exception of
shutdown command), but msgq could decide that the socket is full and close it.

So I propose we still have two threads, but one of them would only read things
from the msgq socket and queue them for the other. The other would do all the
rest of work (eg. the communication would be one way only and there would be no
sending of state updates back and forth).

One state for all the data sources
==================================

Currently we have a state instance for each data source. There's a dictionary
with (class, name) -> state automata. Each of them has separate state and
separate set of the readers. However, the readers are the same for all of
them, while the state may differ.

Let's synchronize the state too and have only one copy of the state and the
readers. We do some updates (potentially on all of them) and then send updates
on all the changed ones. After all the updates are applied, start with more
updates. Effectively, change the whole complex ping-pong logic into this:

  while True:
    command = readNextCommand()
    if command.is_update():
      apply_update(command.datastore)
      mark_dirty(command.datastore)
    else:
      for datastore in datastores:
        if datastore.dirty():
          send_updates(datastore)
      apply_other_command(command)

We don't actually need to store the state anywhere (as it is implicit within the
while).

The updates should be synchronous
=================================

Currently, we do the updates in somewhat asynchronous way. We send updates,
store that we are expecting an ACK. Once we receive the ACK, we remove it from
expecting it and check if we still expect something. But while we wait for the
ACKs, we can't really update anything on that datasource.

Let's not store the set of clients we expect ACK for through the application.
Let's do the communication in a code-local fashion. Just broadcast the updates
and wait for all the answers. Don't do anything else until there are back. This
should be fast, as the readers need only map the segment.

Do the updates as single command
================================

Currently, we do two commands. Command from memory manager to each reader
„Please, apply this update“, an answer is sent, which means nothing. Then, later
on, the reader sends a command „I have finished“, and another meaningless answer
is send by the memory manager.

This is based on assumption that the map of segment can take a long time.
However, I think this assumption is wrong, that mmap is reasonably fast (it'll
happen within few milliseconds). Why not just use the command-answer exchange?

This would require small update to the auth code too, because currently auth
does useless thread exercise there. It takes the update that takes no time and
queues it for the work thread and waits for the work thread to return
notification it was done. Why so complicated? Let's just lock, mmap, send answer
and be done.

Have just one history snapshot
==============================

With this, I'm not even sure how it is supposed to work. But when there's a new
config, we just create another complete set of the state guts and append it to
the list. When yet another new config comes, we create yet another one. And so
on. They live beside each other forever (or, I didn't find a ticket dealing with
removing the old ones in any sane way and didn't spot a place to do that
logically). Furthermore, they use the concept of generation ID which is not yet
solved how to do or if to do it at all.

Let's keep just the current one. We are not capable to keep track of more across
history and I don't see the reason to do that. We do two things differently:

• Instead of sending updates for separate data sources, we send all the
  parameters in one update, for all of them. We mark which ones are changed from
  the previous time and which ones are not. The auth usually applies only the
  ones that were changed, unless it has a fresh config (when it applies
  everything).
• We bundle the whole config (or some kind of hash of it) with the update.

The auth server compares the config it has with the config the update is for. If
they match, the update is applied.

Any update is stored for later (just the latest one). If a new config comes,
auth applies it and tries to apply the update (with comparing the configs). If
it matches, the update is applied (because the update came first, then the
config). If not, then we wait for the next update that is about to come.

Disadvantages
=============

This approach is a lot simpler, because we need to keep less state around and
wrap it into objects. But it also is likely to have little bit smaller update
throughput (because the current design allows updates to one data source while
another is being synchronized), but I don't think this is a big problem. For
one, we don't know there'd be problem with catching up with updates and most
people are likely to have just one data source (or, one busy).

Also, the disadvantage would be to change some of the code we have already
written, or throw some out, as it would no longer be needed. But I hope it would
still be worth it, as we have a lot of work ahead of us and it would pay back in
that (and also in debugging).

I'm going to suspend work on #2858 for now, until we decide what we do with it.
Also, #2857 may be put aside, as it would be one of the things that would not be
needed (or, most of it at least).

With regards

-- 
All flame and insults will go to /dev/null (if they fit)

Michal 'vorner' Vaner
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <https://lists.isc.org/pipermail/bind10-dev/attachments/20130822/1789e6c0/attachment.bin>