[bind10-dev] Profiling of b10-auth

Michal 'vorner' Vaner michal.vaner at nic.cz
Fri Jan 7 13:24:40 UTC 2011


Hello

On Thu, Jan 06, 2011 at 02:18:39PM -0800, JINMEI Tatuya / 神明達哉 wrote:
> > I learned few things. First, the huge zone turned out to be usable
> > only with half a million RRs, since that already took 2GB of
> > memory. That is way too much IMO and we should do something with it
> > (with ~250million of records for the .com zone, it would go near a
> > TB of RAM).
> 
> That's a predictable issue with the current implementation.  It uses
> memory very luxuriously (for faster development of the workable
> "feature-review" version).  To name a few, each RBT node contains a
> dns::Name object, and each RRset in the tree is represented as a
> dns::RRset object.  Both are very fat, and some information is
> redundant (the rbt node essentially contains the information of the
> name of the node, so, technically, the name object contained in each
> RRset is redundant).  We'll eventually need memory-conscious
> representation as tried in #404, but that's currently a lower priority
> task.

Yes, that is what I suspect too. Still, 4kB seems too much for even that. Maybe
there's more overhead with the allocation of small data structures or something.

Anyway, I had one crazy idea. I'm not sure if it's doable or reasonable, but it
is inspired by perl6 ‒ perl6 says there's at most one instance of each string in
memory. Therefore it probably has some kind of global hash table of strings and
actual string is just a pointer to the only one instance. If we did this with
names, we would both save the space, could do it per-label, have additional
hashing for free (since we would already know the hash by difference of the hash
table base pointer and the string pointer) and could use the per-label hierarchy
for the compression. But I'm not sure if it would be gain, because there's the
need to compute the hashes and so on.

> > Name::compare
> >   This one has 50% of the cache misses, and nearly all of them is reading from
> >   the „other“ argument. However, this one was a lot higher in the large zone
> >   case than in the small one. What I gather from that is there are lot of name
> >   compares called from the RBTree, with the „other“ argument being the names
> >   stored in the tree (therefore not cached).
> > 
> >   There was about 15% of mispredicted branches of the whole program.
> 
> As noted above, we'll eventually have to revisit the representation of
> node keys (which is currently a dns::Name object, heavily eating
> memory), and will have to have a customized comparison logic.  At that
> point we can also think about how to place the data in terms of CPU
> cache efficiency, like the one you previously proposed.

Or, if we don't want to invest so much effort in the implementation, some radix
tree would still look better than RB-tree and it is much simpler and more
compact.

One thing I forgot yesterday. 50% of the samples were from kernel. I'll need to
persuade oprofile to tell me where in kernel it is, but we seem to use a lot of
syscalls. It might be only the network IO, but there might be more we ask the
kernel to do.

I agree that it is lower priority, but I wanted to know if there's some low
hanging fruit. It turned out that probably not, but that there are no real
surprises, which is good to know as well.

Have a nice day

-- 
Next sleep is scheduled after 1k lines of code

Michal 'vorner' Vaner
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <https://lists.isc.org/pipermail/bind10-dev/attachments/20110107/7f9ad54e/attachment.bin>


More information about the bind10-dev mailing list