[bind10-dev] Scaling the resolver across multiple cores

Fri Mar 15 14:54:59 UTC 2013

Michal,

On Wed, 13 Mar 2013 10:13:42 +0100
Michal 'vorner' Vaner <michal.vaner at nic.cz> wrote:
> On Mon, Mar 11, 2013 at 11:31:51AM +0100, Michal 'vorner' Vaner wrote:
> > Unfortunately, they are mostly feelings. Does anybody have an idea
> > how to measure them reasonably (and without actually implementing
> > the whole resolver)?
> 
> As talked yesterday on the call, here's a concrete proposal how to
> measure them. We need something similar enough to a resolver, but we
> don't want to do real resolution.
> 
> So, we replace the resolution with something dummy. Let's say a
> resolution of a „query“ will contain these steps:

The steps look good, we can refine the actual numbers in the simulation
if they seem to give us unreasonable results.

For example, I think the cache hit rate is a bit high, possibly we
should change it to random(10) or even random(5).

>  • Receive: { for (size_t i = 0; i < 1000000; i ++) doNothing(); }
>  • Parse: { for (size_t i = 0; i < 1000000000; i ++) doNothing(); }
>  • Look into the cache:
>    ◦ { for (size_t i = 0; i < 1000000000; i ++) doNothing(); }
>    ◦ if (random(50) == 0) {
>        Not found in cache → Proceed to upstream queries (do 1 +
> random(3) upstream queries)
>      } else {
>        Found, good
>      }
>  • Render: { for (size_t i = 0; i < 1000000000; i ++) doNothing(); }
>  • Send: { for (size_t i = 0; i < 1000000; i ++) doNothing(); }
> 
> Upstream query would look like:
>  • Some rendering, etc: { for (size_t i = 0; i < 1000000000; i ++)
> doNothing(); } • Schedule timeout after short time (random(30) ms).
>  • After the time, update cache { for (size_t i = 0; i < 1000000000;
> i ++) doNothing(); }
> 
> Now, the doNothing() would be an empty function (in some other
> compilation module, so the compiler will not want to remove it
> completely).
> 
> Then, we would implement this work in each of the models, with all
> the locking, etc. We would then generate the „queries“ artificially
> and queue bunch of them at the beginning and see how long it takes to
> answer all of them with reasonable number of threads, tweaking the
> parameters (like amount of queries batched in the landlords model),
> etc.
> 
> Do you think it would work? 

I think it falls into the 'better than nothing' approach, which is what
we otherwise have. :)

> If so, we would have something like 4 tickets: 
> • Write the functions to do the work described here that would be
> placed in the respective parts of the model. This would be to ensure
> every experiment measures the same „workload“.
> • 3 tickets, one for each model.

Yes, this seems correct to me.

> Now, I don't know if anybody has a coroutine library at hand, I don't
> know of any and I don't think writing it with ucontext.h is
> reasonable approach.

I only know what I saw on the Wikipedia.

http://en.wikipedia.org/wiki/Coroutine#Implementations_for_C.2B.2B

Possibly we should try the Boost.Coroutine stuff, since we are using
Boost anyway?

Cheers,

--
Shane
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <https://lists.isc.org/pipermail/bind10-dev/attachments/20130315/ab244523/attachment.bin>