[bind10-dev] Resolver skeleton, location of logic, and plans/discussionstarter for that

Thu Jan 20 22:55:47 UTC 2011

OK. So, I've also begun working in and around this code and I certainly 
agree that there's a lot of stuff scattered about. Clearly, debugging 
would be a big issue.

It seems that we ought to take a step back and lay out a 
design<->implementation relationship that says what will be where. For 
instance, I thought that asiolink was a utility library which would 
provide some sort of communication object - a udp/tcp handle or whatever 
- something generic and abstract. I also expected the actual 
query/answer objects would be elsewhere (perhaps in resolver). In short, 
I agree that RunningQuery should not be in asiolink. I'm less clear on 
whether it is a somewhat redundant object or that its function should be 
handled in some other object, although I highly favor simplicity.

Anyway, I do think a discussion is in order.

However, I'm not sure that this it is necessary to do a pre-April 
discussion of this topic. Which then implies that I think that the code 
should be reviewed/merged and refactored later. Sometimes having a 
poorly implemented prototype that works is a good thing.

Just my $.02.

-Scott

On 01/20/2011 10:02 AM, Jelte Jansen wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> (warning, bit of a long mail, sorry)
>
> Ok, so in branch experiments/resolver I have tidied up my little experiment from
> last week a bit, and since people have started working on related code, we might
> want to get some progress on this. I'm not sure if we should review, merge, then
> refactor, or the other way around.
>
> I had a brief discussion with Stephen yesterday, and we seem to disagree on
> where this code should actually be (though I think we both agree on the 'not
> where it is now' part, as well as on that the current general layout is
> somewhat... misfortunate)
>
> Let me first describe how I read what it does, and why the resolver logic is in
> the location it is now. That might provide a good starting point for either a
> discussion or a refactor (or probably both).
>
> The resolver binary (main.cc) sets up a Resolver object (from resolver.cc),
> which in turn creates DNSServers (tcpdns.cc and udpdns.cc) and a DNSService
> (asiolink.cc), as well as a RecursiveQuery object (asiolink.cc).
>
> The servers handle the accepting of queries, the service does the processing, by
> passing events on to a lookup callback or an answer callback, both from
> resolver.cc again. The RecursiveQuery object is used to start recursive lookups.
>
> The answer callback does some final polishing of the answer and renders it, then
> control falls back to the DNSServer in question which sends it back to the client.
>
> The lookup callback calls another function in the Resolver object
> (processMessage() in resolver.cc) which does some sanity checks, and either
> fails on those or starts a recursive lookup. In the failure case, it'll create
> an error dns packet and let control fall back to the DNSServer object, similar
> to the answer handler. In the case a lookup is started, the existing
> RecursiveQuery object (one fixed object, this holds the actual settings for
> queries) creates a RunningQuery object, which sends out the actual query, and
> gets an event if there is either an answer or a timeout.
>
> Now in the case of the forwarder mode, this answer is what it wants, and it'll
> copy that, send a resume() to the DNSServer (which makes it continue into
> calling the answer callback), and then deletes itself. In the case of a timeout,
> it'll resend, until it runs out of retries, in which case it'll also tell the
> DNSServer to resume, but to do nothing further regarding the query in progress
> (and then it deletes itself).
>
>
> The above describes how I understand the code to work.
>
> So here's what I added/changed:
>
> Since RunningQuery is the object doing the actual work for a lookup (i.e.
> deciding where to send it, sending it, handling events regarding an answer or a
> timeout), this seemed to me the most logical place to put the logic for now.
>
> I've added a handleRecursiveAnswer() method, which operator() calls if it is in
> recursive mode (i.e. upstream_ is empty). That method looks at the answer it got
> back, and either causes a new send() (if it was a delegation; it updates the
> list of servers to try, and then calls send again), or decides it is done.
>
> Depending on the return value of that function, operator() will call resume() on
> the server (if it's done), or simply do nothing until the next event comes in
> (which should be cause by the new send() in handleRecursiveAnswer(), and can be
> either a timeout or a new response).
>
> I have added a MessagePtr answer_message to the mix, where the answer is to be
> built. This ptr is created by the DNSServer (udpdns.cc), and propagated into
> RunningQuery, that could for instance add CNAME records while it it looking for
> the final answer. The buffer, though less useful for this specific purpose, is
> still used in sending out queries and parsing responses (and left in at a few
> places where it may not be essential anymore).
>
>
> Soooo. That's the short of what it does right now in experiments/resolver.
>
> There are a few problems, one of which is that is seems somewhat... complicated.
>
> An obvious conclusion is that logic and control is kind of flying around in some
> very different places right now. Part of the handling is done in resolver.cc,
> part in tcp/udpdns.cc and part in asiolink.cc.
>
> A second one is that RunningQuery does the actual recursive handling, which
> seems to me the most logical place (right now, since that's the part getting and
> creating events), but not everyone seems to agree :)
>
> I do think we should get it out of asiolink though. And rename RecursiveQuery to
> RecursiveQueryStarter (or Factory or something), or perhaps even remove it
> completely. It really doesn't do much besides storing data and starting
> RunningQueries. (another option, if possible would be to make RecursiveQuery do
> the actual logic, but then RunningQuery would be reduced to send+timeout,
> without even retries, in which case i'd probably argue for removing that one :))
>
> We should also move all the handling to a common place, whether or not that is
> in one or more files, and certainly out of asiolink. The 'hidden' tcpdns.h and
> udpdns.h seem strange to me as well. I want the full resolver process to be
> callable as a library call so some things that are now in resolver.cc may need
> to move to that common place as well (and then resolver.cc should call that).
>
> There's probably more, but this mail has been long enough for now.
>
> I shall go hide under my desk until the chairs have stopped flying.
>
> Jelte
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk04apAACgkQ4nZCKsdOncV1GwCghug3ac/F5VXLI1/82Pr+1A4b
> ScUAoL52WHPzK1XdeyPUOEr+2FkG57or
> =tFDY
> -----END PGP SIGNATURE-----
> _______________________________________________
> bind10-dev mailing list
> bind10-dev at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind10-dev