INSIST in lwdclient.c

Kory Keith korykeith.tx at gmail.com
Mon Aug 10 14:46:48 UTC 2015


I tried this, but it basically just moves the trap around given enough time
and enough queries. The code you provided fixes that specific problem, but
the code just below that section is also a problem.
ISC_LIST_UNLINK(cm->idle, client, link);
ISC_LIST_APPEND(cm->running, client, link);

Sometimes I get an INSIST failure or REQUIRE failure here because two
threads can manipulate the idle and running lists at the same time.

One question I had about the clientmgr code in general - is there any
benefit to having more than one client per clientmgr? Or can I avoid a lot
of these data race problems/thread contention problems by setting NRECVS =
1?

Thank you,

Kory

On Fri, Aug 7, 2015 at 3:51 PM, Mark Andrews <marka at isc.org> wrote:

>
> Yes, there is a race.
>
> diff --git a/bin/named/lwdclient.c b/bin/named/lwdclient.c
> index a843134..df85c7c 100644
> --- a/bin/named/lwdclient.c
> +++ b/bin/named/lwdclient.c
> @@ -295,19 +295,23 @@ ns_lwdclient_startrecv(ns_lwdclientmgr_t *cm) {
>         INSIST(NS_LWDCLIENT_ISIDLE(client));
>
>         /*
> +        * Set the flag to say there is a recv pending.  If isc_socket_recv
> +        * fails we will clear the flag otherwise it will be cleared by
> +        * ns_lwdclient_recv.
> +        */
> +       cm->flags |= NS_LWDCLIENTMGR_FLAGRECVPENDING;
> +
> +       /*
>          * Issue the recv.  If it fails, return that it did.
>          */
>         r.base = client->buffer;
>         r.length = LWRES_RECVLENGTH;
>         result = isc_socket_recv(cm->sock, &r, 0, cm->task,
> ns_lwdclient_recv,
>                                  client);
> -       if (result != ISC_R_SUCCESS)
> +       if (result != ISC_R_SUCCESS) {
> +               cm->flags &= ~NS_LWDCLIENTMGR_FLAGRECVPENDING;
>                 return (result);
> -
> -       /*
> -        * Set the flag to say we've issued a recv() call.
> -        */
> -       cm->flags |= NS_LWDCLIENTMGR_FLAGRECVPENDING;
> +       }
>
>         /*
>          * Remove the client from the idle list, and put it on the running
>
> In message <
> CAK22deqpUWTYVaipRrc0FTEgUG_B8RtEt+9Jry_FJmTNeSXr5Q at mail.gmail.com>
> , Kory Keith writes:
> > I have an application that is using BIND lightweight resolver ~9.4 -
> > although I've checked the 9.10.2 code and it looks the same.
> >
> > I'm seeing an INSIST check failing in lwdclient.c:
> > INSIST((cm->flags & NS_LWDCLIENTMGR_FLAGRECVPENDING) != 0);
> >
> > I have multi-threading enabled - I have 3 worker threads running in the
> > resolver.
> >
> > My question about lwdclient.c is how can the resolver support multiple
> > threads with no locking mechanism on the ns_lwdclientmgr_t?
> >
> > Through debugging I'm seeing one thread creating the event and another
> > thread waking up to process that event before the first thread has set
> this
> > flag, and thus the INSIST check fails.
> >
> > I understand that the timing has to be incredibly precise for this case
> to
> > happen, but with multiple threads acting on the same client manager,
> isn't
> > it just a matter of time until things go bad?
> >
> > Am I missing some key piece of code that should prevent this?
> >
> > Thank you.
> --
> Mark Andrews, ISC
> 1 Seymour St., Dundas Valley, NSW 2117, Australia
> PHONE: +61 2 9871 4742                 INTERNET: marka at isc.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-workers/attachments/20150810/40abb0eb/attachment.html>


More information about the bind-workers mailing list