[bind10-dev] What to do when msgq can't write to a reader?

Scott Mann smann at isc.org
Wed Jan 5 19:36:16 UTC 2011


Shane,

Yup. You're probably right :)

-Scott

On 01/05/2011 10:58 AM, Shane Kerr wrote:
> Scott,
>
> As for whether processes are well-behaved when messages don't arrive, we
> have to assume that messages might get lost. After all, components can
> crash. We don't assume this is a common case, so error handling may be
> to punt operations on the large scale rather than retry.
>
> Of course, if some process is not reading properly, we're probably
> better off having interaction with it fail than the entire system
> becoming unusable. Probably. :)
>
> --
> Shane
>
> On Wed, 2011-01-05 at 09:59 -0700, Scott Mann wrote:
>> On the surface of it, I would also vote for #1 (or #1+ where you kill
>> the process - the crash-only software approach) because it affords a lot
>> of simplicity. My only question is, are the other processes well behaved
>> when a message they've sent is never received? In my experience, option
>> #2 is a tar baby.
>>
>> -Scott
>>
>> On 01/05/2011 09:34 AM, Shane Kerr wrote:
>>> All,
>>>
>>> I have a ticket concerning a bug that makes BIND 10 stop working
>>> properly on my server, ticket #420:
>>>
>>> https://bind10.isc.org/ticket/420
>>>
>>> Basically if something is not reading it's msgq properly then the msgq
>>> blocks writing to it.
>>>
>>> It's annoying me enough that I want to fix it. The question is, what is
>>> the correct behavior if a reader is not reading?
>>>
>>> I think there are two basic options:
>>>
>>>        1. Close the connection.
>>>        2. Buffer the data.
>>>
>>> Neither are perfect or simple.
>>>
>>> Option #1 means that a process that is busy for a period of time may get
>>> disconnected.
>>>
>>> Option #2 still needs some sort of way to handle a full buffer (after
>>> all, Unix sockets have something like 4096 bytes of buffer, so there is
>>> some already), uses more resources, and delays quick handling of a
>>> broken connections.
>>>
>>> So, I think I would like to implement option #1 (also naming&   shaming
>>> the process involved for debugging purposes!), unless there are any
>>> objections.
>>>
>>> (Note that a more extreme version of #1 might be to terminate the stuck
>>> process, rather than simply closing the connection.)
>>>
>>> BTW, when looking at this I found a cool page about Unix domain sockets:
>>>
>>> http://www.wsinnovations.com/softeng/articles/uds.html
>>>
>>> This even points out some Linux-specific features that I wish were
>>> portable. :-P
>>>
>>> --
>>> Shane
>>>
>>> _______________________________________________
>>> bind10-dev mailing list
>>> bind10-dev at lists.isc.org
>>> https://lists.isc.org/mailman/listinfo/bind10-dev
>




More information about the bind10-dev mailing list