BIND 10 #1859: many auth servers results in Unable to open domain socket on macmini

BIND 10 Development do-not-reply at isc.org
Mon Apr 2 09:26:54 UTC 2012


#1859: many auth servers results in Unable to open domain socket on macmini
-------------------------------------+-------------------------------------
                   Reporter:  jreed  |                 Owner:
                       Type:         |                Status:  new
  defect                             |             Milestone:  New Tasks
                   Priority:         |            Resolution:
  medium                             |             Sensitive:  0
                  Component:         |           Sub-Project:  DNS
  Unclassified                       |  Estimated Difficulty:  0
                   Keywords:         |           Total Hours:  0
            Defect Severity:  N/A    |
Feature Depending on Ticket:         |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------

Comment (by vorner):

 Hello

 Replying to [comment:3 jreed]:
 > {{{
 >      The backlog parameter defines the maximum length for the queue of
 pending
 >      connections.  If a connection request arrives with the queue full,
 the
 >      client may receive an error with an indication of ECONNREFUSED.
 > }}}
 >
 > I don't know yet how this is related, if it is.

 I think it is. What I see could be a problem is the boss listens on a unix
 domain socket to send the file descriptors from socket creator over.
 Imagine following:
  * All 15 auths are started in approximately the same time.
  * So they all connect to boss at approximately the same time and want to
 ask for sockets.
  * The boss picks one and starts handling them (it is not blocking with
 socket creator at this moment, it happened before). But as it is python,
 it takes some time to get through the data structures and answer.
  * At the same time, 14 others try to connect and the queue gets full.

 It could also be when the boss is blocking to request another socket from
 the socket creator (the socket is first „reserved“ over msgq, then sent
 over the unix domain socket).

 The difference I see, I guess linux just sets the listen queue to some
 infinite number no matter what you put there. And we set the listen
 parameter to 5 in the boss. Could you try setting it to something more? On
 the line 900 of boss?

 This still wouldn't explain the msgq failure, which might be something
 unrelated. Anyway, this listening problem would be solved when we teach
 the msgq to send file descriptors in-bound, we could get rid of the whole
 thing with the listening socket in boss.

-- 
Ticket URL: <http://bind10.isc.org/ticket/1859#comment:4>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list