BIND 10 #1522: Implementatio of isc::server_common::SocketRequestor

Wed Jan 4 15:25:12 UTC 2012

#1522: Implementatio of isc::server_common::SocketRequestor
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  vorner                             |                Status:  reviewing
                       Type:  task   |             Milestone:
                   Priority:         |  Sprint-20120110
  blocker                            |            Resolution:
                  Component:         |             Sensitive:  0
  Unclassified                       |           Sub-Project:  Core
                   Keywords:         |  Estimated Difficulty:  0
            Defect Severity:  N/A    |           Total Hours:  10
Feature Depending on Ticket:         |
  Socket creator                     |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by vorner):

 * owner:  vorner => jinmei

Comment:

 Hello

 Replying to [comment:18 jinmei]:
 > I'm afraid I was misunderstood.  Assuming your comment is about
 > "btw I suspect this is also the case for the boss", what I wanted to
 > point out is that a bug in a client component could make the boss
 > crash, too.

 Ah, right, that would be bad. But I believe boss ignores SIGPIPE already
 for some other reason.

 > I understood that, but it can be done from the fact that the
 > application dies itself.  In any case, as long as a socket could be in
 > a broken state, I don't think it acceptable to pretend that it's safe
 > to keep using it.  The choices I can see are: drop everything and
 > create all necessary sockets from the scratch; similar to the previous
 > one, but just abort the application and restart it; or separate the
 > socket state and connection state.

 Maybe killing the application for now could be the way to go. If it turns
 out it happens in practice, we could have a look if it can be solved in a
 better way, right?

 > Maybe my practical suggestion would be to just abort the application
 > if such "should not happen" thing happens with an assert or non
 > catchable exception.  If it really doesn't happen, it's just fine; if
 > it can happen in practical scenario, it's better to know that fact
 > sooner.  And I insist we should do it sooner than later - if not for
 > the coming release, in the next sprint or so.

 Aborting or doing something similarly simple is OK, but I was afraid you
 wanted to implement some complicated resynchronisation protocol to work
 around possible bugs, which, if ever encountered, should be fixed, and we
 don't know if they exist at all. That would sound like a waste of time to
 me.

 > The difference is that in the current style recv_fd() can be called
 > only after we examine the status, and it just wouldn't make sense to
 > call it if we know the status code is bogus.  If we piggy-back the
 > status in the recvmsg, we can detect the code is bogus while still
 > retrieving the FD from the ancillary data, then close the FD (all
 > within recv_fd()).

 But then, calling close on _something_ we got from the auxiliary data (if
 the status is bogus, then the socket may be as well) might lead to closing
 something completely unrelated. That doesn't sound too good to me either.

 > > Few notes for the new code:
 > >  * Is there no logging when the request fails for some reason (error
 response
 > >    over msgq, CREATOR_SOCKET_UNAVAILABLE, other failures…).
 >
 > My intent was that in these cases we throw an exception and it will be
 > logged in an appropriate context at a higher layer.  (But when we
 > separate the error case of failing to create a socket than other rare
 > failures, it would probably make sense to log the event about the
 > former).

 OK, that is fine, I was just making sure it was not overlooked.

 > Okay, done, and while doing so I noticed that the current code could
 > accept bogus 'protocol' and 'share mode'.  Even though it may be
 > rejected after exchanging messages with the boss, it would be better
 > to reject such a case at an earlier point.  I added the check and
 > corresponding tests.

 Well, these are enums, so the application should not be able to create the
 invalid values in any legal way, but being too paranoid is probably better
 than being too careless.

 > >  * What is wrong with ssize_t in
 85bf91d5836306a39b471567952ad1e0af91d948?
 > >    ssize_t is signed, so the error handling should work, and seems
 more clean
 > >    than int, as it is 64bit on 64bit systems (therefore it allows
 sending more
 > >    than 2GB of data at once, though it is unlikely anyone would do
 that).
 >
 > I simply tried to use the same type as the return value of read() or
 > write(), but I didn't necessarily have to change it.  In any case
 > using a larger size than int doesn't help anyway because
 > read()/write() can only return int values.  So in either case the
 > behavior of the code should exactly the same.  If you still want to
 > revert this part, however, please do so.

 Well, my man 2 read (and write) tells me my read and write return ssize_t.
 But as I say, the practical difference would be none I guess, so it
 doesn't really matter.

 Thank you

-- 
Ticket URL: <http://bind10.isc.org/ticket/1522#comment:20>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development