BIND 10 #1522: Implementatio of isc::server_common::SocketRequestor
BIND 10 Development
do-not-reply at isc.org
Mon Jan 2 17:48:40 UTC 2012
#1522: Implementatio of isc::server_common::SocketRequestor
-------------------------------------+-------------------------------------
Reporter: | Owner: jinmei
vorner | Status: reviewing
Type: task | Milestone:
Priority: | Sprint-20120110
blocker | Resolution:
Component: | Sensitive: 0
Unclassified | Sub-Project: Core
Keywords: | Estimated Difficulty: 0
Defect Severity: N/A | Total Hours: 10
Feature Depending on Ticket: |
Socket creator |
Add Hours to Ticket: 0 |
Internal?: 0 |
-------------------------------------+-------------------------------------
Changes (by vorner):
* owner: vorner => jinmei
Comment:
Hello
Trac is back, so the answer should go here as well.
Michal 'vorner' Vaner
Replying to [comment:14 jinmei]:
> Replying to [comment:13 vorner]:
> Is there any documentation that describes the protocol between the
> boss and other clients? I cannot find it in bind10/README or in
> lib/bind10/README, so I've looked at the bind10.py implementation and
> guessed what's expected, and updated the code based on it.
No, I didn't find the time to write the documentation yet and I doubt
anyone
else did, as it was mostly me (with the exception of this particular
branch) who
worked on the socket creator. I know I should have, so I'll probably
create a
ticket for it to do it sometime soon.
> This caused an additional, unexpected issue: if the remote end (in
> practice, boss) closes the connection first, requestSocket() would
> result in SIGPIPE (and the application would die). Just like we did
> for `SocketSession`, we need to at least ignore that signal. I've
> added that code too (btw I suspect this is also the case for the bos -
> an app could die after requestSocket(), which would automatically
> close the connection). The fact that we did the same trick again is
> another reason why we should unify this and `SocketSession` (but
> that's beyond the scope of our current work).
Well, this wouldn't be such a problem, because boss is one of the
components
that aren't allowed to crash. And it doesn't close the socket anywhere in
its
code. But maybe better to have it there.
> > Anyway, it would be bad to close the connection. That would signal the
boss to reclaim all the sockets transferred previously.
>
> Hmm, on further considering this and other bullet points below, I now
> think we should probably manage the states of open sockets at the boss
> and applications separately from the channel between them (i.e, the
> connection between UNIX domain sockets). We should probably introduce
> (re)synchronization mechanism about the open socket states between the
> boss and the application when the connection is (re)established, and
> then we can simply drop the connection if it's possibly in an unusable
> state (and re-establish a new one).
Well, they are bound to the socket they were sent over for a simple
reason: we
need to know when an application crashed, to release its sockets. And it
seems
an overkill to do so much to handle a possible bug, at last for this stage
of
the project, it seems to me.
And anyway, I consider this code to be mostly temporary, for two reasons:
* If/when we adopt dbus instead of msgq, we can register at the buss to
tell
the boss when an application that sent given request disconnects from
the
boss.
* If/when we adopt dbus, we might want to get rid of all this low-level
error-prone code and send the sockets over the dbus (it doesn't allow
sending
them on windows, but our current code doesn't either). And the socket
could go
directly in the response to the request, avoiding this two-phase way.
So, having a note in the code or low-priority ticket for it is OK, but I'd
be
against trying to solve it now or in the next sprint or like that.
> > Hmm, what do you mean? Like if the boss sent a "2\n" and the socket?
Yes, then it would leak, but can we really avoid that in any way?
>
> See the third bullet, for example. If we pass the pair of status (1
> byte) and FD via a single sendmsg()/recvmsg() (the status will be in
> the 1-byte "normal" data; the FD is passed as ancillary data),
> requestSocket() could examine both the FD and status and can close the
> socket (e.g.) if FD is bogus.
But if the „status“ byte would be bogus, the fd could leak anyway,
couldn't it?
If there was some?
Anyway, I do agree that sending it within the sendmsg packet, all at once,
is a
good idea.
> > > - I personally don't think it the best idea to signal the case of
> > > "CREATOR_SOCKET_UNAVAILABLE" by exception. At the very least it
> > > should be specifically separated from other exceptions such as the
> > > one as a result of bogus status data; otherwise, assuming the app
> > > should generally drop the connection on the latter case, we could
> > > drop and re-establish the connection unnecessarily
> >
> > I'd like to point that such situation shouldn't happen in practice and
would be same kind of bug as a wrong data sent, etc. In the time the
client picks up the socket, it was already prepared and reserved. It can
be unavailable only in case the token is wrong.
>
> I thought "unavailable" can be possible for a reason we cannot
> control, such as unrelated third-party application already opens the
> port. Am I mistaken?
We might have misunderstood each other. The CREATOR_SOCKET_UNAVAILABLE
constant
is for the case I described. If the socket can't be created for reasons
like the
port being taken already, the result would be sent over the msgq. But
you're
right this case isn't signalled by a different exception now. The problem
lies
within boss, as it sends the same error status for all errors, which
should be
improved.
Another cleanup ticket, maybe?
> > The newline is there to preserve some kind of consistency, and I
wasn't sure if python wouldn't want to do some kind of line based
buffering (well, it probably wouldn't).
>
> Hmm, I'm not sure whether you mean the newline is good or bad by the
> Python thing, but as long as we use multi-byte data that's probably
> just a matter of taste. More substantially, if we could switch to
> 1-byte status code (obviously it would exclude a newline), we'll be
> able to pass the pair of status code and FD via a single call to
> sendmsg()/recvmsg(), which will make error handling simpler.
My point was, if python did line-oriented buffering, the newline would
force it
to flush. It seems a crazy idea to thing there would be line-oriented
buffering,
looking back, but as you say, this may turn out to be irrelevant when we
go for
the status with sendmsg.
Few notes for the new code:
* Is there no logging when the request fails for some reason (error
response
over msgq, CREATOR_SOCKET_UNAVAILABLE, other failures…).
* The protocolString function could be used inside the
createRequestSocketMessage as well and reuse some code.
* Is it possible to make gtest produce a warning (similar with the „You
have
disabled tests“ line at the bottom of the tests) if some of the tests
are
skipped when it is impossible to do the timeout?
* When reading the tokens in the child, the hardcoded length is 4. Maybe
a note
to the function it shouldn't be used with tokens of different lengths?
* What is wrong with ssize_t in 85bf91d5836306a39b471567952ad1e0af91d948?
ssize_t is signed, so the error handling should work, and seems more
clean
than int, as it is 64bit on 64bit systems (therefore it allows sending
more
than 2GB of data at once, though it is unlikely anyone would do that).
--
Ticket URL: <http://bind10.isc.org/ticket/1522#comment:17>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list