BIND 10 #2738: Clarify high-level design of the CC protocol
BIND 10 Development
do-not-reply at isc.org
Tue Apr 23 03:31:11 UTC 2013
#2738: Clarify high-level design of the CC protocol
-------------------------------------+-------------------------------------
Reporter: vorner | Owner:
Type: task | jinmei
Priority: medium | Status:
Component: Inter-module | reviewing
communication | Milestone:
Keywords: | Sprint-20130423
Sensitive: 0 | Resolution:
Sub-Project: DNS | CVSS Scoring:
Estimated Difficulty: 5 | Defect Severity: N/A
Total Hours: 0 | Feature Depending on Ticket:
| Add Hours to Ticket: 0
| Internal?: 0
-------------------------------------+-------------------------------------
Comment (by jinmei):
Replying to [comment:16 vorner]:
> I tried to rewrite the document. It is probably less formal and detailed
then the example of yours, but I tried to adopt the structure. I also
added some examples.
This version looks much better, or at least close to what I originally
envisioned. I have more things to discuss which follow.
I don't mind removing ipc-high-2.txt at this point.
> Replying to [comment:14 jinmei]:
> > So, in general, I'm not pushing the idea of the details written there,
> > but I'll briefly answer the points below.
> >
> > > * I don't think the session establishment is non-blocking.
> >
> > But how would it block?
>
> The connect() call is blocking, it waits until the other side calls
accept().
Do you have a reference for that behavior? In my understanding
connect() returns as long as the other end has listen()ed on it (and
if the remote end doesn't even listen(), connect() immediately fails
due to ECONNRESET). At least that's how implementations of TCP as
described in RFC793 work, and from a quick experiment that's the same
for UNIX domain stream sockets. In the case of real TCP, connect()
can certainly block because the SYN or SYN-ACK packet may be lost, but
that's not the case for the UNIX domain protocol.
> Also, the getting of lname needs a round-trip to the other process. So
there are at least 4 context switches in between and msgq might be busy.
In my version, I separated the "establish" and the initial
round-trip. But even if that part is included, I thought our
assumption was that msgq wouldn't cause a significant delay (otherwise
many things of this design would be broken). But see the next
paragraph first.
> The session establishment can be IMO considered a fast operation, but I
don't think we can consider it non-blocking. Creating a new session for
each query answered would be a bad idea, but it makes no different if you
create one session on startup or four.
I guess we are basically on the same page, just are different about
the definition of "non blocking" (and I admit the word "non blocking"
may not be a good choice as it could be interpreted as something like
a system call that doesn't cause context switch). I didn't expect the
usage like opening a new session for every query in this context; I
think my specific comment on the "Opening a session" part of the
revised version of ipc-high.txt clarifies that. If it still leaves a
significant gap of understanding, please continue the discussion
there.
> > > * Is it OK to consider the synchronous read non-blocking, even in
the case of talking to msgq? I'm OK with calling it fast, but I don't
think we can call it non-blocking.
> >
> > Do you mean (synchronous) send? In any case whether it's "fast" or
> > "non-blocking", I admit the definition is a bit vague.
>
> No, I meant synchronous receive. You said it can be considered non-
blocking if the other side of communication is msgq.
>
> I don't think we can say that, for the same reasons as above. It is OK
to do such things in xfrout, for example, but trying to subscribe and
unsubscribe once per auth query would be too slow.
I guess this is also about the definition of "non blocking". If you
don't like that word, "short amount of time" or whatever is fine, but
my main point is that I wanted to clarify that the user can get the
service from the system using the synchronous interface (e.g.,
subscribing to a group) even in the middle of interactive session.
> > > * As mentioned above, the message types seem very low level. On
other hand, I think we should document the higher-level (JSON) somewhere
too ‒ format of command, reply, etc.
> >
> > I didn't mean high level things are not necessary. That would simply
> > belong to a different level of conceptual layer. Low level message
> > types may not be necessary if we can explain the system properties
> > without them.
>
> Well, we explain what operations the session can do. What protocol is
behind is described in the cc-protocol.txt.
This simply seems to be the difference on what we envisioned as the
"high level design". I was rather interested in clarifying what kind
of operation can be blocked, what can be delayed (so the user
generally needs to use asynchronous interfaces), whether a group send
to multiple sessions can have a reasonable definition of answer, etc.
Such information is generally missing cc-protocol.txt right now, which
I wanted to see, whether by extending cc-protocol.txt or a separate
document.
Comments on the revised ip-chigh.txt:
'''Definitions/message'''
- the difference may be subtle, but to be very precise I think
recipients of a message are "the system itself, a session of some
(normally different) user, or a set of sessions (called a group,
see below, which is possibly empty)", that is, they are generally
"sessions", not "users", even if at the application level the
intended recipient is the user of the session(s). Same for "The
session/Send", "Assumptions", etc.
'''The session'''
- Opening a session: I'd like to clarify how long the "short amount of
time" is. My understanding of the assumption is that making a
connection is essentially done in an instant, and getting the
session ID is also done without a delay (the session is initially
empty and should never block and the system responds to the request
immediately). With these assumptions, it should be okay, for
example, to open a new session in the middle of handling an
interactive session (e.g, handling a human user's request) while it
may still be considered heavy weight for doing many times per second
(e.g., open a new session every time a DNS server receive a query).
- Send: this description is quite open about whether a user needs to
expect blocking on send. I'd say either the user can assume it's
non blocking (which would be ensured not only by OS level but
also, e.g., with a library level buffer) or it should use some
asynchronous interface if it cannot accept blocking.
- Terminate: I'd clarify if this operation can be blocking or can be
assumed non blocking. It won't matter if the user itself is
exiting, but can matter if some particular session is terminated in
the middle of the user's lifetime.
'''The system'''
There are some things that are not clear to me.
- it's not clear to me how a user shows the "interest" on a group or
session.
- likewise, it's not clear whether a user can cancel its interest on a
group or a session, and if it can, how it does so.
- it's also not clear whether there are "notifications about
aliases" (remember the use of "groups" to represent aliases are
implementation details and conceptually they are separated).
- This note seems to be implementation details rather than high level
property:
{{{
Note that in early stages of startup (before the configuration
manager's session is opened), the `Msgq` alias is not yet available.
}}}
I was not sure how this note matters at this level; if it doesn't, I'd
remove the note to avoid the confusion. If it does, I'd describe
the specific issue, and also generalize the description without the
concept of "configuration manager", e.g., "In the actual
implementation there can be a time lag before the 'Msgq' alias is
available for other users even when other users can open a session
to the system. So, ...??? (I don't know so what)".
'''Details of the higher-level''''
- this looks awkward:
{{{
s12345 -> s0
{"command": ["ping"]}
msgq -> s12345
{"reply": [-1, "No such recipient"]}
}}}
in that "msgq" (which is an alias isn't it?) is used as the sender
(which I thought was expected to specify a session ID).
- reply is missing in this example:
{{{
s3 -> s12345
}}}
'''Others'''
- I'd describe what users would do, just like "the system" section.
- I now found the term "undelivery notification" confusing because
it's named "notification" but not described as a notification
message. I'd rename it, e.g., "undelivery signal" or whatever, at
least not using the same term of "notification".
'''Known limitations'''
- For wording consistency, "the daemon" should probably be "the system".
--
Ticket URL: <http://bind10.isc.org/ticket/2738#comment:17>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list