BIND 10 master, updated. 27b1c2767aa9f7e7b74a640b548aa099605b28c4 Merge #2738
BIND 10 source code commits
bind10-changes at lists.isc.org
Fri Apr 26 08:14:39 UTC 2013
The branch, master has been updated
via 27b1c2767aa9f7e7b74a640b548aa099605b28c4 (commit)
via 0c84643032d36c4b28fcc730872f8866dacebe2c (commit)
via 7b3ab876a7fb697245c81d03bd477f81136355e8 (commit)
via 4963031b007d9e9f15cacfd058edc20a6d33bf37 (commit)
via b4e44a1c5f4f1fa5c16af5bcb2a3ed48a29c1da6 (commit)
via 73feee77d8a05f458c90990c8ff95f3338c54d72 (commit)
via 430bc7504e0b6d439440376bc6b1c56fc69280f7 (commit)
via c6c92db7bae905d428a876996d8414ee1d278fa1 (commit)
via 223e1a93c3a17e229194899151cd728d3f6e72a1 (commit)
via f0f75ecda5cb7e51fbebbcf35ca40f3d38b4b535 (commit)
via 70a919f029b159119a50ac6d6fded6b843167ef1 (commit)
via dfe737706fed6ca7aac475fb91d988222799b21b (commit)
via 122da90997b11cee73dd63291216aa4c6c6ecd11 (commit)
via 400ddf1d85976eb07188e6fa20ae0a83274895fc (commit)
via 2d0fc10cb714c3e34bc670a86ba645b42d3bd777 (commit)
from 88e8eec059a9e91725f8a12ff76e6f9c3af40155 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
commit 27b1c2767aa9f7e7b74a640b548aa099605b28c4
Merge: 88e8eec 0c84643
Author: Michal 'vorner' Vaner <michal.vaner at nic.cz>
Date: Fri Apr 26 10:13:48 2013 +0200
Merge #2738
Clarify high-level design of IPC
commit 0c84643032d36c4b28fcc730872f8866dacebe2c
Author: Michal 'vorner' Vaner <michal.vaner at nic.cz>
Date: Fri Apr 26 10:13:21 2013 +0200
[2738] Minor wording fixes.
-----------------------------------------------------------------------
Summary of changes:
doc/design/ipc-high.txt | 382 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 382 insertions(+)
create mode 100644 doc/design/ipc-high.txt
-----------------------------------------------------------------------
diff --git a/doc/design/ipc-high.txt b/doc/design/ipc-high.txt
new file mode 100644
index 0000000..3f46b5c
--- /dev/null
+++ b/doc/design/ipc-high.txt
@@ -0,0 +1,382 @@
+The IPC protocol
+================
+
+While the cc-protocol.txt describes the low-level primitives, here we
+describe how the whole IPC should work and how to use it.
+
+Definitions
+-----------
+
+system::
+ The system that moves data between the users and does bookkeeping.
+ In our current implementation, it is implemented as the MsgQ daemon,
+ which the users connect to and it routes the data.
+user::
+ Usually a process; generally an entity that wants to communicate
+ with the other users.
+session::
+ Session is the interface by which the user communicates with the
+ system. Single user may have multiple sessions, a session belongs to
+ single user.
+message::
+ A data blob sent by one user. The recipient might be the system
+ itself, other session or set of sessions (called group, see below,
+ it is possibly empty). Message is either a response or an original
+ message (TODO: Better name?).
+group::
+ A named set of sessions. Conceptually, all the possible groups
+ exist, there's no explicit creation and deletion of groups.
+session id::
+ Unique identifier of a session. It is not reused for the whole
+ lifetime of the system. Historically called `lname` in the code.
+undelivery signal::
+ While sending an original message, a client may request an
+ undelivery signal. If the recipient specification yields no
+ sessions to deliver the message to, the system informs user about
+ the situation.
+sequence number::
+ Each message sent through the system carries a sequence number. The
+ number should be unique per sender. It can be used to pair a
+ response to the original message, since the response specifies which
+ sequence number had the message it response to. Even responses and
+ messages not expecting answer have their sequence number, but it is
+ generally unused.
+non-blocking operation::
+ Operation that will complete without waiting for anything.
+fast operation::
+ Operation that may wait for other process, but only for a very short
+ time. Generally, this includes communication between the user and
+ system, but not between two clients. It can be expected to be fast
+ enough to use this inside an interactive session, but may be too
+ heavy in the middle of query processing, for example. Every
+ non-blocking operation is considered fast.
+
+The session
+-----------
+
+The session interface allows for several operations interacting with
+the system. In the code, it is represented by a class.
+
+Possible operations include:
+
+Opening a session::
+ The session is created and connects to the system. This operation is
+ fast. The session receives session id from the system.
+
+Group management::
+ A user may subscribe (become member) of a group, or unsubscribe from
+ a group. These are fast operations.
+
+Send::
+ A user may send a message, addressed to the system, or other
+ session(s). This operation is expected to be non-blocking
+ (current implementation is based on assumption of how OS handles the
+ sends, which may need to be revisited if it turns out to be false).
+
+Receive synchronously::
+ User may wait for an incoming message in blocking mode. It is
+ possible to specify the kind of message to wait for, either original
+ message or response to a message. This interface has a timeout.
+
+Receive asynchronously::
+ Similar to previous, but non-blocking. It terminates immediately.
+ The user provides a callback that is invoked when the requested
+ message arrives.
+
+Terminate::
+ A session may be terminated. No more messages are sent or received
+ over it, the session is automatically unsubscribed from all the
+ groups. This operation is non-blocking. A session is terminated
+ automatically if the user exits.
+
+Assumptions
+-----------
+
+We assume reliability and order of delivery. Messages sent from user A
+to B are all delivered unchanged in original order as long as B
+exists.
+
+All above operations are expected to always succeed. If there's an
+error reported, it should be considered fatal and user should
+exit. In case a user still wants to continue, the session must be
+considered terminated and a new one must be created. Care must be
+taken not to use any information obtained from the previous session,
+since the state in other users and the system may have changed during
+the reconnect.
+
+Addressing
+----------
+
+Addressing happens in three ways:
+
+By group name::
+ The message is routed to all the sessions subscribed to this group.
+ It is legal to address an empty group; such message is then
+ delivered to no sessions.
+By session ID::
+ The message is sent to the single session, if it is still alive.
+By an alias::
+ A session may have any number of aliases - well known names. Only
+ single session may hold given alias (but it is not yet enforced by
+ the system). The message is delivered to the one session owning the
+ alias, if any. Internally, the aliases are implemented as groups
+ with single subscribed session, so it is the same as the first
+ option on the protocol level, but semantically it is different.
+
+The system
+----------
+
+The system performs these goals:
+
+ * Maintains the open sessions and allows creating new ones.
+ * Keeps information about groups and which sessions are subscribed to
+ which group.
+ * Routes the messages between users.
+
+Also, the system itself is a user of the system. It can be reached by
+the alias `Msgq` and provides following high-level services (see
+below):
+
+Notifications about sessions::
+ When a session is opened to the system or when a session is
+ terminated, a notification is sent to interested users. The
+ notification contains the session ID of the session in question.
+ The termination notification is probably more useful (if a user
+ communicated with a given session before, it might be interested it
+ is no longer available), the opening notification is provided mostly
+ for completeness.
+Notifications about group subscriptions::
+ When a session subscribes to a group or unsubscribes from a group, a
+ notification is sent to interested users. The notification contains
+ both the session ID of the session subscribing/unsubscribing and
+ name of the group. This includes notifications about aliases (since
+ aliases are groups internally).
+Commands to list sessions::
+ There's a command to list session IDs of all currently opened sessions
+ and a command to list session IDs of all sessions subscribed to a
+ given group. Note that using these lists might need some care, as
+ the information might be outdated at the time it is delivered to the
+ user.
+
+User shows interest in notifications about sessions and group
+subscriptions by subscribing to a group with well-known name (as with
+any notification).
+
+Note that due to implementation details, the `Msgq` alias is not yet
+available during early stage of the bootstrap of bind10 system. This
+means some very core services can't rely on the above services of the
+system. The alias is guaranteed to be working before the first
+non-core module is started.
+
+Higher-level services
+---------------------
+
+While the system is able to send any kind of data, the payload sent by
+users in bind10 is structured data encoded as JSON. The messages sent
+are of three general types:
+
+Command::
+ A message sent to single destination, with the undeliverable
+ signal turned on and expecting an answer. This is a request
+ to perform some operation on the recipient (it can have side effects
+ or not). The command is identified by a name and it can have
+ parameters. A command with the same name may behave differently (or
+ have different parameters) on different receiving users.
+Reply::
+ An answer to the `Command`. It is sent directly to the session where
+ the command originated from, does not expect further answer and the
+ undeliverable notification is not set. It either confirms the
+ command was run successfully and contains an optional result, or
+ notifies the sender of failure to run the command. Success and
+ failure differ only in the payload sent through the system, not in
+ the way it is sent. The undeliverable signal is failure
+ reply sent by the system on behalf of the missing recipient.
+Notification::
+ A message sent to any number of destinations (eg. sent to a group),
+ not expecting an answer. It notifies other users about an event or
+ change of state.
+
+Details of the higher-level
+---------------------------
+
+While there are libraries implementing the communication in convenient
+way, it is useful to know what happens inside.
+
+The notifications are probably the simplest. Users interested in
+receiving notifications of some family subscribe to corresponding
+group. Then, a client sends a message to the group. For example, if
+clients `receiver-A` and `receiver-B` want to receive notifications
+about changes to zone data, they'd subscribe to the
+`Notifications/ZoneUpdates` group. Then, other client (let's say
+`XfrIn`, with session ID `s12345`) would send something like:
+
+ s12345 -> Notifications/ZoneUpdates
+ {"notification": ["zone-update", {
+ "class": "IN",
+ "origin": "example.org.",
+ "serial": 123456
+ }]}
+
+Both receivers would receive the message and know that the
+`example.org` zone is now at version 123456. Note that multiple users
+may produce the same kind of notification. Also, single group may be
+used to send multiple notification names (but they should be related;
+in our example, the `Notifications/ZoneUpdates` could be used for
+`zone-update`, `zone-available` and `zone-unavailable` notifications
+for change in zone data, configuration of new zone in the system and
+removal of a zone from configuration).
+
+Sending a command to single recipient is slightly more complex. The
+sending user sends a message to the receiving one, addressed either by
+session ID or by an alias (group to which at most one session may be
+subscribed). The message contains the name of the command and
+parameters. It is sent with the undeliverable signals turned on.
+The user also starts a timer (with reasonably long timeout). The
+sender also subscribes to notifications about terminated sessions or
+unsubscription from the alias group.
+
+The receiving user gets the message, runs the command and sends a
+response back, with the result. The response has the undeliverable
+signal turned off and it is marked as response to the message
+containing the command. The sending user receives the answer and pairs
+it with the command.
+
+There are several things that may go wrong.
+
+* There might be an error on the receiving user (bad parameters, the
+ operation failed, the recipient doesn't know command of that name).
+ The receiving side sends the response as previous, the only
+ difference is the content of the payload. The sending user is
+ notified about it, without delays.
+* The recipient user doesn't exist (either the session ID is wrong or
+ terminated already, or the alias is empty). The system sends a
+ failure response and the sending user knows immediately the command
+ failed.
+* The recipient disconnects while processing the command (possibly
+ crashes). The sender gets a notification about disconnection or
+ unsubscription from the alias group and knows the answer won't come.
+* The recipient ``blackholes'' the command. It receives it, but never
+ answers. The timeout in sender times out. As this is a serious
+ programmer error in the recipient and should be rare, the sender
+ should at least log an error to notify about the case.
+
+One example would be asking the question of life, universe and
+everything (all the examples assume the sending user is already
+subscribed to the notifications):
+
+ s12345 -> DeepThought
+ {"command": ["question", {
+ "what": ["Life", "Universe", "*"]
+ }]}
+ s23456 -> s12345
+ {"reply": [0, 42]}
+
+The deep thought had an alias. But the answer is sent from its session
+ID. The `0` in the reply means ``success''.
+
+Another example might be asking for some data at a bureau and getting
+an error:
+
+ s12345 -> Burreau
+ {"command": ["provide-information", {
+ "about": "me",
+ "topic": "taxes"
+ }]}
+ s23456 -> s12345
+ {"reply": [1, "You need to fill in other form"]}
+
+And, in this example, the sender is trying to reach an non-existent
+session. The `msgq` here is not the alias `Msgq`, but a special
+``phantom'' session ID that is not listed anywhere.
+
+ s12345 -> s0
+ {"command": ["ping"]}
+ msgq -> s12345
+ {"reply": [-1, "No such recipient"]}
+
+Last, an example when the other user disconnects while processing the
+command.
+
+ s12345 -> s23456
+ {"command": ["shutdown"]}
+ msgq -> s12345
+ {"notification": ["disconnected", {
+ "lname": "s23456"
+ }]}
+
+The system does not support sending a command to multiple users
+directly. It can be accomplished as this:
+
+* The sending user calls a command on the system to get list of
+ sessions in given group. This is command to alias, so it can be done
+ by the previous way.
+* After receiving the list of session IDs, multiple copies of the
+ command are sent by the sending user, one to each of the session
+ IDs.
+* Successes and failures are handled the same as above, since these
+ are just single-recipient commands.
+
+So, this would be an example with unhelpful war council.
+
+ s12345 -> Msgq
+ {"command": ["get-subscriptions", {
+ "group": "WarCouncil"
+ }]}
+ msgq -> s12345
+ {"reply": [0, ["s1", "s2", "s3"]]}
+ s12345 -> s1
+ {"command": ["advice", {
+ "topic": "Should we attack?"
+ }]}
+ s12345 -> s2
+ {"command": ["advice", {
+ "topic": "Should we attack?"
+ }]}
+ s12345 -> s3
+ {"command": ["advice", {
+ "topic": "Should we attack?"
+ }]}
+ s1 -> s12345
+ {"reply": [0, true]}
+ s2 -> s12345
+ {"reply": [0, false]}
+ s3 -> s12345
+ {"reply": [1, "Advice feature not implemented"]}
+
+Users
+-----
+
+While there's a lot of flexibility for the behaviour of a user, it
+usually comes to something like this (during the lifetime of the
+user):
+
+* The user starts up.
+* Then it creates one or more sessions (there may be technical reasons
+ to have more than one session, such as threads, but it is not
+ required by the system).
+* It subscribes to some groups to receive notifications in future.
+* It binds to some aliases if it wants to be reachable by others by a
+ nice name.
+* It invokes some start-up commands (to get the configuration, for
+ example).
+* During the lifetime, it listens for notifications and answers
+ commands. It also invokes remote commands and sends notifications
+ about things that are happening.
+* Eventually, the user terminates, closing all the sessions it had
+ opened.
+
+Known limitations
+-----------------
+
+It is meant mostly as signalling protocol. Sending millions of
+messages or messages of several tens of megabytes is probably a bad
+idea. While there's no architectural limitation with regards of the
+number of transferred messages and the maximum size of message is 4GB,
+the code is not optimised and it would probably be very slow.
+
+We currently expect the system not to be at heavy load. Therefore, we
+expect the system to keep up with users sending messages. The
+libraries write in blocking mode, which is no problem if the
+expectation is true, as the write buffers will generally be empty and
+the write wouldn't block, but if it turns out it is not the case, we
+might need to reconsider.
More information about the bind10-changes
mailing list