[kea-dev] Initial proposal for Kea Control API

Thu Jul 14 14:29:52 UTC 2016

On 7/13/16 3:41 PM, Tomek Mrugalski wrote:
> Ok, it definitely took me more time, but I finally got round to this one.
>
> On 26.05.2016 22:01, Thomas Markwalder wrote:
>> On 5/24/16 7:27 PM, Tomek Mrugalski wrote:
>> This is a good first cut.
> Thanks. And thanks for the detailed review.
>
> As there are also comments from Shawn and Marcin pending, I didn't want
> to alter the requirements numbering. In certain cases I added extra
> level (e.g. H.17.1). Once we finish collecting feedback from external
> parties, I'll go through and renumber them properly.
>
>> General:
>> --------
>>
>> 1. We have the command channel, across which we issue commands, yet you
>> refer to them frequently as "calls".   As in the Kea Admin guide,  we
>> should refer to them as "commands" throughout this document.
> Updated.
>
>> 2. We need requirements that describe command channel security.  You
>> cited large response sizes as a potential DOS vector. Actually, once
>> they have access to the command channel, the hammer can be dropped in
>> any number of ways.  Making it secure is a vital aspect and needs to be
>> covered by requirements.
> Making something absolutely secure today is not possible. But I get your
> point. Added requirement A.8 "Authentication MUST be supported on
> command channel.". Note the wording implies it is possible to enforce
> authentication, but does not mandate it on every deployment.
>
What you've added is good.
>> 3. People may want an audit log of commands issued, so we should
>> consider adding requirements for this.   These may be satisfied simply
>> by a dedicated logger but that's an implementation detail.
> Good idea. Added requirement A.9.
>
>> Administrative Management
>> --------------------------
>>
>> 1. The paragraph under A.1,  "Supporting large command parameters end
>> and responses..."
>>
>> Really this is part of the A.2 discussion and should be incorporated there.
> Updated.
>
>> 2. A.5 - implies, but does not explicitly state, that content supplied
>> with set-confg would replace the entire existing, in memory
>> configuration.  In other words, set-config is intended to supply a full
>> configuration, anything not supplied in it does not exist.  We should
>> maybe state this explicitly.
> Added text. Hopefully it's more explicit now.
>
>> 3. One thing we might consider is to allow set-config to automatically
>> dump a successful configuration to a file for diagnostic purposes. 
>> Somebody does a set-config, possibly from a remote box and no one
>> records the config. If Kea then crashes we would have no certain
>> knowledge of it's configuration at the time it went down.   The file
>> could be  saved to a time-based file name.  This would not overwrite the
>> existing configuration, nor be reloaded at startup (though that has
>> interesting possibilities).
> I would be cautious about it. For a moment I have write-config command
> described, but then realized how dangerous such call could be. If
> exploited remotely, it would possibly allow Kea to write files in
> locations specified by the attacker. I think better approach is to add a
> flag ("write") that, when set, would cause Kea to write down its current
> configuration somewhere, but that somewhere must not be arbitrary I
> don't know, maybe we will limit it to Kea state directory?
> Anyway, that's something to be figured out in the design.
>
>> Lease Management
>> ----------------
>>
>> 3. "Those two calls will be used to retrieve..."  should be "These two
>> commands will be..."
> Updated.
>
>> 4.  "Q: Do we want to have a single query (e.g. get-lease4) with
>> multiple parameter sets or do we want separate queries..."0
>>
>> Initially I thought, they should be be separate commands but I think
>> having a single ("overloaded") command is more flexible should we decide
>> to add variants in the future.  I don't think the extra parameter logic
>> to deal with permutations would be significant.  Whatever we do decide
>> here we should apply universally throughout the API.  Either we
>> "overload" commands or we do not.
> Updated the text to go with overloaded approach.
>
>> 5. "Q: Do we want to support multi-tenancy..."
>>
>> This seems like a broader question, than just at our command API level. 
>> This is likely to have ramifications other places.  In such a scenario
>> then, what would get-lease6(ip-addr) return if there were more than one
>> lease on different subnets?  It could the first such lease we found OR a
>> collection of the leases.
> The comments you and Shawn made give me some ideas. If we want to do
> multi-tenancy, it likely should be on a global scope rather than leases
> or subnets. In any case, this is out of scope for now, so the question
> has been removed.
>
>> We could decide now that all get commands return collections, just as
>> SQL selects return rows/result sets, giving us ample flexibility for any
>> number of future requirements.
> I prefer for each call to return one or zero objects (leases, subnets,
> etc). get-something type of calls are relatively easy to implement, but
> set-something is trickier. We would have to implement some sort of
> transaction (to roll back the leases we already inserted if the next
> lease inserion fails, do the same for subnets and hosts). This would be
> very difficult to implement (e.g. the change caused subnet with all
> leases in it removed, then next subnet failed, so we need to recreate
> subnet *and* all leases that we just removed). To avoid this sort of
> elaborate logic, it's simpler to go with one object per call approach.
To be clear, what I meant by "all commands" was all commands that fetch
objects.  We could
always adopt a naming convention that anything command which returns
more than one has a plural
name:  get<Object> returns a a single instance (or none),  get<Object>s
or get<Object>List returns
more than one.

>> 6. Does update equate to replacing the entire lease with what is sent
>> with the update command?  In other words, is an update equivalent to
>> delete/add?  If so it implies that every value for the updated lease
>> must be in the JSON supplied to the update.  I'm not saying this is a
>> bad thing, I'm simply looking for clarification.
> Added extra text. It should be possible to update only some parameters.
> The example given is that the sysadmin wants to change lease lifetime,
> so he only has to specify IP address and lifetime. In this case only
> lifetime will be updated as IP address stays the same for the duration
> of the lease lifetime. That's convenient, because he just wants the
> client to keep its lease longer and doesn't want to be bothered with
> details, like what the subnet-id or cltt was.
>
>> 7. Should we allow them to change the subnet id of lease?  This might
>> come in handy for repairing some unseen situation but I'm not certain it
>> is good idea.
> Yes. The commands can be dangerous if you misuse them. We will put
> necessary warnings in the documentation. But clueless sysadmin can wreak
> havoc by writing directly to the DB anyway. It's sorta "here's a
> shotgun, this is your foot, have a nice day" attitude :)
>
>> 8. Do we not also have the multi-tenancy question with update and delete?
> I removed the multi-tenancy from the doc for now.
>
>> 9. "Q: Do we want a way to delete all leases in a subnet? ..."
>>
>> Yes, I think this is useful.
> Ok, added.
>
>> 10. "Q: Do we want to delete all leases that belong to certain identifier?"
>>
>> Are you talking about an identifier having leases in more than one subnet?
>> I imagine this could also be useful.
> Added overloaded delete-lease{4,6}.
>
>> 11. "Note: There are currently no plans to implement calls that retrieve
>> multiple leases..."
>>
>> We could implement a row limit with some reasonable number, and maybe a
>> flag or parameter for overriding that limit.
>> As noted above under general comments, security is a bigger issue than
>> just this item.  In theory only admins should be using this and
>> ultimately it is up to them to use the commands safely.
> I decided to leave this out of scope, but also added it to the list as a
> note that we considered this alternative. I'm hoping to get some
> feedback from couple friendly companies. It's good to have some
> alternatives.
>
>> Host Reservation Management
>> -----------------------------
>>
>> 12. H.17, as with updating leases, is update-reservation equivalent to
>> delete/add?
> Added clarification. A user has to specify only those parameters that
> are to be changed.
>
>> 13. "Q: For IPv6 there may be multiple IPv6 addresses and/or prefixes
>> reserved. There is no easy way to identify them..."
>>
>> This question also applies to host options no?  Users might find it
>> equally useful to add, update, or delete options without having to
>> updating the entire reservation.  We could add these commands as "MAY"
>> support.
> Yes. Added requirements H.17.1 and H.17.2 to
>
>> 14. "Q: Do we want to specify delete-reservation with (identifier-type,
>> identifier, subnet-id)?"
>>
>> Yes I think we should include this.
> Updated the text.
>
>> Subnet Management
>> -----------------
>> 16. "TBD: What to do about subnets modification? There are several
>> options:..."
>>
>> I think a mode parameter is a good idea, certainly one that allows
>> choosing between #1 and #3.
>>
>> I'm not sure how we would implement #2's subnet validity check as a
>> parameter pertaining to a single update.  These checks would have to
>> continue over time, rather than as part of the update processing.  How
>> would one turn this off again?  What we could do, if we are worried
>> about performance is make it a global level parameter, that admins turn
>> on or off, if we think the performance impact warrants this or it
>> prohibits some form of host reservation behavior.
> Updated text. I decided to keep #2 for now, but added a note that it is
> not the preferred way. If we get more voices against it, we'll remove
> it. To somewhat defend it, I think having such a check (if the lease
> belongs to active pool) would be useful as a way for Kea to
> automatically recover from someone or something messing up its database.
>
>> 17."Q: How do we want the subnet removal procedure to work? There are
>> several possible options:"
>>
>> I do not believe #2 is viable because it introduces a violation of
>> referential integrity.  This is true even though Memfile doesn't have
>> foreign key constraints and for RDBMS's they aren't mandatory. 
>>
>> As with the update, I think we should provide a delete "mode" parameter
>> that lets them pick either #1 or #3, "retire" or "immediate".
>>
>> Under "retire", new subnets cannot have the same ID as a retiring
>> subnet, so there is no break in integrity.Retired subnets are deleted
>> only after their last lease is removed.  Mode #3 avoids the issue
>> entirely by performing a cascading delete. 
>>
>> I think that #4, "reconfigure process", might actually be a subnet
>> command in itself.  Wouldn't your use case be something like this:
>>
>> 1. Delete the current subnet
>> 2. Add the new subnet
>> 3. Tell clients of the subnet to reconfigure
> Ok, I think this is a scope creep. The goal here was to design the API,
> not the underlying features. We don't have reconfigure support now, so
> anything related to reconfigure is vague at best at this stage.
My comments on #4 were more exploratory, not suggesting this should in
scope.
>> 18. As an aside, for an datacenter type setup, where they are going to delete
>> thousand of subnets and add thousands more, at some point the ID pool
>> runs out.  We may need an administrative command for handling this.
> We have it already. You can explicitly specify subnet-id when specifying
> a subnet. And even letting Kea assign subnet-ids automatically it's not
> going to run out. Let's assume 20k subnets. That's more than I ever
> heard anyone was using. Let's further assume this particular deployment
> is insane and removes and then add all of those 20k subnets every 5
> minutes. At this pace they would loop the subnet-id in a bit over 2 years.
>
> That's not a problem, though. Their subnet-id (assuming the use
> automatic numbering) would simply loop back to 0 start counting up
> again. If you really think that's an issue, we can consider upgrading
> subnet-id to 64 bits.
>
>
>> Options management - Option Definition commands
>> -----------------------------------------------
>>  
>> 19. "Kea allows specifying options in several scopes:..."  Does not
>> mention client class as a scope.
> Updated.
>
>> 20. The add-optionX-def commands appear to be the only add commands
>> which support adding multiple elements
> Because we don't have any way to reference them. We could do something
> like saying: get-option offset x (i.e. xth option specified, but that
> would be even more confusing).
>
>> - Why support multiples with this one and not others? 
>> Consistency and clarity in APIs are important.  We could decide, that
>> all add commands should accept a collection of objects.  If I can add
>> one host reservation, why not ten?  Or maybe we adopt a naming
> I think we should go the other way and try to make all calls single
> objects. There are several serious issues with handling multiple
> objects. First, we may get hit by the fragmentation. It's solvable, but
> requires extra effort. Second issue is much more important. What if
> process half of the objects and then the next one fails for whatever
> reason. Do we continue or rollback? Sometimes rollback may be extremely
> difficult or even impossible. For example if you delete subnets, you may
> trigger DNS removals that are already ongoing. You can't roll them back.
> So for these reasons I prefer to go with 1 call = 1 object approach.
>
> So we have 2 choices here:
>
> a) update the set/get-options{4,6}-def commands once we figure out how
> to reference options that are already added. Any suggestions?
>
> b) decide that it's ok to work on option sets. Symmetry between
> different calls is nice, but not that critical. Oh, and we already allow
> setting multiple options, multiple IPv6 reservations in host (for the
> same reason).
>
> Which one do you prefer?

I think my issue was actually that the add-option<X>-def commands (O.1
and O.2) allow you to
add one or more definitions while the add-option<X> commands (O.9 and
O.10) allow you to
only add a single option.   Why can I add more than one definition at a
time but only one option at
a time?

The IPv6 address case you mention is slightly different.  We can add
more then one, put it is because they
are children of a host reservation.  The add-reservation command itself
(H.1 thru H.12) does not support adding
multiple reservations.

>> - If we're going to support adding more than one at a time, what happens
>> if one of entries is invalid, does the whole addition "rollback" and fail?
> Yup, that's a problem that is really not solvable. See the DNS update
> example above.
>
>> 21. "set-optionsX-def calls set all option definitions. New options will
>> replace whatever old definitions may have been there."
>> - Does this include the pre-defined "standard" option definitions or
>> does this only apply to custom options. I'm not sure I see the
>> usefulness of this command.  Do you have use case in mind?
> No, I thought only about custom options. You can't have option
> definitions for standard options. These are built in and cannot be
> overridden from config file.
>
>> - If there is an option definition X and values for option X have been
>> specified by two subnets and three host reservations on X but the new
>> set of definitions does not define option X?  Does the set command fail?
> I wasn't really planning, because it's unenforceable. Right now you can
> insert into database an entry that does not match your config. And you
> can't enforce that. If you really think this is an issue, maybe the way
> to go would be to have a sanity-check type of a call? It would verify
> that all your data is consistent.
I think it is incumbent upon the API to sanity check as much as is
practical, requests for
changes.   We may not be able to defend against everything but we should
strive to make
it as robust as we can.    For now, we may wish to add a general
requirement, stating that all
commands carry out a sanity check step which tests requested command
against
business rules pertinent to the command.   If the sanity check fails, we
respond with an
error response/explanation.

What those rules are is something we can define and grow over time.  We
don't have to
specify them right now. 

These rules would also be something that we publish as part of the API
spec. 

Some commands might be - "no sanity checking on this, use at your own
risk" ;)

>> 23. The delete command names abbreviate "delete" to "del".  We need to
>> be consistent. Either always spell it out or always abbreviate it.
> We already have commands that use full names. So no abbreviations.
>
>> 24. As with the set command, how do we handle a delete option definition
>> command that  attempts to delete a definition that is in use?
> Depends on where it is in use. If it's in a configuration, we can
> probably catch that and reject the configuration. However, if it's used
> somewhere in the host reservations stored in database, this would be a
> problem, as we would have to iterate over all reservations from the DB
> and check them one by one. Note we don't have any API for that, so we
> would have to extend all backends. Doable, but not worth the effort imho.
I disagree that it's not worth the effort.  In the near term perhaps but
in the long run
we should defend against what we can and seek to continually improve it.
See my text above.

>> 25. "Q: Do we want also get-optionX-def which would return a single
>> option definition?"
>> Yes, I think we do.
> Ok, added.
>
>> Options management - Option commands
>> -----------------------------------------------
>>
>> 26. "Kea allow option specification on global and per subnet level. Both
>> can be manipulated using the same call. There will be an optional
>> parameter subnet-id. If it's not specified, the code applies to global
>> level. If there is subnet-id specified, the change applies to specific
>> subnet. The same rule applies to all call related to options."
>>
>> Actually they can be set at  global, subnet, host and eventually class
>> level.  Are these commands intended only to address the global and
>> subnet scopes, while options for hosts or classes will be handled under
>> object specific commands?
> Yes. There's a comment for add-reservation and update-reservation that
> explains it would also cover options. There's currently no API designed
> for options defined on class level. I simply decided it's not worth the
> effort at this stage.
>
>> 27. "add-option4 and add-option6 add new DHCPv4 or DHCPv6 option values."
>>
>> Do these commands one option specification or a collection of one or
>> more? It's not clear.
> Added "of a single DHCPvX option". Hopefully it's clear now.
>
>> 29. "set-options4 and set-options6 set all option values for a given
>> scope...." 
>>
>> 30. "get-options4 and get-options6 command returns all option values
>> defined for a given scope. It has one optional parameter: subnet-it. If
>> it's defined, all global options are returned. If it's defined, only
>> options for a given scope are returned."
>>
>> This has a few issue, first is "subnet-it" you mean "subnet-id".  And I
>> think you mean if it is not defined, global options will be returned, if
>> it is defined..."
>>
>>
>> Interfaces Management
>> ---------------------
>>
>> 31. "Q: Do we want those calls at all? ...."
>>
>> I think we probably do want these eventually.  You could change them
>> from MUST to MAY.  But I could see somebody having everything else
>> correct but forgetting an interface name or something.  Certainly, I can
>> see them wanting the redetect-interfaces.
>>
>>
>> Client Classification
>> ---------------------
>>
>> 32. You don't think we know enough about them to write the
>> requirements?  I won't press the issue but I think we do.
> At this stage I simply ran out of steam to propose something. If we
> implement all those features and people would be using them and asking
> for more, then would be a good time to implement classes modification
> API. Personally I consider this document a long lived one (think years).
> We will add it eventually, but not in the first iteration.

Sure.
>> Runtime Operations
>> ------------------
>>
>> 33. Since these are only statistics commands, you could maybe rename
>> this section.
> No. These are statistics only *for now*. I think over time we will have
> more parameters, like maybe cpu usage, free memory, database connection
> status etc. So I really meant that name when I chose it.
Np.
> Tomek
>
> _______________________________________________
> kea-dev mailing list
> kea-dev at lists.isc.org
> https://lists.isc.org/mailman/listinfo/kea-dev