[kea-dev] Design for Kea Host Reservation

Tue Oct 7 12:51:08 UTC 2014

On 06/10/14 19:28, Marcin Siodelski wrote:
>>   "If the allocation of the new lease fails for the reserved address or
>>   prefix, the allocation engine retries using the client's hint. If that
>>   fails, it proceeds with a normal allocation process."
>>
>> That's completely wrong, I'm afraid. Host reservation is not a
>> guideline or a suggestion, it's a strict rule the server must follow. HR
>> can be used to grant or special service, but also also to confine users
>> in various ways. We can't simply give them a regular address instead.
>>
> 
> I take this point. But, you might have seen my email sent earlier today
> to Thomas where I state this:
> 
> "For the Host Reservation, there is an assumption that the server will
> always try to use reserved resources for a host, if any. But, if the
> reserved resource is unavailable for some reason (e.g. is in use) the
> server should still be able to provision the client by allocating some
> other resource. We may obviously speculate whether it is always
> appropriate for the server to allocate a different address than the one
> that the administrator wanted a client to get and whether the client
> should rather not be provisioned in such case. But, I think it is not a
> problem to make this configurable at the later time once the whole logic
> is in place. So, this discussion is out of scope in the doc."
> 
> In my opinion we should not make too strict assumptions because I can
> imagine customers having some use cases in which this would be allowed.
> And, I state this again: it is much easier to restrict something (with a
> configuration knob) than extend the mechanism if the use case appears.
> 
> I tend to agree that this is going to be a rare case. But for this
> reason I don't see a massive escalation of issues that someone has
> received different address than reserved.
> 
> As you seem to be pretty confident here, I would like you to make a
> final statement on this: "that we will never, ever need a configuration
> knob which would allow for dynamic allocation if reserved address is
> unavailable". If you can make this statement I can remove this from the
> design.
I can't speak for all the users. I thought about this a bit more. I'm ok
with the server assigning an address different than in host reservation,
under the condition of the reasons being clearly logged in a very
visible (warning?) way. Something like "Client A has reservation for
address B, but B is currently assigned to client C. Temporarily
assigning a new address D to A. Will change that address as soon as C
attempts to renew B. That correction is expected to happen in E
seconds.". A bit long, but unambiguous. They other way to express it
("hey admin, you messed up and made a reservation for address that is in
use, we'll correct this mess for you, but it will take some time") would
likely be considered less appropriate ;)

> One of the possible approaches would be to wait for the first client to
> renew his address and once the server sees the renewing client it may
> send 0 lifetimes to this client to say: "don't use this address anymore,
> because I have reservation for it. Instead I am giving you this
> dynamically allocated address". The reserved address gets back to the
> server and waits for the second client to renew in which case the second
> client gets the reserved address and the previously allocated address is
> de-allocated. So, over time there is a transition and both clients
> remain in service and they finally get their addresses as appropriate.
> What is wrong with this?
That's acceptable behaviour and we can go ahead with this. But it has
its drawbacks. First, the client gets an address from a dynamic pool.
That may be a problem if host reservation is used for confining or
segregating clients, e.g. for a handful of clients that forgot to pay
and are redirected to a captive portal. Second, the address the client
gets (from dynamic pool) will change in the near future. I suppose both
are ok, as this is a misconfiguration recovery mechanism.

As you said, having a knob to allow admin to decide would be the best
ultimate solution.

> I don't want to get panicked by this. Maybe let's ask users? Maybe let's
> disable this by default and display warnings when enabled?
Sure, that will work.

> 
>> Here's how it should work in my option:
>>
>> 1. there are no reservations
>> 2. client A gets address X
>> 3. admin add reservation for address X to client B
>> 4. client B requests an address, the server discovers that there is a
>>    reservation, but also a lease for client A. It logs an error and
>>    the client B is not assigned any address.
>> 5. Client B repeats discovery process in loop, with exponential backoff.
>> 6. Client A eventually renews, the server discovers that the address
>> it has is reserved to someone else, send X with 0 lifetime back to A
>> and assigns other address Y.
>> 7. Client B does another discovery attempt and get reserved address X.
>>
>> Obviously, 3. is a misconfiguration, but we can't completely prevent
>> that from happening.
>>
>> This is the right way to recover from a misconfiguration in my
>> opinion. If we implement the way you propose, then users will start
>> asking questions: why didn't the host reservation work? How long till
>> the server starts using the host reservation I specified? And there
>> would be no easy answers, because it would depend on T1 and lease
>> lifetmes (think about clients that get an address and disappear: waiting
>> till T1 wouldn't be enough, you'd have to wait till valid-lifetime).
>>
>> You may argue that the plan I described about generates more
>> traffic. That is true, but it's a weak argument. First, such
>> misconfiguration is expected to be a rare event. Second, it gives much
>> better recovery time. The usual exponential backoff counts to 120
>> seconds (or 3600 seconds if a client supports RFC 7083). That's still
>> much better recovery time than some of the real networks we heard about
>> (e.g. 7 days lifetime in cable networks).
> 
> Ok, so this is an exponential backoff for the Client B. But Client B
> still needs to wait for Client A to renew and so as the server can
> replace the address it is using with a new (not reserved address). So,
> the Client's B retransmission period doesn't mean anything on its own.
> If the Client A waits for 7 days before it Renews, Client B is out of
> service for 7 days. Whereas, with the approach I described it could use
> some address during the transition period.
> 
> 
>>
>> In time, when we get reconfigure support, we will trigger it after
>> step 4 to make the recovery much faster.
> 
> Obviously not for DHCPv4.
Why? See RFC3203 (and RFC6704). Ok, I'm not sure how popular is
forcerenew implementation.

> I don't see a reason whye the reservation can't be out of the pool. So
> you're proposing that when I define a reservation, the configuration
> mechanism checks if this reservation happens to be in one of the pools
> defined for a subnet? And if it is, reject the reservation? How would I
> guarantee this for the HR configuration in the database? What about the
> cases that someone reconfigured the server as we were discussing above?
That's what I what thinking about. If the HR is defined in config, we
could sanity check it during config reload. If the HR is in the database
we could do two things. First, sanity check it during runtime when we
happen to read it from the DB. Second, we could implement a command,
something like host-reservation-check, that would go read all HRs and
sanity check them. We may implement such a command anyway, regardless of
what we decide for this particular case.

The primary reason why in my opinion reservation shouldn't be out of the
pool is performance. As I said before, if you allow in-pool
reservations, lease select will become slower.

The second argument is monitoring. Right now you could do some
measurements and try to optimize based on it. For example, check that
your pool is 1000 addresses long and you already have 1000 valid leases
in the db, so don't bother searching for available lease. You could do
statistics and triggers based on it ("hey admin, you're running out of
addresses, there's only 5 available left out of 1000 total") etc. If you
allow in-pool HRs, you won't be able to do any of that.

>> Is there HR for this client? If yes, use whatever is reserved and be
>> done with it. If not, use dynamic allocation as it is defined now,
>> without performing any HR queries at all. It's faster and the code is
>> simpler.
> 
> So you're proposing that the server doesn't check if the lease exists
> for the particular address in the lease database when it has HR?
No, I meant the opposite. When the server picks a candidate for a
dynamic lease, it only checks whether the address is used (if there is a
lease for it) or not. It doesn't check if that particular lease is
reserved for someone else.

In general case, allowing in-pool reservations will degrade performance
even if there are no HRs specified at all. The code would be sending HR
queries anyway.

> I also don't understand "without performing any HR queries at all". The
> query to the HR database has to be made to obtain information if the
> reservation is specified for the host.
Yup, but only once. For one incoming packet, it is queried once. We're
talking about a case, where we checked that there's no HR for this
client and decided to pick a dynamic address. The question here is
whether we should keep sending HR queries for candidate leases that
we're picking from dynamic pool.

Anyway, if my arguments don't convince you, then so be it. We'll measure
the performance after HR is implemented anyway. We may discover that
allocating new leases will be slower than it used to be. If the
degradation will be significant, we could add a configuration knob to
forbid in-pool reservations. Until there's empirical data, there's no
point in continuing this discussion. Let's keep the design as you proposed.

> I have no issue with performance tests. In my opinion we should run them
> as soon as possible for all changes we make.
Agree.

Tomek