[kea-dev] Call for comments about the Lease Expiration design

Mon Aug 3 11:05:50 UTC 2015

On 27.07.2015 14:17, Marcin Siodelski wrote:
>> Can you consider renaming reclaimLeases4() to processExpiredLeases4? It
>> is maybe obvious what reclaim means in the lease expiration context, but
>> when you look at it some time in the future, it won't be that easy to
>> guess what "reclaim" means. It could as well be confused as reclaim
>> declined leases or reclaim leases that after reconfiguration no longer
>> belong to a subnet. On the other hand, if the decline design is accepted
>> as proposed, it may reclaim also declined leases, so maybe that name is
>> not that bad after all?
> 
> Perhaps reclaimExpiredLeases to make it consistent with the terminology
> in the design document and remove ambiguity?
Sounds good.

>> Periodically executing tasks
>> From the tasks listed at the end of your design, I presume that you
>> favor approach 2. I like it more than approach 1, so given choice of 1
>> or 2, I would go with 2. There's another possible approach, though.
>> Let's call it approach 3. You may think of it as simplified approach 2.
>>
>> Approach 3
>> Everything happens in the main thread. The code is allowed to create new
>> timers wherever needed, but it must use the IOService created in the
>> Daemon class. Right now the main DHCPv{4,6} loop calls
>> LeaseMgrFactory::instance().getIOServiceExecInterval(); to determine the
>> timeout, but it would call TimerMgr::getTimeout() instead. (That would
>> also solve the problem described in #3696). That method would go over
>> all registered timers an find the shortest timeout. That's not a big
>> deal, as I don't think we'll ever have more than 3 or 4 timers (one for
>> LFC, second for expiration, maybe a third one for failover and maybe one
>> more custom for hooks). In this approach, TimerMgr is little more than a
>> container for timers with couple utility methods. No WatchSocket would
>> be necessary.
> 
>  This approach doesn’t scale well. I doubt if having 3 or 4 timers, as
> you say, is all we’re going to have. But even with two timers we’re
> going to have some trouble. Consider two timers which are supposed to be
> executed at different time intervals: 13s and 10s. How do I guarantee
> that they are really executed at these intervals? If I pick the smaller
> value of 10s and run IO service every so often it will be executed at:
> 0s, 10s, 20s, 30s and so. The other timer requires that it is executed
> at 0s, 13s, 26s, 39s, and so it is desynced with the time when I
> actually want to run my IO service. I could maybe find some common
> denominator but how do I do it for many timers?
No. At t=0 you iterate over all timers and pick the shortest elapsed
time (that would be 10s). Then run IO service and do the evaluation
again. The first timer just elapsed, so it's restarted (10s out of 10s)
and for the second timer there's 3 seconds left, so you pick 3 seconds.
Run IO Service for 3 seconds. Second timer elapsed, so you do whatever
action it was supposed to carry and restart it. You then have 7s left
out of 10s for the first timer and 13 out of 13s for the second one. You
pick 7 seconds and carry on.

This may sound complex, but it's not. For each timer you keep only 2
pieces of info: the deadline timestamp (so you can calculate how much
time left till timer elapses) and the total time (needed only if the
timer is recurring).

> I also want to point out that I have a strong desire to migrate the DHCP
> servers into more asynchronous model (similar to D2), as its operation
> is in fact asynchronous, i.e. it may receive packet at any given time.
> This may be out of scope for a few upcoming releases, but there is/was
> some consensus in the team that this is where we should go. In that
> case, the use of the asynchronous functions offered by IO Service is
> something that we’ll do anyway.
> Finally, the Approach 2 brings some implementation complexity, but this
> is a price we will pay only once.
One clarifying question:

> Each created timer is associated with an instance of the WatchSocket
> object. This object is registered in the IfaceMgr as an external
> socket with the appropriate callback function. The generic callback
> function picks the appropriate WatchSocket to use and marks it
> "ready".
There are X timers, how many WatchSockets are there: 1 or X? The text
above does not specify that.

If the answer is X, I don't really like it. WatchSocket is implemented
using pipes. That means that for X timers, there will be X pipes.
Depending on the state of those pipes, the code may behave differently.
This is a non-trivial complexity that will be paid every time we debug
something that involves anything related to data reception or waiting.
It will NOT be paid once.

> Strings are more intuitive, as you pointed out, and it is easier to find
> the unique identifier than in the case of integers. So, I’d stick to
> this, unless other people object.
I'm ok with that. In my opinion the number of timers will be small and
they'll be accessed rather infrequently, so it's ok.

> I was thinking that the message at INFO level is not appropriate here,
> as you’re going to have a lot of them if you set the processing time to
> short one. The aim was to have a WARN message if we discover that the
> server can’t catch up. So, you keep running the reclaimation routine and
> after a couple of cycles you still have expired leases in the database.
> I don’t think that I’d have the WARN message logged repeatedly in the
> case that you have described. If the server was able to clean up the
> database from the expired leases, it would reset the counter. The
> counter would be increased only if it was unable to clear all expired
> leases.
Ok, that will work.

>> - I think idle-time should be specified in seconds, not milliseconds.
>> Even in the most agressive setups ("I need to know about the expiration
>> exactly the second when it happens"), the smallest value used would be
>> 1000. The name is also confusing, as it suggests that the server will do
>> expiration only if nothing else is happening. People may afraid that if
>> it's set to 1 second and there's a packet coming in every second the
>> expiration will never be triggered. Better name would be
>> "run-every-x-seconds", "expiration-interval" or simply "interval". The
>> documentation should clarify that it won't happen strictly every X
>> seconds, but every X+(time it took to process the last expiration cycle).
> The problem is that the value set here doesn’t specify the interval
> between starts of two consecutive lease reclaimation routines, but it
> specifies the interval between the end of the previous routine and the
> beginning of the next one. This is to prevent starting routine while the
> previous instance hasn’t finished yet, and allowing the „idle” time
> between them.
> So, neither of your proposed names really conveys this meaning. Maybe
> „wait-time”?
Sounds good.

> I don’t have an ability to run the reclaimation routine during the
> „start time”, as the server requires to fully process its configuration
> to be able to do it. We may make sure that the server doesn’t wait for
> the first timer to elapse before running the reclaimation routine, but
> simply run it at the zero time. 
Ok, running it at t=0 would do the trick.

> However, I don’t see a lot of value in it, as the timers will be
> scheduled for relatively short periods of time anyway.
You don't know that. Once you expose the timer knob to the users and all
bets are off. There will be people setting it to insane values "to make
the server run faster".

One extra comment. The design says that with "idle-time" (or whatever
you decide to rename it to) set to 0, lease expiration is disabled.
That's ok, but I think there still should be a way to run it when the
server starts.

>> Extra steps for recovering declined addresses
>> If the Decline design I proposed gets through, there will be one extra
>> minor step to do. See point 4 in Implementation details here:
>> http://kea.isc.org/wiki/DeclineDesign.
> I think that the decline and expiration are logically two separate
> things, and should be processed separately.
Lease expiration is mechanism that does something with leases after
their timer expires. Exactly the same way declined leases are recycled
after certain period of time. Think of it as decline mechanism being a
consumer of the mechanism you're implementing for lease expiration.

Why do you want to enforce implementing duplicate code for exactly the
same thing? The only difference is one hook call and one log message.

Tomek