[kea-dev] Call for comments about the Lease Expiration design

Mon Jul 27 12:17:27 UTC 2015

Tomek,

Thanks for reading it and commenting. My answers inline.

On 13.07.2015 19:11, Tomek Mrugalski wrote:
> On 06.07.2015 14:43, Marcin Siodelski wrote:
>> I have put together the document
>> http://kea.isc.org/wiki/LeaseExpirationDesign, which presents the design
>> for the Lease Expiration in Kea 1.0.
> I reviewed your proposal and let me be the first one to say: good work!
> Expiration is a tricky topic and I'm sure there were temptations to do
> things with threads, processes or other tricky things. We may get there
> eventually, but 1.0 is not the right time to step into this muddy area. :)
> 
> Anyway, on to the details:
> 
> Lease Reclaimation Routine
> Can you consider renaming reclaimLeases4() to processExpiredLeases4? It
> is maybe obvious what reclaim means in the lease expiration context, but
> when you look at it some time in the future, it won't be that easy to
> guess what "reclaim" means. It could as well be confused as reclaim
> declined leases or reclaim leases that after reconfiguration no longer
> belong to a subnet. On the other hand, if the decline design is accepted
> as proposed, it may reclaim also declined leases, so maybe that name is
> not that bad after all?
> 

Perhaps reclaimExpiredLeases to make it consistent with the terminology
in the design document and remove ambiguity?

> One thing is not clear to me. In the initial implementation with a
> configuration of 100 subnets, how many times the reclaimLeases4 would be
> called? One or 100? While I can certainly see a benefit of having per
> subnet tweaks in the future, I think the base approach should be to
> reclaim leases uniformly.
> 

One. This is assuming that the number of expired leases is below the
threshold for a single pass of the routine. But, I guess you assumed that.

> On a related note, it brings us to the interesting problem of measuring
> timeout. If you plan to implement only global (not per subnet)
> expiration in 1.0, this discussion becomes academic and we can skip it
> in 1.0 timeframe. Anyway, if you have 100 subnets and the user had
> specified that expiration can take no longer than 10 milliseconds, would
> the code favor expiration for lease from subnet 1 over those from subnet
> 100? I think the answer is "no" (which is good) if there will be one
> call for all subnets. The answer would likely be "yes" if there were
> separate calls for each subnet. That's another reason why getting per
> subnet tweaks may fragile. The code would have to implement some sort of
> round robin with memory (the last expiration processing stopped at
> subnet X, so this one continues from X or X+1).
> 

Since each subnet could potentially use a different interval for
executing lease reclaimation routine, I assume that we’d need separate
timer for each subnet. The user would need to be careful to set the
appropriate threshold for the number of leases to be reclaimed in each
subnet in one pass of the reclaimation routine, though.
Nevertheless, it is out of scope for now.

> Comment for unexpected shutdown:
> I think the problem with unexpected shutdown is really non-issue if you
> order the steps in appropriate order. In particular, the actual lease
> removal must be done after hooks and name change request is sent. If the
> server experiences an unexpected shutdown, the lease will will stay in
> the database. After restart a possibly extra hook call and D2 name
> removal request will be issued, but that's ok. The data will be
> consistent. So I agree with your comment that this aspect (not really an
> issue in my opinion) should be documented. Describing it in detail in
> the Guide for Hooks Developers is fine, but a brief mention in the
> User's Guide is also necessary. Even without hooks, you can get
> duplicate DNS update attempts (with the latter failing) if you shutdown
> your server abruptly.
> 

Ok. I’ll make sure this is documented.

> Discussion regarding skip flag in the lease{4,6}_expire hook. I think
> the skip flag should work as you described. Whether they will "pile up"
> or not really depends on the client patterns. If you have a limited
> number of clients (e.g. in a company that does not permit visitors), you
> may want to skip expiration as a way to provide stable addresses.
> There are better ways to do it (very long lease times for example), but
> there may be extra reasons (monitoring when devices are up, thus short
> lifetimes and desire to keep addresses stable). Anyway, explaining this
> in the hooks guide would be very useful. BTW keep in mind ticket #3499
> about replacing skip flag with an enum. 1.0 is the last moment where we
> can do such a change.

I agree this must be very well documented in the Guide for Hooks Developers.

> 
> Sending Name Change Requests from callouts
> I have mixed feelings whether that is really that useful and if it's
> really part of the expiration design. The same thing could be done for
> lease assignment ("callout implementor would want to send a name change
> request on his own"). I'm ok keeping it as part of the design, but the
> tickets associated with this particular part should be considered
> optional and have low priority.
> 

Since Hooks libraries typically contain some external code, which we’re
not responsible for, we should try to help implementors write a correct
code, as it may save us time from debugging the problems that
implementors may encounter. In particular, if they can use the common
(and tested) function in their application, rather than invent the
wheel, it is always better to NOT invent the wheel. I agree with you
that it is not lease expiration specific change, but this is the change
we want to do while implementing lease expiration. If not documented in
the design, it will likely be omitted. Note that anybody can become an
implementor of the lease expiration, and so making it clear what should
be done is essential.

> Periodically executing tasks
> From the tasks listed at the end of your design, I presume that you
> favor approach 2. I like it more than approach 1, so given choice of 1
> or 2, I would go with 2. There's another possible approach, though.
> Let's call it approach 3. You may think of it as simplified approach 2.
> 
> Approach 3
> Everything happens in the main thread. The code is allowed to create new
> timers wherever needed, but it must use the IOService created in the
> Daemon class. Right now the main DHCPv{4,6} loop calls
> LeaseMgrFactory::instance().getIOServiceExecInterval(); to determine the
> timeout, but it would call TimerMgr::getTimeout() instead. (That would
> also solve the problem described in #3696). That method would go over
> all registered timers an find the shortest timeout. That's not a big
> deal, as I don't think we'll ever have more than 3 or 4 timers (one for
> LFC, second for expiration, maybe a third one for failover and maybe one
> more custom for hooks). In this approach, TimerMgr is little more than a
> container for timers with couple utility methods. No WatchSocket would
> be necessary.

 This approach doesn’t scale well. I doubt if having 3 or 4 timers, as
you say, is all we’re going to have. But even with two timers we’re
going to have some trouble. Consider two timers which are supposed to be
executed at different time intervals: 13s and 10s. How do I guarantee
that they are really executed at these intervals? If I pick the smaller
value of 10s and run IO service every so often it will be executed at:
0s, 10s, 20s, 30s and so. The other timer requires that it is executed
at 0s, 13s, 26s, 39s, and so it is desynced with the time when I
actually want to run my IO service. I could maybe find some common
denominator but how do I do it for many timers?

I also want to point out that I have a strong desire to migrate the DHCP
servers into more asynchronous model (similar to D2), as its operation
is in fact asynchronous, i.e. it may receive packet at any given time.
This may be out of scope for a few upcoming releases, but there is/was
some consensus in the team that this is where we should go. In that
case, the use of the asynchronous functions offered by IO Service is
something that we’ll do anyway.
Finally, the Approach 2 brings some implementation complexity, but this
is a price we will pay only once.

> 
> Regarding approach 2 and 3
> Do you want to register timers by string? It would be more efficient if
> each timer had a unique integer value assigned upon registration. We had
> similar discussion about statistics access patterns. In that case the
> decision was to stick with strings as it's more natural. Also, hook
> users may ask how they can ensure that their timer name is unique. A
> method that returns timer id would ensure that. But I don't insist on
> it, if you prefer to keep access by name pattern.
> 

Strings are more intuitive, as you pointed out, and it is easier to find
the unique identifier than in the case of integers. So, I’d stick to
this, unless other people object.

> Configuration structure
> - the parameter unwarned-cycles doesn't make sense, at least not in the
> form you propose. Here's an example. I have the thing configured to run
> expiration every minute. I also have clients that come and go and every
> minute there's one lease that expires. Each expiration fully processes
> all expired leases, yet after 5 minutes I start getting warnings that
> something is wrong in my setup. Instead, I would put a log on INFO that
> the expiration was not completed, because of allowed time budget was
> reached and there are still X leases awaiting reclaimation.
> 

I was thinking that the message at INFO level is not appropriate here,
as you’re going to have a lot of them if you set the processing time to
short one. The aim was to have a WARN message if we discover that the
server can’t catch up. So, you keep running the reclaimation routine and
after a couple of cycles you still have expired leases in the database.
I don’t think that I’d have the WARN message logged repeatedly in the
case that you have described. If the server was able to clean up the
database from the expired leases, it would reset the counter. The
counter would be increased only if it was unable to clear all expired
leases. In which case it would try it a couple of times more and emit
the WARN message if there were expired leases afterwards. It could reset
again after emitting the WARN message. Perhaps it would catch up later on.

> - I think idle-time should be specified in seconds, not milliseconds.
> Even in the most agressive setups ("I need to know about the expiration
> exactly the second when it happens"), the smallest value used would be
> 1000. The name is also confusing, as it suggests that the server will do
> expiration only if nothing else is happening. People may afraid that if
> it's set to 1 second and there's a packet coming in every second the
> expiration will never be triggered. Better name would be
> "run-every-x-seconds", "expiration-interval" or simply "interval". The
> documentation should clarify that it won't happen strictly every X
> seconds, but every X+(time it took to process the last expiration cycle).
> 

The problem is that the value set here doesn’t specify the interval
between starts of two consecutive lease reclaimation routines, but it
specifies the interval between the end of the previous routine and the
beginning of the next one. This is to prevent starting routine while the
previous instance hasn’t finished yet, and allowing the „idle” time
between them.
So, neither of your proposed names really conveys this meaning. Maybe
„wait-time”?

> The design does not explain if it will be possible to run expiration
> procedure during start-up. There should be a switch for it and the
> default should be to do full expiration of all leases at startup.
> That would be useful when recovering after server downtime.
> 

I don’t have an ability to run the reclaimation routine during the
„start time”, as the server requires to fully process its configuration
to be able to do it. We may make sure that the server doesn’t wait for
the first timer to elapse before running the reclaimation routine, but
simply run it at the zero time. However, I don’t see a lot of value in
it, as the timers will be scheduled for relatively short periods of time
anyway.

> New server command
> You may consider renaming the command to "leases-expire" for the reasons
> stated above.
> 

Sorry, I didn’t get that. What reasons?

> Implementation tasks
> I like this list. I need to update decline design with a similar one.
> 
> Extra steps for recovering declined addresses
> If the Decline design I proposed gets through, there will be one extra
> minor step to do. See point 4 in Implementation details here:
> http://kea.isc.org/wiki/DeclineDesign.
> 

I think that the decline and expiration are logically two separate
things, and should be processed separately.

Marcin