Help me make DHCP's documentation better.

David W. Hankins David_Hankins at
Mon Aug 28 17:56:49 UTC 2006

Thanks to all of you for the feedback.  I've written a second pass
at this, collapsing it all into one (related) section.

On Fri, Aug 18, 2006 at 03:11:38PM +0200, Patrick Schoo wrote:
> I think MAC Affinity is not a proper description for what it is used for. If I 
> understand the function correctly Lease Affinity would be a better 
> description. A lease either has affinity to 'reside' on the primary, or it 
> has affinity to 'reside' on the secondary. This is an extra property apart 
> from the 'binding state'. For example a lease can have 'binding state active' 
> and also 'affinity secondary'. Once the lease expires it will move to the 
> FREE status, while keeping the 'affinity secondary' property. The lease 
> rebalancer only needs to touch the affinity property, once a misbalance is 
> detected.

That's an interesting distinction, but the property of the affinity
remains the client identification - be that MAC or client identifier
option.  So the lease's state has an affinity to its identifier.

Find below this cut a second pass at documenting these four options.

I'm rather thinking the max-lease-misbalance value will be modified
in later 3.1.0 that we're on a schedule, it doesn't make
sense not to move _any_ leases if we only try once an hour...we might
as well move a few than wait for it to exceed (x)%.

       The Failover pool balance statements.

	   max‐lease‐misbalance percentage;
	   max‐lease‐ownership percentage;
	   min‐balance seconds;
	   max‐balance seconds;

	  This version of the DHCP Server evaluates pool balance on  a  sched‐
	  ule,  rather  than  on  demand  as leases are allocated.  The latter
	  approach proved to be slightly klunky when  pool  misbalanced  reach
	  total  saturation...when  any server ran out of leases to assign, it
	  also lost its ability to notice it had run dry.

	  In order to understand pool balance, some elements of its  operation
	  first  need  to  be  defined.   First, there are ’free’ and ’backup’
	  leases.  Both of these are  referred  to  as  ’free  state  leases’.
	  ’free’  and  ’backup’  are ’the free states’ for the purpose of this
	  document.  The difference is that only the primary may allocate from
	  ’free’  leases unless under special circumstances, and only the sec‐
	  ondary may allocate ’backup’ leases.

	  When pool balance is performed, the only plausible expectation is to
	  provide  a  50/50  split  of  the  free state leases between the two
	  servers.  This is because no one can predict which server will fail,
	  regardless of the relative load placed upon the two servers, so giv‐
	  ing each server half the leases gives both servers the  same  amount
	  of ’failure endurance’.  Therefore, there is no way to configure any
	  different behaviour, outside of some  very  small  windows  we  will
	  describe shortly.

	  The  first  thing  calculated  on  any  pool  balance run is a value
	  referred to as ’lts’, or "Leases To Send".   This,  simply,  is  the
	  difference  in  the count of free and backup leases, divided by two.
	  For the secondary, it is the  difference  in  the  backup  and  free
	  leases,  divided  by  two.   The resulting value is signed: if it is
	  positive, the local server is expected to hand out leases to  retain
	  a 50/50 balance.  If it is negative, the remote server would need to
	  send leases to balance the pool.  Once the lts value  reaches  zero,
	  the  pool  is perfectly balanced (give or take one lease in the case
	  of an odd number of total free state leases).

	  The current approach is still something  of  a  hybrid  of  the  old
	  approach,  marked by the presence of the max‐lease‐misbalance state‐
	  ment.  This parameter configures what used to be a 10%  fixed  value
	  in  previous  versions: if lts is less than free+backup * max‐lease‐
	  misbalance percent, then the server will skip balancing a given pool
	  (it  won’t bother moving any leases, even if some leases "should" be
	  moved).  The meaning of this value is also somewhat overloaded, how‐
	  ever,  in  that it also governs the estimation of when to attempt to
	  balance the pool (which may then also be skipped over).  The  oldest
	  leases  in  the  free and backup states are examined.  The time they
	  have resided in their respective queues is used as  an  estimate  to
	  indicate  how  much  time  it  is  probable it would take before the
	  leases at the top of the list would be consumed (and thus, how  long
	  it  would take to use all leases in that state).  This percentage is
	  directly multiplied by this time, and fit into the  schedule  if  it
	  falls within the min‐balance and max‐balance configured values.  The
	  scheduled pool check time is only moved in a downwards direction, it
	  is  never  increased.   Lastly,  if the lts is more than double this
	  number in the negative direction, the local server will ’panic’  and
	  transmit  a Failover protocol POOLREQ message, in the hopes that the
	  remote system will be woken up into action.

	  Once the lts value exceeds the  max‐lease‐misbalance  percentage  of
	  total  free state leases as described above, leases are moved to the
	  remote server.  This is done in two passes.

	  In the first pass, only leases whose most recent bound client  would
	  have  been  served by the remote server ‐ according to the Load Bal‐
	  ance Algorithm (see above split and hba configuration statements)  ‐
	  are  given  away to the peer.  This first pass will happily continue
	  to give away leases, decrementing the lts value  by  one  for  each,
	  until  the lts value has reached the negative of the total number of
	  leases multiplied by the max‐lease‐ownership percentage.  So  it  is
	  through  this  value  that  you can permit a small misbalance of the
	  lease pools ‐ for the purpose of giving the peer more than  a  50/50
	  share  of  leases  in  the  hopes  that their clients might some day
	  return and be allocated by the peer (operating normally).  This pro‐
	  cess  is referred to as ’MAC Address Affinity’, but this is somewhat
	  misnamed: it applies equally  to  DHCP  Client  Identifier  options.
	  Note  also  that  affinity  is applied to leases when they enter the
	  state be moved from free to backup if the secondary already has more
	  than its share.

	  The  second  pass  is  only  entered into if the first pass fails to
	  reduce the lts underneath the total number of free state leases mul‐
	  tiplied  by  the  max‐lease‐ownership percentage.  In this pass, the
	  oldest leases are given over to  the  peer  without  second  thought
	  about  the  Load Balance Algorithm, and this continues until the lts
	  falls under this value.  In this way, the  local  server  will  also
	  happily  keep  a  small percentage of the leases that would normally
	  load balance to itself.

	  So, the max‐lease‐misbalance  value  acts  as  a  behavioural  gate.
	  Smaller  values  will cause more leases to transition states to bal‐
	  ance the pools over time, higher values will decrease the amount  of
	  change (but may lead to pool starvation if there’s a run on leases).

	  The max‐lease‐ownership value permits a small (percenatge)  skew  in
	  the  lease balance of a percentage of the total number of free state

	  Finally, the min‐balance and max‐balance make certain that a  sched‐
	  uled  rebalance  event happens within a reasonable timeframe (not to
	  be thrown off by, for example, a 7 year old free lease).

	  Plausible values for the percentages lie between 0 and  100,  inclu‐
	  sive,  but  values  over  50  are indistinguishable from one another
	  (once lts exceeds 50% of the free  state  leases,  one  server  must
	  therefore have 100% of the leases in its respective free state).  It
	  is recommended to select a max‐lease‐ownership value that  is  lower
	  than  the  value  selected for the max‐lease‐misbalance value.  max‐
	  lease‐ownership defaults to 10, and max‐lease‐misbalance defaults to

	  Plausible  values  for  the  min‐balance  and max‐balance times also
	  range from 0 to (2^32)‐1 (or the limit of your local time_t  value),
	  but  default  to  values  60 and 3600 respectively (to place balance
	  events between 1 minute and 1 hour).

ISC Training!  October 16-20, 2006, in the San Francisco Bay Area,
covering topics from DNS to DDNS & DHCP.  Email training at
David W. Hankins	"If you don't do it right the first time,
Software Engineer		you'll just have to do it again."
Internet Systems Consortium, Inc.	-- Jack T. Hankins

More information about the dhcp-users mailing list