dhcpd failover with PXE boot

Eric Crist ecrist at claimlynx.com
Fri Oct 28 14:39:59 UTC 2011


Hello All,

We have a pair of boxes that manage much of our network core, including DHCPd, BIND and NFSd.  We are running ISC-DHCPd 3.1.3 and have the boxes configured with failover following the man page examples.  Within this configuration, we have two subnets, a storage LAN, and an access LAN.  On our network, we have a series of machines, say our web cluster, that use PXE to boot an image via NFS.  Our NFS system is such that only the primary box has access to the data store, and we use FreeBSD with CARP to failover the NFS server address (with much other goo in devd.conf, etc).  

During boot, if the PRIMARY dhcp server replies to the dhcp request, the system is able to boot without a problem.  If the SECONDARY dhcp server replies, the bootstrap image is loaded, but the initial mount via NFS to obtain the kernel hangs.   This requires we baby sit the boot of our boxes, or shutdown the secondary dhcp server during maintenance windows so everything starts up properly on the first-go.

What might be our problem? Some of the basics are:

Our NFS server uses the CARP command address 10.2.0.1, with the two DHCP server being 10.2.0.2 and 10.2.0.3.  We point option root-path and boot-server (code 66) to the 10.2.0.1 address.

failover peer "somethingserver" {                                                                                                                                                                   
  secondary;     (obviously one is primary, one is secondary)                                                                                                                                                                          
  address 10.2.0.3;                                                                                                                                                                       
  port 647;                                                                                                                                                                               
  peer address 10.2.0.2;                                                                                                                                                                   
  peer port 647;                                                                                                                                                                           
  max-response-delay 30;                                                                                                                                                                   
  max-unacked-updates 10;                                                                                                                                                                 
load balance max seconds 3;                                                                                                                                                             
} 

Thanks for your feedback.
---
Eric F Crist
System Administrator
ClaimLynx, Inc
(952) 593-5969 x2301






More information about the dhcp-users mailing list