David W. Hankins
David_Hankins at isc.org
Thu Feb 14 18:34:40 UTC 2008
On Mon, Feb 11, 2008 at 05:42:40PM -0500, Frank Sweetser wrote:
> By setting servers up to not require DHCP, this also makes your service
> dependencies simpler. There's nothing quite so much fun as creating a
> circular service dependency between two servers, where neither one may be
> turned on until the other is fully booted...
I've heard this from multiple people. It's not...entirely...wrong.
I think it is misguided by a lack of understanding that DHCP simply
needs to be implemented differently in such an evironment, and further
confused since ISC DHCP is not IETF DHCP. How we did it is not the
only way it may be done.
That misunderstanding kind of frustrates me.
Let us characterize DHCP in this particular use case as a dynamic
process that reaches a fixed (or semi-fixed) ends. This is a
redundant operation. Redundant operations introduce complexities;
more components of the system may fail ("unknown flaws"). The
conclusion is to remove the redundancy to avoid phonecalls at 3 am.
Let us describe one other redundant operation in server farms; the
use of network interface speed/duplex autonegotiation. Your server is
not going to swap out its nic for one that cannot do full duplex on
a reboot. Your switch is similarly not going to physically change on
a restart. So this wire-protocol dynamic process always reaches the
same conclusion: full speed, full duplex. It is therefore redundant,
and by the same argument, must be removed in order for "undefined
flaws" in the process to keep from affecting service.
I certainly know of many networks whose server farms do not use DHCP.
I also know of farms that do use DHCP (and even dynamic DNS).
Although I do know of folks who disable ethernet link autoneg for
the reasons given, what troubles me is that most folks I know who
choose to disable DHCP do not choose to disable ethernet link auto-
negotiation. This means they give DHCP special consideration outside
the norm. It is somehow an extra special automated process (or at
least one whose parameters are not understood). The line drawn is
arbitrary and fuzzy - whatever the individual chooses to be or not be
an acceptable risk according to their own sense of comfort - rather
than clear and consistent - something drawn from a definition.
But, if a group has elected _arbitrarily_ not to use DHCP to configure
their servers, then the implementation of that election is rather
While we're on this topic however, we could discuss how one might
elect to use automation (for all the benefits it conveys) in the best
way - so as to minimize phonecalls at 3am.
Foremost and most obvious is to use failover or some other means to
get redundancy in your DHCP service itself. But this becomes optional
by the time you've reached the third point below. It actually does
not (or should not) matter if your DHCP service works or not, except
that you'd tend to prefer it did.
Second and equally obvious is to use long lease times, so any failure
in the DHCP service is unlikely, or impossible to be noticed, except
by brand new servers that have never before been put online. Note
that valid lease times range from 1 second to 2^32-2, with a big
gap between 2^32-2 and 'infinity' (meaning the lease simply never
expires...at least until an operator resets it). Note that long lease
times does not necessarily require long renewal times, at least so far
as the protocol is concerned. ISC dhcpd could stand to let the renew
time be configurable. Note that the server farms I'm aware of use
90-120 day lease times.
Third and, it seems, completely non-obvious, is to vet the use of DHCP
client software which implements RFC2131 section 3.2, "reusing a
previously allocated network address", specifically reading between
the lines on how to optimize this process for a non-nomadic lifestyle.
This non-nomadic interpretation of this and section 3.7's SHOULD
(don't) lets a client _immediately_ use any previous valid lease upon
rebooting, although it probably SHOULD also attempt to contact a DHCP
server in parallel and reconfigure if necessary. This effectively
makes DHCP during the boot sequence a non- blocking operation. My
memory is that ISC dhclient is very optimized for the nomadic
lifestyle, such that it is not capable of operating in this server-
farm-desirable fashion*. Improvement would be trivial.
Fourth, use DHCP client identification to match the individual
service. In this way, the hard drive inside a server, with its
generated (RFC3942) or configured client id, consistently identifies
itself for resources like dynamic DNS, and its IPv4 addresses.
In this way DHCP can help an operator make large changes such as
network re-addressing or complete domain renaming (along with all the
little changes, such as nameserver, domain-search, or ntp changes)
without resorting to brute force and without introducing a blocking
event in the system startup sequence. The risk and rewards are
identical to the use of ethernet autonegotiation; the software itself
can have a fault, just like the (possibly upgraded) firmware on either
side of your ethernet cable, always a risk, but you gain the many
advantages of automation for trusting in code.
* I think dhclient will enter REBOOT state and start sending requests,
without applying the most recent active lease to configuration. If
this times out, it will enter INIT state and try to DISCOVER. If
this times out, it will finally fork and continue to try to get a
lease, but it still never applies the active lease information. The
desirable outcome is to continue in REBOOTING state indefinitely,
and to consume the old configuration immediately and fork. But this
is not optimal in a nomadic system (such as a laptop or phone), in
which case you really don't want to tell the system you are
configured (and the user can get net) until you're done; the last
lease is very probably wrong. The only difficult point is how
could dhclient know it is either nomadic, or not?
Ash bugud-gul durbatuluk agh burzum-ishi krimpatul.
Why settle for the lesser evil? https://secure.isc.org/store/t-shirt/
David W. Hankins "If you don't do it right the first time,
Software Engineer you'll just have to do it again."
Internet Systems Consortium, Inc. -- Jack T. Hankins
More information about the dhcp-users