Failover strangeness

Bruce Hudson Bruce.Hudson at Dal.Ca
Mon Oct 16 21:24:55 UTC 2006

>    I guess the frustrating part of this is that I have to take this 
> manual action for the failover to work correctly.  That's problematic.

    This is really the immediate effect of only having two servers. It
is impossible for a given server to tell the difference between the
"communications interupted" state, in which the servers cannot talk to
each other but clients can still see both, and "partner down". The fact
that you can also have a partioned network where some clients see one
and some see another simply adds to the problem. The only safe thing to
do is to assume the other server is still there until somebody, in this
case the system manager, breaks the tie.

    The other answer (going back to the old days of VMS clusters) is to
have 3 or more servers running. Then you simply insist on a quorum of
"half+1" to continue operating. Servers not in the quorum stop and any
in the quorum can safely assume the others are down. However, I am not
sure the DHCP fail-over works with more than two servers.

> Also: I'm very hesitant to write OMAPI documentation myself considering
> our long-term goal is to phase out OMAPI in favor of a more entrenched,
> operational, open management framework (candidates are things like
> dBus or IETF's zeroconf).

    Can you be more specific on this? We use OMAPI quite heavily to add
and remove clients from our DHCP servers so I am interested in what you
are looking to replace it with. I took a quick look at zeroconf and it
does not appear to be doing the same job at all. I am either missing some
part of it or in deparate need of a paradigm shift.
Bruce A. Hudson				| Bruce.Hudson at Dal.CA
UCIS, Networks and Systems		|
Dalhousie University			|
Halifax, Nova Scotia, Canada		| (902) 494-3405

More information about the dhcp-users mailing list