Multi-subnet/vlan and failover

Sat May 11 00:05:23 UTC 2013

So, yes, I did have a VLAN leak. [eek - not enough sleep, too little
thinking!]

But that's resolved now - thanks for the tip.
So, now I have failover working, as well as VLAN/Multi-segment. [Very
nice.]

I must say "Thanks!!" for all those who do the work on this product.
It's a core piece of virtually every network and like most IT work,
you never get credit when it works and does so unobtrusively without a
bunch of babying etc. But you can always guarantee when it doesn't
work, they haven't forgotten where to whine to either!

---
But the discussion about the split values and lease-balancing is one
I'd like to discuss...

I'm happy to start a new thread, but since we started discussing here,
I thought it might make sense to continue. Google should find it in a
search in any case...

---
So the relevant params for address recovery etc seem to be:
mclt - which is only _somewhat_ comprehensible to me.
[I see it's the maximum lease time for any lease when in partner-down
state - but I don't understand what it has to do with recovery of
leases in in PDS.]

But if I thought that was bad, I really don't grok:
max-lease-misbalance
max-lease-ownership
min-balance
max-balance

At least not really.

---
Is there some layman, dumb-oaf version of what happens when one of the
partner servers runs out of leases? [Like Thag just stumbled into
your data center and was looking for a job configuring DHCP servers!?
:) ]

I've read the section several times, and really get fairly lost.

Here's how I understand it.
In short, as the master/peer hand out addresses, they split the
addresses 50/50. [with a few exceptions]
They then hand out addresses and try to balance the free address pool
on master/pool so they remain equivalent to each other.

When the system detects that it may run out of addresses on either the
master or the pool [over X time-frame] , it tries to re-balance the
free leases again to meet a 50/50 split [again with some exceptions
too complicated to finish explaining in the next few hours or so.]

Does this generally sound right?
---

But does mclt have anything to do with lease re-balancing? [The
description seems to indicate it does, but after reading it multiple
times, I don't really think it does.]

---
So, as a final thought. What kinds of situations would run you in risk
of having a wildly mis-balanced pool and running out of addresses on a
master/peer - where the system wouldn't "automagically" re-balance to
save itself?

What settings would help in this regard, and what values might one
pick.

I'd guess this discussion has occurred before, so I'm more than glad to
be pointed at a thread somewhere and do the slog to read it and see if
that helps.

Sorry for the long post and thanks in advance for your help!

-Greg

SC> No, regardless of the split the leases will still be shared 50/50 with
SC> both servers, so you could still run into an issue where the secondary
SC> runs out of addresses. When both servers are online and one is running
SC> low on leases they will rebalance the lease pool and share the
SC> remaining leases 50/50. (This bit really needs to be documented better
SC> as lots of people fall into that trap)

SC> 255 would make the primary respond to all requests when both systems
SC> are online. When the primary goes offline you will have a limited
SC> amount of time before the leases will be depleted, at which point you
SC> will need to tell the secondary that its partner is down and the
SC> secondary will then assume control of the full lease pools.

SC> My general advice to anyone using DHCP failover is if either of the
SC> systems is going to be out for longer than the period of your smallest
SC> lease time then set the partner to be down as once that minimum lease
SC> time is up you will already have started eating into additional
SC> leases.

SC> On 10 May 2013 08:58, Gregory Sloop <gregs at sloop.net> wrote:
>> It might be, it is a test environment - but I didn't think I had
>> anything that whacked.
>>
>> I'll do some more testing the next chance I get. Any other ideas are
>> more then welcome.
>>
>> ---
>> As for split - I generally intend for all requests to be handled by
>> the primary and only fail to the peer. [Fail-over only, no
>> load-balance]
>>
>> I'm not sure if that's the best idea - but it seems more
>> straightforward. (Essentially my worry is if the blocks are split and
>> a peer goes down, could we run out of addresses in the block for the
>> "up" server before reclaiming them from the "down" server. I suspect
>> this worry is mostly because I don't fully grasp how it is handling
>> things, despite reading the docs - but not as carefully as I probably
>> need to do.)
>>
>> [So, I assume a split of 255 would then make it do what I want, having
>> all requests served by the primary - instead of load-balance, right?]
>>
>>
>> -Greg
>>
>>
>> SC> Sounds like you have a leak in your network and broadcast packets are
>> SC> leaking from one VLAN into another.
>>
>> SC> One other thing, is there a reason you are using "split 0;"? This
>> SC> would mean the secondary peer will answer all lease requests. For a
>> SC> balanced approach you should use 128 which will allow both DHCP
>> SC> servers to respond to lease requests.
>>
>> SC> On 10 May 2013 08:19, Gregory Sloop <gregs at sloop.net> wrote:
>>>> As a follow-up, because it may well impact the answer to my duplicate
>>>> DHCPOFFER issue, let me describe how the DHCP servers are connected in
>>>> relation to VLANS etc.
>>>>
>>>> The DHCP Servers are on VLAN1, say 10.1.1.11/10.1.1.12 [master/peer]
>>>>
>>>> The L3 switch is configured to forward dhcp sessions to 10.1.1.11 and
>>>> 10.1.1.12
>>>>
>>>> ---
>>>> The duplicate messages are seen on DHCP negotiations from VLAN3 [and, I assume VLAN2]
>>>>
>>>> But I have not tested VLAN1 or VLAN2 attached clients to see what
>>>> happens on those VLANs.
>>>>
>>>> TIA for any assistance!
>>>>
>>>> -Greg
>>>>
>>>> GS> @Kyle
>>>> GS> Yes, that's it exactly. Thanks!
>>>>
>>>> GS> ---
>>>> GS> I did find a post about putting it in a pool block after posting
>>>> GS> my query, just about the time you posted your response - but
>>>> GS> hadn't had a chance to test it - so that's great. It now works.
>>>>
>>>> GS> BUT...
>>>> GS> When I run it, I see odd stuff [running dhcpd in -d -f
>>>> GS> debug/foreground mode]...
>>>>
>>>> GS> ---
>>>> GS> I see a pair of DHCPDISCOVERs
>>>>
>>>> GS> One from ETH0 and the other from the IP/DHCP helper on the L3 switch.
>>>> GS> i.e.
>>>> GS> DHCPDISCOVER from so:me:ma:ca:dd:rs on eth0
>>>> GS> DHCPDISCOVER from so:me:ma:ca:dd:rs on 10.1.2.1
>>>> GS> [This second one is the layer 3 switch, which is forwarding the DHCP session to the DHCP server]
>>>>
>>>> GS> Then dhcpd makes two offers - one on 10.1.1.X and one on 10.1.2.X
>>>> GS> Since the station isn't on the 10.1.1.X VLAN and *is* on the 10.1.2.X
>>>> GS> VLAN it "accepts" the 10.1.2.X address and it "works."
>>>>
>>>> GS> But I'm sure it's not supposed to be this way.
>>>> GS> [And I'm pretty sure I'm doing something obvious and perhaps
>>>> GS> stupid, but I just don't know where to look or what to try.]
>>>>
>>>> GS> How do I go about making it only see the forwarded DHCP session
>>>> GS> and not the one on eth0 [or some other option I'm simply not aware of...]
>>>>
>>>> GS> ---
>>>>
>>>> GS> -Greg
>>>>
>>>>
>>>> GS> Are you looking for something like this?
>>>>
>>>> GS> subnet 172.21.27.0 netmask 255.255.255.0 {
>>>> GS>   option subnet-mask 255.255.255.0;
>>>> GS>   option broadcast-address 172.21.27.255;
>>>> GS>   option routers 172.21.27.1;
>>>> GS>   ddns-domainname "example.com.";
>>>> GS>   option domain-search "example.com";
>>>> GS>   pool {
>>>> GS>     failover peer "dhcp-failover";
>>>> GS>     range 172.21.27.5 172.21.27.254;
>>>> GS>   }
>>>> GS> }
>>>>
>>>>
>>>> GS> On Thu, May 9, 2013 at 8:08 PM, Gregory Sloop <gregs at sloop.net> wrote:
>>>> GS> So, I've done a fair bit of reading and searching - and this general
>>>> GS> template is what I thought would work, but it doesn't.
>>>>
>>>> GS> Let me post the dhcp.conf file and then discuss what's wrong and ask
>>>> GS> for pointers.
>>>>
>>>> GS> ---
>>>> GS> authoritative;
>>>> GS> #ddns-update-style interim;
>>>> GS> ignore client-updates;
>>>> GS> #option host-name = config-option server.ddns-hostname;
>>>>
>>>> GS> #include "/etc/rndc.key";
>>>>
>>>> GS> option domain-name              "somedom.local";
>>>> GS> option domain-name-servers      10.1.1.190,10.1.2.1,10.1.1.17;
>>>> GS> option time-offset              -18000; # Pacific Standard Time
>>>> GS> option ntp-servers              10.1.1.14
>>>> GS> one-lease-per-client off;
>>>>
>>>> GS> #4 hour lease
>>>> GS> default-lease-time 14400;
>>>> GS> max-lease-time 14400;
>>>> GS> option ip-forwarding off;
>>>>
>>>> GS> failover peer "dhcp-failover" {
>>>> GS>   primary; # declare this to be the primary server
>>>> GS>   # Address if THIS dhcp server, or what address to listen ON
>>>> GS>   address 10.1.1.1;
>>>> GS>   port 647;
>>>> GS>   # Address of the DHCP fail-over peer.
>>>> GS>   peer address 10.1.1.2;
>>>> GS>   peer port 647;
>>>> GS>   max-response-delay 60;
>>>> GS>   max-unacked-updates 10;
>>>> GS>   #load balance max seconds 3;
>>>> GS>   mclt 3600;
>>>> GS>   split 0;
>>>> GS> }
>>>>
>>>> GS>     subnet 10.1.1.0 netmask 255.255.255.0 {
>>>> GS>         range 10.1.1.1 10.1.1.254;
>>>> GS>         option routers                  10.1.1.1;
>>>> GS>         option subnet-mask              255.255.255.0;
>>>> GS>         failover peer "dhcp-failover";
>>>> GS>     }
>>>>
>>>> GS>     subnet 10.1.2.0 netmask 255.255.255.0 {
>>>> GS>         range 10.1.2.1 10.1.2.254;
>>>> GS>         option routers                  10.1.2.1;
>>>> GS>         option subnet-mask              255.255.255.0;
>>>> GS>         failover peer "dhcp-failover";
>>>> GS>     }
>>>>
>>>> GS>     subnet 10.1.3.0 netmask 255.255.255.0 {
>>>> GS>         range 10.1.3.1 10.1.3.254;
>>>> GS>         option routers                  10.1.3.1;
>>>> GS>         option subnet-mask              255.255.255.0;
>>>> GS>         failover peer "dhcp-failover";
>>>> GS>     }
>>>>
>>>> GS> ---
>>>> GS> Now, I've disabled DDNS updates for simplicity sake. Once I get the
>>>> GS> multi-subnet/VLAN setup and failover working I'll add that back.
>>>>
>>>> GS> Perhaps that impacts things somehow, so if you'll keep that in mind,
>>>> GS> I'd appreciate it.
>>>>
>>>> GS> So, when I try this config I get an error saying that a failover needs
>>>> GS> to be inside a shared network block.
>>>>
>>>> GS> But if I do that, I've been told [read] that the DHCP server won't
>>>> GS> know how to assign the different subnets. [This would apply to a
>>>> GS> network where I wanted to share all the 10.1.1.1-10.1.3.254 as a
>>>> GS> single pool/block and assign any station any IP in the whole block.]
>>>>
>>>> GS> But I have a L3 switch and I want these assigned to each VLAN.
>>>>
>>>> GS> ---
>>>> GS> So, I setup the conf file without a shared-network and it works fine
>>>> GS> with the L3 DHCP helper/proxy. Clients on VLAN1 get 10.1.1.0 blocks
>>>> GS> and VLAN2 get 10.1.2.0 blocks etc.
>>>>
>>>> GS> So, with the "failover" block commented out, it works charmingly! Very
>>>> GS> cool!
>>>>
>>>> GS> ---
>>>> GS> But I *also* want to use failover.
>>>>
>>>> GS> And when I put in a fail-over outside a shared-network, it complains
>>>> GS> that it must be inside a shared network.
>>>>
>>>> GS> So, how to I use fail-over AND maintain the subnet grouping above?
>>>>
>>>> GS> ---
>>>> GS> I'll keep reading, but I've tinkered with this quite a bit and for the
>>>> GS> life of me, I can't see how one would go about it.
>>>>
>>>> GS> -Greg
>>>>
>>>>
>>>>
>>>> --
>>>> Gregory Sloop, Principal: Sloop Network & Computer Consulting
>>>> Voice: 503.251.0452 x82
>>>> EMail: gregs at sloop.net
>>>> http://www.sloop.net
>>>> ---
>>>>
>>>> _______________________________________________
>>>> dhcp-users mailing list
>>>> dhcp-users at lists.isc.org
>>>> https://lists.isc.org/mailman/listinfo/dhcp-users
>> SC> _______________________________________________
>> SC> dhcp-users mailing list
>> SC> dhcp-users at lists.isc.org
>> SC> https://lists.isc.org/mailman/listinfo/dhcp-users
>>
>> --
>> Gregory Sloop, Principal: Sloop Network & Computer Consulting
>> Voice: 503.251.0452 x82
>> EMail: gregs at sloop.net
>> http://www.sloop.net
>> ---
>>
>> _______________________________________________
>> dhcp-users mailing list
>> dhcp-users at lists.isc.org
>> https://lists.isc.org/mailman/listinfo/dhcp-users
SC> _______________________________________________
SC> dhcp-users mailing list
SC> dhcp-users at lists.isc.org
SC> https://lists.isc.org/mailman/listinfo/dhcp-users

-- 
Gregory Sloop, Principal: Sloop Network & Computer Consulting
Voice: 503.251.0452 x82
EMail: gregs at sloop.net
http://www.sloop.net
---