Secondary server in failover fails to come out of recover state

Steven Carr sjcarr at gmail.com
Mon May 13 18:20:03 UTC 2013


So the only thing I can think of at this stage is that the secondary
is failing to synchronise after it comes back online, which is why
it's staying in recovery mode.

Could there be anything blocking port 520? Do these boxes have SELinux
running? When this happens again can you run a packet capture on the
primary and secondary and see what is being sent/received to port 520?
You could also try changing this port? (the ports which seem to
recommended these days are 647 for the primary and 847 for the
secondary)

Instead of issuing a restart, maybe do a shutdown, wait ~15 seconds
and then startup dhcpd, just incase something is holding open the port
so when DHCP comes back up it can't open the port, I would expect this
to be logged somewhere but if only the team that developed BIND
logging channels spent some time implementing detailed logging
channels on DHCPD...

HTH


On 13 May 2013 19:05, Oscar Ricardo Silva <osilva at scuff.cc.utexas.edu> wrote:
> Regarding what happens in the logs at 15:20, there's nothing on the primary
> other than incoming requests from clients.  On the secondary I saw:
>
>
> 15:15:47 secondary-dhcp dhcpd: failover peer dhcp: I move from recover to
> startup
> 15:15:47 secondary-dhcp dhcpd: failover peer dhcp: I move from startup to
> recover
> 15:15:47 secondary-dhcp dhcpd: Sent update request all message to dhcp
> 15:18:33 secondary-dhcp dhcpd: dhcp: ignored (recovering)
> 15:20:02 secondary-dhcp dhcpd: dhcp: ignored (recovering)
>
>
> meanwhile, requests were coming in from the clients. Here's a fuller picture
> of that section of the log where you'll see the secondary continues to note
> that it's recovering.
>
>
> 15:20:01 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.199.109 from
> 00:50:56:92:00:5d (NUR-022) via em2: not responding (recovering)
> 15:20:01 secondary-dhcp dhcpd: DHCPREQUEST for 192.16864.218 from
> 00:80:77:78:09:11 (BRN_780911) via em2: not responding (recovering)
> 15:20:01 secondary-dhcp dhcpd: DHCPINFORM from 192.168.186.103 via
> 192.168.166.250
> 15:20:01 secondary-dhcp dhcpd: DHCPACK to 192.168.186.103
> (00:1e:c9:37:43:1f) via em2
> 15:20:01 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.199.218 from
> 00:50:56:92:00:45 (LAW-KSK-006) via em2: not responding (recovering)
> 15:20:01 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.199.162 from
> 00:50:56:92:0d:b6 (FAC227-010) via em2: not responding (recovering)
> 15:20:01 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.199.128 from
> 00:50:56:92:03:1e (FAC-027) via em2: not responding (recovering)
> 15:20:02 secondary-dhcp dhcpd: DHCPINFORM from 192.168.194.238 via
> 192.168.194.129
> 15:20:02 secondary-dhcp dhcpd: DHCPACK to 192.168.194.238
> (00:18:8b:6e:32:0f) via em2
> 15:20:02 secondary-dhcp dhcpd: DHCPDISCOVER from 00:0b:db:59:f9:80 via
> 192.168.39.97: not responding (recovering)
> 15:20:02 secondary-dhcp dhcpd: dhcp: ignored (recovering)
> 15:20:02 secondary-dhcp dhcpd: DHCPDISCOVER from d0:d0:fd:2a:01:57
> (ATS-ACLP-UTDELL) via 192.168.226.1: not responding (recovering)
> 15:20:02 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.226.19 (172.16.200.2)
> from d0:d0:fd:2a:01:57 (ATS-ACLP-UTDELL) via 192.168.226.1: not responding
> (recovering)
> 15:20:02 secondary-dhcp dhcpd: DHCPDISCOVER from 00:0d:56:ee:13:94 via
> 192.168.180.1: not responding (recovering)
> 15:20:02 secondary-dhcp dhcpd: DHCPINFORM from 192.168.176.131 via
> 192.168.176.1
> 15:20:02 secondary-dhcp dhcpd: DHCPACK to 192.168.176.131
> (00:24:e8:fc:43:d2) via em2
> 15:20:02 secondary-dhcp dhcpd: DHCPINFORM from 192.168.222.121 via
> 192.168.82.129
> 15:20:02 secondary-dhcp dhcpd: DHCPACK to 192.168.222.121
> (00:23:ae:a1:69:38) via em2
> 15:20:02 secondary-dhcp dhcpd: DHCPINFORM from 192.168155.193 via
> 192.168155.129
> 15:20:02 secondary-dhcp dhcpd: DHCPACK to 192.168155.193 (00:26:6c:25:47:97)
> via em2
> 15:20:02 secondary-dhcp dhcpd: DHCPREQUEST for 192.16873.186 from
> 00:1f:5b:ea:97:36 via 192.16873.129: not responding (recovering)
> 15:20:02 secondary-dhcp dhcpd: BOOTREQUEST from 2c:41:38:7c:c6:e0 via
> 192.168.164.250: BOOTP from dynamic client and no dynamic leases
> 15:20:02 secondary-dhcp dhcpd: DHCPINFORM from 192.168.242.159 via
> 192.168.242.250
> 15:20:02 secondary-dhcp dhcpd: DHCPACK to 192.168.242.159
> (60:eb:69:32:93:2e) via em2
> 15:20:03 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.199.31 from
> 00:50:56:92:01:70 (FAC227-011) via em2: not responding (recovering)
> 15:20:03 secondary-dhcp dhcpd: DHCPINFORM from 192.168119.9 via
> 192.16896.129
> 15:20:03 secondary-dhcp dhcpd: DHCPACK to 192.168119.9 (d4:85:64:b5:46:f1)
> via em2
> 15:20:04 secondary-dhcp dhcpd: DHCPINFORM from 192.168.44.81 via
> 192.168.44.1
> 15:20:04 secondary-dhcp dhcpd: DHCPACK to 192.168.44.81 (00:23:ae:55:f4:86)
> via em2
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.199.220 from
> 00:50:56:92:00:9c (LAW-046) via em2: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 172.16.226.236 from
> 00:26:bb:5f:96:12 (fac-809982) via 192.168.165.129: not responding
> (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPDISCOVER from 78:45:c4:0b:43:e3
> (heco-pelab03) via 192.168.64.250: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPDISCOVER from 00:26:bb:5f:96:12 via
> 192.168.165.129: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPINFORM from 192.168.191.83 via
> 192.168.191.1
> 15:20:04 secondary-dhcp dhcpd: DHCPACK to 192.168.191.83 (d4:be:d9:c6:0c:f7)
> via em2
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.213.218 from
> 90:b1:1c:64:6d:d3 (CM-M3310-867716) via em2: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPDISCOVER from 78:45:c4:1e:55:03
> (HECO-PELAB08) via 192.168.64.250: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.169.111 from
> 00:50:56:92:0a:fb (CSE-LAB-008) via em2: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.198.201 from
> 00:1f:f3:45:36:7a (LA-ITS827888WD) via em2: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPINFORM from 192.168.186.43 via
> 192.168.166.250
> 15:20:04 secondary-dhcp dhcpd: DHCPACK to 192.168.186.43 (00:22:19:1c:c8:1b)
> via em2
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 192.16873.186 from
> 00:1f:5b:ea:97:36 via 192.16873.129: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.55.189 from
> 3c:07:54:1c:c8:09 via 192.168.55.250: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.199.120 from
> 00:50:56:92:0d:a6 (NUR-045) via em2: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.67.180 from
> 40:6c:8f:3f:84:50 via 192.168.67.250: not responding (recovering)
> 15:20:04 secondary-dhcp dhcpd: DHCPREQUEST for 192.168.67.180 from
> 40:6c:8f:3f:84:50 via 192.168.67.250: not responding (recovering)
> ...
> 15:29:54 secondary-dhcp dhcpd: DHCPDISCOVER from 00:0a:95:36:5c:93
> (RUNIN-TOP) via 192.168.230.1: not responding (recovering)
> 15:29:59 secondary-dhcp dhcpd: DHCPDISCOVER from 00:0a:95:35:f3:77 (Xserve)
> via 192.168.93.193: not responding (recovering)
>
>
>
>
> On 05/13/2013 12:43 PM, Oscar Ricardo Silva wrote:
>>
>> Right before 15:09 we made a configuration change on the primary, adding
>> a new network definition. In order for that new network to be active we
>> had to restart dhcpd on the primary.  The portion of the config file
>> holding all the network definitions was copied to the secondary. Once
>> both servers were online and "normal", dhcpd was restarted on the
>> secondary so it would then be able to provide services on that new
>> network.
>>
>> Maybe we're doing something wrong here but it's my understanding that
>> after a configuration change, dhcpd has to be restarted.
>>
>> Just in case it's a timing issue I've told coworkers to wait an
>> additional 10 minutes after the two servers go into "normal" before
>> restarting the primary.
>>
>>
>>
>>
>> On 05/11/2013 06:51 AM, Steven Carr wrote:
>>>
>>> What happens in the logs at 15:20 when the servers should then have
>>> passed the MCLT time and come out of recovery?
>>>
>>> Why put the system into partner-down at 15:14, what was the reasoning
>>> behind this given both servers were online?
>
> ...
>
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users


More information about the dhcp-users mailing list