[Kea-users] Failed to Sync lease updates from primary to backup in Hot-standby mode in V1.5.0

luckydog xf luckydogxf at gmail.com
Wed Apr 8 03:44:23 UTC 2020


The  root cause is addressed.

2020-04-08 09:30:33.136 WARN  [kea-dhcp4.ha-hooks/29601]
HA_LEASE_UPDATE_FAILED [hwtype=1 00:15:65:98:c1:85],
cid=[01:00:15:65:98:c1:85], tid=0xe088765b: lease update to server2 (
http://172.16.232.20:8080/) failed: The address 172.20.131.221 does not
belong to subnet 172.20.131.0/25, subnet-id=18, error code 1
2020-04-08 09:30:33.529 INFO  [kea-dhcp4.leases/29601] DHCP4_LEASE_ALLOC
[hwtype=1 80:5e:c0:4c:28:52], cid=[01:80:5e:c0:4c:28:52], tid=0x45d82217:
lease 172.20.137.66 has been allocated
2020-04-08 09:30:33.530 WARN  [kea-dhcp4.ha-hooks/29601]
HA_LEASE_UPDATE_FAILED [hwtype=1 00:15:65:90:e9:10],
cid=[01:00:15:65:90:e9:10], tid=0x17253f72: lease update to server2 (
http://172.16.232.20:8080/) failed: The address 192.168.104.24 does not
belong to subnet 172.20.127.0/24, subnet-id=14, error code 1

Because Master/slave node get different Subnet  sequence   . So it makes
subnet id inconsistent.

Just make sure master/slave get the same sequence of subnet.

Just keep the similar config file of master and slave node.

On Tue, Apr 7, 2020 at 4:10 PM luckydog xf <luckydogxf at gmail.com> wrote:

> Sorry, it did not sync only once. It syned periodically.
>
> ###    [root at keaslave1 ~]# grep synchronization
>  /opt/kea-1.5.0/var/log/kea-dhcp4.log  ####
>
> 2020-04-07 15:43:36.667 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_START
> starting lease database synchronization with server1
> 2020-04-07 15:43:46.667 ERROR [kea-dhcp4.ha-hooks/2498] HA_SYNC_FAILED
> lease database synchronization with server1 failed: Connection timed out
> 2020-04-07 15:43:57.211 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_START
> starting lease database synchronization with server1
> 2020-04-07 15:43:57.266 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_SUCCESSFUL
> lease database synchronization with server1 completed successfully in
> 55.718 ms
> 2020-04-07 15:44:30.241 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_START
> starting lease database synchronization with server1
> 2020-04-07 15:44:40.241 ERROR [kea-dhcp4.ha-hooks/2498] HA_SYNC_FAILED
> lease database synchronization with server1 failed: Connection timed out
> 2020-04-07 15:44:50.255 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_START
> starting lease database synchronization with server1
> 2020-04-07 15:45:00.255 ERROR [kea-dhcp4.ha-hooks/2498] HA_SYNC_FAILED
> lease database synchronization with server1 failed: Connection timed out
> 2020-04-07 15:45:10.792 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_START
> starting lease database synchronization with server1
> 2020-04-07 15:45:10.848 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_SUCCESSFUL
> lease database synchronization with server1 completed successfully in
> 55.661 ms
> 2020-04-07 15:56:53.608 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_START
> starting lease database synchronization with server1
> 2020-04-07 15:56:53.664 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_SUCCESSFUL
> lease database synchronization with server1 completed successfully in
> 56.517 ms
> 2020-04-07 15:57:16.573 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_START
> starting lease database synchronization with server1
> 2020-04-07 15:57:16.630 INFO  [kea-dhcp4.ha-hooks/2498] HA_SYNC_SUCCESSFUL
> lease database synchronization with server1 completed successfully in
> 56.862 ms
> #
>
> The point of sync were 15:43, 15:44, 14:45( retried due to connection
> timeout) and 15:56/57(retried again).
>
> The interval is around 10 minutes, is is normal ?
>
> On Tue, Apr 7, 2020 at 2:42 PM luckydog xf <luckydogxf at gmail.com> wrote:
>
>> Thanks, I double check it.
>> I set both server's clock and restart services. Now it seems it begins to
>> sync lease data.
>>
>>
>> But in Standby node, it just synced  only once.
>> ------------------------
>> 2020-04-07 14:38:06.230 INFO  [kea-dhcp4.ha-hooks/1754]
>> HA_STATE_TRANSITION server transitions from WAITING to SYNCING state,
>> partner state is HOT-STANDBY
>> 2020-04-07 14:38:06.230 INFO  [kea-dhcp4.ha-hooks/1754]
>> HA_LEASE_UPDATES_DISABLED lease updates will not be sent to the partner
>> while in SYNCING state
>> 2020-04-07 14:38:06.230 INFO  [kea-dhcp4.ha-hooks/1754] HA_SYNC_START
>> starting lease database synchronization with server1
>> 2020-04-07 14:38:06.279 INFO  [kea-dhcp4.ha-hooks/1754]
>> HA_LEASES_SYNC_LEASE_PAGE_RECEIVED received 500 leases from server1
>> 2020-04-07 14:38:06.286 INFO  [kea-dhcp4.ha-hooks/1754]
>> HA_SYNC_SUCCESSFUL lease database synchronization with server1 completed
>> successfully in 55.856 ms
>> 2020-04-07 14:38:06.286 INFO  [kea-dhcp4.ha-hooks/1754]
>> HA_STATE_TRANSITION server transitions from SYNCING to READY state, partner
>> state is HOT-STANDBY
>> 2020-04-07 14:38:06.286 INFO  [kea-dhcp4.ha-hooks/1754]
>> HA_LEASE_UPDATES_DISABLED lease updates will not be sent to the partner
>> while in READY state
>> 2020-04-07 14:38:06.286 INFO  [kea-dhcp4.ha-hooks/1754]
>> HA_STATE_TRANSITION server transitions from READY to HOT-STANDBY state,
>> partner state is HOT-STANDBY
>> 2020-04-07 14:38:06.286 INFO  [kea-dhcp4.ha-hooks/1754]
>> HA_LEASE_UPDATES_ENABLED lease updates will be sent to the partner while in
>> HOT-STANDBY state
>> 2020-04-07 14:38:06.286 INFO  [kea-dhcp4.ha-hooks/1754]
>> HA_LOCAL_DHCP_ENABLE local DHCP service is enabled while the server2 is in
>> the HOT-STANDBY state
>> 2020-04-07 14:38:13.399 INFO  [kea-dhcp4.commands/1754] COMMAND_RECEIVED
>> Received command 'ha-heartbeat'
>> 2020-04-07 14:38:24.413 INFO  [kea-dhcp4.commands/1754] COMMAND_RECEIVED
>> Received command 'ha-heartbeat'
>> 2020-04-07 14:38:29.239 INFO  [kea-dhcp4.commands/1754] COMMAND_RECEIVED
>> Received command 'lease4-update'
>> 2020-04-07 14:38:29.239 ERROR [kea-dhcp4.callouts/1754]
>> HOOKS_CALLOUT_ERROR error returned by callout on hook 2 registered by
>> library with index $lease4_update (callout address 0x7faefd365ca0) (callout
>> duration 0.065 ms)
>> 2020-04-07 14:38:33.454 INFO  [kea-dhcp4.commands/1754] COMMAND_RECEIVED
>> Received command 'lease4-update'
>> 2020-04-07 14:38:33.454 ERROR [kea-dhcp4.callouts/1754]
>> HOOKS_CALLOUT_ERROR error returned by callout on hook 2 registered by
>> library with index $lease4_update (callout address 0x7faefd365ca0) (callout
>> duration 0.048 ms)
>> 2020-04-07 14:38:34.177 INFO  [kea-dhcp4.commands/1754] COMMAND_RECEIVED
>> Received command 'lease4-update'
>> 2020-04-07 14:38:34.177 ERROR [kea-dhcp4.callouts/1754]
>> HOOKS_CALLOUT_ERROR error returned by callout on hook 2 registered by
>> library with index $lease4_update (callout address 0x7faefd365ca0) (callout
>> duration 0.048 ms)
>> 2020-04-07 14:38:36.180 INFO  [kea-dhcp4.commands/1754] COMMAND_RECEIVED
>> Received command 'ha-heartbeat'
>> 2020-04-07 14:38:37.922 INFO  [kea-dhcp4.commands/1754] COMMAND_RECEIVED
>> Received command 'lease4-update'
>> 2020-04-07 14:38:37.922 ERROR [kea-dhcp4.callouts/1754]
>> HOOKS_CALLOUT_ERROR error returned by callout on hook 2 registered by
>> library with index $lease4_update (callout address 0x7faefd365ca0) (callout
>> duration 0.052 ms)
>> --------------------
>>
>> In primary node, it kept saying that lease data was synced.
>> ---------
>> 2020-04-07 14:39:46.012 INFO  [kea-dhcp4.ha-hooks/28440] HA_SYNC_START
>> starting lease database synchronization with server2
>> 2020-04-07 14:39:46.061 INFO  [kea-dhcp4.ha-hooks/28440]
>> HA_SYNC_SUCCESSFUL lease database synchronization with server2 completed
>> successfully in 48.493 ms
>> 2020-04-07 14:39:57.605 INFO  [kea-dhcp4.ha-hooks/28440] HA_SYNC_START
>> starting lease database synchronization with server2
>> 2020-04-07 14:39:57.658 INFO  [kea-dhcp4.ha-hooks/28440]
>> HA_SYNC_SUCCESSFUL lease database synchronization with server2 completed
>> successfully in 52.490 ms
>> 2020-04-07 14:40:20.289 INFO  [kea-dhcp4.ha-hooks/28440] HA_SYNC_START
>> starting lease database synchronization with server2
>> --------------
>>
>> It seems that slave node failed to sync later due to unkown reason.
>>
>> Here is part of my conf of both sides.
>>
>> ----------
>>  {
>>         "library": "/opt/kea-1.5.0/lib/hooks/libdhcp_stat_cmds.so"
>>     },
>>     {
>>         "library": "/opt/kea-1.5.0/lib/hooks/libdhcp_lease_cmds.so"
>>     },
>>     {
>>             "library": "/opt/kea-1.5.0/lib/hooks/libdhcp_ha.so",
>>             "parameters": {
>>                 "high-availability": [ {
>>                     "this-server-name": "server2",
>>                     "mode": "hot-standby",
>>                     "heartbeat-delay": 10000,
>>                     "max-response-delay": 10000,
>>                     "max-ack-delay": 5000,
>>                     "max-unacked-clients": 5,
>>
>>                     "send-lease-updates": true,
>>                     "sync-leases": true,
>>                     "sync-page-limit": 10000,
>>                     "sync-timeout": 30000,
>>
>>                     "peers": [
>>                         {
>>                             "name": "server1",
>>                             "url": "http://172.16.232.18:8080/",
>>                             "role": "primary",
>>                             "auto-failover": true
>>                         },
>>                         {
>>                             "name": "server2",
>>                             "url": "http://172.16.232.20:8080/",
>>                             "role": "standby",
>>                             "auto-failover": true
>>                         }
>>                     ]
>>                 } ]
>>             }
>>         }
>>     ],
>> ------
>>
>> Anything wrong ? Thanks,
>>
>>
>>
>>
>>
>>
>> On Fri, Apr 3, 2020 at 7:26 PM Marcin Siodelski <marcin at isc.org> wrote:
>>
>>> Hello,
>>>
>>> The issue you're pointing at was merely to improve the logging of the
>>> time skew, partner's time and the local server's time, so you could more
>>> easily diagnose why the server decided to terminate the HA function. As
>>> far as I remember, it doesn't functionally change the server's behavior.
>>> Therefore, I think it must be something else.
>>>
>>> If you could provide us with the logs of BOTH DHCP servers it could shed
>>> some more light on the cauase of your problem.
>>>
>>> Kind Regards,
>>>
>>> Marcin Siodelski
>>> Sr. Software Engineer
>>> ISC
>>>
>>> On 03/04/2020 07:21, luckydog xf wrote:
>>> > I corrected both server's datetime and restarted both service of two
>>> > servers.  But it still got the same error.
>>> >
>>> > https://gitlab.isc.org/isc-projects/kea/-/merge_requests/414
>>> >
>>> > Seem a bug of v1.5.0.
>>> >
>>> > On Thu, Apr 2, 2020 at 6:34 PM Marcin Siodelski <marcin at isc.org
>>> > <mailto:marcin at isc.org>> wrote:
>>> >
>>> >     Hello,
>>> >
>>> >     Thank you for your email. It would be useful to see the log from
>>> the
>>> >     primary server to see why it went to the "terminated" state. The
>>> standby
>>> >     server apparently transitioned to the "terminated" state seeing
>>> that the
>>> >     partner is in that state. Note that the server which transitioned
>>> to the
>>> >     "terminated" state (e.g. as a result of too high clock skew) will
>>> not
>>> >     transition out of this state automatically, even if the clocks get
>>> >     synchronized. It must be stopped and started again. Perhaps, the
>>> primary
>>> >     server wasn't restarted after syncing the clocks?
>>> >
>>> >     Marcin Siodelski
>>> >     Sr. Software Engineer
>>> >     ISC
>>> >
>>> >     On 02/04/2020 10:36, luckydog xf wrote:
>>> >     >   Hi, list,
>>> >     >
>>> >     >   I'm using Kea V1.5.0 and running two dhcp servers in
>>> Hot-standby
>>> >     mode.
>>> >     > My lease database is Memfile. I setup NTP for both servers.
>>> >     >
>>> >     >    Here is part of my configuration,
>>> >     > ===============
>>> >     >               "high-availability": [ {
>>> >     >                     "this-server-name": "server2",
>>> >     >                     "mode": "hot-standby",
>>> >     >                     "heartbeat-delay": 10000,
>>> >     >                     "max-response-delay": 10000,
>>> >     >                     "max-ack-delay": 5000,
>>> >     >                     "max-unacked-clients": 5,
>>> >     >
>>> >     >                     "send-lease-updates": true,
>>> >     >                     "sync-leases": true,
>>> >     >                     "sync-page-limit": 10000,
>>> >     >                     "sync-timeout": 30000,
>>> >     >
>>> >     >                     "peers": [
>>> >     >                         {
>>> >     >                             "name": "server1",
>>> >     >                             "url": "http://172.16.232.18:8080/",
>>> >     >                             "role": "primary",
>>> >     >                             "auto-failover": true
>>> >     >                         },
>>> >     >                         {
>>> >     >                             "name": "server2",
>>> >     >                             "url": "http://172.16.232.20:8080/",
>>> >     >                             "role": "standby",
>>> >     >                             "auto-failover": true
>>> >     >                         }
>>> >     > =============
>>> >     >
>>> >     > Yet  on standby server ,it says that,
>>> >     > ===============
>>> >     > 2020-04-02 11:08:19.269 INFO  [kea-dhcp4.ha-hooks/11468]
>>> >     > HA_LOCAL_DHCP_DISABLE local DHCP service is disabled while the
>>> server2
>>> >     > is in the WAITING state
>>> >     > 2020-04-02 11:08:19.269 INFO  [kea-dhcp4.ha-hooks/11468]
>>> >     > HA_SERVICE_STARTED started high availability service in
>>> >     hot-standby mode
>>> >     > as standby server
>>> >     > 2020-04-02 11:08:19.269 INFO  [kea-dhcp4.dhcp4/11468]
>>> >     DHCP4_STARTED Kea
>>> >     > DHCPv4 server version 1.5.0 started
>>> >     > 2020-04-02 11:08:31.283 INFO  [kea-dhcp4.ha-hooks/11468]
>>> >     > HA_STATE_TRANSITION server transitions from WAITING to TERMINATED
>>> >     state,
>>> >     > partner state is TERMINATED
>>> >     > 2020-04-02 11:08:31.283 INFO  [kea-dhcp4.ha-hooks/11468]
>>> >     > HA_LEASE_UPDATES_DISABLED lease updates will not be sent to the
>>> >     partner
>>> >     > while in TERMINATED state
>>> >     > 2020-04-02 11:08:31.283 INFO  [kea-dhcp4.ha-hooks/11468]
>>> >     > HA_LOCAL_DHCP_ENABLE local DHCP service is enabled while the
>>> >     server2 is
>>> >     > in the TERMINATED state
>>> >     > 2020-04-02 11:08:31.283 ERROR [kea-dhcp4.ha-hooks/11468]
>>> HA_TERMINATED
>>> >     > HA service terminated because of the unacceptable clock skew;
>>> fix the
>>> >     > problem and restart!
>>> >     >
>>> >     > ============
>>> >     >
>>> >     > I'm sure two severs get correct date time. So what's the reason ?
>>> >     >
>>> >     > Is it related
>>> >     > to https://gitlab.isc.org/isc-projects/kea/-/merge_requests/414
>>>  ?
>>> >     >
>>> >     > Thanks and take care.
>>> >     >
>>> >     > -hongquan
>>> >     >
>>> >     >
>>> >     > _______________________________________________
>>> >     > Kea-users mailing list
>>> >     > Kea-users at lists.isc.org <mailto:Kea-users at lists.isc.org>
>>> >     > https://lists.isc.org/mailman/listinfo/kea-users
>>> >     >
>>> >
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/kea-users/attachments/20200408/5ed4b0f5/attachment.htm>


More information about the Kea-users mailing list