[stork-users] HA-status unavailable
Slawek Figiel
slawek at isc.org
Mon Feb 26 15:03:02 UTC 2024
Hello Andrey,
It seems your problem was already reported in
https://gitlab.isc.org/isc-projects/stork/-/issues/616 and
https://gitlab.isc.org/isc-projects/stork/-/issues/818 . The bug fix was
released in Stork 1.7.
How long ago did you configure the Stork to monitor your HA pairs? Is it
possible that it happened before 1.7?
If you configured your Stork server in or after 1.7, did you perform
some experiments with the HA configurations (e.g., changing HA type,
swapping the primary and secondary servers, etc.) after registering
their machines in Stork?
I'm waiting for your feedback. I've opened
https://gitlab.isc.org/isc-projects/stork/-/issues/1320 to analyze if
the problem was fixed properly in 1.7 and should occur in new installations.
Currently, the available solution for your problem is:
1. Unauthorize all Kea machines (there is the "Unauthorize" button in
the hamburger menu on the /machines page)
2. Restart Stork agents
3. Authorize them again
There is no need to touch any Kea components (Kea Control Agent or Kea
DHCP daemons).
Regards,
Slawek
On 26/02/2024 15:31, Andrey Kostin wrote:
> Hi Slawek,
>
> Thanks for the swift response.
>
> Slawek Figiel писал(а) 2024-02-23 16:30:
>> Hello Andrey,
>>
>> Hmm... I'm pretty sure I saw a similar problem a year or two ago... If
>> I remember correctly, that problem was caused by duplicated entries in
>> the `service` or `ha_service` table. Do you have access to the Stork
>> database? Could you check how many rows have these tables?
>>
>> The proper number is four rows in each table because you have two
>> DHCPv4 and two DHCPv6 HA pairs. I expect you will have two extra rows
>> in each table.
>
> Looks like this is the case:
>
> stork=> select * from ha_service;
> id | service_id | ha_type | ha_mode | primary_id | secondary_id |
> primary_last_state | secondary_last_state | backup_id |
> primary_status_collected_at | s
> econdary_status_collected_at | primary_last_scopes |
> secondary_last_scopes | primary_reachable | secondary_reachable |
> primary_last_failover_at | secondary_
> last_failover_at | primary_comm_interrupted | primary_connecting_clients
> | primary_unacked_clients | primary_unacked_clients_left |
> primary_analyzed_packets |
> secondary_comm_interrupted | secondary_connecting_clients |
> secondary_unacked_clients | secondary_unacked_clients_left |
> secondary_analyzed_packets
> ----+------------+---------+-------------+------------+--------------+--------------------+----------------------+-----------+-----------------------------+--
> -----------------------------+---------------------+-----------------------+-------------------+---------------------+----------------------------+-----------
> -----------------+--------------------------+----------------------------+-------------------------+------------------------------+--------------------------+
> ----------------------------+------------------------------+---------------------------+--------------------------------+----------------------------
> 81 | 81 | dhcp6 | hot-standby | 12 | 24 |
> unavailable | hot-standby | | 2024-02-26
> 14:19:11.667682 | 2
> 024-02-26 14:19:12.667682 | {} | {}
> | | t | |
> | f | |
> | | |
> f | |
> | |
> 82 | 82 | dhcp4 | hot-standby | 11 | 23 |
> unavailable | hot-standby | | 2024-02-26
> 14:19:12.66768 | 2
> 024-02-26 14:19:12.66768 | {} | {}
> | | t | 2024-02-26 13:57:00.715839 |
> | f | |
> | | |
> f | |
> | |
> 83 | 83 | dhcp4 | hot-standby | 11 | |
> hot-standby | hot-standby | | 2024-02-26
> 14:19:12.693715 | 2
> 024-02-26 14:19:12.693715 | {keadhcp01} | {} |
> t | t | 2024-02-26 13:31:35.404352 |
> | | |
> | | |
> f | |
> | |
> 84 | 84 | dhcp6 | hot-standby | 12 | |
> hot-standby | hot-standby | | 2024-02-26
> 14:19:12.693717 | 2
> 024-02-26 14:19:12.693717 | {keadhcp01} | {} |
> t | t | |
> | | |
> | | |
> f | |
> | |
> 1 | 1 | dhcp4 | hot-standby | 2 | 4 |
> hot-standby | hot-standby | | 2024-02-26
> 14:19:12.644121 | 2
> 024-02-26 14:19:12.644121 | {keadhcp01-van} | {} |
> t | t | |
> | f | |
> | | |
> f | |
> | |
> 2 | 2 | dhcp6 | hot-standby | 3 | 5 |
> hot-standby | hot-standby | | 2024-02-26
> 14:19:11.644121 | 2
> 024-02-26 14:19:12.644121 | {keadhcp01-van} | {} |
> t | t | | 2023-12-12
> 21:17:40.271372 | f | |
> | | |
> f | |
> | |
> (6 rows)
>
> stork=> select * from service;
> id | name | created_at | service_type
> ----+--------------------+----------------------------+--------------
> 82 | service-0000000082 | 2024-02-24 07:50:39.760247 | ha_dhcp
> 81 | service-0000000081 | 2024-02-24 07:50:39.760247 | ha_dhcp
> 2 | service-0000000002 | 2023-10-10 21:15:21.000787 | ha_dhcp
> 1 | service-0000000001 | 2023-10-10 21:15:21.000787 | ha_dhcp
> 83 | service-0000000083 | 2024-02-24 07:56:26.606722 | ha_dhcp
> 84 | service-0000000084 | 2024-02-24 07:56:26.606722 | ha_dhcp
>
>
>>>> 4. Check the Stork server and Stork agent logs. The HA status is
>>>> fetched one per minute by default. Do you observe any errors here?
>>>
>>> No errors...
>>
>> Please look at the logs directly before the "Completed pulling DHCP
>> status from Kea apps:" message. It indicates when the HA status is
>> refreshed.
>
> These are the log records the second before message:
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Updating
> KeaDHCPDaemonStats: {RPS1:1 RPS2:1}" file=" rps.go:297 "
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Updating
> KeaDHCPDaemonStats: {RPS1:1 RPS2:1}" file=" rps.go:297 "
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Connecting to existing
> agent" file=" agentcomm.go:205 " address="X.X.X.195:8080"
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Updating
> KeaDHCPDaemonStats: {RPS1:0 RPS2:0}" file=" rps.go:297 "
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Updating
> KeaDHCPDaemonStats: {RPS1:0 RPS2:0}" file=" rps.go:297 "
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Connecting to existing
> agent" file=" agentcomm.go:205 " address="X.X.X.4:8080"
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Updating
> KeaDHCPDaemonStats: {RPS1:0 RPS2:0}" file=" rps.go:297 "
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Updating
> KeaDHCPDaemonStats: {RPS1:0 RPS2:0}" file=" rps.go:297 "
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Connecting to existing
> agent" file=" agentcomm.go:205 " address="X.X.X.251:8080"
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Updating
> KeaDHCPDaemonStats: {RPS1:1 RPS2:1}" file=" rps.go:297 "
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Updating
> KeaDHCPDaemonStats: {RPS1:1 RPS2:1}" file=" rps.go:297 "
> Feb 26 09:25:57 storkserver-radius stork-server[864416]:
> time="2024-02-26 09:25:57" level="info" msg="Completed pulling lease
> stats from Kea apps: 4/4 succeeded" file=" statspuller.go:71 "
>
>
>
>>> Kea version is 2.0.2 on all machines.
>>
>> It is pretty old. We don't test Stork with this version, but there
>> were no significant changes in the HA handling. Which Stork version do
>> you use?
>
> Just upgraded to 1.15, but this status was there before the upgrade when
> versions were .12 and .13
>>
>
> Kind regards,
> Andrey
More information about the Stork-users
mailing list