[stork-users] HA-status unavailable
Slawek Figiel
slawek at isc.org
Fri Feb 23 21:30:44 UTC 2024
Hello Andrey,
Hmm... I'm pretty sure I saw a similar problem a year or two ago... If I
remember correctly, that problem was caused by duplicated entries in the
`service` or `ha_service` table. Do you have access to the Stork
database? Could you check how many rows have these tables?
The proper number is four rows in each table because you have two DHCPv4
and two DHCPv6 HA pairs. I expect you will have two extra rows in each
table.
>> 4. Check the Stork server and Stork agent logs. The HA status is
>> fetched one per minute by default. Do you observe any errors here?
>
> No errors...
Please look at the logs directly before the "Completed pulling DHCP
status from Kea apps:" message. It indicates when the HA status is
refreshed.
> Kea version is 2.0.2 on all machines.
It is pretty old. We don't test Stork with this version, but there were
no significant changes in the HA handling. Which Stork version do you use?
Regards,
Slawek
On 23/02/2024 20:20, Andrey Kostin wrote:
> Thanks, Slawek, for quick response.
>
> Slawek Figiel писал(а) 2024-02-23 12:32:
>>
>> Please, make the below checks:
>>
>> 1. Check the machine list (/machines), and machine pages
>> (/machines/XX) related to the HA hosts. Does UI notify any errors?
>
> No errors, all 4 machines show green check marks for DHCPv4 DHCPv6 CA.
>
>> 2. Check the unauthorized machine list. Are there any machines listed?
>
> No, Unauthorized(0)
>
>> 3. Check the application pages for your HA peers. Does UI notify any
>> errors?
>
> Here is services status from Stork dashboard:
>
> Services Status
> Host App Version App Name Daemon Status RPS (15min)
> RPS (24h) HA State Detected Failure w/HA Uptime
> keadhcp01-van Kea 2.0.2 kea at X.X.X.X.194 dhcp6 1 1
> hot-standby 2023-12-12 16:17:40 4 m 10 d 4 h 56 min 40 s
> keadhcp01-van Kea 2.0.2 kea at X.X.X.X.194 dhcp4 1 1
> hot-standby never 4 m 27 d 15 h 42 min 31 s
> keadhcp02-van Kea 2.0.2 kea at X.X.X.X.195 dhcp4
> hot-standby never 4 m 27 d 15 h 41 min 36 s
> keadhcp02-van Kea 2.0.2 kea at X.X.X.X.195 dhcp6
> hot-standby never 4 m 27 d 15 h 41 min 34 s
> keadhcp03 Kea 2.0.2 kea at X.X.X.X.4 dhcp6
> hot-standby never 2 h 55 min 55 s
> keadhcp03 Kea 2.0.2 kea at X.X.X.X.4 dhcp4
> hot-standby 2024-02-23 12:41:06 4 m 21 d 21 h 56 min 21 s
> keadhcp01 Kea 2.0.2 kea at X.X.X.X.251 dhcp4 1 1
> unavailable never 9 m 27 d 21 h 55 min 25 s
> keadhcp01 Kea 2.0.2 kea at X.X.X.X.251 dhcp6 1 1
> unavailable never 9 m 27 d 21 h 54 min 30 s
>
> If I click "hot-standby" for keadhcp03 (standby), I see "Control
> status: offline" for the remote server:
>
> High Availability
>
> Local server
> Status time: 2024-02-23 14:10:04
> Status checked: 24 seconds ago
> Role: standby
> Control status: online
> Heartbeat status: ok
> State: hot-standby
> Scopes served: none (standby server)
> Last in partner-down: never
> Unacked clients: n/a
> Connecting clients: n/a
> Analyzed packets: n/a
>
> Remote server Kea at X.X.X.251
> Status time: 2024-02-23 14:10:04
> Status checked: 24 seconds ago
> Role: primary
> Control status: offline
> Heartbeat status: ok
> State: unavailable
> Scopes served: none
> Last in partner-down: 2024-02-23 12:41:06
> Unacked clients: n/a
> Connecting clients: n/a
> Analyzed packets: n/a
>
> If I click "unavailable" for keadhcp01 (primary), it shows that
> everything is ok:
> High Availability
>
> Local server
> Status time: 2024-02-23 14:13:06
> Status checked: 27 seconds ago
> Role: primary
> Control status: online
> State: hot-standby
> Scopes served: keadhcp01
> Last in partner-down: 2024-02-23 14:03:02
>
> Remote server
> Status time: 2024-02-23 14:13:06
> Status checked: 27 seconds ago
> Role: standby
> Control status: online
> Heartbeat status: ok
> State: hot-standby
> Scopes served: none (standby server)
> Last in partner-down: never
> Unacked clients: n/a
> Connecting clients: n/a
> Analyzed packets: n/a
>
> Interesting observation here is that [working] primary server doesn't
> show standby's IP, and [not working] standby shows primary's IP
>
>> 4. Check the Stork server and Stork agent logs. The HA status is
>> fetched one per minute by default. Do you observe any errors here?
>
> No errors...
>
> stork-server logs these messages regularly:
> Feb 22 16:21:41 storkserver-radius stork-server[864416]:
> time="2024-02-22 16:21:41" level="info" msg="Completed pulling hosts
> from Kea daemons" file=" host.go:109 " erred_count="0"
> skipped_count="0" success_count="8"
>
> Feb 22 16:26:17 storkserver-radius stork-server[864416]:
> time="2024-02-22 16:26:17" level="info" msg="Completed pulling
> information from machines: 4/4 succeeded" file=" statepuller.go:73 "
>
>
>> 5. Which Kea version do you use? Does the second HA pair use the same
>> Kea version?
>
> Kea version is 2.0.2 on all machines.
>
> I also tried to open a TCP connection to ctrl-agent port 8000 and
> stork-agent port 8080 from primary to standby and vise-versa, and TCP
> works, so it's not filters.
>
> Kind regards,
> Andrey
More information about the Stork-users
mailing list