[stork-users] HA-status unavailable

Slawek Figiel slawek at isc.org
Fri Feb 23 21:30:44 UTC 2024


Hello Andrey,

Hmm... I'm pretty sure I saw a similar problem a year or two ago... If I 
remember correctly, that problem was caused by duplicated entries in the 
`service` or `ha_service` table. Do you have access to the Stork 
database? Could you check how many rows have these tables?

The proper number is four rows in each table because you have two DHCPv4 
and two DHCPv6 HA pairs. I expect you will have two extra rows in each 
table.

 >> 4. Check the Stork server and Stork agent logs. The HA status is
 >> fetched one per minute by default. Do you observe any errors here?
 >
 > No errors...

Please look at the logs directly before the "Completed pulling DHCP 
status from Kea apps:" message. It indicates when the HA status is 
refreshed.

 > Kea version is 2.0.2 on all machines.

It is pretty old. We don't test Stork with this version, but there were 
no significant changes in the HA handling. Which Stork version do you use?

Regards,
Slawek



On 23/02/2024 20:20, Andrey Kostin wrote:
> Thanks, Slawek, for quick response.
> 
> Slawek Figiel писал(а) 2024-02-23 12:32:
>>
>> Please, make the below checks:
>>
>> 1. Check the machine list (/machines), and machine pages
>> (/machines/XX) related to the HA hosts. Does UI notify any errors?
> 
> No errors, all 4 machines show green check marks for DHCPv4 DHCPv6 CA.
> 
>> 2. Check the unauthorized machine list. Are there any machines listed?
> 
> No, Unauthorized(0)
> 
>> 3. Check the application pages for your HA peers. Does UI notify any 
>> errors?
> 
> Here is services status from Stork dashboard:
> 
> Services Status
> Host    App Version    App Name    Daemon    Status    RPS (15min)    
> RPS (24h)    HA State    Detected Failure w/HA    Uptime
> keadhcp01-van    Kea 2.0.2    kea at X.X.X.X.194    dhcp6        1    1 
> hot-standby    2023-12-12 16:17:40    4 m 10 d 4 h 56 min 40 s
> keadhcp01-van    Kea 2.0.2    kea at X.X.X.X.194    dhcp4        1    1     
> hot-standby    never    4 m 27 d 15 h 42 min 31 s
> keadhcp02-van    Kea 2.0.2    kea at X.X.X.X.195    dhcp4                 
> hot-standby    never    4 m 27 d 15 h 41 min 36 s
> keadhcp02-van    Kea 2.0.2    kea at X.X.X.X.195    dhcp6                 
> hot-standby    never    4 m 27 d 15 h 41 min 34 s
> keadhcp03    Kea 2.0.2    kea at X.X.X.X.4    dhcp6                 
> hot-standby    never    2 h 55 min 55 s
> keadhcp03    Kea 2.0.2    kea at X.X.X.X.4    dhcp4                 
> hot-standby    2024-02-23 12:41:06    4 m 21 d 21 h 56 min 21 s
> keadhcp01    Kea 2.0.2    kea at X.X.X.X.251    dhcp4        1    1     
> unavailable    never    9 m 27 d 21 h 55 min 25 s
> keadhcp01    Kea 2.0.2    kea at X.X.X.X.251    dhcp6        1    1     
> unavailable    never    9 m 27 d 21 h 54 min 30 s
> 
> If I click "hot-standby" for keadhcp03 (standby), I see "Control 
> status:    offline" for the remote server:
> 
> High Availability
> 
> Local server
> Status time:    2024-02-23 14:10:04
> Status checked:    24 seconds ago
> Role:    standby
> Control status:    online
> Heartbeat status:    ok
> State:    hot-standby
> Scopes served:    none (standby server)
> Last in partner-down:    never
> Unacked clients:    n/a
> Connecting clients:    n/a
> Analyzed packets:    n/a
> 
> Remote server Kea at X.X.X.251
> Status time:    2024-02-23 14:10:04
> Status checked:    24 seconds ago
> Role:    primary
> Control status:    offline
> Heartbeat status:    ok
> State:    unavailable
> Scopes served:    none
> Last in partner-down:    2024-02-23 12:41:06
> Unacked clients:    n/a
> Connecting clients:    n/a
> Analyzed packets:    n/a
> 
> If I click "unavailable" for keadhcp01 (primary), it shows that 
> everything is ok:
> High Availability
> 
> Local server
> Status time:    2024-02-23 14:13:06
> Status checked:    27 seconds ago
> Role:    primary
> Control status:    online
> State:    hot-standby
> Scopes served:    keadhcp01
> Last in partner-down:    2024-02-23 14:03:02
> 
> Remote server
> Status time:    2024-02-23 14:13:06
> Status checked:    27 seconds ago
> Role:    standby
> Control status:    online
> Heartbeat status:    ok
> State:    hot-standby
> Scopes served:    none (standby server)
> Last in partner-down:    never
> Unacked clients:    n/a
> Connecting clients:    n/a
> Analyzed packets:    n/a
> 
> Interesting observation here is that [working] primary server doesn't 
> show standby's IP, and [not working] standby shows primary's IP
> 
>> 4. Check the Stork server and Stork agent logs. The HA status is
>> fetched one per minute by default. Do you observe any errors here?
> 
> No errors...
> 
> stork-server logs these messages regularly:
> Feb 22 16:21:41 storkserver-radius stork-server[864416]: 
> time="2024-02-22 16:21:41" level="info" msg="Completed pulling hosts 
> from Kea daemons" file="             host.go:109  " erred_count="0" 
> skipped_count="0" success_count="8"
> 
> Feb 22 16:26:17 storkserver-radius stork-server[864416]: 
> time="2024-02-22 16:26:17" level="info" msg="Completed pulling 
> information from machines: 4/4 succeeded" file="      statepuller.go:73  "
> 
> 
>> 5. Which Kea version do you use? Does the second HA pair use the same
>> Kea version?
> 
> Kea version is 2.0.2 on all machines.
> 
> I also tried to open a TCP connection to ctrl-agent port 8000 and 
> stork-agent port 8080 from primary to standby and vise-versa, and TCP 
> works, so it's not filters.
> 
> Kind regards,
> Andrey


More information about the Stork-users mailing list