[stork-users] Removing a node and re-adding it back causes a certificate error

Marek Hajduczenia mxhajduczenia at gmail.com
Wed May 8 14:21:17 UTC 2024


As far as 1, i.e., connectivity between hosts, is concerned - they are on the same VLAN (L2) segment and can reach each other without any problems. Notice the very small RTT proving the point – they are hosted on the same cluster, so they are bridged locally within the host, hence sum 0.1ms RTT. No issues here AFAIK

 

root at server-kea-node1:/etc/kea# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

       valid_lft forever preferred_lft forever

    inet6 ::1/128 scope host 

       valid_lft forever preferred_lft forever

2: enp6s18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether bc:24:11:fe:9b:ff brd ff:ff:ff:ff:ff:ff

    inet 172.17.129.130/25 brd 172.17.129.255 scope global enp6s18

       valid_lft forever preferred_lft forever

    inet6 2600:6ce4:0:42::130/64 scope global 

       valid_lft forever preferred_lft forever

    inet6 fe80::be24:11ff:fefe:9bff/64 scope link 

       valid_lft forever preferred_lft forever

root at server-kea-node1:/etc/kea# ping 172.17.129.133

PING 172.17.129.133 (172.17.129.133) 56(84) bytes of data.

64 bytes from 172.17.129.133: icmp_seq=1 ttl=64 time=0.093 ms

64 bytes from 172.17.129.133: icmp_seq=2 ttl=64 time=0.077 ms

64 bytes from 172.17.129.133: icmp_seq=3 ttl=64 time=0.080 ms

64 bytes from 172.17.129.133: icmp_seq=4 ttl=64 time=0.086 ms

64 bytes from 172.17.129.133: icmp_seq=5 ttl=64 time=0.084 ms

^C

--- 172.17.129.133 ping statistics ---

5 packets transmitted, 5 received, 0% packet loss, time 4083ms

rtt min/avg/max/mdev = 0.077/0.084/0.093/0.005 ms

 

root at server-kea-control:/etc/stork# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

       valid_lft forever preferred_lft forever

    inet6 ::1/128 scope host 

       valid_lft forever preferred_lft forever

2: enp6s18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000

    link/ether bc:24:11:94:c3:52 brd ff:ff:ff:ff:ff:ff

    inet 172.17.129.133/25 brd 172.17.129.255 scope global enp6s18

       valid_lft forever preferred_lft forever

    inet6 2600:6ce4:0:42::133/64 scope global 

       valid_lft forever preferred_lft forever

    inet6 fe80::be24:11ff:fe94:c352/64 scope link 

       valid_lft forever preferred_lft forever

root at server-kea-control:/etc/stork# ping 172.17.129.130

PING 172.17.129.130 (172.17.129.130) 56(84) bytes of data.

64 bytes from 172.17.129.130: icmp_seq=1 ttl=64 time=0.076 ms

64 bytes from 172.17.129.130: icmp_seq=2 ttl=64 time=0.086 ms

64 bytes from 172.17.129.130: icmp_seq=3 ttl=64 time=0.092 ms

64 bytes from 172.17.129.130: icmp_seq=4 ttl=64 time=0.074 ms

^C

--- 172.17.129.130 ping statistics ---

4 packets transmitted, 4 received, 0% packet loss, time 3052ms

rtt min/avg/max/mdev = 0.074/0.082/0.092/0.007 ms

 

Neither machine has firewall running on them. 

 

As far as 2 is concerned, I seem to be able to curl all content and wget does return 200 code as expected. 

 

root at server-kea-control:/tmp# wget http://172.17.129.130:9547/metrics

--2024-05-08 14:18:08--  http://172.17.129.130:9547/metrics

Connecting to 172.17.129.130:9547... connected.

HTTP request sent, awaiting response... 200 OK

Length: unspecified [text/plain]

metrics: Read-only file system

 

Cannot write to ‘metrics’ (Read-only file system)

 

.root at server-kea- <mailto:.root at server-kea-> control:/tmp# curl http://172.17.129.130:9547/metrics

# HELP kea_dhcp4_addresses_assigned_total Assigned addresses

# TYPE kea_dhcp4_addresses_assigned_total gauge

kea_dhcp4_addresses_assigned_total{subnet="1"} 2

kea_dhcp4_addresses_assigned_total{subnet="100"} 0

kea_dhcp4_addresses_assigned_total{subnet="2"} 17

kea_dhcp4_addresses_assigned_total{subnet="3"} 2

kea_dhcp4_addresses_assigned_total{subnet="300"} 0

kea_dhcp4_addresses_assigned_total{subnet="4"} 2

kea_dhcp4_addresses_assigned_total{subnet="400"} 117

kea_dhcp4_addresses_assigned_total{subnet="401"} 3

# HELP kea_dhcp4_addresses_declined_reclaimed_total Declined addresses that were reclaimed

# TYPE kea_dhcp4_addresses_declined_reclaimed_total gauge

kea_dhcp4_addresses_declined_reclaimed_total{subnet="1"} 0

kea_dhcp4_addresses_declined_reclaimed_total{subnet="100"} 0

kea_dhcp4_addresses_declined_reclaimed_total{subnet="2"} 0

kea_dhcp4_addresses_declined_reclaimed_total{subnet="3"} 0

kea_dhcp4_addresses_declined_reclaimed_total{subnet="300"} 0

kea_dhcp4_addresses_declined_reclaimed_total{subnet="4"} 0

kea_dhcp4_addresses_declined_reclaimed_total{subnet="401"} 0

# HELP kea_dhcp4_addresses_declined_total Declined counts

# TYPE kea_dhcp4_addresses_declined_total gauge

kea_dhcp4_addresses_declined_total{subnet="1"} 0

kea_dhcp4_addresses_declined_total{subnet="100"} 0

kea_dhcp4_addresses_declined_total{subnet="2"} 0

kea_dhcp4_addresses_declined_total{subnet="3"} 0

kea_dhcp4_addresses_declined_total{subnet="300"} 0

kea_dhcp4_addresses_declined_total{subnet="4"} 0

kea_dhcp4_addresses_declined_total{subnet="401"} 0

# HELP kea_dhcp4_addresses_reclaimed_total Expired addresses that were reclaimed

# TYPE kea_dhcp4_addresses_reclaimed_total gauge

kea_dhcp4_addresses_reclaimed_total{subnet="1"} 0

kea_dhcp4_addresses_reclaimed_total{subnet="100"} 0

kea_dhcp4_addresses_reclaimed_total{subnet="2"} 0

kea_dhcp4_addresses_reclaimed_total{subnet="3"} 0

kea_dhcp4_addresses_reclaimed_total{subnet="300"} 0

kea_dhcp4_addresses_reclaimed_total{subnet="4"} 0

kea_dhcp4_addresses_reclaimed_total{subnet="401"} 24

# HELP kea_dhcp4_addresses_total Size of subnet address pool

# TYPE kea_dhcp4_addresses_total gauge

kea_dhcp4_addresses_total{subnet="1"} 13

kea_dhcp4_addresses_total{subnet="100"} 60

kea_dhcp4_addresses_total{subnet="2"} 92

kea_dhcp4_addresses_total{subnet="3"} 61

kea_dhcp4_addresses_total{subnet="300"} 11

kea_dhcp4_addresses_total{subnet="4"} 8

kea_dhcp4_addresses_total{subnet="401"} 13

# HELP kea_dhcp4_cumulative_addresses_assigned_total Cumulative number of assigned addresses since server startup

# TYPE kea_dhcp4_cumulative_addresses_assigned_total gauge

kea_dhcp4_cumulative_addresses_assigned_total{subnet="1"} 0

kea_dhcp4_cumulative_addresses_assigned_total{subnet="100"} 0

kea_dhcp4_cumulative_addresses_assigned_total{subnet="2"} 0

kea_dhcp4_cumulative_addresses_assigned_total{subnet="3"} 0

kea_dhcp4_cumulative_addresses_assigned_total{subnet="300"} 0

kea_dhcp4_cumulative_addresses_assigned_total{subnet="4"} 0

kea_dhcp4_cumulative_addresses_assigned_total{subnet="401"} 24

# HELP kea_dhcp4_global4_addresses_declined_reclaimed_total Declined addresses that were reclaimed for all subnets

# TYPE kea_dhcp4_global4_addresses_declined_reclaimed_total gauge

kea_dhcp4_global4_addresses_declined_reclaimed_total 0

# HELP kea_dhcp4_global4_addresses_declined_total Declined counts from all subnets

# TYPE kea_dhcp4_global4_addresses_declined_total gauge

kea_dhcp4_global4_addresses_declined_total 0

# HELP kea_dhcp4_global4_addresses_reclaimed_total Expired addresses that were reclaimed for all subnets

# TYPE kea_dhcp4_global4_addresses_reclaimed_total gauge

kea_dhcp4_global4_addresses_reclaimed_total 24

# HELP kea_dhcp4_global4_cumulative_addresses_assigned_total Cumulative number of assigned addresses since server startup from all subnets

# TYPE kea_dhcp4_global4_cumulative_addresses_assigned_total gauge

kea_dhcp4_global4_cumulative_addresses_assigned_total 24

# HELP kea_dhcp4_packets_received_total Packets received

# TYPE kea_dhcp4_packets_received_total gauge

kea_dhcp4_packets_received_total{operation="ack"} 0

kea_dhcp4_packets_received_total{operation="decline"} 0

kea_dhcp4_packets_received_total{operation="discover"} 21850

kea_dhcp4_packets_received_total{operation="drop"} 0

kea_dhcp4_packets_received_total{operation="inform"} 0

kea_dhcp4_packets_received_total{operation="nak"} 0

kea_dhcp4_packets_received_total{operation="offer"} 0

kea_dhcp4_packets_received_total{operation="parse-failed"} 0

kea_dhcp4_packets_received_total{operation="release"} 24

kea_dhcp4_packets_received_total{operation="request"} 266

kea_dhcp4_packets_received_total{operation="unknown"} 0

# HELP kea_dhcp4_packets_sent_total Packets sent

# TYPE kea_dhcp4_packets_sent_total gauge

kea_dhcp4_packets_sent_total{operation="ack"} 72

kea_dhcp4_packets_sent_total{operation="nak"} 194

kea_dhcp4_packets_sent_total{operation="offer"} 304

# HELP kea_dhcp6_global6_addresses_declined_reclaimed_total Declined addresses that were reclaimed for all subnets

# TYPE kea_dhcp6_global6_addresses_declined_reclaimed_total gauge

kea_dhcp6_global6_addresses_declined_reclaimed_total 0

# HELP kea_dhcp6_global6_addresses_declined_total Declined counts from all subnets

# TYPE kea_dhcp6_global6_addresses_declined_total gauge

kea_dhcp6_global6_addresses_declined_total 0

# HELP kea_dhcp6_global6_addresses_reclaimed_total Expired addresses that were reclaimed for all subnets

# TYPE kea_dhcp6_global6_addresses_reclaimed_total gauge

kea_dhcp6_global6_addresses_reclaimed_total 0

# HELP kea_dhcp6_global6_cumulative_nas_assigned_total Cumulative number of assigned NA addresses since server startup from all subnets

# TYPE kea_dhcp6_global6_cumulative_nas_assigned_total gauge

kea_dhcp6_global6_cumulative_nas_assigned_total 0

# HELP kea_dhcp6_global6_cumulative_pds_assigned_total Cumulative number of assigned PD prefixes since server startup

# TYPE kea_dhcp6_global6_cumulative_pds_assigned_total gauge

kea_dhcp6_global6_cumulative_pds_assigned_total 0

 

Please note that the Stork server (133) was able to read this Stork agent (130) before the agent machine had kernel failure and had to be reinstalled. After I removed the previous registration of 130 agent from 133 server, I am unable to re-add the newly installed 130 agent with the same IP as before. 

 

Regards

 

Marek

 

-----Original Message-----
From: Slawek Figiel <slawek at isc.org> 
Sent: Wednesday, May 8, 2024 7:44 AM
To: Marek Hajduczenia <mxhajduczenia at gmail.com>
Cc: stork-users at lists.isc.org
Subject: Re: [stork-users] Removing a node and re-adding it back causes a certificate error

 

Marek,

 

your logs show the connection from the Stork agent (172.17.129.130) to the Stork server (172.17.129.133) is established properly. The problem is the Stork server (172.17.129.133) cannot reach the Stork agent (172.17.129.130).

 

Please, try to perform the below tests:

 

1. From the 172.17.129.133 host ping the 172.17.129.130 host. Does it work?

2. From the 172.17.129.133 open/fetch the  <http://172.17.129.130:9547/metrics> http://172.17.129.130:9547/metrics . Does it return HTTP 200 OK status and some metrics? If you specified the "--listen-stork-only" flag (or "STORK_AGENT_LISTEN_STORK_ONLY" environment variable), remove it temporarily.

 

Please verify the ports opened by your containers/VMs (default values, adjust them if you specified the custom ones in configuration):

 

- Stork server: 8080 (HTTP)

- Stork agent: 8080 (GRPC), 9547 (HTTP)

 

Regards,

Slawek

 

On 07/05/2024 18:47, Marek Hajduczenia wrote:

> I did go with the recommendation and even though I am 100% sure I have 

> IP reachability, the registration process with server token fails.

> 

> root at server-kea-node1:/home/ace# ping 172.17.129.133 PING 

> 172.17.129.133 (172.17.129.133) 56(84) bytes of data.

> 64 bytes from 172.17.129.133 < <http://172.17.129.133> http://172.17.129.133>: icmp_seq=1 

> ttl=64

> time=0.074 ms

> 64 bytes from 172.17.129.133 < <http://172.17.129.133> http://172.17.129.133>: icmp_seq=2 

> ttl=64

> time=0.063 ms

> 64 bytes from 172.17.129.133 < <http://172.17.129.133> http://172.17.129.133>: icmp_seq=3 

> ttl=64

> time=0.147 ms

> ^C

> --- 172.17.129.133 ping statistics ---

> 3 packets transmitted, 3 received, 0% packet loss, time 2054ms rtt 

> min/avg/max/mdev = 0.063/0.094/0.147/0.037 ms 

> root at server-kea-node1:/home/ace# sudo su stork-agent -s /bin/sh -c 

> 'stork-agent register --server-url  <http://172.17.129.133:8080> http://172.17.129.133:8080 

> < <http://172.17.129.133:8080> http://172.17.129.133:8080>'

>  >>>> Server access token (optional):

>  >>>> IP address or FQDN of the host with Stork Agent (for the Stork 

> Server connection) [server-kea-node1]: 172.17.129.130  >>>> Port 

> number that Stork Agent will listen on [8080]:

> INFO[2024-05-07 16:44:26]         register.go:84    Forced agent 

> certificates regeneration.

> INFO[2024-05-07 16:44:26]         register.go:406   

> ======================================================================

> =======

> INFO[2024-05-07 16:44:26]         register.go:407   AGENT TOKEN: 

> E9EE6D836E249B0E9A8898E638DECFCAD35A6577A70672E8F639D4A46CEBC211

> INFO[2024-05-07 16:44:26]         register.go:408   

> ======================================================================

> =======

> INFO[2024-05-07 16:44:26]         register.go:413   Machine will be 

> automatically registered using the server token

> INFO[2024-05-07 16:44:26]         register.go:414   Agent token is 

> printed above for informational purposes only

> INFO[2024-05-07 16:44:26]         register.go:415   User does not need 

> to copy or verify the agent token during registration via the server 

> token

> INFO[2024-05-07 16:44:26]         register.go:416   It will be sent to 

> the server but it is not directly used in this type of machine 

> registration

> INFO[2024-05-07 16:44:26]         register.go:425   Try to register 

> agent in Stork Server

> INFO[2024-05-07 16:44:26]         register.go:262   Machine registered

> INFO[2024-05-07 16:44:26]         register.go:283   Stored 

> agent-signed cert and CA cert

> ERRO[2024-05-07 16:44:26]         register.go:454   Retrying ping 1/3 

> due to error                error="problem pinging machine: Cannot 

> ping machine"

> ERRO[2024-05-07 16:44:28]         register.go:454   Retrying ping 2/3 

> due to error                error="problem pinging machine: Cannot 

> ping machine"

> ERRO[2024-05-07 16:44:32]         register.go:459   Cannot ping 

> machine

>                            error="problem pinging machine: Cannot ping 

> machine"

> FATA[2024-05-07 16:44:32]             main.go:217   Registration 

> failed

> 

> I did try to add the --server-token flag but the net result is the 

> same

> 

> root at server-kea-node1:/home/ace# sudo su stork-agent -s /bin/sh -c 

> 'stork-agent register --server-url  <http://172.17.129.133:8080> http://172.17.129.133:8080 

> < <http://172.17.129.133:8080> http://172.17.129.133:8080> --server-token 

> OQYuMxkWmc3dySolt6uytLY4NrSkLWpo'

>  >>>> IP address or FQDN of the host with Stork Agent (for the Stork 

> Server connection) [server-kea-node1]: 172.17.129.130  >>>> Port 

> number that Stork Agent will listen on [8080]:

> INFO[2024-05-07 16:46:52]         register.go:84    Forced agent 

> certificates regeneration.

> INFO[2024-05-07 16:46:52]         register.go:406   

> ======================================================================

> =======

> INFO[2024-05-07 16:46:52]         register.go:407   AGENT TOKEN: 

> D43AA9AA37F03B1D24A0ADC9CB23E4137FCC284429A1CC87AE397CC78E3DE4FC

> INFO[2024-05-07 16:46:52]         register.go:408   

> ======================================================================

> =======

> INFO[2024-05-07 16:46:52]         register.go:413   Machine will be 

> automatically registered using the server token

> INFO[2024-05-07 16:46:52]         register.go:414   Agent token is 

> printed above for informational purposes only

> INFO[2024-05-07 16:46:52]         register.go:415   User does not need 

> to copy or verify the agent token during registration via the server 

> token

> INFO[2024-05-07 16:46:52]         register.go:416   It will be sent to 

> the server but it is not directly used in this type of machine 

> registration

> INFO[2024-05-07 16:46:52]         register.go:425   Try to register 

> agent in Stork Server

> INFO[2024-05-07 16:46:52]         register.go:262   Machine registered

> INFO[2024-05-07 16:46:52]         register.go:283   Stored 

> agent-signed cert and CA cert

> ERRO[2024-05-07 16:46:52]         register.go:454   Retrying ping 1/3 

> due to error                error="problem pinging machine: Cannot 

> ping machine"

> ERRO[2024-05-07 16:46:54]         register.go:454   Retrying ping 2/3 

> due to error                error="problem pinging machine: Cannot 

> ping machine"

> ERRO[2024-05-07 16:46:58]         register.go:459   Cannot ping 

> machine

>                            error="problem pinging machine: Cannot ping 

> machine"

> FATA[2024-05-07 16:46:58]             main.go:217   Registration 

> failed

> 

> Regards

> 

> Marek

> 

> On Tue, May 7, 2024 at 10:38 AM Slawek Figiel <slawek at isc.org 

> < <mailto:slawek at isc.org> mailto:slawek at isc.org>> wrote:

> 

>     Marek,

> 

>     it is interesting case. But don't worry I'm sure we will find the cause

>     of the problem soon.

> 

>     I see you performed the manual registration using the "register"

>     command. Could you use this command again, but this time provide the

>     `--server-token` flag? Your server token is on the machines page.

> 

>     An additional check is performed when the `--server-token` flag is

>     used.

>     After the successful registration, the server sends the Ping request

>     over the GRPC protocol to the agent. It verifies whether the provided

>     agent host is accessible from the server machine.

> 

>     If the operation fails, you must check your network configuration and

>     the IP address provided as the agent host.

> 

>     I'm waiting for your feedback.

> 

>     Regards,

>     Slawek

> 

>     On 07/05/2024 18:25, Marek Hajduczenia wrote:

>      > Inline, please, with [mh0507] tags

>      >

>      > -----Original Message-----

>      > From: Slawek Figiel < <mailto:slawek at isc.org%20%3cmailto:slawek at isc.org> slawek at isc.org <mailto:slawek at isc.org>>

>      > Sent: Tuesday, May 7, 2024 10:21 AM

>      > To: Marek Hajduczenia <mxhajduczenia at gmail.com

>     < <mailto:mxhajduczenia at gmail.com> mailto:mxhajduczenia at gmail.com>>

>      > Cc:  <mailto:stork-users at lists.isc.org> stork-users at lists.isc.org < <mailto:stork-users at lists.isc.org> mailto:stork-users at lists.isc.org>

>      > Subject: Re: [stork-users] Removing a node and re-adding it back

>     causes a certificate error

>      >

>      > Marek,

>      >

>      >   >     That has not solved my problem. I went through the

>     following process

>      >   >

>      >   >     1. Remove the previous registration for .130 machine at

>     Stork GUI

>      >   >     (Action > Remove)

>      >   >     2. Remove all content from /var/lib/stork-agen/certs and

>      >   >     /var/lib/stork-agen/tokens

>      >   >     3. Re-run registration

>      >

>      > Did you re-authorize the machine? (Machines => Unathorized =>

>     Click the Authorize button). I suppose yes but I would like to

>     double-check.

>      >

>      > [mh0507] Correct, I did re-authorize the machine, that is part of

>     the standard work flow covered in the documentation for Stork.

>      >

>      >   > I am back where I was

>      >

>      > Hmm... Could you verify if the Stork server and Stork agent

>     versions are the same? You can check them by `stork-server

>     --version` and `stork-agent --version` commands.

>      >

>      > [mh0507] As requested, they are both on 1.16.0 as shown below.

>      >

>      > root at server-kea-control:/etc/stork# stork-server --version

>      > 1.16.0

>      >

>      > root at server-kea-node1:/var/lib/stork-agent# stork-agent --version

>      > 1.16.0

>      >

>      > Slawek

>      >

>      > On 07/05/2024 16:23, Marek Hajduczenia wrote:

>      >> The certs have been regenerated on the node, for what it is worth

>      >>

>      >> root at server-kea-node1:/var/lib/stork-agent# ls -lah certs/ total 20K

>      >> drwx------ 2 stork-agent root        4.0K May  7 11:47 .

>      >> drwxr-xr-x 4 stork-agent root        4.0K May  6 19:08 ..

>      >> -rw------- 1 stork-agent stork-agent  664 May  7 11:47 ca.pem

>      >> -rw------- 1 stork-agent stork-agent  656 May  7 11:47 cert.pem

>      >> -rw------- 1 stork-agent stork-agent  241 May  7 11:47 key.pem

>      >>

>      >> but it seems that the Stork Server side is holding onto old

>     certs? Not

>      >> sure where they would be stored - likely in the backend DB, but I do

>      >> not want to delete things at random.

>      >>

>      >> Regards

>      >>

>      >> Marek

>      >>

>      >> On Tue, May 7, 2024 at 5:56 AM Marek Hajduczenia

>      >> < <mailto:mxhajduczenia at gmail.com%20%3cmailto:mxhajduczenia at gmail.com> mxhajduczenia at gmail.com <mailto:mxhajduczenia at gmail.com>

>     < <mailto:mxhajduczenia at gmail.com%20%3cmailto:mxhajduczenia at gmail.com> mailto:mxhajduczenia at gmail.com <mailto:mxhajduczenia at gmail.com>>>

>     wrote:

>      >>

>      >>      Hi Slawek,

>      >>

>      >>      That has not solved my problem. I went through the following

>      >> process

>      >>

>      >>      1. Remove the previous registration for .130 machine at

>     Stork GUI

>      >>      (Action > Remove)

>      >>      2. Remove all content from /var/lib/stork-agen/certs and

>      >>      /var/lib/stork-agen/tokens

>      >>      3. Re-run registration

>      >>

>      >>      root at server-kea-node1:/var/lib/stork-agent/tokens# sudo su

>      >>      stork-agent -s /bin/sh -c 'stork-agent register --server-url

>      >>  <http://172.17.129.251:8080> http://172.17.129.251:8080 < <http://172.17.129.251:8080> http://172.17.129.251:8080>

>     < <http://172.17.129.251:8080%20%3chttp:/172.17.129.251:8080> http://172.17.129.251:8080 <http://172.17.129.251:8080>>'

>      >>       >>>> Server access token (optional):

>      >>       >>>> IP address or FQDN of the host with Stork Agent (for

>     the Stork

>      >>      Server connection) [server-kea-node1]: 172.17.129.130

>      >>       >>>> Port number that Stork Agent will listen on [8080]:

>      >>      INFO[2024-05-07 11:47:14]         register.go:81    There

>     are no

>      >>      agent certificates - they will be generated.

>      >>      INFO[2024-05-07 11:47:14]         register.go:406

>      >>     

>     =============================================================================

>      >>      INFO[2024-05-07 11:47:14]         register.go:407   AGENT

>     TOKEN:

>      >>     

>     B777710F0547C3EA237002537E4B18202F888F4D0F6C2C00BA105167DE1688CE

>      >>      INFO[2024-05-07 11:47:14]         register.go:408

>      >>     

>     =============================================================================

>      >>      INFO[2024-05-07 11:47:14]         register.go:411 

>       Authorize the

>      >>      machine in the Stork web UI

>      >>      INFO[2024-05-07 11:47:14]         register.go:425   Try to

>     register

>      >>      agent in Stork Server

>      >>      INFO[2024-05-07 11:47:14]         register.go:262   Machine

>     registered

>      >>      INFO[2024-05-07 11:47:14]         register.go:283   Stored

>      >>      agent-signed cert and CA cert

>      >>      INFO[2024-05-07 11:47:14]             main.go:215 

>       Registration

>      >>      completed successfully

>      >>

>      >>      4. I am back where I was

>      >>

>      >>      image.png

>      >>

>      >>      I did restart the local Stork agent but that did not change

>      >> anything

>      >>

>      >>      root at server-kea-node1:/var/lib/stork-agent/tokens# service

>      >>      isc-kea-ctrl-agent restart

>      >>      root at server-kea-node1:/var/lib/stork-agent/tokens# service

>      >>      isc-kea-ctrl-agent status

>      >>      ● isc-kea-ctrl-agent.service - Kea Control Agent

>      >>            Loaded: loaded

>      >>      (/lib/systemd/system/isc-kea-ctrl-agent.service; enabled;

>     vendor

>      >>      preset: enabled)

>      >>            Active: active (running) since Tue 2024-05-07

>     11:50:16 UTC; 3s ago

>      >>              Docs: man:kea-ctrl-agent(8)

>      >>          Main PID: 10543 (kea-ctrl-agent)

>      >>             Tasks: 5 (limit: 9343)

>      >>            Memory: 1.4M

>      >>               CPU: 7ms

>      >>            CGroup: /system.slice/isc-kea-ctrl-agent.service

>      >>                    └─10543 /usr/sbin/kea-ctrl-agent -c

>      >>      /etc/kea/kea-ctrl-agent.conf

>      >>

>      >>      May 07 11:50:16 server-kea-node1 systemd[1]:

>      >>      isc-kea-ctrl-agent.service: Deactivated successfully.

>      >>      May 07 11:50:16 server-kea-node1 systemd[1]: Stopped Kea

>     Control Agent.

>      >>      May 07 11:50:16 server-kea-node1 systemd[1]:

>      >>      isc-kea-ctrl-agent.service: Consumed 48.595s CPU time.

>      >>      May 07 11:50:16 server-kea-node1 systemd[1]: Started Kea

>     Control Agent.

>      >>

>      >>      For what is worth, the message in the logs has changed

>      >>

>      >>      May  7 11:54:39 server-kea-control stork-server[719]:

>      >>      time="2024-05-07 11:54:39" level="info" msg="Completed

>     pulling lease

>      >>      stats from Kea apps: 0/1 succeeded" file="     

>     statspuller.go:71   "

>      >>      May  7 11:54:39 server-kea-control stork-server[719]:

>      >>      time="2024-05-07 11:54:39" level="warning" msg="rpc error:

>     code =

>      >>      Unavailable desc = connection error: desc = \"error reading

>     server

>      >>      preface: remote error: tls: bad certificate\"" file="

>      >>        manager.go:124  " agent="172.17.129.130:8080

>     < <http://172.17.129.130:8080> http://172.17.129.130:8080>

>      >>      < <http://172.17.129.130:8080%20%3chttp:/172.17.129.130:8080> http://172.17.129.130:8080 <http://172.17.129.130:8080>>"

>      >>      May  7 11:54:39 server-kea-control stork-server[719]:

>      >>      time="2024-05-07 11:54:39" level="warning" msg="Failed to

>     get state

>      >>      from the Stork agent; the agent is still not responding" file="

>      >>             grpcli.go:326  " agent="172.17.129.130:8080

>     < <http://172.17.129.130:8080> http://172.17.129.130:8080>

>      >>      < <http://172.17.129.130:8080%20%3chttp:/172.17.129.130:8080> http://172.17.129.130:8080 <http://172.17.129.130:8080>>"

>      >>      May  7 11:54:39 server-kea-control stork-server[719]:

>      >>      time="2024-05-07 11:54:39" level="warning" msg="failed to

>     get state

>      >>      from agent 172.17.129.130:8080 < <http://172.17.129.130:8080> http://172.17.129.130:8080>

>     < <http://172.17.129.130:8080%20%3chttp:/172.17.129.130:8080> http://172.17.129.130:8080 <http://172.17.129.130:8080>>: grpc

>      >>      manager is unable to re-establish connection with the agent

>      >> 172.17.129.130:8080 < <http://172.17.129.130:8080> http://172.17.129.130:8080>

>     < <http://172.17.129.130:8080%20%3chttp:/172.17.129.130:8080> http://172.17.129.130:8080 <http://172.17.129.130:8080>>: rpc

>     error: code =

>      >>      Unavailable desc = connection error: desc = \"error reading

>     server

>      >>      preface: remote error: tls: bad certificate\"" file="

>      >>        statepuller.go:247  "

>      >>

>      >>      Not sure whether it for the better or worse

>      >>

>      >>      Regards

>      >>

>      >>      Marek

>      >>

>      >>      On Tue, May 7, 2024 at 4:06 AM Slawek Figiel

>     < <mailto:slawek at isc.org%20%3cmailto:slawek at isc.org> slawek at isc.org <mailto:slawek at isc.org>

>      >>      < <mailto:slawek at isc.org%20%3cmailto:slawek at isc.org> mailto:slawek at isc.org <mailto:slawek at isc.org>>> wrote:

>      >>

>      >>          Hello Marek!

>      >>

>      >>          Stork server reports that the agent introduced itself

>     with a "bad

>      >>          certificate." Several reasons may cause it. I think you

>     should

>      >>          remove

>      >>          the existing cert files and re-register the agent.

>     Please do the

>      >>          following steps:

>      >>

>      >>          1. On the agent machine, remove the files in the

>      >>          `/var/lib/stork-agent`

>      >>          directory (you need to remove all files from the

>     `certs` and

>      >>          `tokens`

>      >>          subdirectories)

>      >>          2. If you manually registered the agent (by the `register`

>      >>          command, you

>      >>          need to call it again and restart the agent. If you

>     used the

>      >>          self-registration flow, just restart the agent.

>      >>          3. Open the Stork UI, go to the machines list, switch

>     to the

>      >>          "Unauthorized" tab, and re-authorize the agent.

>      >>

>      >>          I hope it'll solve your problem.

>      >>          Don't hesitate to ask for more details if you have any

>     questions.

>      >>

>      >>          Regards,

>      >>          Slawek Figiel

>      >>

>      >>          On 07/05/2024 00:05,  <mailto:mxhajduczenia at gmail.com> mxhajduczenia at gmail.com

>     < <mailto:mxhajduczenia at gmail.com> mailto:mxhajduczenia at gmail.com>

>      >>          <mailto:mxhajduczenia at gmail.com

>     < <mailto:mxhajduczenia at gmail.com> mailto:mxhajduczenia at gmail.com>> wrote:

>      >>           > Dear Forum,

>      >>           >

>      >>           > I had two nodes added to Stork: .130 and .131 and

>     they were

>      >>          working

>      >>           > correctly. Node .130 had a kernel failure due to

>     changes I

>      >>          was trying to

>      >>           > apply and I did not make a copy, unfortunately. Long

>     story

>      >>          short, I had

>      >>           > to re-install node .130, and then I wanted to add it

>     back to

>      >>          Stork

>      >>           >

>      >>           > No matter what I do, I am getting the error shown above,

>      >>          i.e., Cannot

>      >>           > get state of machine.

>      >>           >

>      >>           > Syslog review shows only one error message following two

>      >>          warning messages.

>      >>           >

>      >>           > May  6 21:58:38 server-kea-control stork-server[719]:

>      >>          time="2024-05-06

>      >>           > 21:58:38" level="warning" msg="rpc error: code =

>     Unavailable

>      >>          desc =

>      >>           > connection error: desc = \"error reading server preface:

>      >>          remote error:

>      >>           > tls: bad certificate\"" file="         

>     manager.go:124  "

>      >>           > agent="172.17.129.130:8080

>     < <http://172.17.129.130:8080> http://172.17.129.130:8080> <http://172.17.129.130:8080

>     < <http://172.17.129.130:8080> http://172.17.129.130:8080>>"

>      >>           >

>      >>           > May  6 21:58:38 server-kea-control stork-server[719]:

>      >>          time="2024-05-06

>      >>           > 21:58:38" level="warning" msg="Failed to get state

>     from the

>      >>          Stork agent;

>      >>           > the agent is still not responding" file="

>      >>          grpcli.go:326  "

>      >>           > agent="172.17.129.130:8080

>     < <http://172.17.129.130:8080> http://172.17.129.130:8080> <http://172.17.129.130:8080

>     < <http://172.17.129.130:8080> http://172.17.129.130:8080>>"

>      >>           >

>      >>           > May  6 21:58:38 server-kea-control stork-server[719]:

>      >>          time="2024-05-06

>      >>           > 21:58:38" level="warning" msg="failed to get state

>     from agent

>      >>           > 172.17.129.130:8080 < <http://172.17.129.130:8080> http://172.17.129.130:8080>

>     < <http://172.17.129.130:8080%20%3chttp:/172.17.129.130:8080> http://172.17.129.130:8080 <http://172.17.129.130:8080>>: grpc

>      >>          manager is unable to re-establish connection

>      >>           > with the agent 172.17.129.130:8080

>     < <http://172.17.129.130:8080> http://172.17.129.130:8080>

>      >>          <http://172.17.129.130:8080

>     < <http://172.17.129.130:8080> http://172.17.129.130:8080>>: rpc error: code = Unavailable desc =

>      >>           > connection error: desc = \"error reading server preface:

>      >>          remote error:

>      >>           > tls: bad certificate\"" file="     

>     statepuller.go:247  "

>      >>           >

>      >>           > I suspect that the TLS certificate does to get

>     cleared when

>      >>          the machine

>      >>           > is removed and a machine with the same IP address is

>     re-added.

>      >>           >

>      >>           > I did not find a remedy for it for now and I do not

>     fancy a

>      >>          complete

>      >>           > re-install of Stork if I can avoid it.

>      >>           >

>      >>           > Any suggestions on how to fix it?

>      >>           >

>      >>           > Regards

>      >>           >

>      >>           > Marek

>      >>           >

>      >>           >

>      >>          --

>      >>          Stork-users mailing list

>      >>  <mailto:Stork-users at lists.isc.org> Stork-users at lists.isc.org < <mailto:Stork-users at lists.isc.org> mailto:Stork-users at lists.isc.org>

>     < <mailto:Stork-users at lists.isc.org%20%3cmailto:Stork-users at lists.isc.org> mailto:Stork-users at lists.isc.org <mailto:Stork-users at lists.isc.org>>

>      >>  <https://lists.isc.org/mailman/listinfo/stork-users> https://lists.isc.org/mailman/listinfo/stork-users

>     < <https://lists.isc.org/mailman/listinfo/stork-users> https://lists.isc.org/mailman/listinfo/stork-users>

>      >>          <https://lists.isc.org/mailman/listinfo/stork-users

>     < <https://lists.isc.org/mailman/listinfo/stork-users> https://lists.isc.org/mailman/listinfo/stork-users>>

>      >>

>      >

> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/stork-users/attachments/20240508/0458ea00/attachment-0001.htm>


More information about the Stork-users mailing list