[stork-users] Removing a node and re-adding it back causes a certificate error
Marek Hajduczenia
mxhajduczenia at gmail.com
Tue May 7 16:47:18 UTC 2024
I did go with the recommendation and even though I am 100% sure I have IP
reachability, the registration process with server token fails.
root at server-kea-node1:/home/ace# ping 172.17.129.133
PING 172.17.129.133 (172.17.129.133) 56(84) bytes of data.
64 bytes from 172.17.129.133: icmp_seq=1 ttl=64 time=0.074 ms
64 bytes from 172.17.129.133: icmp_seq=2 ttl=64 time=0.063 ms
64 bytes from 172.17.129.133: icmp_seq=3 ttl=64 time=0.147 ms
^C
--- 172.17.129.133 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2054ms
rtt min/avg/max/mdev = 0.063/0.094/0.147/0.037 ms
root at server-kea-node1:/home/ace# sudo su stork-agent -s /bin/sh -c
'stork-agent register --server-url http://172.17.129.133:8080'
>>>> Server access token (optional):
>>>> IP address or FQDN of the host with Stork Agent (for the Stork Server
connection) [server-kea-node1]: 172.17.129.130
>>>> Port number that Stork Agent will listen on [8080]:
INFO[2024-05-07 16:44:26] register.go:84 Forced agent
certificates regeneration.
INFO[2024-05-07 16:44:26] register.go:406
=============================================================================
INFO[2024-05-07 16:44:26] register.go:407 AGENT TOKEN:
E9EE6D836E249B0E9A8898E638DECFCAD35A6577A70672E8F639D4A46CEBC211
INFO[2024-05-07 16:44:26] register.go:408
=============================================================================
INFO[2024-05-07 16:44:26] register.go:413 Machine will be
automatically registered using the server token
INFO[2024-05-07 16:44:26] register.go:414 Agent token is printed
above for informational purposes only
INFO[2024-05-07 16:44:26] register.go:415 User does not need to
copy or verify the agent token during registration via the server token
INFO[2024-05-07 16:44:26] register.go:416 It will be sent to the
server but it is not directly used in this type of machine registration
INFO[2024-05-07 16:44:26] register.go:425 Try to register agent
in Stork Server
INFO[2024-05-07 16:44:26] register.go:262 Machine registered
INFO[2024-05-07 16:44:26] register.go:283 Stored agent-signed
cert and CA cert
ERRO[2024-05-07 16:44:26] register.go:454 Retrying ping 1/3 due
to error error="problem pinging machine: Cannot ping machine"
ERRO[2024-05-07 16:44:28] register.go:454 Retrying ping 2/3 due
to error error="problem pinging machine: Cannot ping machine"
ERRO[2024-05-07 16:44:32] register.go:459 Cannot ping machine
error="problem pinging machine: Cannot ping machine"
FATA[2024-05-07 16:44:32] main.go:217 Registration failed
I did try to add the --server-token flag but the net result is the same
root at server-kea-node1:/home/ace# sudo su stork-agent -s /bin/sh -c
'stork-agent register --server-url http://172.17.129.133:8080
--server-token OQYuMxkWmc3dySolt6uytLY4NrSkLWpo'
>>>> IP address or FQDN of the host with Stork Agent (for the Stork Server
connection) [server-kea-node1]: 172.17.129.130
>>>> Port number that Stork Agent will listen on [8080]:
INFO[2024-05-07 16:46:52] register.go:84 Forced agent
certificates regeneration.
INFO[2024-05-07 16:46:52] register.go:406
=============================================================================
INFO[2024-05-07 16:46:52] register.go:407 AGENT TOKEN:
D43AA9AA37F03B1D24A0ADC9CB23E4137FCC284429A1CC87AE397CC78E3DE4FC
INFO[2024-05-07 16:46:52] register.go:408
=============================================================================
INFO[2024-05-07 16:46:52] register.go:413 Machine will be
automatically registered using the server token
INFO[2024-05-07 16:46:52] register.go:414 Agent token is printed
above for informational purposes only
INFO[2024-05-07 16:46:52] register.go:415 User does not need to
copy or verify the agent token during registration via the server token
INFO[2024-05-07 16:46:52] register.go:416 It will be sent to the
server but it is not directly used in this type of machine registration
INFO[2024-05-07 16:46:52] register.go:425 Try to register agent
in Stork Server
INFO[2024-05-07 16:46:52] register.go:262 Machine registered
INFO[2024-05-07 16:46:52] register.go:283 Stored agent-signed
cert and CA cert
ERRO[2024-05-07 16:46:52] register.go:454 Retrying ping 1/3 due
to error error="problem pinging machine: Cannot ping machine"
ERRO[2024-05-07 16:46:54] register.go:454 Retrying ping 2/3 due
to error error="problem pinging machine: Cannot ping machine"
ERRO[2024-05-07 16:46:58] register.go:459 Cannot ping machine
error="problem pinging machine: Cannot ping machine"
FATA[2024-05-07 16:46:58] main.go:217 Registration failed
Regards
Marek
On Tue, May 7, 2024 at 10:38 AM Slawek Figiel <slawek at isc.org> wrote:
> Marek,
>
> it is interesting case. But don't worry I'm sure we will find the cause
> of the problem soon.
>
> I see you performed the manual registration using the "register"
> command. Could you use this command again, but this time provide the
> `--server-token` flag? Your server token is on the machines page.
>
> An additional check is performed when the `--server-token` flag is used.
> After the successful registration, the server sends the Ping request
> over the GRPC protocol to the agent. It verifies whether the provided
> agent host is accessible from the server machine.
>
> If the operation fails, you must check your network configuration and
> the IP address provided as the agent host.
>
> I'm waiting for your feedback.
>
> Regards,
> Slawek
>
> On 07/05/2024 18:25, Marek Hajduczenia wrote:
> > Inline, please, with [mh0507] tags
> >
> > -----Original Message-----
> > From: Slawek Figiel <slawek at isc.org>
> > Sent: Tuesday, May 7, 2024 10:21 AM
> > To: Marek Hajduczenia <mxhajduczenia at gmail.com>
> > Cc: stork-users at lists.isc.org
> > Subject: Re: [stork-users] Removing a node and re-adding it back causes
> a certificate error
> >
> > Marek,
> >
> > > That has not solved my problem. I went through the following
> process
> > >
> > > 1. Remove the previous registration for .130 machine at Stork GUI
> > > (Action > Remove)
> > > 2. Remove all content from /var/lib/stork-agen/certs and
> > > /var/lib/stork-agen/tokens
> > > 3. Re-run registration
> >
> > Did you re-authorize the machine? (Machines => Unathorized => Click the
> Authorize button). I suppose yes but I would like to double-check.
> >
> > [mh0507] Correct, I did re-authorize the machine, that is part of the
> standard work flow covered in the documentation for Stork.
> >
> > > I am back where I was
> >
> > Hmm... Could you verify if the Stork server and Stork agent versions are
> the same? You can check them by `stork-server --version` and `stork-agent
> --version` commands.
> >
> > [mh0507] As requested, they are both on 1.16.0 as shown below.
> >
> > root at server-kea-control:/etc/stork# stork-server --version
> > 1.16.0
> >
> > root at server-kea-node1:/var/lib/stork-agent# stork-agent --version
> > 1.16.0
> >
> > Slawek
> >
> > On 07/05/2024 16:23, Marek Hajduczenia wrote:
> >> The certs have been regenerated on the node, for what it is worth
> >>
> >> root at server-kea-node1:/var/lib/stork-agent# ls -lah certs/ total 20K
> >> drwx------ 2 stork-agent root 4.0K May 7 11:47 .
> >> drwxr-xr-x 4 stork-agent root 4.0K May 6 19:08 ..
> >> -rw------- 1 stork-agent stork-agent 664 May 7 11:47 ca.pem
> >> -rw------- 1 stork-agent stork-agent 656 May 7 11:47 cert.pem
> >> -rw------- 1 stork-agent stork-agent 241 May 7 11:47 key.pem
> >>
> >> but it seems that the Stork Server side is holding onto old certs? Not
> >> sure where they would be stored - likely in the backend DB, but I do
> >> not want to delete things at random.
> >>
> >> Regards
> >>
> >> Marek
> >>
> >> On Tue, May 7, 2024 at 5:56 AM Marek Hajduczenia
> >> <mxhajduczenia at gmail.com <mailto:mxhajduczenia at gmail.com>> wrote:
> >>
> >> Hi Slawek,
> >>
> >> That has not solved my problem. I went through the following
> >> process
> >>
> >> 1. Remove the previous registration for .130 machine at Stork GUI
> >> (Action > Remove)
> >> 2. Remove all content from /var/lib/stork-agen/certs and
> >> /var/lib/stork-agen/tokens
> >> 3. Re-run registration
> >>
> >> root at server-kea-node1:/var/lib/stork-agent/tokens# sudo su
> >> stork-agent -s /bin/sh -c 'stork-agent register --server-url
> >> http://172.17.129.251:8080 <http://172.17.129.251:8080>'
> >> >>>> Server access token (optional):
> >> >>>> IP address or FQDN of the host with Stork Agent (for the
> Stork
> >> Server connection) [server-kea-node1]: 172.17.129.130
> >> >>>> Port number that Stork Agent will listen on [8080]:
> >> INFO[2024-05-07 11:47:14] register.go:81 There are no
> >> agent certificates - they will be generated.
> >> INFO[2024-05-07 11:47:14] register.go:406
> >>
> =============================================================================
> >> INFO[2024-05-07 11:47:14] register.go:407 AGENT TOKEN:
> >> B777710F0547C3EA237002537E4B18202F888F4D0F6C2C00BA105167DE1688CE
> >> INFO[2024-05-07 11:47:14] register.go:408
> >>
> =============================================================================
> >> INFO[2024-05-07 11:47:14] register.go:411 Authorize the
> >> machine in the Stork web UI
> >> INFO[2024-05-07 11:47:14] register.go:425 Try to register
> >> agent in Stork Server
> >> INFO[2024-05-07 11:47:14] register.go:262 Machine
> registered
> >> INFO[2024-05-07 11:47:14] register.go:283 Stored
> >> agent-signed cert and CA cert
> >> INFO[2024-05-07 11:47:14] main.go:215 Registration
> >> completed successfully
> >>
> >> 4. I am back where I was
> >>
> >> image.png
> >>
> >> I did restart the local Stork agent but that did not change
> >> anything
> >>
> >> root at server-kea-node1:/var/lib/stork-agent/tokens# service
> >> isc-kea-ctrl-agent restart
> >> root at server-kea-node1:/var/lib/stork-agent/tokens# service
> >> isc-kea-ctrl-agent status
> >> ● isc-kea-ctrl-agent.service - Kea Control Agent
> >> Loaded: loaded
> >> (/lib/systemd/system/isc-kea-ctrl-agent.service; enabled; vendor
> >> preset: enabled)
> >> Active: active (running) since Tue 2024-05-07 11:50:16 UTC;
> 3s ago
> >> Docs: man:kea-ctrl-agent(8)
> >> Main PID: 10543 (kea-ctrl-agent)
> >> Tasks: 5 (limit: 9343)
> >> Memory: 1.4M
> >> CPU: 7ms
> >> CGroup: /system.slice/isc-kea-ctrl-agent.service
> >> └─10543 /usr/sbin/kea-ctrl-agent -c
> >> /etc/kea/kea-ctrl-agent.conf
> >>
> >> May 07 11:50:16 server-kea-node1 systemd[1]:
> >> isc-kea-ctrl-agent.service: Deactivated successfully.
> >> May 07 11:50:16 server-kea-node1 systemd[1]: Stopped Kea Control
> Agent.
> >> May 07 11:50:16 server-kea-node1 systemd[1]:
> >> isc-kea-ctrl-agent.service: Consumed 48.595s CPU time.
> >> May 07 11:50:16 server-kea-node1 systemd[1]: Started Kea Control
> Agent.
> >>
> >> For what is worth, the message in the logs has changed
> >>
> >> May 7 11:54:39 server-kea-control stork-server[719]:
> >> time="2024-05-07 11:54:39" level="info" msg="Completed pulling
> lease
> >> stats from Kea apps: 0/1 succeeded" file=" statspuller.go:71
> "
> >> May 7 11:54:39 server-kea-control stork-server[719]:
> >> time="2024-05-07 11:54:39" level="warning" msg="rpc error: code =
> >> Unavailable desc = connection error: desc = \"error reading server
> >> preface: remote error: tls: bad certificate\"" file="
> >> manager.go:124 " agent="172.17.129.130:8080
> >> <http://172.17.129.130:8080>"
> >> May 7 11:54:39 server-kea-control stork-server[719]:
> >> time="2024-05-07 11:54:39" level="warning" msg="Failed to get state
> >> from the Stork agent; the agent is still not responding" file="
> >> grpcli.go:326 " agent="172.17.129.130:8080
> >> <http://172.17.129.130:8080>"
> >> May 7 11:54:39 server-kea-control stork-server[719]:
> >> time="2024-05-07 11:54:39" level="warning" msg="failed to get state
> >> from agent 172.17.129.130:8080 <http://172.17.129.130:8080>: grpc
> >> manager is unable to re-establish connection with the agent
> >> 172.17.129.130:8080 <http://172.17.129.130:8080>: rpc error: code
> =
> >> Unavailable desc = connection error: desc = \"error reading server
> >> preface: remote error: tls: bad certificate\"" file="
> >> statepuller.go:247 "
> >>
> >> Not sure whether it for the better or worse
> >>
> >> Regards
> >>
> >> Marek
> >>
> >> On Tue, May 7, 2024 at 4:06 AM Slawek Figiel <slawek at isc.org
> >> <mailto:slawek at isc.org>> wrote:
> >>
> >> Hello Marek!
> >>
> >> Stork server reports that the agent introduced itself with a
> "bad
> >> certificate." Several reasons may cause it. I think you should
> >> remove
> >> the existing cert files and re-register the agent. Please do
> the
> >> following steps:
> >>
> >> 1. On the agent machine, remove the files in the
> >> `/var/lib/stork-agent`
> >> directory (you need to remove all files from the `certs` and
> >> `tokens`
> >> subdirectories)
> >> 2. If you manually registered the agent (by the `register`
> >> command, you
> >> need to call it again and restart the agent. If you used the
> >> self-registration flow, just restart the agent.
> >> 3. Open the Stork UI, go to the machines list, switch to the
> >> "Unauthorized" tab, and re-authorize the agent.
> >>
> >> I hope it'll solve your problem.
> >> Don't hesitate to ask for more details if you have any
> questions.
> >>
> >> Regards,
> >> Slawek Figiel
> >>
> >> On 07/05/2024 00:05, mxhajduczenia at gmail.com
> >> <mailto:mxhajduczenia at gmail.com> wrote:
> >> > Dear Forum,
> >> >
> >> > I had two nodes added to Stork: .130 and .131 and they were
> >> working
> >> > correctly. Node .130 had a kernel failure due to changes I
> >> was trying to
> >> > apply and I did not make a copy, unfortunately. Long story
> >> short, I had
> >> > to re-install node .130, and then I wanted to add it back to
> >> Stork
> >> >
> >> > No matter what I do, I am getting the error shown above,
> >> i.e., Cannot
> >> > get state of machine.
> >> >
> >> > Syslog review shows only one error message following two
> >> warning messages.
> >> >
> >> > May 6 21:58:38 server-kea-control stork-server[719]:
> >> time="2024-05-06
> >> > 21:58:38" level="warning" msg="rpc error: code = Unavailable
> >> desc =
> >> > connection error: desc = \"error reading server preface:
> >> remote error:
> >> > tls: bad certificate\"" file=" manager.go:124 "
> >> > agent="172.17.129.130:8080 <http://172.17.129.130:8080>"
> >> >
> >> > May 6 21:58:38 server-kea-control stork-server[719]:
> >> time="2024-05-06
> >> > 21:58:38" level="warning" msg="Failed to get state from the
> >> Stork agent;
> >> > the agent is still not responding" file="
> >> grpcli.go:326 "
> >> > agent="172.17.129.130:8080 <http://172.17.129.130:8080>"
> >> >
> >> > May 6 21:58:38 server-kea-control stork-server[719]:
> >> time="2024-05-06
> >> > 21:58:38" level="warning" msg="failed to get state from
> agent
> >> > 172.17.129.130:8080 <http://172.17.129.130:8080>: grpc
> >> manager is unable to re-establish connection
> >> > with the agent 172.17.129.130:8080
> >> <http://172.17.129.130:8080>: rpc error: code = Unavailable
> desc =
> >> > connection error: desc = \"error reading server preface:
> >> remote error:
> >> > tls: bad certificate\"" file=" statepuller.go:247 "
> >> >
> >> > I suspect that the TLS certificate does to get cleared when
> >> the machine
> >> > is removed and a machine with the same IP address is
> re-added.
> >> >
> >> > I did not find a remedy for it for now and I do not fancy a
> >> complete
> >> > re-install of Stork if I can avoid it.
> >> >
> >> > Any suggestions on how to fix it?
> >> >
> >> > Regards
> >> >
> >> > Marek
> >> >
> >> >
> >> --
> >> Stork-users mailing list
> >> Stork-users at lists.isc.org <mailto:Stork-users at lists.isc.org>
> >> https://lists.isc.org/mailman/listinfo/stork-users
> >> <https://lists.isc.org/mailman/listinfo/stork-users>
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/stork-users/attachments/20240507/40927136/attachment-0001.htm>
More information about the Stork-users
mailing list