[stork-users] Removing a node and re-adding it back causes a certificate error
Marek Hajduczenia
mxhajduczenia at gmail.com
Tue May 7 11:56:05 UTC 2024
Hi Slawek,
That has not solved my problem. I went through the following process
1. Remove the previous registration for .130 machine at Stork GUI (Action >
Remove)
2. Remove all content from /var/lib/stork-agen/certs and
/var/lib/stork-agen/tokens
3. Re-run registration
root at server-kea-node1:/var/lib/stork-agent/tokens# sudo su stork-agent -s
/bin/sh -c 'stork-agent register --server-url http://172.17.129.251:8080'
>>>> Server access token (optional):
>>>> IP address or FQDN of the host with Stork Agent (for the Stork Server
connection) [server-kea-node1]: 172.17.129.130
>>>> Port number that Stork Agent will listen on [8080]:
INFO[2024-05-07 11:47:14] register.go:81 There are no agent
certificates - they will be generated.
INFO[2024-05-07 11:47:14] register.go:406
=============================================================================
INFO[2024-05-07 11:47:14] register.go:407 AGENT TOKEN:
B777710F0547C3EA237002537E4B18202F888F4D0F6C2C00BA105167DE1688CE
INFO[2024-05-07 11:47:14] register.go:408
=============================================================================
INFO[2024-05-07 11:47:14] register.go:411 Authorize the machine
in the Stork web UI
INFO[2024-05-07 11:47:14] register.go:425 Try to register agent
in Stork Server
INFO[2024-05-07 11:47:14] register.go:262 Machine registered
INFO[2024-05-07 11:47:14] register.go:283 Stored agent-signed
cert and CA cert
INFO[2024-05-07 11:47:14] main.go:215 Registration completed
successfully
4. I am back where I was
[image: image.png]
I did restart the local Stork agent but that did not change anything
root at server-kea-node1:/var/lib/stork-agent/tokens# service
isc-kea-ctrl-agent restart
root at server-kea-node1:/var/lib/stork-agent/tokens# service
isc-kea-ctrl-agent status
● isc-kea-ctrl-agent.service - Kea Control Agent
Loaded: loaded (/lib/systemd/system/isc-kea-ctrl-agent.service;
enabled; vendor preset: enabled)
Active: active (running) since Tue 2024-05-07 11:50:16 UTC; 3s ago
Docs: man:kea-ctrl-agent(8)
Main PID: 10543 (kea-ctrl-agent)
Tasks: 5 (limit: 9343)
Memory: 1.4M
CPU: 7ms
CGroup: /system.slice/isc-kea-ctrl-agent.service
└─10543 /usr/sbin/kea-ctrl-agent -c
/etc/kea/kea-ctrl-agent.conf
May 07 11:50:16 server-kea-node1 systemd[1]: isc-kea-ctrl-agent.service:
Deactivated successfully.
May 07 11:50:16 server-kea-node1 systemd[1]: Stopped Kea Control Agent.
May 07 11:50:16 server-kea-node1 systemd[1]: isc-kea-ctrl-agent.service:
Consumed 48.595s CPU time.
May 07 11:50:16 server-kea-node1 systemd[1]: Started Kea Control Agent.
For what is worth, the message in the logs has changed
May 7 11:54:39 server-kea-control stork-server[719]: time="2024-05-07
11:54:39" level="info" msg="Completed pulling lease stats from Kea apps:
0/1 succeeded" file=" statspuller.go:71 "
May 7 11:54:39 server-kea-control stork-server[719]: time="2024-05-07
11:54:39" level="warning" msg="rpc error: code = Unavailable desc =
connection error: desc = \"error reading server preface: remote error: tls:
bad certificate\"" file=" manager.go:124 " agent="
172.17.129.130:8080"
May 7 11:54:39 server-kea-control stork-server[719]: time="2024-05-07
11:54:39" level="warning" msg="Failed to get state from the Stork agent;
the agent is still not responding" file=" grpcli.go:326 " agent="
172.17.129.130:8080"
May 7 11:54:39 server-kea-control stork-server[719]: time="2024-05-07
11:54:39" level="warning" msg="failed to get state from agent
172.17.129.130:8080: grpc manager is unable to re-establish connection with
the agent 172.17.129.130:8080: rpc error: code = Unavailable desc =
connection error: desc = \"error reading server preface: remote error: tls:
bad certificate\"" file=" statepuller.go:247 "
Not sure whether it for the better or worse
Regards
Marek
On Tue, May 7, 2024 at 4:06 AM Slawek Figiel <slawek at isc.org> wrote:
> Hello Marek!
>
> Stork server reports that the agent introduced itself with a "bad
> certificate." Several reasons may cause it. I think you should remove
> the existing cert files and re-register the agent. Please do the
> following steps:
>
> 1. On the agent machine, remove the files in the `/var/lib/stork-agent`
> directory (you need to remove all files from the `certs` and `tokens`
> subdirectories)
> 2. If you manually registered the agent (by the `register` command, you
> need to call it again and restart the agent. If you used the
> self-registration flow, just restart the agent.
> 3. Open the Stork UI, go to the machines list, switch to the
> "Unauthorized" tab, and re-authorize the agent.
>
> I hope it'll solve your problem.
> Don't hesitate to ask for more details if you have any questions.
>
> Regards,
> Slawek Figiel
>
> On 07/05/2024 00:05, mxhajduczenia at gmail.com wrote:
> > Dear Forum,
> >
> > I had two nodes added to Stork: .130 and .131 and they were working
> > correctly. Node .130 had a kernel failure due to changes I was trying to
> > apply and I did not make a copy, unfortunately. Long story short, I had
> > to re-install node .130, and then I wanted to add it back to Stork
> >
> > No matter what I do, I am getting the error shown above, i.e., Cannot
> > get state of machine.
> >
> > Syslog review shows only one error message following two warning
> messages.
> >
> > May 6 21:58:38 server-kea-control stork-server[719]: time="2024-05-06
> > 21:58:38" level="warning" msg="rpc error: code = Unavailable desc =
> > connection error: desc = \"error reading server preface: remote error:
> > tls: bad certificate\"" file=" manager.go:124 "
> > agent="172.17.129.130:8080"
> >
> > May 6 21:58:38 server-kea-control stork-server[719]: time="2024-05-06
> > 21:58:38" level="warning" msg="Failed to get state from the Stork agent;
> > the agent is still not responding" file=" grpcli.go:326 "
> > agent="172.17.129.130:8080"
> >
> > May 6 21:58:38 server-kea-control stork-server[719]: time="2024-05-06
> > 21:58:38" level="warning" msg="failed to get state from agent
> > 172.17.129.130:8080: grpc manager is unable to re-establish connection
> > with the agent 172.17.129.130:8080: rpc error: code = Unavailable desc
> =
> > connection error: desc = \"error reading server preface: remote error:
> > tls: bad certificate\"" file=" statepuller.go:247 "
> >
> > I suspect that the TLS certificate does to get cleared when the machine
> > is removed and a machine with the same IP address is re-added.
> >
> > I did not find a remedy for it for now and I do not fancy a complete
> > re-install of Stork if I can avoid it.
> >
> > Any suggestions on how to fix it?
> >
> > Regards
> >
> > Marek
> >
> >
> --
> Stork-users mailing list
> Stork-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/stork-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/stork-users/attachments/20240507/85490422/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 76255 bytes
Desc: not available
URL: <https://lists.isc.org/pipermail/stork-users/attachments/20240507/85490422/attachment-0001.png>
More information about the Stork-users
mailing list