[stork-users] Removing a node and re-adding it back causes a certificate error
Slawek Figiel
slawek at isc.org
Tue May 7 16:38:10 UTC 2024
Marek,
it is interesting case. But don't worry I'm sure we will find the cause
of the problem soon.
I see you performed the manual registration using the "register"
command. Could you use this command again, but this time provide the
`--server-token` flag? Your server token is on the machines page.
An additional check is performed when the `--server-token` flag is used.
After the successful registration, the server sends the Ping request
over the GRPC protocol to the agent. It verifies whether the provided
agent host is accessible from the server machine.
If the operation fails, you must check your network configuration and
the IP address provided as the agent host.
I'm waiting for your feedback.
Regards,
Slawek
On 07/05/2024 18:25, Marek Hajduczenia wrote:
> Inline, please, with [mh0507] tags
>
> -----Original Message-----
> From: Slawek Figiel <slawek at isc.org>
> Sent: Tuesday, May 7, 2024 10:21 AM
> To: Marek Hajduczenia <mxhajduczenia at gmail.com>
> Cc: stork-users at lists.isc.org
> Subject: Re: [stork-users] Removing a node and re-adding it back causes a certificate error
>
> Marek,
>
> > That has not solved my problem. I went through the following process
> >
> > 1. Remove the previous registration for .130 machine at Stork GUI
> > (Action > Remove)
> > 2. Remove all content from /var/lib/stork-agen/certs and
> > /var/lib/stork-agen/tokens
> > 3. Re-run registration
>
> Did you re-authorize the machine? (Machines => Unathorized => Click the Authorize button). I suppose yes but I would like to double-check.
>
> [mh0507] Correct, I did re-authorize the machine, that is part of the standard work flow covered in the documentation for Stork.
>
> > I am back where I was
>
> Hmm... Could you verify if the Stork server and Stork agent versions are the same? You can check them by `stork-server --version` and `stork-agent --version` commands.
>
> [mh0507] As requested, they are both on 1.16.0 as shown below.
>
> root at server-kea-control:/etc/stork# stork-server --version
> 1.16.0
>
> root at server-kea-node1:/var/lib/stork-agent# stork-agent --version
> 1.16.0
>
> Slawek
>
> On 07/05/2024 16:23, Marek Hajduczenia wrote:
>> The certs have been regenerated on the node, for what it is worth
>>
>> root at server-kea-node1:/var/lib/stork-agent# ls -lah certs/ total 20K
>> drwx------ 2 stork-agent root 4.0K May 7 11:47 .
>> drwxr-xr-x 4 stork-agent root 4.0K May 6 19:08 ..
>> -rw------- 1 stork-agent stork-agent 664 May 7 11:47 ca.pem
>> -rw------- 1 stork-agent stork-agent 656 May 7 11:47 cert.pem
>> -rw------- 1 stork-agent stork-agent 241 May 7 11:47 key.pem
>>
>> but it seems that the Stork Server side is holding onto old certs? Not
>> sure where they would be stored - likely in the backend DB, but I do
>> not want to delete things at random.
>>
>> Regards
>>
>> Marek
>>
>> On Tue, May 7, 2024 at 5:56 AM Marek Hajduczenia
>> <mxhajduczenia at gmail.com <mailto:mxhajduczenia at gmail.com>> wrote:
>>
>> Hi Slawek,
>>
>> That has not solved my problem. I went through the following
>> process
>>
>> 1. Remove the previous registration for .130 machine at Stork GUI
>> (Action > Remove)
>> 2. Remove all content from /var/lib/stork-agen/certs and
>> /var/lib/stork-agen/tokens
>> 3. Re-run registration
>>
>> root at server-kea-node1:/var/lib/stork-agent/tokens# sudo su
>> stork-agent -s /bin/sh -c 'stork-agent register --server-url
>> http://172.17.129.251:8080 <http://172.17.129.251:8080>'
>> >>>> Server access token (optional):
>> >>>> IP address or FQDN of the host with Stork Agent (for the Stork
>> Server connection) [server-kea-node1]: 172.17.129.130
>> >>>> Port number that Stork Agent will listen on [8080]:
>> INFO[2024-05-07 11:47:14] register.go:81 There are no
>> agent certificates - they will be generated.
>> INFO[2024-05-07 11:47:14] register.go:406
>> =============================================================================
>> INFO[2024-05-07 11:47:14] register.go:407 AGENT TOKEN:
>> B777710F0547C3EA237002537E4B18202F888F4D0F6C2C00BA105167DE1688CE
>> INFO[2024-05-07 11:47:14] register.go:408
>> =============================================================================
>> INFO[2024-05-07 11:47:14] register.go:411 Authorize the
>> machine in the Stork web UI
>> INFO[2024-05-07 11:47:14] register.go:425 Try to register
>> agent in Stork Server
>> INFO[2024-05-07 11:47:14] register.go:262 Machine registered
>> INFO[2024-05-07 11:47:14] register.go:283 Stored
>> agent-signed cert and CA cert
>> INFO[2024-05-07 11:47:14] main.go:215 Registration
>> completed successfully
>>
>> 4. I am back where I was
>>
>> image.png
>>
>> I did restart the local Stork agent but that did not change
>> anything
>>
>> root at server-kea-node1:/var/lib/stork-agent/tokens# service
>> isc-kea-ctrl-agent restart
>> root at server-kea-node1:/var/lib/stork-agent/tokens# service
>> isc-kea-ctrl-agent status
>> ● isc-kea-ctrl-agent.service - Kea Control Agent
>> Loaded: loaded
>> (/lib/systemd/system/isc-kea-ctrl-agent.service; enabled; vendor
>> preset: enabled)
>> Active: active (running) since Tue 2024-05-07 11:50:16 UTC; 3s ago
>> Docs: man:kea-ctrl-agent(8)
>> Main PID: 10543 (kea-ctrl-agent)
>> Tasks: 5 (limit: 9343)
>> Memory: 1.4M
>> CPU: 7ms
>> CGroup: /system.slice/isc-kea-ctrl-agent.service
>> └─10543 /usr/sbin/kea-ctrl-agent -c
>> /etc/kea/kea-ctrl-agent.conf
>>
>> May 07 11:50:16 server-kea-node1 systemd[1]:
>> isc-kea-ctrl-agent.service: Deactivated successfully.
>> May 07 11:50:16 server-kea-node1 systemd[1]: Stopped Kea Control Agent.
>> May 07 11:50:16 server-kea-node1 systemd[1]:
>> isc-kea-ctrl-agent.service: Consumed 48.595s CPU time.
>> May 07 11:50:16 server-kea-node1 systemd[1]: Started Kea Control Agent.
>>
>> For what is worth, the message in the logs has changed
>>
>> May 7 11:54:39 server-kea-control stork-server[719]:
>> time="2024-05-07 11:54:39" level="info" msg="Completed pulling lease
>> stats from Kea apps: 0/1 succeeded" file=" statspuller.go:71 "
>> May 7 11:54:39 server-kea-control stork-server[719]:
>> time="2024-05-07 11:54:39" level="warning" msg="rpc error: code =
>> Unavailable desc = connection error: desc = \"error reading server
>> preface: remote error: tls: bad certificate\"" file="
>> manager.go:124 " agent="172.17.129.130:8080
>> <http://172.17.129.130:8080>"
>> May 7 11:54:39 server-kea-control stork-server[719]:
>> time="2024-05-07 11:54:39" level="warning" msg="Failed to get state
>> from the Stork agent; the agent is still not responding" file="
>> grpcli.go:326 " agent="172.17.129.130:8080
>> <http://172.17.129.130:8080>"
>> May 7 11:54:39 server-kea-control stork-server[719]:
>> time="2024-05-07 11:54:39" level="warning" msg="failed to get state
>> from agent 172.17.129.130:8080 <http://172.17.129.130:8080>: grpc
>> manager is unable to re-establish connection with the agent
>> 172.17.129.130:8080 <http://172.17.129.130:8080>: rpc error: code =
>> Unavailable desc = connection error: desc = \"error reading server
>> preface: remote error: tls: bad certificate\"" file="
>> statepuller.go:247 "
>>
>> Not sure whether it for the better or worse
>>
>> Regards
>>
>> Marek
>>
>> On Tue, May 7, 2024 at 4:06 AM Slawek Figiel <slawek at isc.org
>> <mailto:slawek at isc.org>> wrote:
>>
>> Hello Marek!
>>
>> Stork server reports that the agent introduced itself with a "bad
>> certificate." Several reasons may cause it. I think you should
>> remove
>> the existing cert files and re-register the agent. Please do the
>> following steps:
>>
>> 1. On the agent machine, remove the files in the
>> `/var/lib/stork-agent`
>> directory (you need to remove all files from the `certs` and
>> `tokens`
>> subdirectories)
>> 2. If you manually registered the agent (by the `register`
>> command, you
>> need to call it again and restart the agent. If you used the
>> self-registration flow, just restart the agent.
>> 3. Open the Stork UI, go to the machines list, switch to the
>> "Unauthorized" tab, and re-authorize the agent.
>>
>> I hope it'll solve your problem.
>> Don't hesitate to ask for more details if you have any questions.
>>
>> Regards,
>> Slawek Figiel
>>
>> On 07/05/2024 00:05, mxhajduczenia at gmail.com
>> <mailto:mxhajduczenia at gmail.com> wrote:
>> > Dear Forum,
>> >
>> > I had two nodes added to Stork: .130 and .131 and they were
>> working
>> > correctly. Node .130 had a kernel failure due to changes I
>> was trying to
>> > apply and I did not make a copy, unfortunately. Long story
>> short, I had
>> > to re-install node .130, and then I wanted to add it back to
>> Stork
>> >
>> > No matter what I do, I am getting the error shown above,
>> i.e., Cannot
>> > get state of machine.
>> >
>> > Syslog review shows only one error message following two
>> warning messages.
>> >
>> > May 6 21:58:38 server-kea-control stork-server[719]:
>> time="2024-05-06
>> > 21:58:38" level="warning" msg="rpc error: code = Unavailable
>> desc =
>> > connection error: desc = \"error reading server preface:
>> remote error:
>> > tls: bad certificate\"" file=" manager.go:124 "
>> > agent="172.17.129.130:8080 <http://172.17.129.130:8080>"
>> >
>> > May 6 21:58:38 server-kea-control stork-server[719]:
>> time="2024-05-06
>> > 21:58:38" level="warning" msg="Failed to get state from the
>> Stork agent;
>> > the agent is still not responding" file="
>> grpcli.go:326 "
>> > agent="172.17.129.130:8080 <http://172.17.129.130:8080>"
>> >
>> > May 6 21:58:38 server-kea-control stork-server[719]:
>> time="2024-05-06
>> > 21:58:38" level="warning" msg="failed to get state from agent
>> > 172.17.129.130:8080 <http://172.17.129.130:8080>: grpc
>> manager is unable to re-establish connection
>> > with the agent 172.17.129.130:8080
>> <http://172.17.129.130:8080>: rpc error: code = Unavailable desc =
>> > connection error: desc = \"error reading server preface:
>> remote error:
>> > tls: bad certificate\"" file=" statepuller.go:247 "
>> >
>> > I suspect that the TLS certificate does to get cleared when
>> the machine
>> > is removed and a machine with the same IP address is re-added.
>> >
>> > I did not find a remedy for it for now and I do not fancy a
>> complete
>> > re-install of Stork if I can avoid it.
>> >
>> > Any suggestions on how to fix it?
>> >
>> > Regards
>> >
>> > Marek
>> >
>> >
>> --
>> Stork-users mailing list
>> Stork-users at lists.isc.org <mailto:Stork-users at lists.isc.org>
>> https://lists.isc.org/mailman/listinfo/stork-users
>> <https://lists.isc.org/mailman/listinfo/stork-users>
>>
>
More information about the Stork-users
mailing list