Our Vault HA cluster runs on ec2 instances and is deployed via CICD pipeline. When we spin up a new cluster (no matter how many nodes) they all unseal as expected and if we add new nodes to the cluster they autounseal as expected also. However if we spin up a cluster (again no matter how many nodes), initialize Vault, all nodes are unsealed as expected. Then we import a RAFT snapshot and all the current nodes are healthy and all using that same RAFT, however if we add a new node to that cluster it failed to autounseal. The only difference in these scenarios is the RAFT import. The config files across all the nodes are the same (with exception to node specific parameters).
These are the errors from the leader node and the newly added unsealed node.
# error log from leader node
Dec 17 21:48:56 ip-172-18-99-123.us-west-2.compute.internal vault[6301]: {"@level":"error","@message":"failed to heartbeat to","@module":"storage.raft","@timestamp":"2024-12-17T21:48:56.170451Z","backoff time":500000000,"error":"remote error: tls: unrecognized name","peer":"172.18.99.242:8201"}
# error log from new node
Dec 17 21:47:16 ip-172-18-99-242.us-west-2.compute.internal vault[5756]: {"@level":"info","@message":"http: TLS handshake error from 172.18.99.235:9875: EOF","@timestamp":"2024-12-17T21:47:16.210103Z"}
Dec 17 21:47:20 ip-172-18-99-242.us-west-2.compute.internal vault[5756]: {"@level":"info","@message":"stored unseal keys supported, attempting fetch","@module":"core","@timestamp":"2024-12-17T21:47:20.204400Z"}
Dec 17 21:47:20 ip-172-18-99-242.us-west-2.compute.internal vault[5756]: {"@level":"warn","@message":"failed to unseal core","@timestamp":"2024-12-17T21:47:20.205231Z","error":"stored unseal keys are supported, but none were foun
d"}