Vault upgrade failing for 1.6.5

I am trying to upgrade vault to 1.6.5. my current version is 1.5.6. I have three node HA configurations using raft.

Steps performed

  1. Node - 1 - upgrade to 1.6.5
  2. vault unseal done
  3. vault operator raft join ( this is failing with following error):

Error joining the node to the Raft cluster: Error making API request.
Code: 500. Errors:

  • failed to join raft cluster: failed to join any raft leader node

When I revert the vault version to 1.5.6 this is not issue and things works fine. any pointers would be help here. we are facing this issue to upgrade to 1.6.5

Whats your upgrade plan/order?
You should upgrade the 2 standbys first and bring them back online, then step down the active and finally upgrade that.

If both updated instances are not part of cluster will it join / become active once i shutdown master node ?

I can see error in master node logs when I update standby nodes :slight_smile:
[ERROR] ha.raft: failed to heartbeat to: peer=a.vault.xxxxxxx:8201 error=“remote error: tls: internal error”

No. Those nodes aren’t part of a cluster so how would they know to do anything? Maybe I misunderstand here… but if you have a 3 node cluster that means all those nodes should show up with list-peers

on upgraded node if we do vault operator raft lits-peers we see error :

Error reading the raft cluster configuration: Error making API request.
URL: GET https://b.vault.XXXXXX:8200/v1/sys/storage/raft/configuration
Code: 500. Errors:

  • local node not active but active cluster node not found

vault operator raft join from updated node i see following error
Error joining the node to the Raft cluster: Error making API request.

URL: POST https://b.vault.XXXXXXX:8200/v1/sys/storage/raft/join

Code: 500. Errors:

  • failed to join raft cluster: failed to join any raft leader node

Same time we see following error on Master node
Jun 10 19:22:54 hashicorp-vault-*** vault[2197]: 2021-06-10T19:22:54.565Z [ERROR] ha.raft: failed to appendEntries to: peer="{Voter b.vault.XXXXX b.vault.XXXXX:8201}" error=“remote error: tls: internal error”.