Raft: how to restore from 3-node HA cluster to 1-node DR instance

candlerb · February 4, 2021, 12:06pm

Hi,

I am trying to take a snapshot of a live 3-node Vault cluster with Raft storage, and restore it onto a single DR node on a different IP address. It’s in a different data centre, and the data changes only rarely, so a static snapshot is fine.

However, I have got stuck on getting the DR instance to come up on its new IP address after restoring the snapshot, which has the old Raft peer IPs in it. I’ve been through some forum posts, which took me to:

The test environment is all built in VMs. My main cluster is on 10.0.0.104 / 108 / 109, and the DR node is 192.0.2.51. Steps done so far:

Take snapshot on main active node 10.0.0.104 (node_id “vault-dev1”)
Install vault on DR node, with a new vault-conf.hcl with its own IP address

storage "raft" {
  path = "/opt/vault-dev/data"
  node_id = "vault-dev1"
}
cluster_addr = "https://192.0.2.51:18201"
api_addr = "https://192.0.2.51:18200"
disable_mlock = "true"
ui = "true"

listener "tcp" {
  address = "192.0.2.51:18200"
  tls_min_version = "tls10"
  tls_cert_file = "/opt/vault-dev/certificates/vault-dev1.cert"
  tls_key_file = "/opt/vault-dev/certificates/vault-dev1.key"
}

To restore the snapshot, I need vault to be running, initialized and unsealed (with a temporary key)

[root@drvault vault-dev]# /opt/vault-dev/vault operator raft snapshot restore ~/202102041017.snap
Error installing the snapshot: Post "https://127.0.0.1:8200/v1/sys/storage/raft/snapshot": dial tcp 127.0.0.1:8200: connect: connection refused
[root@drvault vault-dev]# systemctl start vault-dev
[root@drvault vault-dev]# export VAULT_ADDR=https://192.0.2.51:18200
[root@drvault vault-dev]# /opt/vault-dev/vault operator raft snapshot restore ~/202102041017.snap
Error installing the snapshot: Post "https://192.0.2.51:18200/v1/sys/storage/raft/snapshot": x509: certificate is valid for 10.0.0.104, not 192.0.2.51
[root@drvault vault-dev]# export VAULT_SKIP_VERIFY=1
[root@drvault vault-dev]# /opt/vault-dev/vault operator raft snapshot restore ~/202102041017.snap
Error installing the snapshot: Error making API request.

URL: POST https://192.0.2.51:18200/v1/sys/storage/raft/snapshot
Code: 503. Errors:

* Vault is sealed
[root@drvault vault-dev]# /opt/vault-dev/vault operator init -key-shares=5 -key-threshold=2
... note the results
[root@drvault vault-dev]# /opt/vault-dev/vault operator unseal
Unseal Key (will be hidden):
...
[root@drvault vault-dev]# /opt/vault-dev/vault operator unseal
Unseal Key (will be hidden):
...
[root@drvault vault-dev]# /opt/vault-dev/vault operator raft snapshot restore ~/202102041017.snap
Error installing the snapshot: Error making API request.

URL: POST https://192.0.2.51:18200/v1/sys/storage/raft/snapshot
Code: 400. Errors:

* missing client token
[root@drvault vault-dev]# /opt/vault-dev/vault login
Token (will be hidden):
Success! You are now authenticated.
...
[root@drvault vault-dev]# /opt/vault-dev/vault operator raft snapshot restore ~/202102041017.snap
Error installing the snapshot: Error making API request.

URL: POST https://192.0.2.51:18200/v1/sys/storage/raft/snapshot
Code: 400. Errors:

* could not verify hash file, possibly the snapshot is using a different set of unseal keys; use the snapshot-force API to bypass this check
[root@drvault vault-dev]# /opt/vault-dev/vault operator raft snapshot restore --force ~/202102041017.snap
[root@drvault vault-dev]#

So far, so good. Next, create peers.json file:

[
  {
    "id": "vault1-dev",
    "address": "192.0.2.51:18201",
    "non_voter": false
  }
]

and restart the server. However when I do, I find that vault still tries to contact the original node 10.0.0.104, even though it has clearly picked up peers.json:

Feb 04 11:11:56 drvault systemd[1]: Started Vault secret store.
Feb 04 11:11:56 drvault vault[1043]: ==> Vault server configuration:
Feb 04 11:11:56 drvault vault[1043]: Api Address: https://192.0.2.51:18200
Feb 04 11:11:56 drvault vault[1043]: Cgo: disabled
Feb 04 11:11:56 drvault vault[1043]: Cluster Address: https://192.0.2.51:18201
Feb 04 11:11:56 drvault vault[1043]: Go Version: go1.15.7
Feb 04 11:11:56 drvault vault[1043]: Listener 1: tcp (addr: "192.0.2.51:18200", cluster address: "192.0.2.51:18201", max_request_duration: "1m30s", max_request_size: "33554432", tls: "enabled")
Feb 04 11:11:56 drvault vault[1043]: Log Level: info
Feb 04 11:11:56 drvault vault[1043]: Mlock: supported: true, enabled: false
Feb 04 11:11:56 drvault vault[1043]: Recovery Mode: false
Feb 04 11:11:56 drvault vault[1043]: Storage: raft (HA available)
Feb 04 11:11:56 drvault vault[1043]: Version: Vault v1.6.2
Feb 04 11:11:56 drvault vault[1043]: Version Sha: be65a227ef2e80f8588b3b13584b5c0d9238c1d7
Feb 04 11:11:56 drvault vault[1043]: ==> Vault server started! Log data will stream in below:
Feb 04 11:11:56 drvault vault[1043]: 2021-02-04T11:11:56.316Z [INFO]  proxy environment: http_proxy= https_proxy= no_proxy=
Feb 04 11:11:56 drvault vault[1043]: 2021-02-04T11:11:56.350Z [INFO]  storage.raft.snapshot: reaping snapshot: path=/opt/vault-dev/data/raft/snapshots/3-5057-1612436954856
Feb 04 11:15:04 drvault vault[1043]: 2021-02-04T11:15:04.358Z [INFO]  core.cluster-listener.tcp: starting listener: listener_address=192.0.2.51:18201
Feb 04 11:15:04 drvault vault[1043]: 2021-02-04T11:15:04.358Z [INFO]  core.cluster-listener: serving cluster requests: cluster_listen_address=192.0.2.51:18201
Feb 04 11:15:04 drvault vault[1043]: 2021-02-04T11:15:04.362Z [INFO]  storage.raft: raft recovery initiated: recovery_file=peers.json
Feb 04 11:15:04 drvault vault[1043]: 2021-02-04T11:15:04.367Z [INFO]  storage.raft: raft recovery found new config: config="{[{Voter vault1-dev 192.0.2.51:18201}]}"
Feb 04 11:15:04 drvault vault[1043]: 2021-02-04T11:15:04.390Z [INFO]  storage.raft: raft recovery deleted peers.json
Feb 04 11:15:04 drvault vault[1043]: 2021-02-04T11:15:04.396Z [INFO]  storage.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:vault1-dev Address:192.0.2.51:18201}]"
Feb 04 11:15:04 drvault vault[1043]: 2021-02-04T11:15:04.398Z [INFO]  core: vault is unsealed
Feb 04 11:15:04 drvault vault[1043]: 2021-02-04T11:15:04.398Z [INFO]  core: entering standby mode
Feb 04 11:15:04 drvault vault[1043]: 2021-02-04T11:15:04.398Z [INFO]  storage.raft: entering follower state: follower="Node at 192.0.2.51:18201 [Follower]" leader=
Feb 04 11:15:13 drvault vault[1043]: 2021-02-04T11:15:13.682Z [WARN]  storage.raft: not part of stable configuration, aborting election
Feb 04 11:15:15 drvault vault[1043]: 2021-02-04T11:15:15.906Z [ERROR] core: error during forwarded RPC request: error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.0.0.104:18201: connect: no route to host""
Feb 04 11:15:15 drvault vault[1043]: 2021-02-04T11:15:15.907Z [ERROR] core: forward request error: error="error during forwarding RPC request"
Feb 04 11:17:30 drvault vault[1043]: 2021-02-04T11:17:30.298Z [ERROR] storage.raft: failed to take snapshot: error="nothing new to snapshot"

As a result, the raft cluster doesn’t come up, still thinking it needs to talk to 10.0.0.104:

[root@drvault vault-dev]# /opt/vault-dev/vault operator raft list-peers
Error reading the raft cluster configuration: Get "https://10.0.0.104:18200/v1/sys/storage/raft/configuration": dial tcp 10.0.0.104:18200: connect: no route to host
[root@drvault vault-dev]# /opt/vault-dev/vault status
Key                     Value
---                     -----
Seal Type               shamir
Initialized             true
Sealed                  false
Total Shares            5
Threshold               2
Version                 1.6.2
Storage Type            raft
Cluster Name            vault-cluster-8cdcefbe
Cluster ID              4dd2d930-121b-c897-57c2-7f4cfe983099
HA Enabled              true
HA Cluster              https://10.0.0.104:18201
HA Mode                 standby
Active Node Address     https://10.0.0.104:18200
Raft Committed Index    5060
Raft Applied Index      5060

It seems like I need to perform the “recover from permanently lost quorum” process whilst changing the stored IP address of the peer at the same time.

I did come across recovery mode, but couldn’t find any examples of how to use it - in particular what commands make use of /sys/raw. When running the server in recovery mode, all the commands I tried, including vault operator raft snapshot restore, give a 404 error. In any case, I’d prefer to perform disaster recovery using just the standard unseal keys, and not rely on having access to a recovery token which could have been misplaced.

Any clues as to where to go next?

Thanks in advance!

candlerb · February 4, 2021, 1:08pm

I managed an ugly workaround to reinitialize raft: migrate from raft to filesystem, and then migrate from filesystem back to raft. Details here.

However, I’d really like to find a cleaner and safer way of doing this.

candlerb · February 4, 2021, 2:27pm

I thought I found something here:

Cluster reset: When a node is brought up in recovery mode, it resets the list of cluster members. This means that when resuming normal operations, each node will need to rejoin the cluster.

Sounds like just what i want. However, I can’t get it to work. If I restart vault in recovery mode:

[root@drvault vault-dev]# sudo -u vault /opt/vault-dev/vault server -config=/opt/vault-dev/vault-conf.hcl -recovery
==> Vault server configuration:

               Seal Type: shamir
         Cluster Address: https://192.0.2.51:18201
              Go Version: go1.15.7
               Log Level: info
           Recovery Mode: true
                 Storage: raft
                 Version: Vault v1.6.2
             Version Sha: be65a227ef2e80f8588b3b13584b5c0d9238c1d7

==> Vault server started! Log data will stream in below:

2021-02-04T13:56:55.344Z [INFO]  proxy environment: http_proxy= https_proxy= no_proxy=

As expected, commands like “vault operator raft list-peers” and “vault operator unseal” give a 404, so there’s little I can do. I could get a recovery token and do /sys/raw operations, but I have no need for this.

I then restart into normal mode (with or without peers.json)

2021-02-04T13:59:58.178Z [INFO]  core.cluster-listener.tcp: starting listener: listener_address=192.0.2.51:18201
2021-02-04T13:59:58.179Z [INFO]  core.cluster-listener: serving cluster requests: cluster_listen_address=192.0.2.51:18201
2021-02-04T13:59:58.180Z [INFO]  storage.raft: raft recovery initiated: recovery_file=peers.json
2021-02-04T13:59:58.181Z [INFO]  storage.raft: raft recovery found new config: config="{[{Voter vault1-dev 192.0.2.51:18201}]}"
2021-02-04T13:59:58.224Z [INFO]  storage.raft: raft recovery deleted peers.json
2021-02-04T13:59:58.230Z [INFO]  storage.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:vault1-dev Address:192.0.2.51:18201}]"
2021-02-04T13:59:58.231Z [INFO]  core: vault is unsealed
2021-02-04T13:59:58.232Z [INFO]  core: entering standby mode
2021-02-04T13:59:58.232Z [INFO]  storage.raft: entering follower state: follower="Node at 192.0.2.51:18201 [Follower]" leader=
2021-02-04T14:00:04.414Z [WARN]  storage.raft: not part of stable configuration, aborting election

“vault status” says it’s a standby node with no active node:

HA Enabled              true
HA Cluster              n/a
HA Mode                 standby
Active Node Address     <none>

As a result, any operation like operator raft list-peers fails:

[root@drvault ~]# /opt/vault-dev/vault operator raft list-peers
Error reading the raft cluster configuration: Error making API request.

URL: GET https://192.0.2.51:18200/v1/sys/storage/raft/configuration
Code: 500. Errors:

* local node not active but active cluster node not found

Attempting to join itself fails:

[root@drvault ~]# /opt/vault-dev/vault operator raft join https://192.0.2.51:18200
Key       Value
---       -----
Joined    true
[root@drvault ~]# /opt/vault-dev/vault operator raft list-peers
Error reading the raft cluster configuration: Error making API request.

URL: GET https://192.0.2.51:18200/v1/sys/storage/raft/configuration
Code: 500. Errors:

* local node not active but active cluster node not found

The documentation says:

In order to bring the Vault server up reliably, using any node’s raft data, recovery mode Vault automatically resizes the cluster to size 1. This means that after having used recovery mode, part of the procedure for returning to active service must include rejoining the raft cluster.

But it doesn’t say how to rejoin the cluster when recovery mode wiped the raft config. I wonder if it’s because the IP address of this node doesn’t match any of the original nodes, or because I’ve done something wrong here.

I note in this tutorial, Active Node Address <None> is shown when you have a HA setup with less than quorum. But I already tried the process to resolve that (with peers.json) to no avail.

candlerb · February 4, 2021, 3:28pm

Right, I think I solved it!

Step 1: start up vault with -recovery flag. For my system that was:

sudo -u vault /opt/vault-dev/vault server -config=/opt/vault-dev/vault-conf.hcl -recovery

Step 2: unseal via the API (I couldn’t find any option in the vault CLI to do it)

nonce=$(expr "$(curl -XPUT -k $VAULT_ADDR/v1/sys/generate-recovery-token/attempt)" : '.*"nonce":"\([^"]*\)".*')

# repeat as many times as required:
read v; curl -XPOST -k --data-binary "{\"key\":\"$v\",\"nonce\":\"$nonce\"}" $VAULT_ADDR/v1/sys/generate-recovery-token/update
# paste in key share

After a few seconds, raft elects itself leader:

2021-02-04T15:02:18.556Z [INFO]  storage.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:vault-dev1 Address:192.0.2.51:18201}]"
2021-02-04T15:02:18.556Z [INFO]  storage.raft: entering follower state: follower="Node at vault-dev1 [Follower]" leader=
2021-02-04T15:02:27.096Z [WARN]  storage.raft: heartbeat timeout reached, starting election: last-leader=
2021-02-04T15:02:27.096Z [INFO]  storage.raft: entering candidate state: node="Node at vault-dev1 [Candidate]" term=8
2021-02-04T15:02:27.113Z [INFO]  storage.raft: election won: tally=1
2021-02-04T15:02:27.113Z [INFO]  storage.raft: entering leader state: leader="Node at vault-dev1 [Leader]"
2021-02-04T15:02:27.115Z [INFO]  core: recovery operation token generation finished: nonce=eb1dca12-78aa-9075-67ec-241794827d5c

Step 3: Stop vault with Ctrl-C (sends SIGINT).

Step 4: Restart vault in normal mode.

Step 5: Unseal, and check status.

For a few seconds it was trying to use the old cluster address:

[root@drvault ~]# /opt/vault-dev/vault operator raft list-peers
Error reading the raft cluster configuration: Get "https://10.0.0.104:18200/v1/sys/storage/raft/configuration": dial tcp 10.0.0.104:18200: connect: no route to host

But it soon sorted itself out:

[root@drvault ~]# /opt/vault-dev/vault operator raft list-peers
Error reading the raft cluster configuration: Error making API request.

URL: GET https://192.0.2.51:18200/v1/sys/storage/raft/configuration
Code: 403. Errors:

* permission denied

Then after vault login it was working properly:

[root@drvault ~]# /opt/vault-dev/vault operator raft list-peers
Node          Address             State     Voter
----          -------             -----     -----
vault-dev1    192.0.2.51:18201    leader    true
[root@drvault ~]#

Phew!

Aside: when you have entered the last key share for recovery mode, the response includes an encoded_token:

..."complete":true,"encoded_token":"XXXX"...

However, I was unable to use the encoded_token to authenticate to /sys/raw: e.g.

curl -v -k -H "X-Vault-Token: $RECOVERY_TOKEN" -XLIST $VAULT_ADDR/v1/sys/raw/logical
...
HTTP/1.1 403 Forbidden

But that doesn’t really matter, as unsealing was sufficient to fix raft.

Topic		Replies	Views
Recover vault cluster with raft storage Vault raft	4	4815	January 29, 2021
Raft snapshot restore issue Vault	6	1416	May 17, 2022
Backup - Restore Vault raft	2	1649	June 23, 2020
Restoring raft snapshot on test cluster broke production cluster Vault	3	366	September 29, 2022
How to unseal vault server after raft snapshot restore? Vault	7	1328	June 5, 2022

Raft: how to restore from 3-node HA cluster to 1-node DR instance

Related topics