Enable Repication Primary on GCS backend

tal-ayalon · March 31, 2022, 9:54am

Hi,
I’m trying to make sure that my new Vault Enterprise cluster, that’s is on GCS bucket, will be the primary replicator but I’m getting the following error:
/ $ vault write -f sys/replication/dr/primary/enable
Error writing data to sys/replication/dr/primary/enable: Error making API request.

URL: PUT http://127.0.0.1:8200/v1/sys/replication/dr/primary/enable
Code: 400. Errors:

WAL not found or clustering disabled; replication is disabled

Why GCS backend vault can’t be used as primary?

aram · March 31, 2022, 7:49pm

Is this node already the leader in the cluster?

What’s the output of:

curl \
    http://127.0.0.1:8200/v1/sys/replication/dr/status

tal-ayalon · April 3, 2022, 9:42am

This is the details about my Vault:

$ vault status
Key                      Value
---                      -----
Recovery Seal Type       shamir
Initialized              true
Sealed                   false
Total Recovery Shares    5
Threshold                3
Version                  1.9.3+ent
Storage Type             gcs
Cluster Name             vault-cluster-792cd19e
Cluster ID               9e1c8635-1aa0-1132-de19-a499586f04fd
HA Enabled               true
HA Cluster               https://vault-1.vault-internal:8201
HA Mode                  standby
Active Node Address      http://10.188.3.47:8200

$ curl http://127.0.0.1:8200/v1/sys/replication/dr/status
{"request_id":"a4ede4ca-5374-d897-11ec-b236c81f7dfc","lease_id":"","renewable":false,"lease_duration":0,"data":{"mode":"unsupported"},"wrap_info":null,"warnings":null,"auth":null}

It has this line “mode: unsupported”

tal-ayalon · April 3, 2022, 9:45am

If I’m doing the same curl command on a Raft backend, the mode is disabled

aram · April 3, 2022, 9:58am

Okay you probably don’t have ha_enabled in your config:

You may also want to check your cluster, make sure you have a valid cluster.

tal-ayalon · April 3, 2022, 10:09am

This is the configuration I have (GCS backend cluster):

/vault/config $ cat extraconfig-from-values.hcl
disable_mlock = true
ui = true
listener "tcp" {
  tls_disable = 1
  address = "[::]:8200"
  cluster_address = "[::]:8201"
}
storage "gcs" {
  bucket = "some-bucket"
  ha_enabled = "true"
}
seal "gcpckms" {         
  project     = "project"
  region      = "global"
  key_ring    = "some-key"
  crypto_key  = "some-crypto"
}

I have ha_enabled. I have 3 pods with one primary and 2 standbys so I ahve ha_enbaled.

On the Raft cluster, I can enable the DR replication and then the status is enabled
Do you have any other suggestions?

aram · April 3, 2022, 10:39am

You’re exposing the license via VAULT_LICENSE kub secret as an env variable?

Also, does curl -X PUT -H "X-Vault-Token: $(vault print token)" https://vault.basement.lab/v1/sys/replicaiton/dr/primary/enable return a valid response?

tal-ayalon · April 3, 2022, 10:59am

I’m using Helm in order to deploy Vault.
The env var that Vault uses is: VAULT_LICENSE_PATH

Running the command you suggested on GCS cluster:

$ curl -X PUT -H "X-Vault-Token: $(vault print token)" http://127.0.0.1:8200/v1/sys/replication/dr/primary/enable
{"errors":["WAL not found or clustering disabled; replication is disabled"]}

Running the command you suggested on Raft cluster:

$ curl -X PUT -H "X-Vault-Token: $(vault print token)" http://127.0.0.1:8200/v1/sys/replication/dr/primary/enable
{"request_id":"2b837a13-3a1a-6e64-67a4-8f8beae47cf4","lease_id":"","renewable":false,"lease_duration":0,"data":null,"wrap_info":null,"warnings":["This cluster is being enabled as a primary for replication. Vault will be unavailable for a brief period and will resume service shortly."],"auth":null}

maxb · April 3, 2022, 11:16am

This is a bit of a long shot, but you don’t happen to have GOOGLE_STORAGE_HA_ENABLED=false set in the environment, do you?

Also, since you’re using Vault Enterprise, try opening a ticket at support.hashicorp.com asking them to explain exactly what “WAL not found or clustering disabled; replication is disabled” means.

(If you manage to find out, please come back here and let us know, I’m curious now!)

aram · April 3, 2022, 11:21am

Okay, lets try two different commands:

vault status against each of the nodes, it’s possible that they’re up but not unsealed?
Can you actually do a check from external to the cluster and run the vault operator raft list-peers against the cluster instead of against 127.0.0.1?

Lastly, change the log_level = trace in the config log, restart the pod, and re-enable the replication and post the log output from all of the nodes. I’d keep the logs handy as that’s what support is going to ask for in case you do need to open a ticket.

tal-ayalon · April 3, 2022, 2:12pm

GOOGLE_STORAGE_HA_ENABLED doesn’t exist in my ENV
Opened a ticket, will update

tal-ayalon · April 3, 2022, 2:13pm

Hi,

The cluster is functening properly, there isn’t a problem with it. You can create a Vault using GCS backend and get the error too.
Raft is not related to the issue. I have problems with GCS.

Topic		Replies	Views
Vault TLS/ HA raft with gcs bucket issues Vault	0	551	October 6, 2021
Can't get replicas to sync - Helm + GKE + manually specified listeners and services Vault vault	0	27	February 22, 2025
Error This is a standby Vault node but can't communicate with the active node via request forwarding Vault vault	2	3103	October 22, 2021
DR secondary activation failed without any error Vault	1	453	October 10, 2022
Vault HA with postgresql backend storage failure after DB shut down Vault	0	852	June 6, 2021

Enable Repication Primary on GCS backend

Related topics