Enable Repication Primary on GCS backend

Hi,
I’m trying to make sure that my new Vault Enterprise cluster, that’s is on GCS bucket, will be the primary replicator but I’m getting the following error:
/ $ vault write -f sys/replication/dr/primary/enable
Error writing data to sys/replication/dr/primary/enable: Error making API request.

URL: PUT http://127.0.0.1:8200/v1/sys/replication/dr/primary/enable
Code: 400. Errors:

  • WAL not found or clustering disabled; replication is disabled

Why GCS backend vault can’t be used as primary?

Is this node already the leader in the cluster?

What’s the output of:

curl \
    http://127.0.0.1:8200/v1/sys/replication/dr/status

This is the details about my Vault:

$ vault status
Key                      Value
---                      -----
Recovery Seal Type       shamir
Initialized              true
Sealed                   false
Total Recovery Shares    5
Threshold                3
Version                  1.9.3+ent
Storage Type             gcs
Cluster Name             vault-cluster-792cd19e
Cluster ID               9e1c8635-1aa0-1132-de19-a499586f04fd
HA Enabled               true
HA Cluster               https://vault-1.vault-internal:8201
HA Mode                  standby
Active Node Address      http://10.188.3.47:8200
$ curl http://127.0.0.1:8200/v1/sys/replication/dr/status
{"request_id":"a4ede4ca-5374-d897-11ec-b236c81f7dfc","lease_id":"","renewable":false,"lease_duration":0,"data":{"mode":"unsupported"},"wrap_info":null,"warnings":null,"auth":null}

It has this line “mode: unsupported”

If I’m doing the same curl command on a Raft backend, the mode is disabled

Okay you probably don’t have ha_enabled in your config:

You may also want to check your cluster, make sure you have a valid cluster.

This is the configuration I have (GCS backend cluster):

/vault/config $ cat extraconfig-from-values.hcl
disable_mlock = true
ui = true
listener "tcp" {
  tls_disable = 1
  address = "[::]:8200"
  cluster_address = "[::]:8201"
}
storage "gcs" {
  bucket = "some-bucket"
  ha_enabled = "true"
}
seal "gcpckms" {         
  project     = "project"
  region      = "global"
  key_ring    = "some-key"
  crypto_key  = "some-crypto"
}

I have ha_enabled. I have 3 pods with one primary and 2 standbys so I ahve ha_enbaled.

On the Raft cluster, I can enable the DR replication and then the status is enabled
Do you have any other suggestions?

You’re exposing the license via VAULT_LICENSE kub secret as an env variable?

Also, does curl -X PUT -H "X-Vault-Token: $(vault print token)" https://vault.basement.lab/v1/sys/replicaiton/dr/primary/enable return a valid response?

I’m using Helm in order to deploy Vault.
The env var that Vault uses is: VAULT_LICENSE_PATH

Running the command you suggested on GCS cluster:

$ curl -X PUT -H "X-Vault-Token: $(vault print token)" http://127.0.0.1:8200/v1/sys/replication/dr/primary/enable
{"errors":["WAL not found or clustering disabled; replication is disabled"]}

Running the command you suggested on Raft cluster:

$ curl -X PUT -H "X-Vault-Token: $(vault print token)" http://127.0.0.1:8200/v1/sys/replication/dr/primary/enable
{"request_id":"2b837a13-3a1a-6e64-67a4-8f8beae47cf4","lease_id":"","renewable":false,"lease_duration":0,"data":null,"wrap_info":null,"warnings":["This cluster is being enabled as a primary for replication. Vault will be unavailable for a brief period and will resume service shortly."],"auth":null}

This is a bit of a long shot, but you don’t happen to have GOOGLE_STORAGE_HA_ENABLED=false set in the environment, do you? :slightly_smiling_face:

Also, since you’re using Vault Enterprise, try opening a ticket at support.hashicorp.com asking them to explain exactly what “WAL not found or clustering disabled; replication is disabled” means.

(If you manage to find out, please come back here and let us know, I’m curious now!)

Okay, lets try two different commands:

  1. vault status against each of the nodes, it’s possible that they’re up but not unsealed?
  2. Can you actually do a check from external to the cluster and run the vault operator raft list-peers against the cluster instead of against 127.0.0.1?

Lastly, change the log_level = trace in the config log, restart the pod, and re-enable the replication and post the log output from all of the nodes. I’d keep the logs handy as that’s what support is going to ask for in case you do need to open a ticket.

GOOGLE_STORAGE_HA_ENABLED doesn’t exist in my ENV
Opened a ticket, will update

Hi,

  1. The cluster is functening properly, there isn’t a problem with it. You can create a Vault using GCS backend and get the error too.
  2. Raft is not related to the issue. I have problems with GCS.