Hi,
I’m trying to make sure that my new Vault Enterprise cluster, that’s is on GCS bucket, will be the primary replicator but I’m getting the following error:
/ $ vault write -f sys/replication/dr/primary/enable
Error writing data to sys/replication/dr/primary/enable: Error making API request.
$ vault status
Key Value
--- -----
Recovery Seal Type shamir
Initialized true
Sealed false
Total Recovery Shares 5
Threshold 3
Version 1.9.3+ent
Storage Type gcs
Cluster Name vault-cluster-792cd19e
Cluster ID 9e1c8635-1aa0-1132-de19-a499586f04fd
HA Enabled true
HA Cluster https://vault-1.vault-internal:8201
HA Mode standby
Active Node Address http://10.188.3.47:8200
You’re exposing the license via VAULT_LICENSE kub secret as an env variable?
Also, does curl -X PUT -H "X-Vault-Token: $(vault print token)" https://vault.basement.lab/v1/sys/replicaiton/dr/primary/enable return a valid response?
I’m using Helm in order to deploy Vault.
The env var that Vault uses is: VAULT_LICENSE_PATH
Running the command you suggested on GCS cluster:
$ curl -X PUT -H "X-Vault-Token: $(vault print token)" http://127.0.0.1:8200/v1/sys/replication/dr/primary/enable
{"errors":["WAL not found or clustering disabled; replication is disabled"]}
Running the command you suggested on Raft cluster:
$ curl -X PUT -H "X-Vault-Token: $(vault print token)" http://127.0.0.1:8200/v1/sys/replication/dr/primary/enable
{"request_id":"2b837a13-3a1a-6e64-67a4-8f8beae47cf4","lease_id":"","renewable":false,"lease_duration":0,"data":null,"wrap_info":null,"warnings":["This cluster is being enabled as a primary for replication. Vault will be unavailable for a brief period and will resume service shortly."],"auth":null}
This is a bit of a long shot, but you don’t happen to have GOOGLE_STORAGE_HA_ENABLED=false set in the environment, do you?
Also, since you’re using Vault Enterprise, try opening a ticket at support.hashicorp.com asking them to explain exactly what “WAL not found or clustering disabled; replication is disabled” means.
(If you manage to find out, please come back here and let us know, I’m curious now!)
vault status against each of the nodes, it’s possible that they’re up but not unsealed?
Can you actually do a check from external to the cluster and run the vault operator raft list-peers against the cluster instead of against 127.0.0.1?
Lastly, change the log_level = trace in the config log, restart the pod, and re-enable the replication and post the log output from all of the nodes. I’d keep the logs handy as that’s what support is going to ask for in case you do need to open a ticket.