I have setup four vault instances behind a loadbalancer. I’m seeing some very weird behaviour when sealing/unsealing and have questions on what api_addr
should be configured too. The vault hosts are virtual machines running flatcar 3815 with the v1.15.5 vault docker image. I have then setup an endpointslice referencing the IP for each vault instance, a service and an nginx ingress inside a Kubernetes cluster.
When I run VAULT_ADDR=https://vault.kube.internal.company.net vault operator seal
I sometimes get an error that I hit a standby node, and sometimes I get a info that the vault sealed. I guess this is consistent with vault#6161? However if I try to read/write from/to the vault after a successful seal I am sometimes able to, and sometimes I get an error that it’s sealed. When I run vault status
on the same loadbalancer address I sometimes see status sealed: true
and sometimes not? The leader also seems to change whenever I seal. To me it looks like I need to run the seal command on every node…? But only when they’re the leader? Same goes for unsealing. This feels very buggy. Loadbalancer might redirect to a different node after I enter an unseal key. Why is the leader changing when we seal?
The Vault HA Docs say that api_addr
should be set to the loadbalancer address if it’s not intended to reach the vault nodes directly. I can’t see any difference in the behaviour when this is set to that vs the nodes own hostnames.
Example of writing after sealing:
> vault operator seal
Error sealing: Error making API request.
URL: PUT https://vault.kube.internal.company.net/v1/sys/seal
Code: 500. Errors:
* vault cannot seal when in standby mode; please restart instead
> vault operator seal
Success! Vault is sealed.
> vault write cubbyhole/eol data=hei
Success! Data written to: cubbyhole/eol
> vault write cubbyhole/eol data=hei
Success! Data written to: cubbyhole/eol
... Repeats 4 more time
> vault write cubbyhole/eol data=hei
Error writing data to cubbyhole/eol: Error making API request.
URL: PUT https://vault.kube.internal.company.net/v1/cubbyhole/eol
Code: 503. Errors:
* Vault is sealed
... some minutes later I rerun
> vault write cubbyhole/eol data=hei
Success! Data written to: cubbyhole/eol
Vault status twice in a row getting:
> vault status
Key Value
--- -----
Seal Type shamir
Initialized true
Sealed true # <<< sealed is true
Total Shares 5
Threshold 2
Unseal Progress 0/2
Unseal Nonce n/a
Version 1.15.5
Build Date 2024-01-26T14:53:40Z
Storage Type raft
HA Enabled true
Raft Committed Index 1489
Raft Applied Index 1489
> vault status
Key Value
--- -----
Seal Type shamir
Initialized true
Sealed false # <<< sealed is false
Total Shares 5
Threshold 2
Version 1.15.5
Build Date 2024-01-26T14:53:40Z
Storage Type raft
Cluster Name vault-cluster-387fea85
Cluster ID 7ede3598-b2cc-c4b3-ef60-4026bc4100f8
HA Enabled true
HA Cluster n/a
HA Mode standby
Active Node Address <none>
vault-0
config. All other configs follow the same patterns. retry_join
is set to join every node except itself.
{
"api_addr": "https://vault-0.company.lan:8200",
"cluster_addr": "https://vault-0.company.lan:8201",
"disable_mlock": true,
"listener": {
"tcp": {
"address": "0.0.0.0:8200",
"cluster_address": "0.0.0.0:8201",
"tls_cert_file": "/vault/certs/cert.pem",
"tls_client_ca_file": "/vault/certs/CA_cert.pem",
"tls_key_file": "/vault/certs/key.pem"
}
},
"storage": {
"raft": {
"node_id": "raft_node_vault-0",
"path": "/vault",
"retry_join": [
{
"leader_api_addr": "https://vault-1.company.lan:8200",
"leader_ca_cert": "[REDACTED]",
"leader_tls_servername": "vault-1.company.lan"
},
{
"leader_api_addr": "https://vault-2.company.lan:8200",
"leader_ca_cert": "[REDACTED]",
"leader_tls_servername": "vault-2.company.lan"
},
{
"leader_api_addr": "https://vault-3.company.lan:8200",
"leader_ca_cert": "[REDACTED]",
"leader_tls_servername": "vault-3.company.lan"
}
]
}
},
"ui": true
}
endpointslice:
addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- 10.80.20.101 # vault-0.company.lan
conditions: {}
- addresses:
- 10.80.20.100 # vault-1.company.lan
conditions: {}
- addresses:
- 10.80.20.99 # vault-2.company.lan
conditions: {}
- addresses:
- 10.80.20.102 # vault-3.company.lan
conditions: {}
kind: EndpointSlice
metadata:
labels:
endpointslice.kubernetes.io/managed-by: terraform
kubernetes.io/service-name: vault
name: vault-1
namespace: vault
ports:
- appProtocol: http
name: ""
port: 8200
protocol: TCP
service:
apiVersion: v1
kind: Service
metadata:
name: vault
namespace: vault
spec:
clusterIP: 10.43.77.103
clusterIPs:
- 10.43.77.103
ports:
- port: 8200
protocol: TCP
targetPort: 8200
type: ClusterIP
ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
name: vault
namespace: vault
spec:
rules:
- host: vault.kube.internal.company.net
http:
paths:
- backend:
service:
name: vault
port:
number: 8200
path: /
pathType: Prefix
tls:
- hosts:
- vault.kube.internal.company.net
secretName: kube-internal-company-net-wildcard