Vault HA Integrated Storage mTLS Certificiates

Hi Im setting up Vault HA using Integrated Storage. Im confused.

What is considered best practice, the following config where each node has its own cert bundle:

storage "raft" {
  path    = "/Users/foo/raft/"
  node_id = "node1"

  retry_join {
    leader_api_addr = "http://127.0.0.2:8200"
    leader_ca_cert_file = "/path/to/ca1"
    leader_client_cert_file = "/path/to/client/cert1"
    leader_client_key_file = "/path/to/client/key1"
  }
  retry_join {
    leader_api_addr = "http://127.0.0.3:8200"
    leader_ca_cert_file = "/path/to/ca2"
    leader_client_cert_file = "/path/to/client/cert2"
    leader_client_key_file = "/path/to/client/key2"
  }
  retry_join {
    leader_api_addr = "http://127.0.0.4:8200"
    leader_ca_cert_file = "/path/to/ca3"
    leader_client_cert_file = "/path/to/client/cert3"
    leader_client_key_file = "/path/to/client/key3"
  }
  retry_join {
    auto_join = "provider=aws region=eu-west-1 tag_key=vault tag_value=... access_key_id=... secret_access_key=..."
  }
}

I also see many examples having it like this:

listener "tcp" {
  tls_disable = 0
  address = "[::]:8200"
  cluster_address = "[::]:8201"
  tls_client_ca_file = "/vault/certs/server.ca.pem"
  tls_cert_file = "/vault/certs/vault.crt"
  tls_key_file = "/vault/certs/vault.key"
  tls_require_and_verify_client_cert = true
} 

storage "raft" {
path = "/vault/data"
retry_join {
  leader_api_addr = "https://vault-0.vault-internal:8200/"
  leader_ca_cert_file = "/vault/certs/server.ca.pem"
  leader_client_cert_file = "/vault/certs/vault.crt"
  leader_client_key_file = "/vault/certs/vault.key"
}
retry_join {
  leader_api_addr = "https://vault-1.vault-internal:8200/"
  leader_ca_cert_file = "/vault/certs/server.ca.pem"
  leader_client_cert_file = "/vault/certs/vault.crt"
  leader_client_key_file = "/vault/certs/vault.key"
}
retry_join {
  leader_api_addr = "https://vault-2.vault-internal:8200/"
  leader_ca_cert_file = "/vault/certs/server.ca.pem" 
  leader_client_cert_file = "/vault/certs/vault.crt"
  leader_client_key_file = "/vault/certs/vault.key"
}

Which option should be preferred?
How exactly does the 2nd example work, do all nodes use the same client cert & key?

Unless you absolutely need it, I would stick with IPv4 (first section). The second section is really the same configuration just using IPv6.

Yes, you do point the cert to the same cert … vault.foo.com is the VIP. vault-node-{0,1,2,3,4}.foo.com are the nodes. The clients use vault.foo.com to reach the cluster so the same certificate.

Do remember that the cert has a SAN (Subject Alternative Name), which needs to list all of the nodes, so you have vault.foo.com, as well as node-{0,1,2,3,4,5}.foo.com all added in, as well as 127.0.0.1.

hi aram,

thanks for the answer. Now I understand how one cert would work for all the nodes using SANs.

But that made me think, what if I would add a node to the cluster or a node gets new deployed with a new IP/DNS I would then have to change the cert on all nodes ?

Is using all the nodes DNS in the SANs in one cert consideres best practice?

Yes, you do have to plan ahead. Yes, all of your nodes (and their names, and service names [if kubs]) do need to be in your SAN list.

Most likely scenario is that you’ll start 5 nodes (raft minimum) and just leave it alone at that point.

If you actually do need to scale, the solution is to have better disks (i.e. solid state – raft is disk i/o based, vs consul which is memory based) rather than adding more nodes.

Adding nodes to a vault cluster does NOT horizontally scale. The way you scale is by adding additional Performance Replica Clusters and that’s a whole other cluster.