Vault Deployment Architecture | which option is better?

We are planning to setup vault in our infrastructure.
We already have 5 kubernetes clusters. One cluster per environment (dev, staging, training, production and one more). Which approach (out of the followings) is best. Every approach has some pros and cons so it is confusing to decide. Please feel free to proposed any other better approach that is not in the below list.

1- Setup 5 vaults (one vault one every kubernetes cluster).
Pros:

  • Don’t have to setup strict policies to secure secret for every environments.
  • Controlling authentication and authorisation is easy
  • Any configuration change or version updates can be tested first in dev, followed by staging and production.
  • Easy installation. Same helm chart will be deployed by using environment-specific values files.

Cons:

  • Managing 5 vault is not convenient. Whenever we add or update any secret, we need to make sure which vault we are interacting with.
  • Team will have to use so many URLs to manage secrets. ( we are planning to use vault for team password sharing as well)
  • We need to add a set of nodes to every cluster of kubernetes because vault will not share workloads with other k8s pods.

2- Setup two vaults on separate dedicated kubernetes clusters. (one for production and other for all non-production)

Pros:

  • Managing 2 vaults is easy as compare to 5 vaults.

Cons:

  • We will have to manage two additional kubernetes cluster that seems inconvenient.

3- Setup 2 vaults on VMs (one for production and other for all non-production)

  • We are also thinking to avoid dedicated kubernetes cluster because kubernetes cluster will only be running vaults pod so there is no point to install and configure kubernetes clusters just for vault.
  • There will be extra administration to setup and maintain kubernetes cluster itself.
  • In kubernetes, we have to setup multi-master to make k8s cluster highly available, so at least those master nodes can be avoided by using VMs base vault.

Thanks,

Arif

Setup vault with consul and the active node will be loadbalanced automatically , or put any loadbalancer in front it will query the system health api if you dont to use consul, dont over complicate it.

When I say “managing 5 vaults” i meant 5 vault installations not just 5 servers. because first option is to setup separate vault for every k8s cluster.

what do u suggest, there should one vault per environment or there should be two (prod and non-prod)

Generally speaking, you’d want to have as few Vault clusters to manage as possible, which probably means a single HCP Vault as a Service (that offers Enterprise namespaces).

Now, if that’s not an option, you may want to go with 2 Vault installations, one for prod and another for non-prod, but that comes with challenges too. Because the secrets will be organized in different ways across installations, you can’t guarantee, for instance, that the same policy that works in non-prod will work for prod.

The only real way around this is to actually have one Vault per environment though that’s a lot of work, even if you deploy and manage them through GitOps. The best thing I could think of that would really help in that scenario is some sorta Vault “federation” which I don’t think exists. So your best alternative, I’d say, is to automate as much as possible of your deployments and configuration

Thank you so much for your response. Could you please share your thoughts on below points as well, it will help me to make a decision:

  • if I setup one vault per environment (in kubernetes), there are some secrets that does not belong to any specific environment OR those secrets will be consumed by all the environments (for example, gitlab token, some passwords of grafana etc. It is confusing to decide which vault instance should store those types of secrets.

  • if I setup vault in kubernetes (as our all environment are already in k8s), It is not making a technical and logical sense to use vault to setup and configure k8s cluster itself ( a kind of chicken and egg issue). Vault should exist before kubernetes cluster.

you response will be help.

Thanks again,

Arif

Hi @arif.hussain,

I guess to address both of your concerns above, you’d have to build a dependency graph and figure out your chicken and egg situation first. There is no one-size-fits-all answer for this.

At my company we use identity-based policies whenever possible, so we don’t have to rely on Vault for most of the initial deployments (think Github OIDC provider + Atlantis + Terraform).

Vault is mainly used for database dynamic secrets and API integrations, in which case it can be deployed after the clusters are up and running, no problem about that.

More specifically about placement of secrets, my rule of thumb is: never share secrets across environments to limit the blast radius, but if you absolutely have to, the environment of highest criticality that uses the secret defines its placement i.e., if a secret is shared across all environments including production then it should be in the production Vault (though you may need to create extra fine-grained policies to allow clients from higher-risk environments to access those secrets).

P.S.: I’m curious as to how one would use Vault to setup k8s itself.

1 Like

Sorry for late response.
To answer the question “How vault would be used to setup kubernetes cluster”:

  • We have a gitlab pipeline with a number of job for helm deployment in kubernetes.
  • Kubernetes cluster initialization is manual , and once kubernetes clusters is initialized and up and running, the gitlab pipeline takes care all helm deployments.
  • Gitlab pipeline deploys metallb, nginx ingress controller, rabbitmq, minio and many more… The same pipeline will also install and deploy vault in kubernetes.

Now, lets take example of rabbitmq and minio depyment. When gitlab install/configure minio and rabbitmq, it also configure its admin users and password. In my case, job will get secrets from vault but vault is being setup by the same pipeline (separate job for vault in the same pipeline).

I will run vault job first and all other job after vault setup. Problem will be only when we set up kubernetes cluster first time, afterwards vault secrets will be available for other jobs.

If you have any idea to make this design better, I would love to hear.

I hope I made it clear.

Thanks,