Best practice for managing policies/auth-methods in in HA setup

pat-s · January 17, 2021, 10:21am

We’ve setup a HA Vault setup with two instances (and a HA consul backend).

While the communication between the vault instances and their backends seems fine (if one goes offline, the other one becomes active and has all secrets available), I am wondering how to best manage the auth methods setup in the different instances.

AFAICS only the KV store is synced via consule and all policy/auth method setup is unique to each vault instance.

This means that each change to a policy/auth setup needs to be applied manually in all vault instances.
Given that only one instance is active at a time, activating and deactivating instances for a small change is tedious.

Is there a better way that I am overlooking?

stuart-c · January 17, 2021, 11:00am

Everything is stored in the configured backend, so failure of one node should be pretty seamless (you might lose things which are not yet committed to Consul. Presumably you have a separate Consul cluster as you have only 2 Vault servers, so you don’t have a Consul failure issue). You can get a small amount of downtime too depending on which method you use to point traffic at the active node.

pat-s · January 17, 2021, 11:18am

Thanks for the quick reply @stuart-c.

Unfortunately I can’t follow your reply fully.

We have two vault instances which are connected via a consul-backend cluster (3 servers, 2 agents).

If one vault instance goes down, the KV store is available on the other instance (which is fine).
However, policies and access methods from instance1 are not available on instance2.
Do I understand you correctly that policies and access methods should also be synced between the instances (and their consol backends?).

Apologies if the lack of understanding is on my side.

stuart-c · January 17, 2021, 12:09pm

There is no syncing between instances of Vault itself - they never communicate directly (with the exception of optionally having standbys talking to the primary if you send a query to one instead of the primary).

All configuration, secrets (both KV and dynamic secrets such as database or PKI), leases and tokens are stored in the backend, which in your case is Consul (but could be MySQL, DynamoDB, etc.).

Are you saying that for some reason you aren’t seeing that with your setup?

How do you have things configured? You have 3 separate Consul servers correctly clustered and then each Vault server runs an agent (which the Vault server talks to via localhost)?

How are you unsealing your two Vault instances?

Could you post the configuration for the two Vault instances?

aram535 · January 17, 2021, 3:34pm

Sounds like something is not configured correctly. Two nodes on the same cluster would have identical information, one would be ‘read-only’ and the other ‘active’. [[ Note that Hashicorp does recommend at least 3 nodes in a HA setup ]]. The read-only node will become ‘active’ if the active node is not responding in that environment. Please post your consul and vault configurations.

Check that the consul servers are all linked and ready, run this on any node:
consul operator raft list-peers

Check the status of the vault servers, run this one each node:
vault status | grep “HA mode”

pat-s · February 9, 2021, 7:36am

Hi guys,

sorry for the late response. Just wanting to drop that we’re still on trying to get this setup running and I’ll report when I revisit it next time. There was/is lots of other stuff to do Thanks for your input. appreciated!

Topic		Replies	Views
Vault + consul best practices Vault	2	1301	July 19, 2022
Vault with Consul - ACL token Vault	1	411	May 20, 2022
Clustered Vault servers with postgres backend Vault	4	2015	September 23, 2019
High Available Vault setup in Kubernetes Vault	4	513	October 25, 2021
Why do we need consul agent for Vault with Consul as HA backend Vault	3	476	April 21, 2021

Best practice for managing policies/auth-methods in in HA setup

Related topics