I’m new to Vault and Consul and I’ve been studying how is the best way to implement this infrastructure, I already built some tests with vault and consul as storage but I have a couple of doubts of what is the best way to implement it. Specifically, I’m between the following options:
Using 3 consul as server and 2 vault as clients (installing consul inside the vault servers) I had some issues trying to connect the client-server, usually errors like reaching the IP and port of the different consul members, sometimes it works and sometimes doesn’t work, it was little weird so honestly, I didn’t like it too much. I followed this Vault High Availability with Consul | Vault - HashiCorp Learn and I didn’t trust too much in that tutorial because some part of the config was deprecated so maybe is an old way to implement HA, but not sure at all.
The other way I tried and I like more it’s creating 3 consul servers and 2 vault but connecting vault by an ACL token and it wasn’t necessary to install consul in vault servers. I think this way is cleaner and easy but I’ve read that connecting vault directly to consul is a bad practice but I’m not sure if connecting in a direct way means using an ACL token. I’m not sure if I’m having a bad practice here or if the way I did is a good way to do it and it won’t be a problem in the future.
Really, if you’re setting up a new Vault these days, and you’re not already an experienced Consul sysadmin, you should choose Vault’s Integrated Storage (Raft), not Consul.
That way you only have to administer one set of servers.
2 Vault servers is not a sensible number to have. You need to have 3, in order to be able to tolerate 1 failed server. If you only have 2, you might as well just have 1.
With such a vague description of a problem, no-one can help you.
Yes, because Vault really does not like it if it cannot talk to its configured Consul, so by having a local Consul agent, you ensure it can always reach the localhost Consul, which will take care of routing traffic to the correct cluster nodes.
You need an ACL token for both of the architectures you discuss, not just the 2nd.
Oh boy … there is a lot of bad assumptions in your question.
Hashicorp uses the raft protocol to enable HA in their products. Raft only wants odd number of servers to enable HA — so 2 is a bad number of instances. You always want to have 1, 3, 5, etc… this applies to both Vault and Consul.
You didn’t include enough information on the network issues to be used to help but my guess is that your even number of instances is part of the problem. Using consul agents on each vault node means that Vault should be connecting locally to those agents and the agents will figure out how/where to connect to the consul servers – so if you’re having connectivity issues it’s a Consul datacenter HA issue – most likely.
That learn article is valid, although a bit dated – it’s just for learning not for deploying into an environment for use. Use the Reference Architecture for actual deployment.
It is 100% necessary to install Consul (as agent) on the vault instances. It isn’t “cleaner” whatever that means.