I’m new to vault and I’m trying to set up a cluster of 3 vault nodes using raft as storage. I’ve messed with the config quite a bit but can’t seem to get the nodes to “peer” each other.
Vault configuration /etc/vault.d/vault.hcl
storage "raft" {
path = "/opt/raft"
node_id = "raft_node_<node # (1, 2, or 3)>"
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_disable = true
}
api_addr = "http://<IP address of this node>:8200"
cluster_addr = "https://<IP address of this node>:8201"
ui = true
-
Start the server on each of the 3 nodes
sudo vault server -config=/etc/vault.d/vault.hcl -log-level=trace
-
Set vault address on each node to be node 1’s IP:8200
export VAULT_ADDR=http://<node 1's IP>:8200
-
Init vault cluster on node 1
vault operator init
-
Try to join node 1 from nodes 2 and 3
vault operator raft join "http://<node 1's IP>:8200"
Note: This command pretty much always returns “Joined true” even if I plug in some random value like vault operator raft join asdfasdf
-
Run vault operator unseal
once on node 2 and 3 (threshold is 3)
-
Do step 4 again just in case
-
Do the final unseal on node 1
-
Set VAULT_TOKEN and run vault operator raft list-peers
Actual output:
Node Address State Voter
---- ------- ----- -----
raft_node_1 <Node 1's IP>:8201 leader true
Expected output:
Node Address State Voter
---- ------- ----- -----
raft_node_1 <Node 1's IP>:8201 leader true
raft_node_2 <Node 2's IP>:8201 follower true
raft_node_3 <Node 3's IP>:8201 follower true
Also I’m able to store secrets on one node and pull them down on another node even though they’re not listed as peers.
These are the docs I’ve been following
Is anyone able to see where I’m going wrong here?
Hello,
Yes, you are correct, the Joined true
does not indicates successfully joined node, when the vault unseal ...
is successfully executed, that would indicate successful join to the cluster.
I would suggest having your log-level
set to TRACE
and observe the message in the operational logs at the time when you do raft join...
on node 2 and node 3, this would reveal the error reason for not joining.
Martin
Okay that’s what I thought.
I ran all the servers with log-level=trace
and I got no output on nodes 2 and 3 when unsealing – they did not change at all after being started. Here is the output from node 1 during the unseal process
Hello,
The log shared from node 1 does not seem to provide any useful information.
Can you share the TRACE
logs from node 2 and node 3 when you execute raft join...
command.
Martin
Yeah sure
One the left are the trace server logs of node 2 and on the right are the commands I ran on the node 2. From when I started the server they did not change at all.
Here’s the return from raft list-peers
after trying to join the cluster
Hello,
I can see that node 2 says still says “Sealed: true” after you do “vault operator unseal”. A raft node is successfully joined to the cluster when it is unsealed correctly after joining.
What kind of “seal” stanza do you use. If auto-unseal is being used, are the unseal keys the same on node 1 and node 2?
Martin
Hey Martin,
So I have no seal stanza definition so I’m using the default Shamir seal. When I init on vault node 1 I just copy those unseal keys to the other nodes.
The vault still says sealed after I ran vault operator unseal
because the unseal progress
was only at 2/3. After I unsealed on this node I finished up on the rest and later ran the 'vault operator raft list-peers` command you see in the last screenshot.
Hello,
Vault needs to unseal successfully in order to join the cluster. I can see that you have 3 Shamir keys, what do you see when you enter 3 keys. Here are example steps :
- Execute
vault operator raft join...
on node 2 to join it to node 1
- Execute 3 (three times)
vault operator unseal UNSEAL_KEY
, all times use different unseal key.
Do you see an error during the unseal process?
Martin
Hello again Martin!
Sorry I was out all weekend but here is the log output from completing these steps:
- Start server in trace mode on all three vault servers
-
export VAULT_ADDR=<node 1's IP>:8200
on all three servers
-
vault operator init
on server 1
-
vault operator raft join "http://<node 1's IP>:8200"
on node 2
- I then ran
vault operator unseal
3 times on node 2 (with different unseal keys each time)
- At this point the unseal process was done so I set
VAULT_TOKEN
on node 2 and ran vault operator raft list-peers
and got the same result, only the leader (node 1) was listed
The very first line of this log output that says “core: pre-seal teardown complete” is the very last line of the output generated by vault operator init
the rest of the output is from unsealing the vault
Note: I don’t see output in any of the servers when I run vault operator raft join "http://<node 1's IP>:8200"
Thanks again, Carter
Hello,
This export VAULT_ADDR=<node 1's IP>:8200 on all three servers
means that all of your commands are executed only on node 1.
For node2 - export VAULT_ADDR=localhost:8200, for node3 - export VAULT_ADDR=localhost:8200.
VAULT_ADDR
variables specifies at which hosts your CLI commands are executed, more info here.
Hi Martin!
It’s working! I looked through all sorts of docs but I guess I never understood that the VAULT_ADDR
var was supposed to be the local machines IP.
I also thought that each node only had to contribute to a single unseal process (each node did it at least one time) and didn’t understand that each node had to do its own entire unseal process (me having the wrong VUALT_ADDR
definitely didn’t help with this either lol).
Thank you so much for all your help!!
- Carter
Hello,
I’m glad i was able to help, wish you all the best !
Martin
1 Like