HCL and TLS config versus connections refused logs

I would like to submit my HCL configuration to you because i think i have a communication problem which is probably related to my certificates.

Vault OSS 1.12.6 used

Vault works well but i often have this error in my logs:

[ERROR] storage.raft: failed to appendEntries to: peer=“{Voter nodex abjd081x.myenterprise.be:8201}” error=“dial tcp connect: connection refused”

I would like your opinion on the configuration of the HCL to detect anomalies

my cluster is composed by the following 3 virtual machines:

node1: abjd0812.myenterprise.be
node2: abjd0813.myenterprise.be
node3: abjd0814.myenterprise.be

Vault access for clients (north-south communication) is available through F5 from URL https://vault.myenterprise.be
from 443 to 8200 (with healthcheck: /v1/sys/health)

listener 8200 for north/south communication (clients to vault communication)
the vault.myenterprise.be.pem certificate does not contain any SAN other than CN=vault.myenterprise.be SAN=vault.myenterprise.be

listener 9200 for west/east communication (inter nodes communication)
the vault.cluster.myenterprise.be.pem certificate contains theses SANs:
SAN1: abjd0812.myenterprise.be
SAN2: abjd0813.myenterprise.be
SAN3: abjd0814.myenterprise.be

HCL configuration file of a node:

storage “raft” {
path = “/raft_data”
node_id = “node1”

listener “tcp” {
address = “abjd0812.myenterprise.be:9200”
cluster_address = “abjd0812.myenterprise.be:9201”
tls_disable = 0
tls_cert_file = “/etc/ssl/certs/vault.cluster.myenterprise.be.pem”
tls_key_file = “/etc/ssl/private/vault.cluster.myenterprise.be.key”
tls_disable_client_certs = “true”
tls_min_version = “tls10”

listener “tcp” {
address = “abjd0812.myenterprise.be:8200”
tls_disable = 0
tls_cert_file = “/etc/ssl/certs/vault.myenterprise.be.pem”
tls_key_file = “/etc/ssl/private/vault.myenterprise.be.key”
tls_disable_client_certs = “true”
tls_min_version = “tls10”

cluster_addr = “https://abjd0812.myenterprise.be:8201
api_addr = “https://abjd0812.myenterprise.be:9200
log_level = “Trace”

What do you think about ?

Do i must change https://abjd0812.myenterprise.be:8201 URL for https://vault.myenterprise.be:8201 ?

  1. Welcome to the forum - please reformat your message

  1. It seems surprising to me that you have two listener "tcp" blocks. Yes, I do see you’re using that to serve different TLS certificates on port 8200 vs. 9200, but Vault nodes do not talk to each other on their API port, except for the one narrow exception of during the initial join only to a new Raft cluster.

The actual “east-west” traffic in normal operation is flowing on port 8201.

No, you must absolutely not do that, as the Vault cluster_addr must be unique per node, and route directly to a particular node.

Is the listed IP address correct for your nodes?

Are there firewalls that could be blocking connections between nodes?

Is the Vault server process on the destination node actually running at these times? If so, what is the destination Vault logging around this time?

I didn’t want the machine names to appear in the certificate exposed to clients (on the north-south stream) so that’s why I added the second listener.

nop, its for example

hum, there’s no firewall on nodes os system, they are on the same EPG (CISCO ACI), same Datacenter.


i’m sorry but i don’t understand what you mean, can you rephrase again please?

You have a clear error message that explicitly says the Vault leader node encountered a TCP connection refused error when trying to send changes to a follower node.

Now it is necessary to figure out why.

Connection refused usually means that the receiving node isn’t even running, or isn’t listening on that port.

So, I’m suggesting that if the sending node is getting connection refused, you should also be looking at the logs of the receiving node, so figure out what it was doing during that time.