Vault joins as non-voter in a Raft cluster

We are using the version 1.14 (also tried 1.13 and 1.12) docker image with docker compose to set up a raft Vault cluster. I should mention that we are hosting our two Vaults on two seperate hosts behind a treafik reverse proxy. Our config.hcl looks like that:

listener "tcp" {
    address = "0.0.0.0:8200"
    cluster_addr = "127.0.0.1:8201"

    tls_disable = 1
}

storage "raft" {
    path = "/vault/data"
    node_id = "node1" # node2 for vault 2
    
    retry_join {
        leader_api_addr = "https://hashi-vault-1.domain.tld"
    }

    retry_join {
        leader_api_addr = "https://hashi-vault-2.domain.tld"
    }

}

disable_mlock = true
api_addr = "https://hashi-vault-1.domain.tld:443" # vault-2 for second vault
cluster_addr = "https://hashi-vault-1.domain.tld/cluster:443" # vault-2 for second vault
ui = true

And our docker-compose.yml is the following:

version: '3.7'

networks:
  "{{ traefik_network }}":
    name: "{{ traefik_network }}"
    external: true

services:
  vault:
    container_name: "vault_1" # vault_2 for second vault
    restart: unless-stopped
    image: "hashicorp/vault:1.14"
    volumes:
      - "{{ vault_project_dir }}/config:/vault/config"
      - "{{ vault_project_dir }}/data:/vault/data:z"
    command: "server"
    expose:
      - 8200
      - 8201
    cap_add:
      - IPC_LOCK
    labels:
      - traefik.enable=true

      # Port 8200
      - traefik.http.routers.vault_api.rule=Host(`hashi-vault-1.domain.tld`)
      - traefik.http.routers.vault_api.entrypoints={{ traefik_entrypoint }}
      - traefik.http.routers.vault_api.tls=true
      - traefik.http.routers.vault_api.service=vault_api

      - traefik.http.services.vault_api.loadbalancer.server.port=8200

      # Port 8201
      - traefik.http.routers.vault_cluster.rule=Host(`hashi-vault-1.domain.tld`) && Path(`/cluster`)
      - traefik.http.routers.vault_cluster.entrypoints={{ traefik_entrypoint }}
      - traefik.http.routers.vault_cluster.tls=true
      - traefik.http.routers.vault_cluster.service=vault_cluster

      - traefik.http.services.vault_cluster.loadbalancer.server.port=8201

    networks:
      - "{{ traefik_network }}"

We start both vaults with exactly this configuration on both hosts, which the exception of the vault instance number (1 or 2).

What works:

  • the vaults start and are reachable over the defined addresses
  • we can initialize the first vault via UI (by using shamir)
  • we reach the second vault via UI and after unsealing it is shown in the UI of the first vault (which is the leader)

What does not work:

  • after unsealing the second vault the UI of this vault is stuck in the unsealing step
  • the second vault is also only shown as a non-voter in the UI of the first vault
  • when we kill the first vault, we cant access any vault as no vault becomes the leader

Error messages from the logs (Vault 1):

  • storage.raft: failed to appendEntries to: peer="{Nonvoter node1 hashi-vault-1.domain.tld}" error="dial tcp: address hashi-vault-1.domain.tld: missing port in address"
  • storage.raft: failed to heartbeat to: peer=hashi-vault-1.domain.tld backoff time=1.28s error="dial tcp: address hashi-vault-1.domain.tld: missing port in address"

Error messages from the logs (Vault 2):

  • core: failed to retry join raft cluster: retry=2s err="waiting for unseal keys to be supplied"
  • core: failed to retry join raft cluster: retry=2s err="failed to send answer to raft leader node: error bootstrapping cluster: cluster already has state"
  • core: failed to get raft challenge: leader_addr=https://hashi-vault-1.domain.tld error="error during raft bootstrap init call: context deadline exceeded"

Vault 2 also has this info which could be interesting:

  • [INFO] core: security barrier not initialized

Any help is appreciated! Please don’t hesitate to ask for further information (logs, system configuration, etc.).

If you are using raft as your storage you need an odd number of servers in the cluster, so 1, 3 or 5. Having 2 servers won’t work.

This is an immediate red flag… the entire purpose of the cluster listener is inter-node communication, so binding it to localhost cannot be correct.

Setting the node_id is a bit of a yellow flag. There is no good reason to set this. Leave it unset, and vault will generate and store a UUID instead, which insulates you from mistakes managing the node ID. (But don’t change it for an existing cluster. Do it when recreating a cluster.)

This is incorrect. Your listener is on port 8200 not 443. Also, you have tls_disable set so this needs to be http not https.

This is incorrect. Your cluster listener is on port 8201 not 443, and the /cluster URL-path is also incorrect and should be deleted. (This one stays https though.)

Probably due to the bad cluster_addr settings.

Expected. In a 2-node Raft cluster, both nodes must be up for the cluster to be functional. This is normal for any consensus/quorum system. 3 nodes are required to tolerate a node failure.

Ok, that explains a lot… Thank you!

Thanks for the detailed answer! The first two points make sense, we will change that. Regarding the ports however we are not 100% sure. We did that because we’re using traefik as a reverse proxy and traefik doesn’t allow ports 8000 and 8001, so we did this as a fix. Do you think there is a better way? Thanks again for your answer, that’s really helpful!

So much to unpack here…

  • 8000 and 8001 are irrelevant to Vault. Vault’s ports are 8200 and 8201.

  • Although I am not familiar with Traefik, it seems implausible that a general purpose piece of software would forbid specific port numbers.

  • Vault’s internal cluster communication on port 8201 should go nowhere near any proxies - it is strictly from one Vault node directly to another Vault node.

  • When Vault is run behind a reverse proxy, it is appropriate to set api_addr to the address of the reverse proxy, which may involve a different port number - see High Availability | Vault | HashiCorp Developer

Update:

We are now using three vaults instead of just two. We also updated the config.hcl and the docker-compose.yml.

This is our current config.hcl:

listener "tcp" {
    address = "0.0.0.0:8200"
    cluster_addr = "0.0.0.0:8201" # we changed that from localhost to 0.0.0.0

    tls_disable = 1
}

storage "raft" {
    path = "/vault/data"
    # removed node_id here
    
    retry_join {
        leader_api_addr = "https://hashi-vault-1.domain.tld"
    }

    retry_join {
        leader_api_addr = "https://hashi-vault-2.domain.tld"
    }

}

disable_mlock = true
api_addr = "https://hashi-vault-1.domain.tld" # removed :443 here
cluster_addr = "https://hashi-vault-1.domain.tld/cluster" # removed :443 here
ui = true

And this is our current docker-compose.yml:

version: '3.7'

networks:
  "{{ traefik_network }}":
    name: "{{ traefik_network }}"
    external: true

services:
  vault:
    container_name: "vault_1"
    restart: unless-stopped
    image: "hashicorp/vault:1.14"
    volumes:
      - "{{ vault_project_dir }}/config:/vault/config"
      - "{{ vault_project_dir }}/data:/vault/data:z"
    command: "server"
    expose:
      - 8200
      - 8201
    cap_add:
      - IPC_LOCK
    labels:
      - traefik.enable=true

      # Port 8200
      - traefik.http.routers.vault_api.rule=Host(`hashi-vault-1.domain.tld`)
      - traefik.http.routers.vault_api.entrypoints={{ traefik_entrypoint }}
      - traefik.http.routers.vault_api.tls=true
      - traefik.http.routers.vault_api.service=vault_api

      - traefik.http.services.vault_api.loadbalancer.server.port=8200

      # Port 8201
      - traefik.http.routers.vault_cluster.rule=Host(`hashi-vault-1.domain.tld`) && PathPrefix(`/cluster/`)
      - traefik.http.routers.vault_cluster.entrypoints={{ traefik_entrypoint }}
      - traefik.http.routers.vault_cluster.tls=true
      - traefik.http.routers.vault_cluster.service=vault_cluster
      - traefik.http.routers.vault_cluster.middlewares=vault_cluster_strip

      - traefik.http.middlewares.vault_cluster_strip.stripprefix.prefixes=/cluster
      - traefik.http.middlewares.vault_cluster_strip.stripprefix.forceSlash=false

      - traefik.http.services.vault_cluster.loadbalancer.server.port=8201

    networks:
      - "{{ traefik_network }}"

The first vault we start becomes the leader, but the following vaults join as non-voters as you can see in this screenshot:

And when we try to access hashi-vault-1.domain.tld/cluster (for all vaults) we get the following error, which is shown as a 404 error in the console:

image2

What is supposed to happen when trying to access the address we set as the cluster address?

We still get the same errors, especially this one makes us wonder if this is a hint to the problem we’re having:
storage.raft: failed to appendEntries to: peer="{Nonvoter node1 hashi-vault-1.domain.tld}" error="dial tcp: address hashi-vault-1.domain.tld: missing port in address"

We also get the following warning:

[WARN]  storage.raft: heartbeat timeout reached, not part of a stable configuration or a non-voter, not triggering a leader election

You have set the cluster_addrs incorrectly so it is unsurprising that the clustering isn’t working.

Refer to my earlier response:

Thanks for your response. We are almost there, our cluster is now working but we have one question left:

Why do we need to use https for the cluster_addr? I thought that if we disable TLS, we wouldn’t need https. We don’t have any certs so how would that work?

Thanks again for your previous help!

Because Vault always HTTPS for its internal clustering traffic - no exceptions. disable_tls does not apply. It generates its own internal certificates which the user is not allowed to override, nor expected to ever interact with.

Ok that makes sense, thanks a lot!