Hello Hashi Community,
I have recently begun mapping out a single-VM deployment of our in-house application which has been running as distributed microservices on separate VMs using Consul, Nomad and Vault to manage KV, scheduling and secrets, respectively.
As a POC I’m attempting to set up the Consul and Nomad framework using docker images of each to facilitate multi-server quorum on the same machine in lieu of having multiple VMs to work with.
I was able to create the Consul cluster without issues using Docker Compose and the default docker bridge network.
The difficulty I’m having is that I can’t seem to get Nomad to cluster using Consul (auto join) using the same bridge network and feel that I’ve exhausted all advertise and bind options (starting with the defaults.
Fingerprinting seems to be working, and Nomad will register with Consul but will not pass its health check and ultimately cannot elect a leader.
The configuration I’ve been trying is 3 Consul server, 1 Consul client x 3 Nomad Servers which are also clients.
Here are my Nomad configs -

Docker Compose (just the Nomad section): please let me know if I can supply any other configuration info or details. Thank you all in advance for any input!
nomad-server-1:
image: vptech/nomad:1.0.4
container_name: nomad-server1
command: /bin/nomad agent -config=/server1.hcl -config=/client.hcl -config=/base.hcl
environment:
NOMAD_RUN_ROOT: 1
ports:
- 4646:4646
- 4647:4647
- 4648:4648
restart: always
privileged: true
cap_add:
- SYS_ADMIN
- NET_ADMIN
- chown
- dac_override
- fsetid
- fowner
- mknod
- net_raw
- setgid
- setuid
- setfcap
- setpcap
- net_bind_service
- sys_chroot
- kill
- audit_write
- IPC_LOCK
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./nomad/base.hcl:/base.hcl
- ./nomad/client.hcl:/client.hcl
- ./nomad/server1.hcl:/server1.hcl
- /tmp:/tmp
nomad-server-2:
image: vptech/nomad:1.0.4
container_name: "nomad-server2"
command: /bin/nomad agent -config=/server2.hcl -config=/client.hcl -config=/base.hcl
environment:
NOMAD_RUN_ROOT: 1
ports:
- 5646:5646
- 5647:5647
- 5648:5648
restart: always
privileged: true
cap_add:
- SYS_ADMIN
- NET_ADMIN
- chown
- dac_override
- fsetid
- fowner
- mknod
- net_raw
- setgid
- setuid
- setfcap
- setpcap
- net_bind_service
- sys_chroot
- kill
- audit_write
- IPC_LOCK
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./nomad/base.hcl:/base.hcl
- ./nomad/client.hcl:/client.hcl
- ./nomad/server2.hcl:/server2.hcl
- /tmp:/tmp
nomad-server-3:
image: vptech/nomad:1.0.4
container_name: "nomad-server3"
command: /bin/nomad agent -config=/server3.hcl -config=/client.hcl -config=/base.hcl
environment:
NOMAD_RUN_ROOT: 1
ports:
- 6646:6646
- 6647:6647
- 6648:6648
restart: always
privileged: true
cap_add:
- SYS_ADMIN
- NET_ADMIN
- chown
- dac_override
- fsetid
- fowner
- mknod
- net_raw
- setgid
- setuid
- setfcap
- setpcap
- net_bind_service
- sys_chroot
- kill
- audit_write
- IPC_LOCK
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./nomad/base.hcl:/base.hcl
- ./nomad/client.hcl:/client.hcl
- ./nomad/server3.hcl:/server3.hcl
- /tmp:/tmp
Hello.
Here is an example docker-compose i’ve used for testing basic consul/nomad clustering (servers only). Maybe you can customize it to your use, or atleast see the configuration used for clustering/network.
In this file the clustering & healthchecks all work, which is what I take it you were asking for.
It shouldn’t be difficult to modify to suit your need (there is also a nginx-proxy in front of the API-endpoints, etc in this file, but that isn’t required in any way)
FILE: ./docker-compose.yml
# 3-Node Cluster Example
########################
## Networks
########################
networks:
hashicluster-multi:
########################
## Volumes
########################
volumes:
consul-data-1:
consul-data-2:
consul-data-3:
nomad-data-1:
nomad-data-2:
nomad-data-3:
########################
## Service Defaults
########################
x-hashicluster-services:
consul-server: &consul-server
restart: unless-stopped
image: hashicorp/consul:latest
networks:
- hashicluster-multi
command:
- agent
- -server
- -bootstrap-expect=3
- -client={{ GetPrivateIP }}
- -retry-join=consul-1
- -retry-join=consul-2
- -retry-join=consul-3
- -ui
nomad-server: &nomad-server
restart: unless-stopped
image: hashicorp/nomad:latest
environment:
NOMAD_SKIP_DOCKER_IMAGE_WARN: 'yesplease'
networks:
- hashicluster-multi
command:
- agent
- -server
- -data-dir=/nomad/data
- -bind={{ GetPrivateIP }}
- -bootstrap-expect=3
#- -retry-join=nomad-1
#- -retry-join=nomad-2
#- -retry-join=nomad-3
- -consul-address=consul-1:8500
- -consul-address=consul-2:8500
- -consul-address=consul-3:8500
- -consul-auto-advertise
- -consul-client-auto-join
services:
########################
## Proxy
########################
proxy:
image: nginx:alpine
restart: unless-stopped
ports:
- "8500:8500" # Consul API
- "4646:4646" # Nomad API
networks:
- hashicluster-multi
volumes:
- ./files/nginx-multi.conf:/etc/nginx/nginx.conf
########################
## Consul Servers
########################
consul-1:
<<: *consul-server
hostname: consul-1
volumes:
- consul-data-1:/consul/data
consul-2:
<<: *consul-server
hostname: consul-2
volumes:
- consul-data-2:/consul/data
consul-3:
<<: *consul-server
hostname: consul-3
volumes:
- consul-data-3:/consul/data
########################
## Nomad Servers
########################
nomad-1:
<<: *nomad-server
hostname: nomad-1
volumes:
- nomad-data-1:/nomad/data
nomad-2:
<<: *nomad-server
hostname: nomad-2
volumes:
- nomad-data-2:/nomad/data
nomad-3:
<<: *nomad-server
hostname: nomad-3
volumes:
- nomad-data-3:/nomad/data
FILE: ./files/nginx-multi.conf
events {
worker_connections 1024;
}
http {
resolver 127.0.0.11 valid=30s; # Docker's internal DNS resolver
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
add_header X-Content-Type-Options nosniff;
add_header X-Frame-Options DENY;
add_header X-XSS-Protection "1; mode=block";
########################
## Upstreams
########################
upstream nomad_4646_backend {
ip_hash; # Ensure connections are always proxied to the same server node when possible.
server nomad-1:4646 fail_timeout=15s max_fails=10;
server nomad-2:4646 fail_timeout=15s max_fails=10;
server nomad-3:4646 fail_timeout=15s max_fails=10;
}
upstream consul_8500_backend {
ip_hash; # Ensure connections are always proxied to the same server node when possible.
server consul-1:8500 fail_timeout=15s max_fails=10;
server consul-2:8500 fail_timeout=15s max_fails=10;
server consul-3:8500 fail_timeout=15s max_fails=10;
}
########################
## Consul Servers
########################
server {
listen *:8500;
location / {
proxy_pass http://consul_8500_backend;
}
#health_check interval=30s fails=3 passes=2;
}
########################
## Nomad Servers
########################
server {
listen *:4646;
location / {
proxy_pass http://nomad_4646_backend;
# Nomad blocking queries will remain open for a default of 5 minutes.
# Increase the proxy timeout to accommodate this timeout with an
# additional grace period.
proxy_read_timeout 319s;
# Nomad log streaming uses streaming HTTP requests. In order to
# synchronously stream logs from Nomad to NGINX to the browser
# proxy buffering needs to be turned off.
proxy_buffering off;
# The Upgrade and Connection headers are used to establish
# a WebSockets connection.
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# The default Origin header will be the proxy address, which
# will be rejected by Nomad. It must be rewritten to be the
# host address instead.
proxy_set_header Origin "${scheme}://${proxy_host}";
}
#health_check interval=30s fails=3 passes=2;
}
}
On a sidenote; this post was a bit old. I’ll leave the reply anyway … xD
Honestly, I think just running two VMs on your machine would save you quite a bit of time and frustration.
I have one ‘arbiter’ VM running on my NAS with just Nomad and Consul. 1GB of RAM is totally fine, with a bit of tuning it should probably fit into 512MB.
Regarding your PoC, I would set up two VMs with Nomad and Consul, and assign the datacenter “unused” in the Nomad config.
Then set up Nomad and Consul on your local machine with, say, “dc1” as the Nomad datacenter.
This way you can make sure that your jobs for “dc1” are scheduled on your local machine and not on the arbiter nodes.