Hey!
I am trying to set up consul connect on one machine for a simple database and an application that just waits for a database to come up.
It’s not a special case. I just want a proof of concept for setting up Consul Connect so that I can later build and deploy applications with it.
Background
I have one node (vserver) to test it out. So there is no real data on those servers.
Node Configuration
My consul configuration file looks like this:
# /etc/consul.d/consul.hcl
data_dir = "/opt/consul"
ui_config{
enabled = true
}
server = true
bootstrap_expect=1
retry_join = ["168.119.124.210"]
acl {
enabled = false
}
connect {
enabled = true
}
bind_addr = "168.119.124.210"
advertise_addr = "168.119.124.210"
client_addr = "0.0.0.0"
ports {
http = 8500
grpc = 8502
}
My nomad configuration file looks like this:
# /etc/nomad.d/nomad.hcl
data_dir = "/opt/nomad/data"
bind_addr = "168.119.124.210"
advertise {
# Defaults to the first private IP address.
http = "168.119.124.210"
rpc = "168.119.124.210"
serf = "168.119.124.210" # non-default ports may be specified
}
server {
# license_path is required for Nomad Enterprise as of Nomad v1.1.1+
#license_path = "/etc/nomad.d/license.hclic"
enabled = true
bootstrap_expect = 1
}
client {
enabled = true
}
acl {
enabled = true
}
vault {
enabled = true
address = "http://127.0.0.1:8200"
default_identity {
aud = ["nomad"]
ttl = "1h"
}
jwt_auth_backend_path = "nomad"
}
consul {
address = "127.0.0.1:8500"
}
Nomad Job Configuration
I have TWO Nomad Jobs that are deployed via Terraform:
- customer-api.hcl
# customer-api.hcl
job "customer-api" {
type = "service"
namespace = "${namespace}"
group "api" {
network {
mode = "bridge"
}
service {
name = "${namespace}-api"
connect {
sidecar_service {
proxy {
upstreams {
destination_name = "${namespace}-db"
local_bind_port = 5432
}
}
}
}
}
task "api" {
driver = "docker"
vault {
policies = ["${namespace}"]
role = "${namespace}"
}
template {
destination = "secrets/env"
env = true
data = <<EOH
PGUSER={{ with secret "${secrets_path}/database" }}{{ .Data.data.username }}{{ end }}
PGPASSWORD={{ with secret "${secrets_path}/database" }}{{ .Data.data.password }}{{ end }}
PGDATABASE=postgres
PGHOST=localhost
PGPORT=5432
EOH
}
config {
image = "postgres:16-alpine"
command = "sh"
args = [
"-ec",
"until pg_isready -h \"$PGHOST\" -p \"$PGPORT\" -U \"$PGUSER\"; do echo waiting for db; sleep 1; done; psql -h \"$PGHOST\" -p \"$PGPORT\" -U \"$PGUSER\" \"$PGDATABASE\" -c 'select now();'; sleep 3600"
]
}
resources {
cpu = 200
memory = 256
}
}
}
}
# Please note that the variables in this specific configuration look like this:
namespace: customer-one
secrets_path: customers/customer-one
- customer-db.hcl
job "customer-db" {
type = "service"
namespace = "${namespace}"
group "db" {
network {
mode = "bridge"
port "db" {
to = 5432
}
}
service {
name = "${namespace}-db"
port = "db"
connect {
sidecar_service {
}
}
check {
type = "script"
task = "db"
command = "sh"
args = ["-ec", "pg_isready -h 127.0.0.1 -p 5432"]
interval = "30s"
timeout = "2s"
}
}
task "db" {
driver = "docker"
vault {
policies = ["${namespace}"]
role = "${namespace}"
}
template {
destination = "secrets/postgres.env"
env = true
data = <<EOF
{{ with secret "${secrets_path}/database" }}
POSTGRES_LISTEN_ADDRESSES=*
POSTGRES_DB=test
POSTGRES_USER={{ .Data.data.username }}
POSTGRES_PASSWORD={{ .Data.data.password }}
{{ end }}
EOF
}
config {
image = "postgres:17"
ports = ["db"]
}
resources {
cpu = 500
memory = 512
}
}
restart {
attempts = 10
interval = "5m"
delay = "25s"
mode = "delay"
}
}
update {
max_parallel = 1
min_healthy_time = "5s"
healthy_deadline = "3m"
auto_revert = false
canary = 0
}
}
I have the CNI Plugins installed.
Network background
- There is a docker bridge that appearently is being used more than the nomad virtual interface.
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 92:00:06:3d:b9:b2 brd ff:ff:ff:ff:ff:ff
altname enp1s0
inet 168.119.124.210/32 brd 168.119.124.210 scope global dynamic eth0
valid_lft 78223sec preferred_lft 78223sec
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 2e:7e:39:25:ec:26 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::2c7e:39ff:fe25:ec26/64 scope link
valid_lft forever preferred_lft forever
6: br-bd1cb1ebe544: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 66:03:d8:e7:81:a4 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.1/16 brd 172.18.255.255 scope global br-bd1cb1ebe544
valid_lft forever preferred_lft forever
inet6 fe80::6403:d8ff:fee7:81a4/64 scope link
valid_lft forever preferred_lft forever
7: vethafecdc9@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-bd1cb1ebe544 state UP group default
link/ether 52:c3:c7:c3:ef:5a brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::50c3:c7ff:fec3:ef5a/64 scope link
valid_lft forever preferred_lft forever
8: veth5cdaf4c@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-bd1cb1ebe544 state UP group default
link/ether b6:97:c8:f9:5a:2f brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::b497:c8ff:fef9:5a2f/64 scope link
valid_lft forever preferred_lft forever
9: vethff1b435@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-bd1cb1ebe544 state UP group default
link/ether 62:1a:77:76:98:ac brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet6 fe80::601a:77ff:fe76:98ac/64 scope link
valid_lft forever preferred_lft forever
10: nomad: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 6a:9f:d3:98:5f:0d brd ff:ff:ff:ff:ff:ff
inet 172.26.64.1/20 brd 172.26.79.255 scope global nomad
valid_lft forever preferred_lft forever
inet6 fe80::689f:d3ff:fe98:5f0d/64 scope link
valid_lft forever preferred_lft forever
21: vethbd3fc599@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master nomad state UP group default
link/ether 2a:0c:89:05:5e:0d brd ff:ff:ff:ff:ff:ff link-netnsid 3
inet6 fe80::280c:89ff:fe05:5e0d/64 scope link
valid_lft forever preferred_lft forever
22: veth3b215a7b@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master nomad state UP group default
link/ether be:f8:fc:aa:95:11 brd ff:ff:ff:ff:ff:ff link-netnsid 4
inet6 fe80::bcf8:fcff:feaa:9511/64 scope link
valid_lft forever preferred_lft forever
23: veth1e27da5b@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master nomad state UP group default
link/ether f2:e8:a1:5f:c3:67 brd ff:ff:ff:ff:ff:ff link-netnsid 5
inet6 fe80::f0e8:a1ff:fe5f:c367/64 scope link
valid_lft forever preferred_lft forever
24: vetha421be1f@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master nomad state UP group default
link/ether 16:84:5e:9c:f0:d3 brd ff:ff:ff:ff:ff:ff link-netnsid 6
inet6 fe80::1484:5eff:fe9c:f0d3/64 scope link
valid_lft forever preferred_lft forever
28: veth15cc5136@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master nomad state UP group default
link/ether 72:ed:22:a4:98:6e brd ff:ff:ff:ff:ff:ff link-netnsid 7
inet6 fe80::70ed:22ff:fea4:986e/64 scope link
valid_lft forever preferred_lft forever
IPTables
iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy DROP)
target prot opt source destination
CNI-FORWARD all -- anywhere anywhere /* CNI firewall plugin rules */
DOCKER-USER all -- anywhere anywhere
DOCKER-FORWARD all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain CNI-FORWARD (1 references)
target prot opt source destination
NOMAD-ADMIN all -- anywhere anywhere /* CNI firewall plugin admin overrides */
ACCEPT all -- anywhere 172.26.64.12 ctstate RELATED,ESTABLISHED
ACCEPT all -- 172.26.64.12 anywhere
ACCEPT all -- anywhere 172.26.64.13 ctstate RELATED,ESTABLISHED
ACCEPT all -- 172.26.64.13 anywhere
ACCEPT all -- anywhere 172.26.64.15 ctstate RELATED,ESTABLISHED
ACCEPT all -- 172.26.64.15 anywhere
ACCEPT all -- anywhere 172.26.64.20 ctstate RELATED,ESTABLISHED
ACCEPT all -- 172.26.64.20 anywhere
ACCEPT all -- anywhere 172.26.64.21 ctstate RELATED,ESTABLISHED
ACCEPT all -- 172.26.64.21 anywhere
Chain DOCKER (2 references)
target prot opt source destination
ACCEPT tcp -- anywhere 172.18.0.4 tcp dpt:http-alt
ACCEPT tcp -- anywhere 172.18.0.4 tcp dpt:3000
DROP all -- anywhere anywhere
DROP all -- anywhere anywhere
Chain DOCKER-BRIDGE (1 references)
target prot opt source destination
DOCKER all -- anywhere anywhere
DOCKER all -- anywhere anywhere
Chain DOCKER-CT (1 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
Chain DOCKER-FORWARD (1 references)
target prot opt source destination
DOCKER-CT all -- anywhere anywhere
DOCKER-INTERNAL all -- anywhere anywhere
DOCKER-BRIDGE all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Chain DOCKER-INTERNAL (1 references)
target prot opt source destination
Chain DOCKER-USER (1 references)
target prot opt source destination
Chain NOMAD-ADMIN (1 references)
target prot opt source destination
ACCEPT all -- anywhere 172.26.64.0/20
In the current example the postgres container runs on 172.18.0.3and is available through the host.
$ ip route
default via 172.31.1.1 dev eth0
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.18.0.0/16 dev br-bd1cb1ebe544 proto kernel scope link src 172.18.0.1
172.26.64.0/20 dev nomad proto kernel scope link src 172.26.64.1
172.31.1.1 dev eth0 scope link
So it looks like that the traffic is routed through the docker bridge, although I thought that it should not be routed through there.
Problem
The api container can not reach the database container.
The sidecar proxy container, that is right next to the database can run psql and gets a response.
The sidecar proxy container right next to the application can not run psql. It aborts with a connection reset error.
What I’ve tried
- I have already tried to change the initially node configuration from the bind_addr (it was originally
127.0.0.1to the now public ip of the virtual server.) - Debugging that I can not clearly summarize. I checked the ip interfaces but didn’t understand what was wrong.
I would love to get some help regarding this topic, because after three days of debugging, I ran out of ideas and creativity ![]()