Hi there,
I am trying to get nomad and consul connect running on MacOs, this is using rootless podman. The containers spin up, but it seems whatever nomad is starting for the side-cars, ends up completely breaking nomads ability to ‘reach’ out. Resulting in it being unable to pull down the envoy images.
2022-11-12T07:07:45.770Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=74c46165-f80f-bc79-2852-c2c228c3c4a2 task=connect-proxy-count-api error="rpc error: code = Unknown desc = failed to create image: docker.io/envoyproxy/envoy:v1.21.1: failed to start task, unable to pull image docker.io/envoyproxy/envoy:v1.21.1 : Error reading response: context deadline exceeded (Client.Timeout or context cancellation while reading body)"
the podman service is printing
time="2022-11-12T07:07:45Z" level=warning msg="Failed, retrying in 1s ... (1/3). Error: initializing source docker://envoyproxy/envoy:v1.21.1: pinging container registry registry-1.docker.io: Get \"https://registry-1.docker.io/v2/\": dial tcp 3.216.34.172:443: i/o timeout"
What’s curious is, everything is fine until you give it a job. Then this new route appears inside the container running nomad.
ip route
default via 10.89.0.1 dev eth0
10.89.0.0/24 dev eth0 scope link src 10.89.0.91
172.26.64.0/20 dev nomad scope link src 172.26.64.1
After that all network connectivity out is just gone. Even just trying to wget google.com results in a timeout.
The code is available ~btrepp/environment: common/nomad.mk - sourcehut git
and should fire everything up with ‘make’ and having rootless podman setup, and br-netfilter enabled as per the consul connect guide.
Does anyone have an idea what is going on?. I suspect nomad is creating networks to ensure everything is going through the side-cars, but it’s also doing that to its own traffic, is there any way to tell it to exclude itself from what it is filtering?
nomad.log.txt (51.0 KB)