Ingress connection dropped

Hi! I have this job, that’s fairly simple. Everything works fine except the ingress.
Check passes fine, psql works withing the container, logs looks fine, but I cannot connect to it from the outside.
When I’m on the host that’s running the container (or anywhere else, really) psql just says:

# psql -h postgresql.ingress.dc1.consul -U postgres
psql: error: server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.

When I try telnet, I’ll get something similar:

# telnet postgresql.ingress.dc1.consul 5432
Trying xxx.xxx.xxx.xxx...
Connected to postgresql.ingress.dc1.consul.
Escape character is '^]'.
Connection closed by foreign host.

In consul, everything is green. Intentions are also set to allow.
Nomad is also green everywhere, so I have now idea where to look.

job "postgresql" {
    datacenters = ["dc1"]
    type = "service"
    group "postgresql-ingress" {
        network {
            mode = "bridge"
            port "inbound" {
                static = 5432
                to = 5432
            }
        }
        service {
            name = "postgresql-ingress"
            port = "5432"
            connect {
                gateway {
                    proxy {
                    }
                    ingress {
                        listener {
                            port = 5432
                            protocol = "tcp"
                            service {
                                name = "postgresql"
                            }
                        }
                    }
                }
            }
        }
    }
    group "postgresql" {
        network {
            mode = "bridge"
        }
        service {
            name = "postgresql"
            tags = ["db"]
            port = "5432"
            check {
                type = "script"
                name = "Check PG ready"
                command = "su"
                args = ["postgres", "-c", "/usr/local/bin/pg_isready"]
                interval = "10s"
                timeout = "1s"
                task = "postgresql"
            }
            connect {
                sidecar_service {}
            }
        }
        task "postgresql" {
            resources {
                cpu = 500
                memory = 500
            }
            template {
                data = <<EOH
                PGDATA="/var/lib/postgresql/data"
		POSTGRES_HOST_AUTH_METHOD=trust
                EOH
                destination = "secrets/file.env"
                env = true
            }
            driver = "docker"
            config {
                image = "postgres:12-alpine"
            }
            volume_mount {
                volume = "data"
                destination = "/var/lib/postgresql/data"
            }
        }
        volume "data" {
            type = "host"
            source = "postgresql"
            read_only = false
        }
    }
}

I’m running:
Consul v1.10.1 - 3x server&agent
Nomad v1.1.3 (8c0c8140997329136971e66e4c2337dfcf932692) - 3x server&agent
All with SSL.

Thanks for any help!

Hi @DejfCold . Thanks for using Nomad.

I’m not entirely certain I understand your situation yet. If anything I say next seems to not be relevant, please correct my misperceptions.

It seems like you are trying to connect to the postgresql server using an Ingress gateway, from outside your Nomad cluster. If that is the case, I’d recommend looking into setting up a load balancer. Examples of how to do this for a few popular tools can be found at this link on our Learn site.

From what I am seeing in the Nomad Connect gateway docs, " Ingress Gateways are generally intended for enabling access into a Consul service mesh from within the same network. For public ingress products like NGINX provide more suitable features."

Does that help at all?

HI @DerekStrickland . Thanks for the reply.
I did not get to the point of setting up load balancer as that’s another thing I could setup wrong and I wouldn’t know what’s to blame.

I was trying to connect to the postgresql server (that’s in service mesh) from the hosts of both - Consul and Nomad. I have 3 VMs, all 3 running Consul and Nomad and all 3 Consuls and Nomads in server&agent configuration and I test stuff there. Even though I was on their hosts, I guess I was outside of the Nomad cluster? But I guess you could say that I was the load balancer.

Anyway, I just tried to disable firewall which I didn’t like doing, but I wanted to try it anyway. I stopped the deployment, ran nomad system gc and nomad system reconcile summaries, waited a bit, re-ran the job, waited a little again and it started to work.

So the good thing is there’s no problem with nomad or consul. Bad thing is I’ll have to figure out how to setup the firewall so it doesn’t have to be disabled. On all 3 VMs I had each other (their IPs) added into the trusted zone. I’ve noticed there’s a Nomad interface, maybe something to do with that? I don’t remember reading about any firewall rules in the docs, maybe some iptables settings during install, which used. I’ll have to try dig around for that.

So, I found the problem/solution.
Just like docker, when installed, creates it’s own firewall zone that looks like this:

# firewall-cmd --zone=docker --list-all
docker (active)
  target: ACCEPT
  icmp-block-inversion: no
  interfaces: docker0
  sources: 
  services: 
  ports: 
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules:

I’ve added the nomad interface to my trusted zone with:

# firewall-cmd --permanent --zone=trusted --add-interface=nomad
success
# firewall-cmd --zone=trusted --add-interface=nomad
success

restarted all the allocations (simple restart in-place doesn’t work) and it started to work even with firewall up.

Great! Glad you found a solution!