Consul Connect Connection Reset By Peer Errors

Curious if anyone else has this working properly or not. I’m testing a very simple setup with Consul Connect and Nomad integration. I’ve got a simple redis job with a sidecar proxy enabled, and another job running an ubuntu container that has the redis service as an upstream connection. The 2 jobs are up and running fine, but when I try to execute a “redis-cli -h 127.0.0.1 -p 9595 (local bind port) ping”, I get a connection reset by peer and don’t get the PONG reply from redis. I’ve also tried connecting to a Postgresql upstream as well, but when I execute a “psql -h 127.0.0.1 -p 9596 -U userA” I get a connection reset error as well.

2 Likes

Hi @dlightsey sorry you’re having trouble getting this working. Can you post your job files and Nomad & Consul config files? If you try running the countdash example, does that work?

Hi, thanks for the reply. Yes, I started with the countdash example and have that working perfectly. It’s only when I tried setting up a redis and postgresql that I started having this connection reset issue. My infra setup is 6 VM’s, 3 acting as servers and 3 as clients. Both node types have nomad and consul running in server and client respectively. I’ve uploaded my configs.

netshoot.txt (1.6 KB) redis.txt (1.6 KB) consul-client.txt (717 Bytes) nomad-client.txt (1.0 KB) consul-server.txt (722 Bytes) nomad-server.txt (762 Bytes)

I’m guessing you see the same problem either way, but FYI Consul does not yet support envoy v1.15 as a Connect proxy - that’s coming in Consul 1.9.

I created this example job file that I think can be used as a starting point for what you’re trying to do. With this I can exec into the “wait” task and contact redis through the Connect plumbing.

job "rediscon" {
  datacenters = ["dc1"]

  group "cache" {
    network {
      mode = "bridge"
      port "db" {
        to = 6379
      }
    }

    service {
      name = "redis"
      port = "6379"
      connect {
        sidecar_service {}
      }
    }

    task "redis" {
      driver = "docker"

      config {
        image = "redis:3.2"
      }
    }
  }

  group "poke" {
    network {
      mode = "bridge"
    }

    service {
      name = "poker"
      port = "9999" # irrelevant 

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "redis"

              # should be able to exec into this task and
              # contact redis on 127.0.0.1:6379, handled
              # by Connect 
              local_bind_port = 6379
            }
          }
        }
      }
    }

    task "wait" {
      driver = "exec"

      config {
        command = "/bin/sleep"
        args    = ["10000"]
      }
    }
  }
}

Hi @shoenig, thanks for all your help. After several permutations between what you provided and my own configs, I found the needle in the haystack as to why my configs weren’t working but yours do. Not sure if it is a bug, but I can successfully repeat the issue now. Turns out that the ‘port’ parameter in the service stanza does not like labels, and only works when I use a numeric value. I can get it to work 100% of the time when I use port = “6379”, and can cause it to fail 100% of the time when I use port = “db”.

I’m back up and running on our POC now. Thanks a million!

1 Like

Just to recap,

This works:
service {
name = “redis”
port = “6379”
connect {
sidecar_service {}
}
}

This does not work:
service {
name = “redis”
port = “db”
connect {
sidecar_service {}
}
}

1 Like

I think I have observed something related …
I sense there is definitely something missed since the service stanza was moved out of the task stanza and into the group stanza.

The port and ports has me confused as well.
(focusing on getting some simple things working before tackling whats up with that)

I know I am being vague … but I want to look deeper into the new example which gets generated as that is my starting point for new things I am trying out.

Unfortunately, this is causing a new issue for me. Because I can’t use the port labels from the network stanza, I can’t find a way to advertise the service using it’s dynamic port, which is causing me grief getting haproxy to find the backend services.

This should probably be submitted as a bug? Unless there is a workaround.

2 Likes

As a workaround, I’ve decided to create sidecar proxies for all of the services that my haproxy LB needs to ingress traffic to, and then in the backend stanza in haproxy, I define the servers as localhost and the port of the upstream local bind port, and that seems to be working pretty nice. I now have all the services communicating together internally across the sidecars, with an haproxy LB setup to route traffic into the stack. Not 100% clean, but works fine for now.

As a workaround, I’ve decided to create sidecar proxies for all of the services that my haproxy LB needs to ingress traffic to, and then in the backend stanza in haproxy, I define the servers as localhost and the port of the upstream local bind port, and that seems to be working pretty nice.

@dlightsey I’ve run into the same problem. Do you mind sharing the changes you made for HAProxy? Thanks.

Anecdata: dynamic ports trouble only Redis, but not MinIO. Does the redis protocol (versus HTTP for MinIO) have to do with this? :thinking:

It turned out that address_mode=alloc worked out for me (auto is default). E.g.

    network {
      mode = "bridge"
      port "gotenberg" {
         to = 3000
      }
    }

    service {
      name = "gotenberg"
      port = "gotenberg"
      address_mode = "alloc"
      connect {
        sidecar_service {}
      }
    }
2 Likes

Thank you for the address_mode=alloc tip, it fixed my traefik/whoami test job. This should probably make it way into the docs somewhere visible, or be considered a bug?

Oh my god, this definitely must be emphasized more in the documentation. I lost half a day for this to finally make it work.

1 Like