Nomad and Consul Connect with Expose

I think that this is more an issue with Nomad than Consul, but I’m not certain.

I’ve tried following the steps shown on this blog to add a service to the Consul Connect mesh but expose a single path for metrics to other hosts.

Where it seems to fail is that I can curl all paths on the Nomad bind port with no issues. I also see no mention in the headers of them going through the proxy. This may be intended, so I generally have set the host network for my ports to bind only to lo so that they are private and can only be reached through Consul Connect.

The issue when I do this is that even when exposing a path via the connect stanza, it’s not accessible. Nomad 1.4 doesn’t seem to show me the bindings for tasks, so I can’t see the ports there, but when I look into Consul, I can see that the main service is bound to a port on my loopback interface and that the Connect Sidecar is bound to another. If I curl the sidecar port with the /metrics path, I get an empty reply from the server.

It says on the Consul documentation that the Expose functionality is designed for something like metrics scraping, but it doesn’t seem like there is a convenient way to use it for that. Even if I manage to get the sidecar to respond to the /metrics path, I’m unable to set a meta tag on the consul service indicating sidecar host and port.

How are these capabilities supposed to function?

This is my job configuration:

variable "config_data" {
  type = string
  description = "Plain text config file for blocky"
}

job "blocky" {
  datacenters = ["dc1"]
  type = "system"
  priority = 100

  update {
    max_parallel = 1
    auto_revert = true
  }

  group "blocky" {

    network {
      mode = "bridge"

      port "dns" {
        static = "53"
      }

      port "api" {
        host_network = "loopback"
        to = "4000"
      }
    }

    service {
      name = "blocky-dns"
      port = "dns"
    }

    service {
      name = "blocky-api"
      port = "api"

      meta {
        metrics_addr = "${NOMAD_ADDR_api}"
      }

      tags = [
        "traefik.enable=true",
      ]

      connect {
        sidecar_service {
          proxy {
            local_service_port = 400

            expose {
              path {
                path = "/metrics"
                protocol = "http"
                local_path_port = 4000
                listener_port = "api"
              }
            }

            upstreams {
              destination_name = "redis"
              local_bind_port = 6379
            }
          }
        }

        sidecar_task {
          resources {
            cpu    = 50
            memory = 20
            memory_max = 50
          }
        }
      }

      check {
        name     = "api-health"
        port     = "api"
        type     = "http"
        path     = "/"
        interval = "10s"
        timeout  = "3s"
      }
    }

    task "blocky" {
      driver = "docker"

      config {
        image = "ghcr.io/0xerr0r/blocky"
        ports = ["dns", "api"]

        mount {
          type = "bind"
          target = "/app/config.yml"
          source = "app/config.yml"
        }
      }

      resources {
        cpu = 50
        memory = 50
        memory_max = 100
      }

      template {
        data = var.config_data
        destination = "app/config.yml"
        splay = "1m"
      }
    }
  }
}

Hi @ViViDboarder! I’ve extracted some relevant bits from your jobspec below.

network {
  port "api" {
    host_network = "loopback"
    to           = "4000"
  }
}

service {
  connect {
    sidecar_service {
      proxy {
        local_service_port = 400

        expose {
          path {
            path            = "/metrics"
            protocol        = "http"
            local_path_port = 4000
            listener_port   = "api"
          }
        }
      }
    }
  }
}

So what we’ve got here is the listener you’re exposing for the sidecar is listening on port 4000 inside the network namespace on localhost, and on port 4000 on the api port, which you’ve also bound to the loopback interface. It’s a little unclear what you’re intent here is:

  • If you want to be able to scrape metrics from outside the service mesh and another host, you need to bind the api port to a host network that’s exposed to those hosts.
  • If you want to be able to scrape metrics from outside the service mesh but the same host, you’ll need to use the port that’s been advertised for api: that’ll be in Consul. (Or you could add a static field to the network block.)
  • If you’re looking to scrape metrics from inside the service mesh, you probably don’t want the expose block there at all.

(Also, I’m assuming that the local_service_port = 400 is a copy-and-paste error and that you’re using 4000. Otherwise, you should fix that.)

Thanks. I’m looking to scrape metrics from outside the service mesh and on another host but prevent the other host from accessing paths other than /metrics.

If I change the api port to bind to the host network, I’m indeed able to access /metrics from other hosts but I can also access all other paths as well.

It’s unclear to me what the expose stanza really does in this case because it’s something registered with Consul and apparently Nomad binds the task directly to the host interface used and my curl requests never seem to pass through Consul at all to check for mTLS or an exposed path. This is why I’ve been binding to loopback to prevent stepping around Consul Connect.

Hi @ViViDboarder , it seems like expose is intended to do what you’re looking for - expose only the indicated path (e.g. /metrics) of a service. It works by configuring consul to configure the envoy sidecar to create a new listener, one bound to the expose.path.listener_port, which returns a 404 for all but the indicated path.

Working backwards, are you able to get the expose.path example (the 2nd code block on the page) from expose Stanza - Job Specification | Nomad | HashiCorp Developer to work as intended? If not, drawing form the example, what specifically is not working the way you expect?

When I try to curl the port that the sidecar is bound to (eg. curl http://192.168.2.101:31940/metrics), I’m unable to get any response for any path. I get a connection error curl: (56) Recv failure: Connection reset by peer.

While, instead if I use the other interface and port, the one bound to the service, all paths respond.

Does anyone have any advice on this? I’m beginning to think that this is a bug that I should report on Github. Not sure if it’s with Nomad or Consul though. Probably Consul as I can see that the “expose” path is registered.