Fabio creates routes with container internal IPs for Nomad Job

I am testing a POC with a Nomad Cluster (3 Nodes and 3 Clients), Consul Cluster (3 Nodes) and Fabio running as systemd service on Nomad Clients. Consul agent running on all 9 servers.
Everything looks good on config wise and all the services report healthy in Consul.

I’m running into a problem when I execute a job with a container that has port 8080 exposed. Job runs fine, however Fabio is advertising or rather creating routing tables using container internal IPs and exposed port instead of using the actual Server’s or Host IPs and dynamic Nomad port range.

This is what I see in Fabio: (The IPs below are from Container Specific Network)

1              tomcat-test        /tomcat-test      http://10.88.0.15:8080/ strip=/tomcat-test          33.33%
2              tomcat-test        /tomcat-test      http://10.88.0.13:8080/ strip=/tomcat-test          33.33%
3              tomcat-test        /tomcat-test      http://10.88.0.12:8080/ strip=/tomcat-test          33.33%

Whereas the app instances show up like this on Nomad: (The actual server IPs and Nomad Dynamic Ports)

Name                    Host Address                                    Mapped Port
http                       10.201.2.203:21246                          8080
http                       10.201.2.204:22048                          8080
http                       10.201.2.204:28093                          8080

All Nomad Clients with Fabio service are configured on external load balancer with one node active at any time in the pool. When I try to access the service via urlprefix externally, the node that’s active in the pool during that time, if it routes the request to the container IP running on that node, I can access the service fine. But that’s not the case. Fabio does round robin and on my next attempt the same node points to the container IP on different node and since that IP cannot be reached from the node serving the request, I get ‘Page not working’. To get back to the site/service, I have to hit refresh 3 times. The 3rd attempt goes to the container IP that’s on the serving node from load balancer pool and I get the page again.

Here’s my job:

job "tomcat-test" {
  datacenters = ["dev-dc"]
  type        = "service"

  update {
    max_parallel     = 1
    min_healthy_time = "30s"
    healthy_deadline = "5m"
    auto_revert      = false
    canary           = 3
    health_check     = "checks"
  }
  
  group "test-group" {
    count = 3

  network {
        port "http" {
          to = 8080 #Port exposed in container
        }
  }

  task "tomcat-test" {
    driver = "podman"
    config {
      image = "nexus.mydomain.com:8081/my-ubi/rhel8-tomcat9:latest"
      auth {
        username = "nomad-user"
        password = "N0madUs3r"
      }
          ports = ["http"]
     }

    service {
      name = "tomcat-test"
          port = "http"
      tags = [
        "urlprefix-/tomcat-test strip=/tomcat-test",
      ]
      check {
        type     = "http"
        path     = "/"
        interval = "2s"
        timeout  = "2s"
      }
    }
   }
 }
}

How can I make Fabio advertise or register the app service with Host Specific IPs and Nomad Dynamic Ports?

I’m curious - if you run the job I’ve attached do you see the behavior you want (just on port 9090 in this case

job "fake-service-job" {
  region      = "global"
  datacenters = ["[[ .cluster_name ]]"]
  type        = "service"

  update {
    stagger      = "10s"
    max_parallel = 1
  }

  group "fake-service-api-group" {
    count = 3

    network {
        mode  = "bridge"
        port "http" {
            to = "9090"
        }
    }

    service {
      name = "fake-service-api"
      port = "http" #  advertise the dynamic port
      tags = ["urlprefix-/fake-service-api"]
      #  Add a health check to make Fabio proxy traffic to it.
      check {
          name = "fake-service-api-check"
          type = "http"
          port = "http"
          path = "/"
          interval = "5s"
          timeout  = "2s"
      }
    }

    task "fake-service-api" {
        driver = "docker"

        config {
            image = "nicholasjackson/fake-service:v0.12.0"
        }

        env {
            LISTEN_ADDR = "0.0.0.0:9090"
            NAME = "${NOMAD_TASK_NAME}"
            MESSAGE = "Hello from ${NOMAD_TASK_NAME}:${NOMAD_ALLOC_ID} HostIp: ${attr.unique.network.ip-address}"
            HTTP_CLIENT_KEEP_ALIVES = "false"
        }
    }
  }
}

I suspect you may need to move your service stanza up to the group level.

2 Likes

Thank you very much for the response. I made changes to my job file based on yours.

job "tomcat-test" {
  datacenters = ["dev"]
  type        = "service"

  update {
    stagger      = "10s"
    max_parallel = 1
  }
  
  group "test-group" {
    count = 3

  network {
    port "http" {
      to = 8080
    }
  }

  service {
    name = "tomcat-test"
    port = "http"
    tags = [
      "urlprefix-/tomcat-test strip=/tomcat-test",
    ]
    check {
      type     = "http"
	  port     = "http"
      path     = "/"
      interval = "2s"
      timeout  = "2s"
    }
  }
  task "tomcat-test" {
    driver = "podman"
    config {
      image = "nexus.mydomain.com:8081/my-ubi/rhel8-tomcat9:latest"
      auth {
        username = "nomad-user"
        password = "N0madUs3r"
      }
	  ports = ["http"]
     }
   }
  }
}

And the Fabio registers the services with correct host IPs and Nomad’s dynamic ports.
Here’s how things look in Consul:

_nomad-task-24675b72-def3-4e4e-2644-a2a7ce6d98db-group-test-group-tomcat-test-http
Registered via Nomad	All service checks passing	All node checks passing	nomadclient01	10.201.2.203:25447	urlprefix-/tomcat-test strip=/tomcat-test
_nomad-task-790e103b-6f05-05c4-09ca-cdb121abb6d2-group-test-group-tomcat-test-http
Registered via Nomad	All service checks passing	All node checks passing	nomadclient02	10.201.2.204:20881	urlprefix-/tomcat-test strip=/tomcat-test
_nomad-task-dfd8160c-4305-239a-ae7b-df5cc2aeeffc-group-test-group-tomcat-test-http
Registered via Nomad	All service checks passing	All node checks passing	nomadclient03	10.201.2.205:23116	urlprefix-/tomcat-test strip=/tomcat-test

And here’s what Fabio shows:

1	tomcat-test	/tomcat-test	http://10.201.2.205:23116/	strip=/tomcat-test	33.33%
2	tomcat-test	/tomcat-test	http://10.201.2.204:20881/	strip=/tomcat-test	33.33%
3	tomcat-test	/tomcat-test	http://10.201.2.203:25447/	strip=/tomcat-test	33.33%

A happy place! :smile: :vulcan_salute:
Thank you @idrennanvmware very much once again!! :pray: :pray:

2 Likes

Happy to help. Glad it worked!

1 Like