Nomad Java Driver Bridge Networking UnknownHostException

When running Java Spring directly on the Nomad client, and going to the address in the browser, it works fine. When I run it via the Java driver with bridged networking I get the following exception. I did not specify any DNS configuration in the job spec so curious why the isolation layer is complaining about the server hostname when the actual host isn’t? And what do I need to do to get around this?

java.net.UnknownHostException: nomad-client-0: nomad-client-0: Temporary failure in name resolution

Job spec:

job "backend-java" {
  datacenters = ["us-west-2"]

  group "java" {
    network {
      mode = "bridge"
      port "http" {
        to = 8080
      }
    }

    service {
      name = "backend-java"
      port = "8080"

      connect {
        sidecar_service {}
      }
    }
    
    task "java" {
      driver = "java"

      config {
        jar_path    = "local/repo/complete/build/libs/rest-service-0.0.1-SNAPSHOT.jar"
        jvm_options = ["-Xmx300m", "-Xms100m"]
        args = ["--server.port=${NOMAD_PORT_http}"]
      }

      artifact {
        source = "git::https://github.com/romasi-projects/backend-java"
        destination = "local/repo"
      }

      resources {
        cpu = 300
        memory = 300
      }
    }
  }
}

Pinging from within the “container” hits the same error. Seems like DNS isn’t being routed to the host, which should be the default per the docs.

ubuntu@ip-10-0-10-163:~$ nomad alloc exec -i -t -task java 51eb18c6 /bin/bash
nobody@ip-10-0-10-163:/$ ping www.google.com
ping: www.google.com: Temporary failure in name resolution

And the /etc/resolv.conf files match between the host and “container”. How has no one else hit this already? The job spec is pretty basic.

nameserver 127.0.0.53
options edns0 trust-ad
search us-west-2.compute.internal

Adding the below DNS block seems to resolve the “Temporary failure in name resolution” issue, but it raises a separate java.net.UnknownHostException: ip-10-0-10-163: ip-10-0-10-163: Name or service not known instead now. I can resolve external hostnames now, just not the local one…

group "java" {
  network {
    mode = "bridge"
    port "http" {
      to = 8080
    }
    dns {
      servers = []
      options = []
      searches = []
    }
  }

The above DNS block sets the /etc/resolv.conf to the following.

nameserver 127.0.0.1
nameserver 10.0.0.2
search us-west-2.compute.internal

The Java application is failing to get localhost using the code snippet before. The /etc/hosts and /etc/resolv.conf files on the host and the “container” (cgroup + namespaces) match.

InetAddress ip = InetAddress.getLocalHost();
int serverPort = serverProperties.getPort();
return new Greeting(String.format(template, ip.getHostAddress() + ":" + String.valueOf(serverPort)));

Running certain hostname commands give the same error, while others don’t (e.g. hostname -I)

nobody@ip-10-0-10-163:/$ hostname -i
hostname: Name or service not known

nobody@ip-10-0-10-163:/$ hostname -f
hostname: Name or service not known

Very confused why this works fine on the host but not in the “container”. Trying to get a PoC of Consul and Nomad working for my company but it is getting snagged on seemingly trivial problems =(.

Any help would be appreciated. Thanks.

Bumping this thread, would love to know if anyone has found a solution to this. It is indeed quite surprising that this has not been discussed before.

A very simple usecase that I am struggling with…

I’m running nomad in AWS, and my spring application running on a java driver is not able to hit RDS using its hostname. If I set the network mode to host, and don’t use a sidecar, things work fine.

I can also see that a docker container running in bridge mode with a sidecar proxy has no problems resolving the hostname for RDS on the same client.

Any workaround would be appreciated!

I could suggest something:
you could set it up as a raw_exec instead of the java driver.
A small script using the template block could be the stub which starts the application.
The health-checks, etc. would be then much simpler as everything is on the machine OS itself.

Hi @romasi :wave:

I’m not an expert on AWS networking, but I think you will need to specify the AWS internal DNS IP in your job.

Here’s the job that used for testing:

job "petclinic" {
  datacenters = ["dc1"]

  group "petclinic" {
    network {
      mode = "bridge"

      dns {
        servers = ["169.254.169.253"]
      }

      port "http" {}
    }

    task "petclinic" {
      driver = "java"

      config {
        jar_path    = "local/spring-petclinic-1.0.jar"
        jvm_options = ["-Xmx512m", "-Xms256m", "-Dserver.port=${NOMAD_PORT_http}"]
      }

      artifact {
        source      = "https://github.com/lgfa29/spring-petclinic/releases/download/v1.0/spring-petclinic-1.0.jar"
        destination = "local"
      }

      resources {
        memory = 512
      }
    }
  }
}

I’m not sure if the IP 169.254.169.253 is the same in all regions, so you might need to check the AWS documentation for more details.

@animeshjain,

I didn’t test this with RDS, but I think it would work as well. I was able to curl internal hostnames and IPs without problems.

Give it a try and let me know how it goes :slightly_smiling_face: