Unable to deploy applications with waypoint runner when Nomad Server has TLS endpoint

Is it possible to have waypoint on demand runners to connect to nomad via TLS?

When running waypoint up I am getting ! operation canceled and in logs I’ve noticed that the waypoint runner or server are not able to connect to Nomad server

error starting task: job_id=01FXGJ8J06T30PPNHH905NQXCY job_op=*gen.Job_StartTask error="rpc error: code = Unknown desc = Put "https://nomad.service.brain.consul:4646/v1/jobs?namespace=default®ion=eu-west-1": x509: certificate signed by unknown authority"

I didn’t find anything about this topic in the documentations.

For more context please check this issue unable to deploy applications with waypoint runner when Nomad Server has TLS endpoint · Issue #3074 · hashicorp/waypoint · GitHub

The underlying issue here is that your Nomad server certificate is signed by a CA that is unknown to your waypoint runner. The correct way to solve unknown CA issues is to always sign certificates with a CA known to all of your infrastructure. How to do that, and how to get a known CA on your waypoint runner(s) is outside the scope that I want to cover.

I’m going to make some simplifying assumptions to help with your immediate problem.

  1. You’re running Nomad/Waypoint Runners in an environment you trust
  2. That that environment is not production
  3. You will follow these directions at your own risk and know that skipping TLS verification is an obvious production risk.

Below is an example Waypoint Runner Nomad jobspec you can use. The key elements being the environment variables: WAYPOINT_SERVER_TLS_SKIP_VERIFY and NOMAD_SKIP_VERIFY. Setting these to true means that the waypoint runner will not attempt to verify your certificate chains of trust.
If you currently install your waypoint runner with waypoint install, I personally find that approach too rigid, which is why I choose to deploy a custom Nomad jobspec instead.

Nomad Jobspec

job "waypoint-runner" {
  datacenters = ["dc1"]

  type = "service"

  group "service" {
    count = 1

    task "waypoint-runner" {
      driver = "docker"

      env {
        NOMAD_ADDR = "https://nomad.service.consul:4646"
        NOMAD_SKIP_VERIFY = "true"
        NOMAD_TOKEN = "<NOMAD TOKEN>"
        WAYPOINT_SERVER_ADDR = "waypoint-server.service.consul:9701"
        WAYPOINT_SERVER_TLS = "true"
        WAYPOINT_SERVER_TLS_SKIP_VERIFY = "true"
        WAYPOINT_SERVER_TOKEN = "<WAYPOINT SERVER TOKEN>"
      }

      config {
        image = "hashicorp/waypoint"

        # allow images to be pulled from public repo
        auth_soft_fail = "true"

        args = [
          "runner",
          "agent",
        ]
      }

      resources {
        cpu    = 256 # MHz
        memory = 1024 # MB
      }
    }
  }
}

The above jobspec uses Waypoint’s public waypoint image. If you’d like to solve this problem correctly, you could create your own docker image, add your cluster’s CA certificate to it, and use a custom docker image. This approach would be preferable to skipping TLS verification.

2 Likes