Waypoint-Nomad deployment using "nomad-jobspec

Hoping someone can lead me in the right direction here, attempting to setup waypoint-nomad.

Can get waypoint to deploy into a local instance of nomad.

Waypoint is running and can be accessed from local IP. Waypoint init from project source directory will execute just fine and insert a new project and project application. After setting up variables inside of Waypoint and attempting to run waypoint up. Error message below is received.

! operation canceled

Here is the waypoint.hcl file attempting to run:


project = "Bad Apps Development"

app "notification" {

    build {

        use "docker" {
            dockerfile = "./docker/Dockerfile"
            no_cache = true
        }

        registry {
            use "docker" {
                image = "badappsdev/notification"
                tag = "latest"
                local = false

                auth {
                    username = "badappsdev"
                    password = var.docker_password
                }
            }
        }
    }

    deploy {
        use "nomad-jobspec" {
            jobspec = templatefile("${path.app}/nomad/notification.nomad.tpl")
        }
    }
}

variable "docker_password" {
    type = string
    default = "password"
    sensitive = true
}

variable "db_dsn" {
    type = string
    default = "localhost:4406"
    sensitive = true
}

variable "send_grid_api_key" {
    type = string
    default = "key"
    sensitive = true
}

variable "github_token" {
    type = string
    default = "token"
    sensitive = true
}

Any suggestions or help is greatly appreciated!

Hey @trevatk - When I see an error like what you shared, I have to think maybe the Waypoint server panicked somehow. I recommend digging into the server logs, as you might see a stacktrace there from when the operation failed. If you find it, can you please share them here or on our GitHub repo so the team can take a look? Thank you!

@briancain

Attached the waypoint server logs after attempting to run waypoint up. Looks like Waypoint is unable to connect to Nomad… But that doesn’t make any sense since the Waypoint server is installed through Nomad. Everything was able to communicate when waypoint was installed.

2023-03-01T23:36:26.757Z [DEBUG] waypoint.server.grpc: job stream recv loop exiting due to error: job_id=01GTFRHHST43Y8YR68ZCVZ36SD runner_id=static error="Put \"http://localhost:4646/v1/jobs?namespace=default&region=global\": dial tcp 127.0.0.1:4646: connect: connection refused"
2023-03-01T23:36:26.759Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/RunnerJobStream response: error=<nil> duration=3.316449016s
2023-03-01T23:36:26.763Z [DEBUG] waypoint.server.grpc: job state change: job_id=01GTFRHHSSRSVB5J1QEESJHH4S state=ERROR
2023-03-01T23:36:26.764Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/GetJobStream response: error=<nil> duration=67.683812ms
2023-03-01T23:36:26.764Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/GetVersionInfo request
2023-03-01T23:36:26.764Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/GetVersionInfo response: error=<nil> duration="172.889µs"
2023-03-01T23:36:26.764Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/RunnerJobStream request
2023-03-01T23:36:26.765Z [INFO]  waypoint.server.grpc: waiting for job assignment: runner_id=static
2023-03-01T23:36:26.766Z [DEBUG] waypoint.server.grpc: sending job assignment to runner: job_id=01GTFRHHST50E27N98XRRDAFEW runner_id=static
2023-03-01T23:36:26.767Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/GetJob request
2023-03-01T23:36:26.767Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/GetJob response: error=<nil> duration="146.76µs"
2023-03-01T23:36:26.768Z [DEBUG] waypoint.server: heartbeat timer set: job=01GTFRHHST50E27N98XRRDAFEW timeout=2m0s
2023-03-01T23:36:26.773Z [DEBUG] waypoint.server.grpc: job stream recv loop exiting due to completion: job_id=01GTFRHHST50E27N98XRRDAFEW runner_id=static
2023-03-01T23:36:26.789Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/RunnerJobStream response: error=<nil> duration=33.374550602s
2023-03-01T23:36:26.790Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/RunnerJobStream request
2023-03-01T23:36:26.790Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/GetVersionInfo request
2023-03-01T23:36:26.791Z [INFO]  waypoint.server.grpc: waiting for job assignment: runner_id=static
2023-03-01T23:36:26.791Z [INFO]  waypoint.server.grpc: /hashicorp.waypoint.Waypoint/GetVersionInfo response: error=<nil> duration="326.548µs"

might be helpful to post the nomad-jobspec so here it is


job "notification.badappsdevelopment.com" {
  
  datacenters = ["dc1"]
  group "badappsdevelopment" {
    
    count = 1
    network {
      port "http" {
        static = 11030
      }
      port "grpc" {
        static = 50061
      }
    }

    task "server" {
      driver = "docker"
      config {
        image = "badappsdev/notification:latest"
        ports = ["http", "grpc"]
        auth {
          username = "badappsdev"
          password = var.docker_password
        }
      }

      service {
        name = "notification-badappsdevelopment-com"
        tags = [
          "traefik",
          "traefik.enable=true"
        ]
        provider = "consul"
        port = "http"
        meta {
          meta = "notification micro service"
        }

        check {
          type = "http"
          name = "notification_health"
          port = "http"
          path = "/api/v1/health"
          interval = "30s"
          timeout = "5s"
        }
      }

      env {
        HTTP_SERVER_HOST = "0.0.0.0"
        HTTP_SERVER_PORT = 11030
        GRPC_SERVER_HOST = "0.0.0.0"
        GRPC_SERVER_PORT = 50061
        PORT = 11030
        DB_DSN = var.db_dsn
        SEND_GRID_API_KEY = var.send_grid_api_key
      }

      resources {
        
        cpu = 500
        memory = 128
      }

    }
  }
}

variable "docker_password" {
  type = string
  default = "docker"
}

variable "db_dsn" {
  type = string
  default = "localhost:3306"
}

variable "send_grid_api_key" {
  type = string
  default = "key"
}

Hey @trevatk, I think this could be happening because of what’s indicated in the very first log line (above). Near the end there, Waypoint appears to be connecting to localhost:4646 in search of a Nomad API endpoint. If your Nomad cluster isn’t accessible at that address from Waypoint, it won’t be able to make the connection. I recommend using runner configuration to set the Nomad address:

$ waypoint config set -runner NOMAD_ADDR=http://nomad.service.consul:4646

I gave the example address here of http://nomad.service.consul:4646, but in place of that, you should enter in a Nomad address which is accessible from your Waypoint runners.

In addition to that, I’d also recommend setting the Nomad address in the runner profile which is launching on-demand runners, with this plugin configuration. In your runner profile, you can set this using the -plugin-config flag on the waypoint runner profile set command.