I am setting up a Nomad/Terraform cluster (along with Consul and Vault). I recently learned about Waypoint. I have been reading the docs and playing with installing it. When I run the install command, it appears to install itself as a typical Nomad job in my cluster.
I am wondering if I can just write a Nomad job
(or nomad-pack
) and use that to do an automated nomad job run
(or nomad-pack run
) to install Waypoint. It appears to be able to run as just another service on Nomad along with all my other services/applications.
I decided to just dive in and see if it would work. I was able to setup a nomad-pack
for the Waypoint server. It runs just fine on my Nomad cluster. However, once it is running it wants me to bootstrap it. So now I am trying to figure out the best way to automate the bootstrapping command:
waypoint server bootstrap -server-addr=[::]:9701 -server-tls-skip-verify
I could build my own Docker image using the current image, but I would rather not take that dependancy. Perhaps an RPC from a sidecar task…
I had the idea that maybe I could just do a waypoint install
in a nomad-pack
script. I
./waypoint install
-platform=nomad
-nomad-dc=default
-nomad-consul-datacenter=my-consul
-accept-tos
-nomad-host-volume=waypoint-volume
This was the result:
-> Initializing Nomad client...
-> Checking for existing Waypoint server...
! Error installing server into nomad: Unexpected response code: 403 (Permission denied)
The waypoint install
method was doomed from the beginning.
So I went back to manually running the server and trying to bootstrap it. Here is the task portion of my current nomad-pack
:
task "test" {
driver = "exec"
config {
command = "/bin/sh"
args = ["${NOMAD_TASK_DIR}/waypoint.sh"]
}
restart {
attempts = 0
}
volume_mount {
volume = "waypoint-volume"
destination = "/data"
}
resources {
cpu = 200
memory = 600
}
template {
destination = "${NOMAD_TASK_DIR}/waypoint.sh"
perms = "644"
data = <<EOF
#!/bin/sh
curl -Ls https://releases.hashicorp.com/waypoint/[[ .waypoint_server.version ]]/waypoint_[[ .waypoint_server.version ]]_linux_amd64.zip -o waypoint.zip
unzip -n waypoint.zip
./waypoint server run -accept-tos -vv -db=/alloc/data/data.db -listen-grpc=0.0.0.0:{{ env "NOMAD_PORT_server" }} -listen-http=0.0.0.0:{{ env "NOMAD_PORT_ui" }} &
./waypoint server bootstrap -server-addr=[::]:{{ env "NOMAD_PORT_server" }} -server-tls-skip-verify
EOF
}
}
After trying to run this in Nomad, the stdout
log file shows this:
Archive: waypoint.zip
inflating: waypoint
» Server configuration:
DB Path: /alloc/data/data.db
gRPC Address: [::]:9701
HTTP Address: [::]:9702
Auth Required: yes
Browser UI Enabled: yes
URL Service: api.waypoint.run:443 (account: guest)
» Server requires bootstrapping!
New servers must be bootstrapped to retrieve the initial auth token for
connections. To bootstrap this server, run the following command in your
terminal once the server is up and running.
waypoint server bootstrap -server-addr=[::]:9701 -server-tls-skip-verify
This command will bootstrap the server and setup a CLI context.
» Server logs:
BCkP8cw7qjruUDFLokWxZoriTgRMbuxjcdcH7kGSJ4eeTmCD2LjoQqq62uCaxZzv7Xs936pE4hzrcfpxVDVhDZJikbYqSs5m4kUMSemuDLrJwJMik2UjTpWYSgz7cr6x1nBsAMbPKcAnycvgG
The stderr
log file shows this:
2021-12-31T15:16:06.570Z [INFO] waypoint: waypoint version: full_string="v0.6.3 (bd303e12)" version=v0.6.3 prerelease="" metadata="" revision=bd303e12
2021-12-31T15:16:06.572Z [DEBUG] waypoint: home configuration directory: path=/nonexistent/.config/waypoint
2021-12-31T15:16:06.572Z [INFO] waypoint.server: opening DB: path=/alloc/data/data.db
2021-12-31T15:16:06.575Z [DEBUG] waypoint.server.singleprocess: checking if DB restore is requested
2021-12-31T15:16:06.575Z [DEBUG] waypoint.server.singleprocess: no restore file found, no DB restore requested
2021-12-31T15:16:06.582Z [DEBUG] waypoint.server.singleprocess.url_service: API token not set in config, initializing guest account
2021-12-31T15:16:06.582Z [DEBUG] waypoint.server.singleprocess.url_service: connecting to URL service to retrieve guest token: addr=api.waypoint.run:443 tls=true
2021-12-31T15:16:06.583Z [DEBUG] waypoint.server.singleprocess.url_service: waiting on server connection state to become ready
2021-12-31T15:16:06.706Z [DEBUG] waypoint.server.singleprocess.url_service: connection is ready
2021-12-31T15:16:06.826Z [DEBUG] waypoint.server.singleprocess.url_service: connection is ready
2021-12-31T15:16:06.826Z [INFO] waypoint.server.singleprocess.url_service: URL service client successfully initialized
2021-12-31T15:16:06.826Z [DEBUG] waypoint.server.grpc: starting listener: addr=0.0.0.0:9701
2021-12-31T15:16:06.826Z [INFO] waypoint.server.singleprocess.poll_queuer.application_statusreport: starting
2021-12-31T15:16:06.827Z [INFO] waypoint.server.singleprocess.prune: starting
2021-12-31T15:16:06.826Z [INFO] waypoint.server.singleprocess.poll_queuer.project: starting
2021-12-31T15:16:06.827Z [INFO] waypoint.server.grpc: TLS cert wasn't specified, a self-signed certificate will be created
2021-12-31T15:16:06.861Z [INFO] waypoint.server.grpc: listener is wrapped with TLS
2021-12-31T15:16:06.861Z [DEBUG] waypoint.server.http: starting listener: addr=0.0.0.0:9702
2021-12-31T15:16:06.862Z [INFO] waypoint.server.http: TLS cert wasn't specified, a self-signed certificate will be created
2021-12-31T15:16:06.908Z [INFO] waypoint.server.http: listener is wrapped with TLS
2021-12-31T15:16:06.908Z [INFO] waypoint.server: starting built-in server: addr=[::]:9701
2021-12-31T15:16:06.908Z [INFO] waypoint.server.http: starting HTTP server: ln=[::]:9702 addr=[::]:9702
2021-12-31T15:16:06.908Z [INFO] waypoint.server.grpc: starting gRPC server: addr=[::]:9701
2021-12-31T15:16:07.605Z [INFO] waypoint.server.grpc: /hashicorp.waypoint.Waypoint/GetVersionInfo request
2021-12-31T15:16:07.605Z [INFO] waypoint.server.grpc: /hashicorp.waypoint.Waypoint/GetVersionInfo response: error=<nil> duration=59.838µs
2021-12-31T15:16:07.616Z [INFO] waypoint.server.grpc: /hashicorp.waypoint.Waypoint/BootstrapToken request
2021-12-31T15:16:07.620Z [INFO] waypoint.server.grpc: /hashicorp.waypoint.Waypoint/BootstrapToken response: error=<nil> duration=4.407082ms
I am making progress. I now have the Waypoint server installed automatically with a nomad-pack
, in a remote Nomad cluster. It also runs the bootstrap automatically. In order to test that it is working remotely, I am setting up a context on my local machine:
$ waypoint context create
-server-addr=waypoint.mydomain.com:443
-server-require-auth
-server-auth-token=$WAYPOINT_TOKEN
waypoint-test
Then trying to get the status:
$ waypoint status -vvv
2022-01-04T17:24:41.360-0700 [INFO] waypoint: waypoint version: full_string="v0.6.3 (bd303e12)" version=v0.6.3 prerelease="" metadata="" revision=bd303e12
2022-01-04T17:24:41.360-0700 [TRACE] waypoint: starting interrupt listener for context cancellation
2022-01-04T17:24:41.360-0700 [TRACE] waypoint: interrupt listener goroutine started
2022-01-04T17:24:41.360-0700 [DEBUG] waypoint: home configuration directory: path=/Users/sunsparc/Library/Preferences/waypoint
2022-01-04T17:24:41.361-0700 [TRACE] waypoint: no API client provided, initializing connection if possible
2022-01-04T17:24:41.361-0700 [TRACE] waypoint.server: WithLocal set, server credentials optional
2022-01-04T17:24:41.361-0700 [INFO] waypoint.server: attempting to source credentials and connect
2022-01-04T17:24:41.362-0700 [DEBUG] waypoint.serverclient: connection information: address=waypoint.mydomain.com:443 tls=true tls_skip_verify=false send_auth=true has_token=true
2022-01-04T17:24:41.749-0700 [DEBUG] waypoint.server: connection established with sourced credentials
2022-01-04T17:24:41.749-0700 [TRACE] waypoint: requesting version info from server
2022-01-04T17:24:41.801-0700 [ERROR] waypoint: failed to create client: error="rpc error: code = Unavailable desc = stream terminated by RST_STREAM with error code: REFUSED_STREAM"
! failed to create client: rpc error: code = Unavailable desc = stream terminated by RST_STREAM with error code: REFUSED_STREAM
2022-01-04T17:24:41.802-0700 [TRACE] waypoint: stopping signal listeners and cancelling the context
According to the debug logs, it says it is actually getting a connection established
.
I am not sure this is a bug, so I did not want to post it as a Waypoint issue. My guess is that I have something configured incorrectly. The error appears to be rpc
related. A search mentions the same error in connection with grpc
.
I am not sure where to look. Any ideas?
Hi @SunSparc!
I’m guessing the issue here is getting the address/dns name of the server out of the nomad cluster after it starts. For a more stable setup, you likely need to use at minimum the consul service DNS name to contact the server. For a quick and unstable setup, you can use the direct allocation address.
In the last example you have, you have waypoint.mydomain.com:443
as the server address. It’s here that you’d put the consul service DNS for the waypoint server. Note that for the consul service DNS to work though, your client has to have consul setup as a valid DNS server as well.
2 Likes
@evanphx, yes, that was the problem. I had Waypoint installed behind a proxy (HAProxy) and there were some configurations that the Waypoint server required that I was not correctly passing. I decided to skip using the proxy and setup Waypoint to be accessed directly and it is now working.
1 Like
The next step in automating the Waypoint installation in my Nomad cluster is figuring out how to define “Waypoint Projects” for all of the repositories that my companies have.
I already have the waypoint server bootstrap
command running as a poststart
task on the Waypoint job in Nomad. Perhaps I can just expand the script to loop through a set of pre-defined waypoint.hcl
files after the bootstrap is complete. Seems a bit kludgy to do it this way. Any other ideas are welcome.
@SunSparc Any chance you have this documented/committed somewhere that you are willing to share? Was looking to do the same thing but not much found on the web about this.
The project has been shelved for a while. However, this is a nomad-pack
template that I was testing when I was last working on the project. No guarantees. Hopefully it provides you some help.
job "waypoint" {
datacenters = ["default"]
region = [[ .my.facility | quote ]]
priority = [[ .my.priority ]]
update {
max_parallel = 1
min_healthy_time = "5s"
healthy_deadline = "120s"
auto_revert = false
auto_promote = false
canary = 0
}
group "primary" {
count = [[ .my.instances ]]
restart {
attempts = 2
}
reschedule {
unlimited = true
}
network {
port "server" {
host_network = "public"
static = 9701
}
port "ui" {
host_network = "public"
static = 9702
}
}
volume "waypoint-volume" {
type = "host"
source = "cache"
}
task "server" {
driver = "exec"
config {
command = "/bin/bash"
args = ["${NOMAD_TASK_DIR}/waypoint_server_start.sh"]
}
service {
name = "waypoint-server"
port = "server"
tags = ["waypoint", "server"]
check {
name = "alive"
type = "tcp"
interval = "5s"
timeout = "10s"
}
}
service {
name = "waypoint-ui"
port = "ui"
tags = ["waypoint", "ui"]
}
template {
destination = "${NOMAD_TASK_DIR}/waypoint_server_start.sh"
perms = "644"
data = <<EOF
#!/bin/bash
set -eux
${NOMAD_ALLOC_DIR}/waypoint server run \
-accept-tos \
-vvv \
-db=/alloc/data/data.db \
-listen-grpc=:${NOMAD_PORT_server} \
-listen-http=:${NOMAD_PORT_ui} \
-advertise-tls=true \
-advertise-tls-skip-verify=true \
-advertise-addr=[[ .my.server_address ]] \
-tls-cert-file=${NOMAD_SECRETS_DIR}/ecosystem.pem \
-tls-key-file=${NOMAD_SECRETS_DIR}/ecosystem.pem
EOF
}
volume_mount {
volume = "waypoint-volume"
destination = "/data" # ${NOMAD_ALLOC_DIR}/data/
read_only = false
}
resources {
cpu = [[ .my.server_resources.cpu ]]
memory = [[ .my.server_resources.memory ]]
}
artifact {
source = "https://releases.hashicorp.com/waypoint/[[ .my.server_version ]]/waypoint_[[ .my.server_version ]]_linux_amd64.zip"
destination = "${NOMAD_ALLOC_DIR}/waypoint"
mode = "file"
}
template {
destination = "${NOMAD_SECRETS_DIR}/ecosystem.pem"
change_mode = "restart"
data = <<-EOT
{{- with secret "vault/path/to/encryption-certificate" -}}
{{- .Data.data.private_key -}} {{- (printf "\n") -}}
{{- .Data.data.certificate -}} {{- (printf "\n") -}}
{{- .Data.data.issuing_ca -}} {{- (printf "\n") -}}
{{- range .Data.data.ca_chain -}}
{{- . -}} {{- (printf "\n") -}}
{{- end -}}
{{- end -}}
EOT
}
vault {
policies = [
[[- range .my.server_vault_policies ]]
"[[ $.my.ecosystem ]]-nomad-job-[[ . ]]",
[[- end ]]
]
}
}
task "runner" {
driver = "raw_exec"
lifecycle {
hook = "poststart"
sidecar = "true"
}
config {
command = "/bin/bash"
args = ["${NOMAD_TASK_DIR}/waypoint_runner.sh"]
}
template {
destination = "${NOMAD_TASK_DIR}/waypoint_runner.sh"
perms = "644"
data = <<EOF
#!/bin/bash
set -eux
mkdir -p workspace
export WORKDIR=$(pwd)
echo "workdir: $WORKDIR"
until [ -s ${NOMAD_ALLOC_DIR}/bootstrap.token ]; do
sleep 1
done
# manually run a static runner
export WAYPOINT_SERVER_ADDR="localhost:{{ env "NOMAD_PORT_server" }}"
export WAYPOINT_SERVER_TLS=true
export WAYPOINT_SERVER_TLS_SKIP_VERIFY=true
export WAYPOINT_SERVER_TOKEN=$(<{{env "NOMAD_ALLOC_DIR"}}/bootstrap.token)
#export WAYPOINT_LOG_LEVEL=trace # trace, debug, info, warn, error
${NOMAD_ALLOC_DIR}/waypoint runner agent
EOF
}
}
task "projects" {
driver = "exec"
lifecycle {
hook = "poststart"
sidecar = "false"
}
config {
command = "/bin/bash"
args = ["${NOMAD_TASK_DIR}/waypoint_projects.sh"]
}
template {
destination = "${NOMAD_TASK_DIR}/waypoint_projects.sh"
perms = "644"
data = <<EOF
#!/bin/bash
set -eux
until [ -s ${NOMAD_ALLOC_DIR}/bootstrap.token ]; do
sleep 1
done
echo "creating context..."
${NOMAD_ALLOC_DIR}/waypoint context create \
-server-addr="localhost:{{ env "NOMAD_PORT_server" }}" \
-server-require-auth \
-server-auth-token=$(<{{env "NOMAD_ALLOC_DIR"}}/bootstrap.token) \
-server-tls \
-server-tls-skip-verify \
default-context
echo "waypoint context list..."
${NOMAD_ALLOC_DIR}/waypoint context list
echo "waypoint status..."
${NOMAD_ALLOC_DIR}/waypoint status
while [ "$(${NOMAD_ALLOC_DIR}/waypoint runner list)" = "No runners found" ]; do
echo "waiting for runner"
sleep 1
done
${NOMAD_ALLOC_DIR}/waypoint runner list
echo "waypoint project list..."
${NOMAD_ALLOC_DIR}/waypoint project list
#WAYPOINT_SERVER_TOKEN
echo "project applying..."
${NOMAD_ALLOC_DIR}/waypoint project apply \
-poll \
-poll-interval="30s" \
-data-source=git \
-git-auth-type=ssh \
-git-private-key-path=${NOMAD_SECRETS_DIR}/repository-key \
-git-url=git@private-repository:username/test-project-for-waypoint.git \
-waypoint-hcl="${NOMAD_ALLOC_DIR}/waypoint-hcl-files/test-project-file.hcl" \
test-project-for-waypoint
echo "waypoint project list..."
${NOMAD_ALLOC_DIR}/waypoint project list
echo "waypoint project status..."
${NOMAD_ALLOC_DIR}/waypoint status -project=test-project-for-waypoint
EOF
}
restart {
attempts = 0
}
template {
destination = "${NOMAD_ALLOC_DIR}/waypoint-hcl-files/test-project-file.hcl"
data = <<EOF
app "test-project-for-waypoint" {
build {
use "docker" {}
}
deploy {
use "docker" {}
}
}
EOF
}
template {
destination = "${NOMAD_SECRETS_DIR}/repository-key"
data = <<EOF
{{- with secret "vault/path/to/repository-key"}}{{.Data.data.private_key}}{{end}}
EOF
}
vault {
policies = [
[[- range .my.server_vault_policies ]]
"[[ $.my.ecosystem ]]-nomad-job-[[ . ]]",
[[- end ]]
]
}
}
task "bootstrap" {
driver = "exec"
lifecycle {
hook = "poststart"
sidecar = false
}
config {
command = "/bin/bash"
args = ["${NOMAD_TASK_DIR}/waypoint_bootstrap.sh"]
}
template {
destination = "${NOMAD_TASK_DIR}/waypoint_bootstrap.sh"
perms = "644"
data = <<EOF
#!/bin/bash
set -eu
{{env "NOMAD_ALLOC_DIR" }}/waypoint server bootstrap -server-addr=:{{ env "NOMAD_PORT_server" }} -server-tls-skip-verify > {{ env "NOMAD_ALLOC_DIR" }}/bootstrap.token
EOF
}
}
}
}