Waypoint deploy exits with non-descriptive error - ! error reading from server: EOF

christian.mena · August 3, 2022, 9:39pm

Hi,

I’m having an issue deploying nomad jobs that have the scheduler type ‘batch’. We’re using nomad periodic jobs to run scheduled tasks and have successfully deployed this to our QA and staging environments about 2 months ago. We then tried deploying to our production environment about 1 month ago and get the error mentioned ! error reading from server: EOF. No other message is printed so it’s been hard tracking down the issue.

I’ve tried simplifying both the waypoint.hcl and the nomad config, but it still fails. Not sure if its useful to note that we can deploy directly to nomad; this only fails with waypoint up or waypoint deploy.

Let me know if I can provide more information. I didnt include either the waypoint.hcl or nomad spec since even a barebones example file doesnt work. I’m looking for a solution or a nudge in the right direction since the error printed doesn’t give me much (I’ve tried -vvv too)

dcanadillas1 · May 31, 2023, 9:26pm

Check in the Nomad client logs where the the Waypoint runner (and task jobs) is running that there is no OOM killed. That was my case (also had the ! error reading from server: EOF) , so it was a memory issue in the runner. The solution was to increase memory for the runner job.

You just can install with following parameters:

waypoint server install -platform nomad \
-nomad-host "http://$(hostname -i):4646" \
-nomad-host-volume waypoint \
-nomad-runner-host-volume waypoint-runner \
-nomad-runner-memory 800 \
-accept-tos \
-vvv

In my case it worked to set runner memory to 800MBi (default is 600). But I suppose it will depend on the builds your runner will do.

Probably you can also configure a runner profile, using the parameter -plugin-config with the following configuration, so you don’t need to reinstall the runner:

{
        "datacenter": "dc1",
        "namespace": "default",
        "nomad_host": "$NOMAD_ADDR",
        "region": "global",
        "resources_cpu": "200",
        "resources_memory": "800"
}

I know this is an old thread but I was having the same error and issue during the last week. This solved it in my case.

Hope it helps.

cassie.coyle · June 7, 2023, 4:54pm

@dcanadillas1, if you encounter this again, mind adding the output from the waypoint job get-stream <id> - here are some docs on it? This is definitely a hard to catch issue with memory on the runners. Thanks for posting your solution! Here is some additional documentation on the watchJob.

Topic		Replies	Views
Waypoint-Nomad deployment using "nomad-jobspec Waypoint	3	650	June 13, 2023
HCP Waypoint: Unable to install Runner on Nomad Client Waypoint hcp , nomad	18	986	June 21, 2023
Nomad install: Error connecting to server: context deadline exceeded Waypoint nomad	4	915	February 23, 2023
Deploying Waypoint using a nomad jobspec Waypoint	1	501	July 22, 2022
Waypoint 0.6.2 Released Waypoint waypoint-release	10	1189	November 30, 2021

Waypoint deploy exits with non-descriptive error - ! error reading from server: EOF

Related topics