Nomad - Persist Docker Container When Host Loses Connectivity With Server

luizhfff · April 8, 2021, 8:55pm

Hello,

I am new to Hashistack and I am doing a Proof of Concept for the company I work for.

The scenario is:

Hashistack → Consul + Nomad + Vault

Nomad Server on AWS instance
Few clients in different availability zones communicating with the server via WAN, not necessarily on AWS.

I need to make it work in a way that:

I can run docker containers on the clients remotely through the nomad server. Therefore I can update tags/images/redeploy.
When the client loses internet connectivity, the container must persist running on the client and when wan connectivity is available again, if that container is running, nothing should be changed.

I have the whole environment already set up and running, but the issue is: whenever the client node loses wan connectivity and comes back on again, nomad dispatch action to start a new container and kills the previously running container.

Also, is there a way to run containers with nomad without being a service? I mean, make the container run in the client node but if we stop the job from running on the server node, the container will not stop running on the client?

Thank you all!

luizhfff · April 12, 2021, 5:33pm

Updating this topic…

I’ve already tested running docker with shell scripts using raw_exec driver, and it works fine. But if I could use the docker driver would be best because we would have the ability to manage the allocation/container from the nomad cluster.

Anyone?

lgfa29 · April 15, 2021, 1:35am

Hi @luizhfff

In general Nomad clients will never stop a container unless specifically told to do so by a server. You can change this behavior by setting a value to stop_after_client_disconnect, but that seems to be the opposite of what you want.

This could be caused by client missing a heartbeat against the server, so the server thinks the client is gone and reschedule the container somewhere else.

You can set additional hearbeat interval by configuring the heartbeat_grace server configuration.

Try setting to a higher value and see if that helps.

Not with the Docker driver. The idea of having jobs register against the servers is so that there’s a consistent view of your cluster. Leaving containers behind like this would prevent Nomad from doing proper scheduling.

luizhfff · April 16, 2021, 4:39pm

@lgfa29, thank you very much for the help. I have tested your suggestion to configure heartbeat_grace in the server and that unexpected behaviour is now gone.

I had tested using raw_exec driver to deploy containers via shell scripts but we would lose the ability to manage the allocations from the server, which is awesome.

Best regards from a Brazilian fellow.

lgfa29 · April 16, 2021, 11:35pm

Glad it worked!

Aproveite o Nomad

Topic		Replies	Views
Dockerize nomad Nomad	4	1107	October 20, 2021
Nomad plugin server error (use of closed network connection) And client: task “api” for alloc “X” failed: wait exit code 1 Nomad	5	519	April 6, 2020
How to configure hashicorp nomad network part? Nomad	2	969	May 16, 2021
Nomad + Consul Connect Nomad	2	1418	September 16, 2019
Container starts then immediately stops Nomad	0	1100	December 22, 2022

Nomad - Persist Docker Container When Host Loses Connectivity With Server

Related topics