Nomad - Persist Docker Container When Host Loses Connectivity With Server

Hello,

I am new to Hashistack and I am doing a Proof of Concept for the company I work for.

The scenario is:

Hashistack → Consul + Nomad + Vault

  1. Nomad Server on AWS instance
  2. Few clients in different availability zones communicating with the server via WAN, not necessarily on AWS.

I need to make it work in a way that:

  1. I can run docker containers on the clients remotely through the nomad server. Therefore I can update tags/images/redeploy.
  2. When the client loses internet connectivity, the container must persist running on the client and when wan connectivity is available again, if that container is running, nothing should be changed.

I have the whole environment already set up and running, but the issue is: whenever the client node loses wan connectivity and comes back on again, nomad dispatch action to start a new container and kills the previously running container.

Also, is there a way to run containers with nomad without being a service? I mean, make the container run in the client node but if we stop the job from running on the server node, the container will not stop running on the client?

Thank you all!

Updating this topic…

I’ve already tested running docker with shell scripts using raw_exec driver, and it works fine. But if I could use the docker driver would be best because we would have the ability to manage the allocation/container from the nomad cluster.

Anyone?

Hi @luizhfff :wave:

In general Nomad clients will never stop a container unless specifically told to do so by a server. You can change this behavior by setting a value to stop_after_client_disconnect, but that seems to be the opposite of what you want.

This could be caused by client missing a heartbeat against the server, so the server thinks the client is gone and reschedule the container somewhere else.

You can set additional hearbeat interval by configuring the heartbeat_grace server configuration.

Try setting to a higher value and see if that helps.

Not with the Docker driver. The idea of having jobs register against the servers is so that there’s a consistent view of your cluster. Leaving containers behind like this would prevent Nomad from doing proper scheduling.

2 Likes

@lgfa29, thank you very much for the help. I have tested your suggestion to configure heartbeat_grace in the server and that unexpected behaviour is now gone.

I had tested using raw_exec driver to deploy containers via shell scripts but we would lose the ability to manage the allocations from the server, which is awesome.

Best regards from a Brazilian fellow. :slight_smile:

1 Like

Glad it worked!

Aproveite o Nomad :grinning_face_with_smiling_eyes: