I have a Nomad cluster running and want to deploy a job that holds open TCP connections for long periods of time. When I update the job with a new version, I want the now “old” jobs to receive a signal to shutdown (SIGHUP or something like that), so that the service will know it can safely exit when all TCP connections are gone. I do not want Nomad to kill any of those old jobs, no matter how long it takes for its connections to drain naturally. It could take a week for all connections to drain and I am OK with that. If the node running the old jobs suddenly shuts down, I would not want Nomad to restart those jobs, only new jobs.
I could not easily find how to fit an update strategy like this into a job’s “Update” configuration. It seems like everything is looking for actual timelines on updates and the new deployment won’t be marked healthy until the old deployment is fully shut down. Is my understanding accurate, or is there a way to fit the scheme I have above into the current job specification? If so, I would appreciate a sample. The only alternative I came up with was to create a brand new job for each deployment and effectively never update a job, relying on some templating and external tools to create job names and such. That idea felt somewhat gross, but it’d make me feel better if that was the expected pattern for a workload like this.
Thanks!