RecoverTask for custom Driver

Hi, I feel like the doc for the RecoverTask interface method is pretty scarce on the nomad documentation site.

For example I saw in this issue: When Nomad is restarted, the successful Job will also perform the Recover task · Issue #10449 · hashicorp/nomad · GitHub

that sometimes RecoverTask is not called when a task is rescheduled.

The RecoverTask operation is what happens when the Nomad client tries to sync its local state store with the state of the running tasks for the task drivers. So for example, with a Docker task, the client will have the Docker ID and ask dockerd for a handle to a container with that ID.
Normally when the Nomad client stops, tasks will keep on running. After some time, the Nomad server will declare these tasks as “lost” and reschedule them.

Can we override this behavior with a custom reschedule stanza? What happens if the reschedule is set to never reschedule: are the tasks still marked as lost or will the server wait indefinitely for the client to come back online to recover the task?

Do you guys have some overview diagram of the whole lifecycle of a task and which factors influence the lifecycle? I feel like I can’t really test my custom nomad driver since I don’t know how the calls to the driver are made based on the lifecycle sync between server/client.

My custom nomad driver has really sticky jobs that can’t be moved that easily to another client. However I want to provide failure recovery in case the client crashes.

Would really appreciate if someone could provide more information on that topic.

Hi @gtestault! There’s kind of a few different questions here, so I’ll try to break it down into parts.

Recover Task

The docs for this are definitely not great, because it’s missing that the RecoverTask operation is done on startup of the client node. So that’s the following order of operations:

  • Client is running
  • Client receives a new allocation
    • Client starts up an alloc_runner
    • Client starts a docker container (or whatever)
    • Client persists the state of the alloc_runner to its local state store.
  • Client is shut down (ex. for upgrade)
    • This shuts down the alloc_runner in the agent
    • The docker container is left running!
  • Client restarts
    • Client calls RecoverTask on all the tasks in the state store so that it can reattach an alloc_runner.
    • Client checks in with server and possibly stops those allocs because they were marked “lost” and were rescheduled by the server while the client was shut down.

Lost Allocs Handling

Can we override this behavior with a custom reschedule stanza? What happens if the reschedule is set to never reschedule: are the tasks still marked as lost or will the server wait indefinitely for the client to come back online to recover the task?

You can’t quite set reschedule to never reschedule because zero values will give you the defaults. As a workaround you could probably set jobspec metadata that prevents the allocation from being scheduled on any other node, but that’s admittedly not a good workaround.

My custom nomad driver has really sticky jobs that can’t be moved that easily to another client. However I want to provide failure recovery in case the client crashes.

Stepping back a moment, there’s no way for the server to determine whether a client has crashed or is simply unreachable. So the reschedule behavior of the scheduler on the server is intentionally split from the client behavior for RecoverTask.

That being said, @DerekStrickland is in the middle of working on a new project to change how lost allocs are being handled by default, which might include giving us an option to never reschedule just as you’re asking. Pinging him here so that he can pop in and ask questions he might have about your use case.

Thanks for all the info! The RecoverTask interface is much clearer for me now.