Hi,
I have a single nomad client which is a macOS machine. I submitted a “batch” job that downloads and runs a binary with some arguments for me.
Once the batch job finishes, it goes into the completed
state. Now, my nomad client i.e. the mac kinda goes into screen lock – making the state of the client as “disconnected” as shown below. This is because my job had the following parameter passed: max_client_disconnect = "1h"
What happens next is that the job is kinda scheduled again – which I don’t want since it was already completed before. We can see that it goes into pending
as seen below.
Once I unlock my mac again i.e. the client becomes Ready
, the job which was pending successfully runs again. This basically means the job is run again even though it has completed successfully before.
One can confirm this by checking the 2 allocations listed below – one is 25 mins ago while the other is just a few seconds ago – both of which are completed.
How can I stop this behavior? I am already using the following block for my group
:
max_client_disconnect = "1h"
prevent_reschedule_on_lost = true
reschedule {
attempts = 0
unlimited = false
}
restart {
attempts = 0
mode = "fail"
}
but it seems to have no effect whatsoever
TLDR: a batch job which is completed is retried again if the nomad client goes down (say screen lock) for a while and then comes back up — I don’t want this.