Hi.
We have 6 nodes, all running Nomad 1.3.1 in client mode. All these nodes are eligible to run a particular system job, and yesterday all 6 were running the job as expected. Over night, two of these nodes ran out of disk space and obviously went down. I’ve since fixed this problem (both nodes now have ~90% disk space free), but Nomad isn’t recreating the failed system job allocations. If I go into the UI and look at the topology, Nomad sees these two clients as empty, and the system job only has 4 running. So all this state is correct - but why is Nomad to reallocating the system job?