System job with constrains fails to plan

Nomad v1.2.6 has problem described below, while Nomad v 1.1.5 works as expected.

Possibly related:

Expected

The setup is a set of nodes with ${node.class} = worker and a few other nodes in the cluster. All the worker nodes should run the worker task, all other nodes should not. So a a job with type = "system" is used, and the following constraint is added to the worker group:

constraint {
  attribute = "${node.class}"
  operator  = "="
  value     = "worker"
}

Observed

This works sometimes, in particular when there are no allocations on the cluster. But running nomad job plan after allocations are running displays the following warning:

Scheduler dry-run:
- WARNING: Failed to place allocations on all nodes.
  Task Group "worker" (failed to place 1 allocation):
    * Class "entry": 1 nodes excluded by filter
    * Constraint "${node.class} = worker": 1 nodes excluded by filter

This should not be a warning, as the allocations match the job definition, considering the constraints.
nomad job run produces the desired state and the job state is displayed as “not scheduled” on all non-worker nodes.
Removing the constrains shows no warning, but obviously schedules the worker task on non-worker nodes, which is unwanted.


The only workaround seems be to ignore warnings, which defeats the purpose of nomad job plan, or create a entire separate cluster for the workers.

Hi @lucas it sounds like this may be a regression, do you mind submitting all this as a new issue?

Issue submitted: System job with constrains fails to plan · Issue #12748 · hashicorp/nomad · GitHub