I’ve read TFM and a thread that’s similar to my question, but it’s not quite the same.
job "batch" {
datacenters = ["dc1"]
type = "batch"
parameterized {
...
}
group "batch_group" {
count = 1
task "batch_task" {
...
Each tasks consist of a shell script that takes an input parameter.
If I set spread
on job level, batch_group
is only one, so Nomad doesn’t get to spread it. I have two clients, but count = 1
so all tasks that get submitted to this job end up on the same Nomad client.
If I set group.count = 2
and set spread
on that level, every task results in two groups, each group on one of the clients, but because each (parameterized) task is just one, one of the batch_group
s ends up without any task (which doesn’t bother me much) but also results in unnecessary resource allocation and pollutes the log.
I thought about creating two batch jobs, each with one batch group, and force each group onto a different client, but that would make it hard to split submissions evenly and I’d have to change my scripts as well (to divide tasks in two “queues”).
Is there a better way to spread parameterized tasks around the cluster and how? I’ve been thinking if I should try group.count=2 and two tasks per each group, but that also means I have to rewrite my script to fetch two parameters at once.