Dependency/order of groups

Hi,
I’m trying to use nomad to perform batch processing jobs. in these kind of jobs, there is a “process” phase, where you would run 10-100 indexes on a set of data, followed by a “reduce” phase, where you aggregate the data and summarize it.

ideally, i would use groups for that. i’d declare a process group (with count=n), and a single reduce group (with count=1), which will summarize the calculation.

the problem is that i cannot enforce order/dependency of the reduce so it will always run after the process group.

is there a way to do it, or is there another composition i can make to make it work?

Hi @yfarmad, Nomad itself does not do ordered jobs/groups. For these use cases I think it’s common to use something like Apache Airflow to issue parameterized jobs.

thank you @ shoenig for the tip.
i looked a bit on airflow, but my main goal is to distribute batch processing among 10-100 nodes, and it seems that airflow is less about that (correct me if I’m wrong).

it seems airflow is more of a sequential data flow. though airflow does support parallelism, it’s less accessible. i cannot just mark a task with count=25 to make it run distributed as easy as nomad does.

Hello! @yfarmad you are right that Airflow is more for directed acyclic graph ordering of processes. What you are looking for is two sequential phases (map —> reduce).

You can do phase ordering within a task group with the task lifecycle stanza. Add this stanza to the map task lifecycle { hook = “prestart” }, and it will run before the reduce task. All the prestart tasks will run & complete before all the other main tasks (any task without a lifecycle stanza) are started.

Quick question about the architecture of the data: are the map & reduce steps downloading data to the task group allocation before processing? In that case, it would make more sense to combine the map & reduce steps into a single task group so they share a filesystem on the same node. Then you dont have send data over the network in between map & reduce steps.

Wow, that’s exactly the answer i was waiting for
can’t tell you how i appreciate your help

and regarding the data transfer, it’s different than what you have wrote, but it good to know anyway.

10x