How to loop across tasks in nomad group?

Metadata
  • Nomad v1.7.2
  • OS: Windows
  • Scheduler: service
  • Driver: exec / raw_exec
Architecture

I have 10 windows servers running nomad client nodes.

I want to run 3 instances of a stateful service using Nomad. Each instance (01, 02, 03) writes checkpoints and data to filesystem (eg. /tmp/instance01, /tmp/instance02, /tmp/instance03). When the instance restarts, it will continue from the latest checkpoint. Each instance can be allocated to any host. However, each instance should be configured to use the same directory as the previously failed instance.

So basically:

  • 01/tmp/instance01
  • 02/tmp/instance02
  • 03/tmp/instance03

For simplicity, assume these 3 directories are created in NAS, and the NAS is mounted on all servers running nomad client node. Also assume all groups / tasks has RW access to these 3 directories.

Issue

There are a few ways I can configure the directory that the service uses to read/write state data:

  • Command Line Argument
  • Configuration File, via environment variable
  • Template block, via Nomad template interpolation

How can I pass a different value to the same task running in different instances of a group?

i.e. How do I give each instance a unique and consistent tag, so it can be reliably identified?

Use Case:
  • Grafana Loki Read-Node and Backend-Node writes checkpoints and data to filesystem before pushing to downstream. Each node maintain it’s own checkpoint.
  • Opentelemetry writes checkpoints and data to filesystem when memory is full and exporting to downstream has failed. Each node maintain its own checkpoint.

I suspect many other application stores instance-state-data on disk, where the state store are unique to each instance, and is use to recover work for that instance.

What I’ve considered
  • Parameterized Block → Does not work for Service jobs
  • Template Block → Every task instance will receive the same data
  • Env Var → Every task instance will receive the same data
  • Meta Block → Every task instance will receive the same data
  • Variable Block → Every task instance will receive the same data
  • Dynamic Block → Possible solution, but this is essentially repeating Group Block
  • Repeating Group Block → Trying to avoid this
  • Multiple Job Spec → Trying to avoid this

It seems like it’s possible to loop within a nomad task, but not across tasks.

Any solution appreciated, even hacky ones. TIA!

job "app-write" {
  datacenters = ["dc1"]
  type = "service"
  node_pool = "default"

  # Write
  group "app-write" {
    count = 3
    
    # /tmp/instance01
    volume "app01" {
      type = "host"
      read_only = false
      source = "tmpapp01"
    }

    # /tmp/instance02
    volume "app02" {
      type = "host"
      read_only = false
      source = "tmpapp02"
    }

    # /tmp/instance03
    volume "app03" {
      type = "host"
      read_only = false
      source = "tmpapp03"
    }
    
    network {
      port "http" { }  // 3100
      port "grpc" { } // 9095
      port "gossip" { }  // 7946
      # port "lb" { static = 8080 }
    }

    service {
      name = "app-write"
      address_mode = "host"
      port = "http"
      tags = ["http"]
      provider = "nomad"
    }

    service {
      name = "app-write"
      address_mode = "host"
      port = "grpc"
      tags = ["grpc"]
      provider = "nomad"
    }

    service {
      name = "app-write"
      address_mode = "host"
      port = "gossip"
      tags = ["gossip"]
      provider = "nomad"
    }

    task "app-write" {
      driver = "exec" # or "raw_exec"
      
      volume_mount {
        volume = "app01"
        destination = "/tmp/app01"
        read_only = false
      }

      volume_mount {
        volume = "app02"
        destination = "/tmp/app02"
        read_only = false
      }

      volume_mount {
        volume = "app03"
        destination = "/tmp/app03"
        read_only = false
      }

      config {
        command = "/usr/bin/app"
        args = [
          "-config.file=local/app/config.yaml",
          "-working-directory=/tmp/app01" # <-- Need this to change for each instance
          ]
      }

      resources {
        cpu = 100
        memory = 128
      }

      # Can change this for each instance too
      template {
        source = "/etc/app/config.yaml.tpl"
        destination = "local/app/config.yaml"
        change_mode = "restart" // restart
      }
    }
  }
}