Nomad jobs with raw_exec driver and consul service discovery

Dear hashicorp community,

I am learning Nomad, I also have a nomad + consul cluster in some machines and I can run Nomad jobs.

I am deploying a cluster application in Nomad, this application consists of one server and multiple clients. Each client needs to have the server hostname to start.

My nomad jobs use raw_exec driver.

When I see consul service discovery examples, they all seems to be alligned with the http api requests, in my case all communication between client and servers is tcp (not http).

I would like to ask:

Would consul service discovery work for only tcp calls? if yes, could you also please point me to the right documentation?

thank you very much

Could you share an example?

For using service you have to declare it in the service stanza, but I guess you already know that.

job "docs" {
  group "example" {
    task "server" {
        service {
          name = "YOUR-SERVICE"
          port = "YOUR-PORT" 
        }
    }
  }
}

This will create a service which will allow you to reference it dynamically in other jobs and get the IP and port of the service.

Yes, sure

Nomad job managing the lifecycle of a client application

job "slurm-cn" {
  priority    = 95 # 100 is higher priority
  datacenters = ["${var.datacenter}"]
  type        = "system" 
  group "slurmd-cn" {

    # Each task should be scheduled on a different node.
    constraint {
      operator = "distinct_hosts"
      value    = "true"
    }

    task "setup" {
      lifecycle {
        hook    = "prestart"
        sidecar = false
      }
      driver = "raw_exec"
      user   = "root"
      config {
        command = "bash"
        args    = ["-c", "zypper install --no-confirm --no-recommends slurm=22.05.6-1 slurm-slurmd=22.05.6-1 slurm-perlapi=22.05.6-1 slurm-libpmi=22.05.6-1 slurm-devel=22.05.6-1 mpich mpich-devel pmix pmix-devel pmix-headers libpmix2 rpm-build && mkdir -p /etc/slurm && chown -R root:root /etc/slurm && chmod -R 0755 /etc/slurm && mkdir -p /var/spool/slurmd"]
      }
    }

    task "slurmd" {
      driver = "raw_exec"
      user   = "root"
      config {
        command = "/usr/sbin/slurmd"
        args    = ["-D", "-Z", "--conf-server", "${var.slurm-ctld-host}", "--conf", "Feature=compute"]
      }
      service {
        port = "slurmd"
        provider = "consul"
        check {
          type = "tcp"
          port = "slurmd"
          interval = "5s"
          timeout = "2s"
        }
        check_restart {
          limit = 3
          grace = "90s"
          ignore_warnings = false
        }
      }
    }

    task "remove" {
      lifecycle {
        hook    = "poststop"
        sidecar = false
      }
      driver = "raw_exec"
      user   = "root"
      config {
        command = "bash"
        args    = ["-c", "zypper remove --no-confirm slurm=22.05.6-1 slurm-slurmd=22.05.6-1 slurm-perlapi=22.05.6-1 slurm-libpmi=22.05.6-1 slurm-devel=22.05.6-1 mpich mpich-devel pmix pmix-devel pmix-headers libpmix2 rpm-build ; rm -rf /etc/slurm && rm -rf /var/spool/slurmd ; "]
      }
    }
    
    network {
      port "slurmd" {
        static = 6818 # host linked port to TCP 6818
      }
    }
  }
}

Nomad job managing the lifecycle of the server application

job "slurm-ctl" {
  priority    = 95 # 100 is higher priority
  datacenters = ["${var.datacenter}"]
  type        = "service"
  group "slurm-ctl" {

    task "setup" {
      lifecycle {
        hook    = "prestart"
        sidecar = false
      }
      driver = "raw_exec"
      user   = "root"
      config {
        command = "bash"
        args    = ["-c", "zypper install --no-confirm --no-recommends slurm=22.05.6-1 slurm-slurmctld=22.05.6-1 slurm-perlapi=22.05.6-1 slurm-libpmi=22.05.6-1 slurm-devel=22.05.6-1 mpich mpich-devel pmix pmix-devel pmix-headers libpmix2 && mkdir -p /etc/slurm && curl -o /etc/slurm/slurm.conf ${var.jfrog_repo_url}/${var.jfrog_repo_slurm_conf_file_name} && curl -o /etc/slurm/cgroup.conf ${var.jfrog_repo_url}/cgroup.conf && curl -o /etc/slurm/plugstack.conf ${var.jfrog_repo_url}/plugstack.conf"]
      }
    }

    task "slurmctld" {
      driver = "raw_exec"
      user = "root"
      config {
        command = "/usr/sbin/slurmctld"
        args = ["-D"]
      }
      service {
        port = "slurmctl"
        provider = "consul"
        check {
          type = "tcp"
          port = "slurmctl"
          interval = "5s"
          timeout = "2s"
        }
        check_restart {
          limit = 3
          grace = "90s"
          ignore_warnings = false
        }
      }
    }

    task "remove" {
      lifecycle {
        hook    = "poststop"
        sidecar = false
      }
      driver = "raw_exec"
      user   = "root"
      config {
        command = "bash"
        args    = ["-c", "zypper remove --no-confirm slurm=22.05.6-1 slurm-slurmctld=22.05.6-1 slurm-perlapi=22.05.6-1 slurm-libpmi=22.05.6-1 slurm-devel=22.05.6-1 mpich mpich-devel pmix pmix-devel pmix-headers libpmix2 ; rm -rf /etc/slurm ; "]
      }
    }

    network {
      port "slurmctl" {
        static = 6817 # host linked port to TCP 6817
      }
    }
  }

  constraint {
    attribute = "${attr.unique.hostname}"
    value = "${var.slurm-ctld-host}"
  }
}

client needs to reach the server on ${var.slurm-ctld-host}:6817 where ${var.slurm-ctld-host} is just a hostname.

Please note these jobs uses the raw_exec driver

thank you