Exact meaning of lifecycle of sidecar(prestart)/main task

Trying to utilize nomd 0.11 lifecycle to express the task dependency. task sidecarA (sidecar type, long-run process) and main task B. where task B has to wait for sidecarA fully ready (by service health check?).
My question, when we deploy the job, how we know sidecarA is already ready for task B to be able to start? This is different from the non-sidecar case , which the exit of prestart task means task B can be started. For long run sidecarA, how do we know B can start? I register sidecarA also as a service. Does that mean only if the consul health check passes, B will start? if it is not the case, how do we do that? Thanks.

2 Likes

I have this problem. I am running in to a race condition where a main task is started after the sidecar and the main task tries to connect to upstreams that the sidecar should have available but are not yet ready.

This causes the main task to fail, exit, and the whole job dies. I need a way to not start the main task until the sidecar is truly ready to accept connections, not just “started” by the task driver, but there doesn’t appear to be a way to do this that I can see.

Hi @easyfmxu and @spaulg :wave:

You could use a non-sidecar prestart task to perform any checks necessary.

Here’s an example of a task that waits for a sidecar endpoint to be available:

job "wait-for-sidecar" {
  datacenters = ["dc1"]

  group "wait-for-sidecar" {
    network {
      port "sidecar" {}
    }

    task "sidecar" {
      driver = "docker"

      config {
        image = "traefik/whoami"
        ports = ["sidecar"]
        args  = ["--port", "${NOMAD_PORT_sidecar}"]
      }

      lifecycle {
        hook    = "prestart"
        sidecar = true
      }
    }

    task "wait" {
      driver = "docker"

      config {
        image = "willwill/wait-for-it"
        args  = ["${NOMAD_ADDR_sidecar}", "-t", "0"]
      }

      lifecycle {
        hook    = "prestart"
        sidecar = false
      }
    }

    task "main" {
      driver = "docker"

      config {
        image   = "alpine:3.15"
        command = "/bin/sh"
        args    = ["local/script.sh"]
      }

      template {
        data = <<EOF
#!/usr/bin/env bash

apk update && apk add curl

while true;
do
  curl http://$NOMAD_ADDR_sidecar
  sleep 5
done
EOF

        destination = "local/script.sh"
      }
    }
  }
}

This is how their lifecycle topology looks like:

The main task will be blocked until wait completes successfully.

Hi

Thanks for the reply and advice.

This is what I have resorted to today. I’m using netcat to test for the open socket of each upstream port. I’m also going to see if I can use exported NOMAD_UPSTREAM_PORT_* environment variables to detect which ports to check dynamically.

I do think this ought to be something Nomad should have the option to do on its own though. Otherwise every developer who needs it is reinventing the wheel.

It would be worth documenting too. I’ve been developing a commercial proposition that uses Nomad for 18 months and hadn’t realised this race condition exists. I guess I just assumed that Nomad wouldn’t start the main task until the sidecar was ready.

Regards,
Simon

Definitely something we can improve.

Could you open an issue in our GitHub repo as a feature request to have main tasks wait for prestart sidecars?

Thank you!