Service Mesh envoy_bootstrap

Bonjour, :wave:

Today!

Nomad v1.7.6
Consul v1.18.0
Vault v1.15.6

I try another times Hashistack with Workload identity, I hope one day to solve this: Permission denied except example mongo job

Meanwhile I try another test job, a mosquitto stack like this:

job "mosquitto-stack" {
  region = "global"
  datacenters = ["dc1"]
  type = "service"
  node_pool = "all"

  group "mosquitto-server" {

    count = 1

    restart {
      attempts = 10
      interval = "5m"
      delay = "10s"
      mode = "delay"
    }

    network {
      mode = "bridge"

        port "mqtt" {
        to = 1883
        static = 1883
      }
    }

    service {
      name = "mqtt"
      port = "1883"

      connect {
        sidecar_service {}

        sidecar_task {
          resources {
            cpu    = 64
            memory = 64
          }
        }
      }
    }

    task "server" {
      driver = "docker"

      config {
        image = "eclipse-mosquitto:latest"

        mount {
          type = "bind"
          target = "/mosquitto/config/mosquitto.conf"
          source = "local/mosquitto.conf"
          readonly = false
          bind_options {
            propagation = "rshared"
          }
        }

        ports = ["mqtt"]
      }

      template {
        data = <<EOH
listener 1883
allow_anonymous true
EOH
        destination = "local/mosquitto.conf"
      }

      template {
        data = <<EOH
ANSIBLE_FORCE_COLOR=TRUE

EOH
        destination = "secrets/file.env"
        env         = true
      }

      resources {
        cpu    = 128
        memory = 128
      }
    }
  }

  group "mosquitto-client" {

    count = 1

    restart {
      attempts = 10
      interval = "5m"
      delay = "10s"
      mode = "delay"
    }

    network {
      mode = "bridge"
    }

    service {
      name = "mesh"

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "mqtt"
              local_bind_port  = "1883"
            }
          }
        }
        sidecar_task {
          resources {
            cpu    = 64
            memory = 64
          }
        }
      }
    }

    task "client" {
      driver = "docker"

      config {
        image = "alpine:latest"
        entrypoint = ["/bin/sleep", "3600"]
      }

      resources {
        cpu    = 128
        memory = 128
      }
    }
  }
}

And new error! :partying_face:

Task hook failed: envoy_bootstrap: error creating bootstrap configuration for Connect proxy sidecar: exit status 1; see: <https://developer.hashicorp.com/nomad/s/envoy-bootstrap-error>

With these logs, I don’t know if it’s revelant:

Mar 29 14:14:53 sandbox nomad[47832]:     2024-03-29T14:14:53.723+0100 [DEBUG] nomad: memberlist: Stream connection from=127.0.0.1:33840
Mar 29 14:14:55 sandbox consul[22198]: 2024-03-29T14:14:55.222+0100 [WARN]  agent: Check TCP connection failed: check=_nomad-check-55f807ac82d50afb200485e0caecc37d48aae135 error="dial tcp XXX.XXX.XXX.XXX(BIG SECRET):443: connect: connection refused"
Mar 29 14:14:55 sandbox consul[22198]: 2024-03-29T14:14:55.222+0100 [WARN]  agent: Check is now critical: check=_nomad-check-55f807ac82d50afb200485e0caecc37d48aae135
Mar 29 14:14:59 sandbox kernel: [269826.908380] [UFW BLOCK] IN=nomad OUT=nomad PHYSIN=vethb13ea72f PHYSOUT=veth10cde3e5 MAC=33:33:00:00:00:02:da:c3:b3:e2:a6:da:86:dd SRC=fe80:0000:0000:0000:d8c3:b3ff:fee2:a6da DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255 FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0
Mar 29 14:15:01 sandbox consul[22198]: 2024-03-29T14:15:01.656+0100 [ERROR] agent.http: Request error: method=PUT url=/v1/agent/service/deregister/_nomad-task-f738180a-8964-0d63-5395-4949dee4a568-group-traefik-traefik-metrics-traefik_metrics from=127.0.0.1:47672 error="ACL not found"
Mar 29 14:15:02 sandbox nomad[47832]:     2024-03-29T14:15:02.390+0100 [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=07124209-2c68-4d2a-0f74-46ef717d12e4 task=connect-proxy-mesh type="Task hook failed" msg="envoy_bootstrap: error creating bootstrap configuration for Connect proxy sidecar: exit status 1; see: <https://developer.hashicorp.com/nomad/s/envoy-bootstrap-error>" failed=false
Mar 29 14:15:02 sandbox nomad[47832]:     2024-03-29T14:15:02.394+0100 [ERROR] client.alloc_runner.task_runner: prestart failed: alloc_id=07124209-2c68-4d2a-0f74-46ef717d12e4 task=connect-proxy-mesh error="prestart hook \"envoy_bootstrap\" failed: error creating bootstrap configuration for Connect proxy sidecar: exit status 1; see: <https://developer.hashicorp.com/nomad/s/envoy-bootstrap-error>"
Mar 29 14:15:02 sandbox nomad[47832]:     2024-03-29T14:15:02.395+0100 [INFO]  client.alloc_runner.task_runner: restarting task: alloc_id=07124209-2c68-4d2a-0f74-46ef717d12e4 task=connect-proxy-mesh reason="Restart within policy" delay=11.00834248s
Mar 29 14:15:02 sandbox nomad[47832]:     2024-03-29T14:15:02.395+0100 [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=07124209-2c68-4d2a-0f74-46ef717d12e4 task=connect-proxy-mesh type=Restarting msg="Task restarting in 11.00834248s" failed=false
Mar 29 14:15:02 sandbox nomad[47832]:     2024-03-29T14:15:02.470+0100 [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=c8706d23-ce39-0e0e-1e92-4c72873f72d9 task=connect-proxy-mqtt type="Task hook failed" msg="envoy_bootstrap: error creating bootstrap configuration for Connect proxy sidecar: exit status 1; see: <https://developer.hashicorp.com/nomad/s/envoy-bootstrap-error>" failed=false
Mar 29 14:15:02 sandbox nomad[47832]:     2024-03-29T14:15:02.471+0100 [ERROR] client.alloc_runner.task_runner: prestart failed: alloc_id=c8706d23-ce39-0e0e-1e92-4c72873f72d9 task=connect-proxy-mqtt error="prestart hook \"envoy_bootstrap\" failed: error creating bootstrap configuration for Connect proxy sidecar: exit status 1; see: <https://developer.hashicorp.com/nomad/s/envoy-bootstrap-error>"
Mar 29 14:15:02 sandbox nomad[47832]:     2024-03-29T14:15:02.471+0100 [INFO]  client.alloc_runner.task_runner: restarting task: alloc_id=c8706d23-ce39-0e0e-1e92-4c72873f72d9 task=connect-proxy-mqtt reason="Restart within policy" delay=11.00834248s

Help… :disappointed_relieved: :sob:

I have the same problem, randomly jobs will just start failing with the same Task hook failed: envoy_bootstrap: error creating bootstrap configuration for Connect proxy sidecar: exit status 1; see: <https://developer.hashicorp.com/nomad/s/envoy-bootstrap-error> error message

bumping this as I am seeing the same problem. It seems to be very intermittent, unsure of the cause. A “sometimes fix” for me is to migrate the failing workloads to a different Nomad Client.