Exec format error on consul connect

Hello,

Raspberry Pi 4 8GB rev 1.4
Ubuntu 20.04
Nomad v1.0.4 (9294f35f9aa8dbb4acb6e85fa88e3e2534a3e41a)
Consul v1.9.5 Revision 3c1c22679
installed by ansible role community with consul connect enabled.

When I try a nomad job without consul connect instructions, no problem, it works.

But, I want to learn about consul connect, I follow the guide Consul Connect | Nomad by HashiCorp for CNI plugin. I download the arm64 version.

But when I try to apply consul connect service in nomad job and launch one, I have this error on each connect-proxy-* task.

standard_init_linux.go:219: exec user process caused: exec format error
standard_init_linux.go:219: exec user process caused: exec format error
standard_init_linux.go:219: exec user process caused: exec format error
[...]

The docker image are:
https://docs.linuxserver.io/images/docker-piwigo
https://docs.linuxserver.io/images/docker-mariadb

And are available for ARM64.

Where I did a mistake?
Thanks

1 Like

Hi @fred-gb :wave:

Consul Connect uses a sidecar proxy that is automatically deployed by Nomad alongside your jobs, that’s the connect-proxy-* tasks that you see failing. This sidecar is a proxy called Envoy that Consul automatically configures to enable the Connect magic sauce.

By default, Nomad will use an official Envoy Docker image that is compatible with the Consul version that you are running. Since you are running Consul 1.9.5, the Envoy version that Nomad would use is 1.16.2, which does have an arm64 image.

Could you check which version of the Envoy image is being used?

You can use the nomad job inspect command and look for the connect-proxy-* tasks to see which images they are using.

Thanks,

I cannot find this information with nomad inspect.

nomad job inspect piwigo.dev.lan | grep image
                            "image": "ghcr.io/linuxserver/piwigo"
                            "image": "ghcr.io/linuxserver/mariadb",
                            "image": "${meta.connect.sidecar_image}",
                            "image": "${meta.connect.sidecar_image}"

But when: docker images

REPOSITORY                             TAG       IMAGE ID       CREATED         SIZE
ghcr.io/linuxserver/piwigo             latest    83fe89c4831b   36 hours ago    301MB
ghcr.io/linuxserver/mariadb            latest    4fc227ba1b8d   11 days ago     343MB
traefik                                latest    db4a19369dcb   2 weeks ago     85.5MB
envoyproxy/envoy                       v1.11.2   72e91d8680d8   19 months ago   150MB
gcr.io/google_containers/pause-arm64   3.1       6cf7c80fe444   3 years ago     525kB

envoy 1.11.2 ?!

It’s fresh install of nomad / consul, I didn’t pull envoy before.

Do I need to force pull new version?
or is it a problem with arm64 deploy?

Thanks

Oh sorry, I thought the version would be render there. Thanks for finding another way to check :sweat_smile:

1.11.2 is a bit old, and not really supported by Consul, so it’s very strange that it got picked :thinking:

Could you run this command in one of your nodes to see which versions are reported by Consul?

$ curl localhost:8500/v1/agent/self | jq .xDS

To unblock you, can you try to manually specify the image to use? You can do so by setting a meta attribute in your client’s configuration:

client {
  enabled = true
  # ...
  meta {
    "connect.sidecar_image" = "envoyproxy/envoy:v1.16.2"
  }
}

1.11.2 is a bit old, and not really supported by Consul, so it’s very strange that it got picked

The 1.11.2 legacy envoy version is the fallback option when the version of Consul is too old to support that /v1/agent/self xDS response object. Is it possible the job was initially launched using an older version of Consul?

For reference, envoy didn’t have arm64 images until v1.16

2 Likes

Hello,

Thanks @lgfa29 & @shoenig

curl localhost:8500/v1/agent/self | jq .xDS

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  9540    0  9540    0     0  1552k      0 --:--:-- --:--:-- --:--:-- 1552k

null

So, I added to client.hcl

  meta {
    "connect.sidecar_image" = "envoyproxy/envoy:v1.16.2"
  }

It works! But…

This is my two services configs in nomad job:

    service {
      name = "piwigo-dev-lan"

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "piwigo-db-dev-lan"
              local_bind_port  = 3306
            }
          }
        }
      }

      port = "webinterface"

      tags = [
              "traefik.enable=true",
              "traefik.http.routers.piwigo.entrypoints=http",
              "traefik.http.routers.piwigo.rule=Host(`piwigo.dev.lan`)",
              "traefik.http.routers.piwigo.service=piwigo-dev-lan",
      ]

    }

    service {
      name = "piwigo-db-dev-lan"
      port = "database"
      tags = []

      connect {
        sidecar_service {}
      }

    }

In web install, when I put localhost or localhost:3306 , unable to connect.

In connect-proxy-db… stderr I have this error in loop:

[2021-05-06 06:03:54.488][10][warning][config] [bazel-out/aarch64-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:101] StreamAggregatedResources gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
[2021-05-06 06:03:55.542][10][warning][config] [bazel-out/aarch64-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:101] StreamAggregatedResources gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
[2021-05-06 06:04:07.908][10][warning][config] [bazel-out/aarch64-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:101] StreamAggregatedResources gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination

stderrhave lot of informations, but I don’t know If I can paste it here.

Getting a null response would explain why the default Envoy image was used, as @shoenig mentioned.

Could you check if you have Consul Connect enabled?

You would need to have the values listed in the docs set in your Consul config file:

ports {
  grpc = 8502
}

connect {
  enabled = true
}

Thanks!

When install consul by ansible role, connect is enabled but I don’t put variable to fix grpc port to 8502. It was set to -1 .

I change and it works!

curl localhost:8500/v1/agent/self | jq .xDS
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  9684    0  9684    0     0  1351k      0 --:--:-- --:--:-- --:--:-- 1576k
{
  "SupportedProxies": {
    "envoy": [
      "1.16.2",
      "1.15.3",
      "1.14.6",
      "1.13.7"
    ]
  }
}

And in my piwigo container:

netstat -lpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:24359           0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:9000          0.0.0.0:*               LISTEN      312/php-fpm.conf)
tcp        0      0 127.0.0.1:3306          0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:22252           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      313/nginx.conf
tcp        0      0 127.0.0.1:19002         0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      313/nginx.conf
tcp        0      0 127.0.0.1:19003         0.0.0.0:*               LISTEN      -

3306 port appears!

But I can’t connec with web install. As I can see I’m not alone in this case on forum.

Thanks for your help! I will continue

Nice! You can remove that meta attribute from you Nomad client now if you want. Nomad should retrieve the right image now.

:tada:

From your service name, I’m assuming you are trying to run this Piwigo project. I don’t know much about it, but I was able to run the web install using this job:

job "piwigo" {
  datacenters = ["dc1"]

  group "piwigo" {
    task "piwigo" {
      driver = "docker"

      config {
        image = "linuxserver/piwigo:11.4.0-ls112"
        ports = ["web"]
      }
    }

    network {
      mode = "bridge"

      port "web" {
        to = 80
      }
    }

    service {
      name = "piwigo"
      port = "80"

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "mysql"
              local_bind_port  = 3306
            }
          }
        }
      }
    }
  }

  group "mysql" {
    task "mysql" {
      driver = "docker"

      config {
        image = "mysql:5.7"
        ports = ["mysql"]
      }

      env {
        MYSQL_ROOT_PASSWORD = "root"
        MYSQL_DATABASE      = "piwigo"
        MYSQL_USER          = "piwigo"
        MYSQL_PASSWORD      = "piwigo"
      }

    }

    network {
      mode = "bridge"

      port "mysql" {
        to = 3306
      }
    }

    service {
      name = "mysql"
      port = "3306"

      connect {
        sidecar_service {}
      }
    }
  }
}

Please note that this was just a test and you will most likely need to work a bit more on it :slightly_smiling_face:

To connect to the database I had to use 127.0.0.1 instead of localhost. It seems like the PHP client for MySQL treats these two values as different things:

Note :
Whenever you specify “localhost” or “localhost:port” as server, the MySQL client library will override this and try to connect to a local socket (named pipe on Windows). If you want to use TCP/IP, use “127.0.0.1” instead of “localhost”. If the MySQL client library tries to connect to the wrong local socket, you should set the correct path as in your PHP configuration and leave the server field blank.

(source: PHP: mysql_connect - Manual)

Try using 127.0.0.1 instead of localhost and see if that works for you.

Hello!

Thanks again!

With your example, I did that:

job "piwigo.dev.lan" {
  region      = "global"
  datacenters = ["dc1"]
  type        = "service"

  group "app" {

    count = 1

    restart {
      attempts = 10
      interval = "5m"
      delay = "10s"
      mode = "delay"
    }

    update {
        max_parallel     = 1
        canary           = 1
        min_healthy_time = "10s"
        healthy_deadline = "5m"
        auto_revert      = true
        auto_promote     = true
        health_check     = "checks"
        stagger          = "30s"
    }

    task "piwigo" {
      driver = "docker"

      config {
        image = "ghcr.io/linuxserver/piwigo"
        ports = ["web"]

        volumes = [
          "/data/piwigo.dev.lan/config:/config",
        ]
      }
    }

    network {
      mode = "bridge"

      port "web" {
        to = 80
      }
    }

    service {
      name = "piwigo"
      port = "web"

      tags = [
              "traefik.enable=true",
              "traefik.http.routers.piwigo.entrypoints=http",
              "traefik.http.routers.piwigo.rule=Host(`piwigo.dev.lan`)",
              "traefik.http.routers.piwigo.service=piwigo",
      ]

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "database"
              local_bind_port  = 3306
            }
          }
        }
      }
    }
  }

  group "database" {

    count = 1

    restart {
      attempts = 10
      interval = "5m"
      delay = "10s"
      mode = "delay"
    }

    update {
        max_parallel     = 1
        canary           = 1
        min_healthy_time = "10s"
        healthy_deadline = "5m"
        auto_revert      = true
        auto_promote     = true
        health_check     = "checks"
        stagger          = "30s"
    }

    task "mariadb" {
      driver = "docker"

      config {
        image = "ghcr.io/linuxserver/mariadb"
        ports = ["database"]

        volumes = [
          "/data/piwigo.dev.lan/mysql:/config",
        ]
      }

      env {
        MYSQL_ROOT_PASSWORD = "root"
        MYSQL_DATABASE      = "piwigo"
        MYSQL_USER          = "piwigo"
        MYSQL_PASSWORD      = "piwigo"
      }
    }

    network {
      mode = "bridge"
    }

    service {
      name = "database"
      port = 3306

      connect {
        sidecar_service {}
      }
    }
  }
}

It works! So:

BEER FOR ALL!
:beers: :yum: :fireworks: :tada:

I had somes problems to build it, because I encounter problems with port label, like this thread: [resolved] Port labels not working intuitively

Sad to not enable have dynamic configuration for port label with consul connect.

Do you have any others goods advices about this job file?

Help that can help someone.

Thanks again!
Have a great day!

I think you would need to set the address_mode for your service to alloc. Take a look at this answer: Correct way to connect to upstream that uses dynamic ports - #2 by lgfa29

Nothing in particular, it’s looking good :grinning_face_with_smiling_eyes:

Just make sure you are properly persisting the MySQL data (check these guides on stateful workloads) and be careful with storing passwords in your job file.