Nomad v0.12.9 and consul 1.9.0 service mesh = Envoy 1.11.2 is too old and is not supported by Consul

Hello,

after successful test of traefik load balancing configuration (https://learn.hashicorp.com/tutorials/nomad/load-balancing-traefik), when traefik retrieve instance information using consul provider,

I decided to try with service mesh (https://learn.hashicorp.com/tutorials/nomad/consul-service-mesh#run-a-connect-enabled-job) configuration, following the tutorial I discover that I can’t reach services that registered in consul by nomad because some consul health-check’s was failing:

also I do not see any port listeners for instances (ip:port) that are registered in consul, future investigation lead me to envoyproxy container log with following information:

[warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 3, Envoy 1.11.2 is too old and is not supported by Consul

not sure if that’s the root cause of my problem, but based on used image version, it seems to be triggered here:

future investigation shows that used envoy version are not supported by latest (1.9) consul:

can this “warning” be a reason for failing connection?
can I overtake this situation somehow or I need to downgrade to consul 1.8.6 for now?

Thanks.

Yes, we ran into the same issue. You can add a connect.sidecar_image metadata value to force a specific version. In Nomad 1.0.0 they have used the supported proxies API in Consul 1.8.6 and later to update this dynamically, but in earlier versions we have to do it manually.

    client {
      enabled = true
      meta = {
        connect.sidecar_image = "envoyproxy/envoy:v1.14.5"
      }
    }

Indeed, the upcoming Nomad v1.0.0 release will significantly improve the compatibility story between Nomad, Consul, and Envoy by automatically choosing the latest version of Envoy supported by Consul at runtime. You can read more about the change in the 1.0 upgrade notes.

In the interim you could do one of a few things

  • Set the sidecar image manually as suggested by @mcfarlandj
  • Downgrade Consul to a version prior to 1.9
  • Upgrade to Nomad 1.0.0-beta3

Hello and many thanks @mcfarlandj and @shoenig,

setting sidecar envoy version resolve consul health-check / listeners,
and I was able to connect dashboard (ui) and api (backend) from countdash example, one issue with that, so far, is that sometimes when you reload the page UI disconnects from back-end and it may take up to 5-7 seconds to reconnect back (not sure if that’s just example specifics or some problems in my configuration, will try with some more examples and/or nomad update to clarify that.

Thanks again for such useful replies.

1 Like

After upgrading Nomad and Consul I seem to b having similar issues. Note the startup message from Consul:

2021-02-26T11:49:30.821-0700 [ERROR] agent.envoy: Error handling ADS stream: error=“rpc error: code = InvalidArgument desc = Envoy 1.11.2 is too old and is not supported by Consul”

Note:
% consul --version
Consul v1.9.3
Revision f55da9306
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

% nomad --version
Nomad v1.0.3 (08741d9f2003ec26e44c72a2c0e27cdf0eadb6ee)

Any thoughts or suggestions?

@schemmer Consul 1.9 drops support for Envoy 1.11; unfortunately the only graceful in-place upgrade path is through Consul 1.8. You should be able to just re-run Connect jobs from here, triggering the Envoy images being used to evaluate to the newest version.

In the upgrade guide we do recommend doing node-drains this time to be sure of avoiding the incompatibility scenario.