Consul Upgrade v0.6.4 ---> v1.5.3. Failure to connect clients

I recently joined an org that is running Consul v0.6.4 which brings on a host of issues. I’ve been tasked to upgrade Consul to the latest version and bring our Docker Swarm environment up to speed with Consul and Vault upgrades.

Upon upgrade to v1.5.3 I was able to get a 3 node cluster up and running and I can see all three members of the cluster have joined. Vault was able to connect and I can see the client as part of the consul members list. Other Docker containers however complain with:

Error connecting to Consul agent: dial tcp XX.XX.XX.XX:8400: getsockopt: connection refused

I noticed that the RPC Protocol was deprecated in v0.8 onwards and our applications might be looking to connect with RPC on port 8400. To resolve this would we have to upgrade the consul agent across all our applications causing an outage while we upgrade?

Is there a workaround to get this up and running quickly. We’re using smebberson/alpine-consul-nodejs (Docker hub) images for our node based apps and seems like we would have to rebuild (worst-case) with the latest images that are compatible with our consul v1.5.3 servers.

@hamza15 What are your containers doing that is causing them to try and access port 8400. Are these Consul agents or your own application? Is it using the Consul CLI?

1 Like

Nothing to add about the issue, just “kudos” to upgrading to the latest GA from a prehistoric version!!! :clap: :+1: :grin: :smiley:

2 Likes

Hi Matt! Thanks for reaching out. When we bring up an app container, we install a Consul agent. I see one of the bootstrap scripts running Consul CLI commands which I assume fails cause the agent is trying to use RPC at port 8400?

I recently took down Vault and tried to compose up from scratch and consul agent is stuck at:

+echo Looking for consul agent…
+counter=0
+curl -s -I http://127.0.0.1:8500/v1/agent/self
+sleep 10

Haha thanks. It’s a huge leap but it comes with its own set of issues.

Yes, if you are using a not yet upgraded binary and invoke the CLI its going to try and use the client RPC which no longer exists. You will need to update the binary to get it to use the HTTP API instead.

When you say its getting stuck what are you seeing? Is the v1/agent/self endpoint returning something besides a 200 status, never returning or something else?

Will update the binaries across our applications. Thanks for the clarification!

Looking at the logs I don’t see any response whatsoever. The following just repeats every 10 seconds:

+echo Looking for consul agent…
+counter=0
+curl -s -I http://127.0.0.1:8500/v1/agent/self
+sleep 10