Hi All,
I am new to consul. I am trying out few things. I would like to know if this is issue or not.
Currently, I was following Consul up and Running
Book, and have 2 deployment and services within the mesh i.e frontend and backend.
At any point in time both frontend and backend has only one replicas running, however, in consul UI it seems to be showing four instances.
Assuming instances means no. of instances of the deployments running, it is wrong, right ?
I believe because of this I am running into such issues.
Running consul on Azure, I usually stop and start cluster, can it be because of that ?
How to fix this issue ?
Thanks in advance
maxb
April 15, 2023, 2:25pm
2
Considering your screenshot of the Consul web UI, and the kubectl get pods
output, the implication is that for some unknown reason, Kubernetes has had to restart a new pod several times, and all of the registrations have remained in Consul.
You can tell this because the Consul service names are only differing in the 5-random-character part that Kubernetes automatically generates to identify generated objects.
Your next step should be to look into the checks registered on these services, to determine why they still show as ‘checks passing’ even after the relevant pods have gone away.
That is the mystery. I guess Consul doesn’t know that the pod is deleted. I guess this is a issue.
I’ve posted a GitHub Issue to track this.
opened 02:33AM - 16 Apr 23 UTC
#### Overview of the Issue
When a Azure Cluster is restarted, the previous pods… running inside the mesh are lost and new pods are created. Post restart Consul isn't picking up that the previous pods are deleted/not present. Hence, it is still trying to route to the pods resulting in the following error when I try to consume the service be UI/API.

Further more:
These are the active pods:

Whereas, Consul UI shows this,

Notice that there is only one frontend pod is running in AKS whereas UI shows two instances of the service
---
#### Reproduction Steps
Steps to reproduce this issue, eg:
1. Create a service mesh within Azure Kubernetes Service.
1. Stop and Start the AKS.
1. Notice the previous pod info still exists in Consul UI whereas in reality it doesn't exist in AKS.
-->
### Consul info for both Client and Server
<details>
<summary>Client info</summary>
/ $ consul info
agent:
check_monitors = 0
check_ttls = 0
checks = 0
services = 0
build:
prerelease =
revision = 7c04b6a0
version = 1.15.1
version_metadata =
consul:
acl = disabled
bootstrap = true
known_datacenters = 1
leader = true
leader_addr = 10.244.0.12:8300
server = true
raft:
applied_index = 2213
commit_index = 2213
fsm_pending = 0
last_contact = 0
last_log_index = 2213
last_log_term = 4
last_snapshot_index = 0
last_snapshot_term = 0
latest_configuration = [{Suffrage:Voter ID:b9744a41-cccd-861f-eca2-f3b18496e5b4 Address:10.244.0.12:8300}]
latest_configuration_index = 0
num_peers = 0
protocol_version = 3
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Leader
term = 4
runtime:
arch = amd64
cpu_count = 4
goroutines = 259
max_procs = 4
os = linux
version = go1.20.1
serf_lan:
coordinate_resets = 0
encrypted = false
event_queue = 1
event_time = 4
failed = 0
health_score = 0
intent_queue = 1
left = 0
member_time = 4
members = 1
query_queue = 0
query_time = 1
serf_wan:
coordinate_resets = 0
encrypted = false
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 1
members = 1
query_queue = 0
query_time = 1
</details>
### Operating system and Environment details
Kubernetes Version: 1.24.10
Cloud Provider: Azure
Environment: Azure Kubernetes Service