Consul Failover Service-Resolver
Consul Version: v1.11.3
Currently I have 3 datacenters all governed by Consul. All consul servers are healthy and aware of each other.
Goal: When I take one of my service pods offline and curl it I would like the ability for Consul to route the call to one of the other datacenters in my Consul configuration.
Experiment: When I delete the pod causing it to obviously not be able to serve the request when curl is executed I receive “no healthy upstream”
Following the documentation service-resolver
My service-resolver was applied to the primary and stated synced to the other datacenters.
Here is the service resolver.
kubectl get -n prd serviceresolvers.consul.hashicorp.com service-ci-prd -o yaml
apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceResolver
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"consul.hashicorp.com/v1alpha1","kind":"ServiceResolver","metadata":{"annotations":{},"name":"service-ci-prd","namespace":"prd"},"spec":{"connectTimeout":"15s","defaultSubset":"default","failover":{"*":{"datacenters":["k-dc-g1","k-dc-g3","k-dco-g1"]}},"subsets":{"default":{"filter":"Service.Meta.environment == prd"}}}}
creationTimestamp: "2022-04-06T18:41:41Z"
finalizers:
- finalizers.consul.hashicorp.com
generation: 17
name: service-ci-prd
namespace: prd
resourceVersion: "137557685"
selfLink: /apis/consul.hashicorp.com/v1alpha1/namespaces/prd/serviceresolvers/service-ci-prd
uid: 9b5521a8-eb25-46e2-90c3-81434a5fc15f
spec:
connectTimeout: 15s
defaultSubset: default
failover:
'*':
datacenters:
- k-dc-g1
- k-dc-g3
- k-dco-g1
subsets:
default:
filter: Service.Meta.environment == prd
status:
conditions:
- lastTransitionTime: "2022-04-08T18:23:06Z"
status: "True"
type: Synced
lastSyncedTime: "2022-04-08T18:23:06Z"
In the consul UI in all datacenters it shows the dcs and the failover applied. However when the pod is killed and curled I receive no healthy upstream. Note it also Synced correctly per above. In the UI I am going to the service and then instances and under Meta I am using the data there environment prd. Does anyone else know what else I need to check in order to get the routing to work when the one instance in the primary is down to route to one of the others. For all intensive purposes it looks like it is aware of the other dcs and instances of the services.