I am trying to setup consul multi-cluster with two local kind clusters [ one control-plane and one worker nodes] I have Dapr setup on each of the clusters and my applications running on each of the cluster. The intention is to get these applications to communicate over gRPC. The first step is to get the name resolution for services running across these clusters - I am trying multiple tutorials shared in different learning paths.
For the setup of datacenter I am getting an error in certificate creation for federated data centre - the option -additional-dnsname is erroring :
learn-consul-get-started-kubernetes % consul tls cert create -server -dc dca -domain consul -additional-dnsname=*.dcb.consul
zsh: no matches found: -additional-dnsname=*.dcb.consul
I tried to run the instructions to setup multi-cluster with kind based on instructions at -
This setup deploys the primary datacenter, but secondary data centre is not recognised in ‘consul members’ output. I have build this setup on two azure VMs, each hosting a 3 node kind cluster [ one control-plane, 2 workers].
Help needed:
Is there an interactive lab or a consolidated set of instructions that I could use to setup 2 kind clusters and test http calls originating from a service in one cluster being routed to the other cluster.
When running on K8S, the TLS cert creation is taken care of by the Helm Charts. The command that is erroring for you is only required when you are deploying Consul on VMs. In addition, the error you are having is because zsh is interpreting * in your -additional-dnsname arg and is not an issue with the CLI.
Consul 1.14.0 introduced Cluster Peering that allows connecting multiple DCs easier than WAN Federation. If you don’t have a requirement to go with WAN federation specifically, I recommend you try the tutorial below on Cluster Peering.
The only change you may have to make is to properly expose the MeshGateway (according to the available networking options in your kind cluster) by modifying the following in the values file: Helm Chart Reference | Consul | HashiCorp Developer.
Thanks for the help. I tried using cluster peering with consul to communicate between two services in two clusters and was able to do it successfully as well. My next step is to enable service discovery using Dapr, while using consul mesh in the background to communicate between two clusters.
Dapr offers service invocation between two services using it’s APIs and runs alongside application container as a sidecar: Service invocation overview | Dapr Docs. So for that, it requires annotating the application deployment with dapr.io/enabled:"true". For the consul service mesh, it requires annotating the deployment with consul.hashicorp.com/connect-inject: "true". The combined deployment file looks like this:
When I try to run the deployment with only one of the two flags dapr.io/enabled:true and consul.hashicorp.com/connect-inject: "true", the deployment runs fine. But when I enable both, the pods are always in PodInitializing state. What could be the way to fix this?
I don’t have any prior exposure to dapr, nor have I seen any integrations of dapr and consul so far. With that said, I just did a quick test on my side, and it looks like dapr creates a service with the name <servicename>-dapr, and Consul has a requirement that only one service should exist that points to the service that is connect injected.
For example, I tried with the HashiCorp Counting example service, and I saw this error in the consul-connect-inject-init
2023-01-20T06:31:55.761Z [INFO] Unable to find registered services; retrying
2023-01-20T06:31:55.761Z [ERROR] There are multiple Consul services registered for this pod when there must only be one. Check if there are multiple Kubernetes services selecting this pod and add the label `consu
l.hashicorp.com/service-ignore: "true"` to all services except the one used by Consul for handling requests.
Applying the label, as shown below made the connect-injection to work, but the dapr container’s readiness and liveness probe was failing and this is due to the use of transparent proxy.
kubectl get event --field-selector involvedObject.name=counting-b9d79d85f-psq7t
LAST SEEN TYPE REASON OBJECT MESSAGE
13m Normal Scheduled pod/counting-b9d79d85f-psq7t Successfully assigned default/counting-b9d79d85f-psq7t to k3s
13m Normal Pulled pod/counting-b9d79d85f-psq7t Container image "hashicorp/consul:1.13.2" already present on machine
13m Normal Created pod/counting-b9d79d85f-psq7t Created container copy-consul-bin
13m Normal Started pod/counting-b9d79d85f-psq7t Started container copy-consul-bin
13m Normal Pulled pod/counting-b9d79d85f-psq7t Container image "hashicorp/consul-k8s-control-plane:0.49.0" already present on machine
13m Normal Created pod/counting-b9d79d85f-psq7t Created container consul-connect-inject-init
13m Normal Started pod/counting-b9d79d85f-psq7t Started container consul-connect-inject-init
13m Normal Pulled pod/counting-b9d79d85f-psq7t Container image "hashicorp/counting-service:0.0.2" already present on machine
13m Normal Created pod/counting-b9d79d85f-psq7t Created container counting
13m Normal Started pod/counting-b9d79d85f-psq7t Started container counting
13m Normal Pulled pod/counting-b9d79d85f-psq7t Container image "envoyproxy/envoy:v1.23.1" already present on machine
13m Normal Created pod/counting-b9d79d85f-psq7t Created container envoy-sidecar
13m Normal Started pod/counting-b9d79d85f-psq7t Started container envoy-sidecar
13m Normal Pulled pod/counting-b9d79d85f-psq7t Container image "docker.io/daprio/daprd:1.9.5" already present on machine
13m Normal Created pod/counting-b9d79d85f-psq7t Created container daprd
13m Normal Started pod/counting-b9d79d85f-psq7t Started container daprd
While this helps the pod start fully, I am unsure whether this would allow dapr to work. I hope you will be able to figure that part out.
Hi @Ranjandas, Thanks for giving your time resolving the issue. Unfortunately I couldn’t seem to get the pod started with the steps mentioned above. Detailing my steps below:
git clone git@github.com:hashicorp/demo-consul-101.git && cd k8s/04-yaml-connect-envoy
Added following annotations to the pod deployment:
k label svc countingapp-dapr consul.hashicorp.com/service-ignore="true"
Even after running above commands, the counting pod is in initializing state:
NAME READY STATUS RESTARTS AGE
counting-54cdf66b77-2z75v 0/3 Init:0/1 4 (76s ago) 10m
Also, I don’t see the log There are multiple Consul services registered for this pod when there must only be one. Check.... in consul-connect-injector pod. Where is this log generated? When I added consul.hashicorp.com/service-ignore="true" label to counting app-dapr, following logs were generated in consul-connect-injector pod:
2023-01-23T12:17:27.275Z INFO controller.endpoints retrieved {"name": "countingapp-dapr", "ns": "default"}
2023-01-23T12:17:27.275Z INFO controller.endpoints Ignoring endpoint labeled with `consul.hashicorp.com/service-ignore: "true"` {"name": "countingapp-dapr", "namespace": "default"}
2023-01-23T12:17:27.276Z INFO controller.endpoints deregistering service from consul {"svc": "counting-54cdf66b77-2z75v-countingapp-dapr"}
2023-01-23T12:17:27.283Z INFO controller.endpoints deregistering service from consul {"svc": "counting-54cdf66b77-2z75v-countingapp-dapr-sidecar-proxy"}
2023-01-23T12:17:27.287Z INFO controller.endpoints retrieved {"name": "countingapp-dapr", "ns": "default"}
2023-01-23T12:17:27.288Z INFO controller.endpoints registering service with Consul {"name": "countingapp-dapr", "id": ""}
2023-01-23T12:17:27.311Z INFO controller.endpoints registering proxy service with Consul {"name": "countingapp-dapr-sidecar-proxy"}
Seemingly it re-registers the service after deregistering. Any possible reason behind it?
I had the consul installed with following values:
I can see the following logs in the daprd container:
time="2023-01-25T09:54:33.743453327Z" level=info msg="application configuration loaded" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime type=log ver=1.9.5
time="2023-01-25T09:54:33.744644668Z" level=info msg="actors: state store is not configured - this is okay for clients but services with hosted actors will fail to initialize!" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime type=log ver=1.9.5
time="2023-01-25T09:54:33.745020907Z" level=info msg="actor runtime started. actor idle timeout: 1h0m0s. actor scan interval: 30s" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime.actor type=log ver=1.9.5
time="2023-01-25T09:54:33.745625067Z" level=info msg="dapr initialized. Status: Running. Init Elapsed 50ms" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime type=log ver=1.9.5
time="2023-01-25T09:54:33.746221736Z" level=debug msg="try to connect to placement service: dns:///dapr-placement-server.dapr-system.svc.cluster.local:50005" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime.actor.internal.placement type=log ver=1.9.5
time="2023-01-25T09:54:33.777206381Z" level=debug msg="established connection to placement service at dns:///dapr-placement-server.dapr-system.svc.cluster.local:50005" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime.actor.internal.placement type=log ver=1.9.5
time="2023-01-25T09:54:33.780328257Z" level=debug msg="placement order received: lock" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime.actor.internal.placement type=log ver=1.9.5
time="2023-01-25T09:54:33.780542195Z" level=debug msg="placement order received: update" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime.actor.internal.placement type=log ver=1.9.5
time="2023-01-25T09:54:33.780732961Z" level=info msg="placement tables updated, version: 0" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime.actor.internal.placement type=log ver=1.9.5
time="2023-01-25T09:54:33.780813163Z" level=debug msg="placement order received: unlock" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime.actor.internal.placement type=log ver=1.9.5
time="2023-01-25T09:54:49.312937795Z" level=info msg="dapr shutting down." app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime type=log ver=1.9.5
time="2023-01-25T09:54:49.313134911Z" level=info msg="Stopping PubSub subscribers and input bindings" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime type=log ver=1.9.5
time="2023-01-25T09:54:49.313177471Z" level=info msg="Shutting down actor" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime type=log ver=1.9.5
time="2023-01-25T09:54:49.79355185Z" level=info msg="Stopping Dapr APIs" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime type=log ver=1.9.5
time="2023-01-25T09:54:49.796882232Z" level=info msg="Waiting 5s to finish outstanding operations" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime type=log ver=1.9.5
time="2023-01-25T09:54:54.797623642Z" level=info msg="Shutting down all remaining components" app_id=countingapp instance=counting-7cbc7c46f6-2swzg scope=dapr.runtime type=log ver=1.9.5
I can’t figure out why daprd is shutting down.
For additional information, I installed dapr on k8s using the following command:
ubuntu@k3s:~$ dapr init -k --wait --enable-mtls=false -n dapr-system
⌛ Making the jump to hyperspace...
ℹ️ Note: To install Dapr using Helm, see here: https://docs.dapr.io/getting-started/install-dapr-kubernetes/#install-with-helm-advanced
ℹ️ Container images will be pulled from Docker Hub
✅ Deploying the Dapr control plane to your cluster...
✅ Success! Dapr has been installed to namespace dapr-system. To verify, run `dapr status -k' in your terminal. To get started, go here: https://aka.ms/dapr-getting-started
I looked at dapr documentation to see the ports used inbound and outbound to tweak the deployment, but I couldn’t find any details about it.