Auto Join issue

Hi,

I have a working K8s cluster and have other projects in it already. I added the consul through helm in the same cluster that uses Kubenet as network.
The consul-ui is exposed to Load Balancer, so I can connect to it.

Now, I have additional VM that is not part of K8s cluster. I have installed consul agent on the VM, but when I am trying to connect it via the auto join method mentioned in the link Auto-join a Cloud Provider | Consul | HashiCorp Developer.

Connection string used:

retry_join = [“provider=azure resource_group= vm_scale_set= tenant_id= client_id= subscription_id= secret_access_key=”]

I get below error:

Dec 17 19:37:41 hk-09-az consul[1655302]: 2021-12-17T19:37:41.836+0800 [ERROR] agent: Cannot discover address: cluster=LAN address=“provider=azure resource_group= vm_scale_set= tenant_id= client_id= subscription_id= secret_access_key=” error=“discover-azure: no interfaces”

Kubectl get service for consul is as following:

consul-dns ClusterIP bb.bb.bb.bb 53/TCP,53/UDP 3h5m
consul-server ClusterIP None 8501/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP 3h5m
consul-ui LoadBalancer aa.aa.aa.aa xx.xx.xx.xx 443:30921/TCP 3h5m

Can anyone guide me through?

I updated the consul.env with the below variables:
ARM_SUBSCRIPTION_ID=XXXX
ARM_TENANT_ID=XXXX
ARM_CLIENT_ID=XXXX
ARM_CLIENT_SECRET=XXXX

where XXXX represents the actual values.

and in the consul.hcl, I used
retry_join = [“provider=azure tag_name=consul tag_value=tag”]

now, I see below error:

Dec 23 17:10:27 hk-09-az bash[1494]: 2021-12-23T17:10:27.303+0800 [ERROR] agent: Cannot discover address: cluster=LAN address=“provider=azure tag_name=consul tag_value=tag” error=“discover-azure: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions//providers/Microsoft.Network/networkInterfaces?api-version=2015-06-15: StatusCode=400 – Original Error: adal: Refresh request failed. Status Code = ‘400’. Response body: {“error”:“invalid_request”,“error_description”:“Identity not found”}”

When I was using below configuration
retry_join = [“provider=azure resource_group=XXXX vm_scale_set=XXXX tenant_id=XXXX client_id=XXXX subscription_id=XXXX secret_access_key=XXXX”]

I saw below error:

Dec 23 17:19:18 hk-09-az bash[2427]: 2021-12-23T17:19:18.204+0800 [ERROR] agent.anti_entropy: failed to sync remote state: error=“No known Consul servers”
Dec 23 17:19:18 hk-09-az bash[2427]: 2021-12-23T17:19:18.423+0800 [INFO] agent: Sending GET https://management.azure.com/subscriptions/b9820b3b-18ea-4d9d-8931-dd083ce943a4/providers/Microsoft.Network/networkInterfaces?api-version=2015-06-15: cluster=LAN
Dec 23 17:19:18 hk-09-az bash[2427]: 2021-12-23T17:19:18.713+0800 [INFO] agent: GET https://management.azure.com/subscriptions/b9820b3b-18ea-4d9d-8931-dd083ce943a4/providers/Microsoft.Network/networkInterfaces?api-version=2015-06-15 received 200 OK: cluster=LAN
Dec 23 17:19:18 hk-09-az bash[2427]: 2021-12-23T17:19:18.716+0800 [INFO] agent: Discovered servers: cluster=LAN cluster=LAN servers=
Dec 23 17:19:18 hk-09-az bash[2427]: 2021-12-23T17:19:18.716+0800 [WARN] agent: Join cluster failed, will retry: cluster=LAN retry_interval=30s error=“No servers to join”

Can anyone guide?

Hi @Jayant,

Per the docs for using cloud auto-join with Azure, Can you confirm that you have properly applied the consul tag to the virtual NIC’s of the Consul servers in the tenant and subscription? If you are using a Virtual Machine Scale Set, this tag can be configured on the resource_group of the vm_scale_set.