Cross-Region Service Discovery & Auth in Federated Nomad & Consul Setup

Hi all,

not sure if this should be posted in the Nomad or Consul sub-forum.

I have two regions r1 and r2. Both regions have their own Consul cluster and their own Nomad cluster. Both Nomad clusters and both Consul clusters are federated: Federate multi-region clusters | Nomad | HashiCorp Developer / Federate multiple datacenters with WAN gossip | Consul | HashiCorp Developer.
r1 acts as primary (if applicable)

I’ve also set-up workload identities where each Nomad cluster is able to create identities in the Consul cluster in the respective region.

My mesh enabled service “my-mesh-service” is only located in r1 and exposed to legacy apps using api gateways. The api-gateways run on Nomad similar like this: GitHub - hashicorp-guides/consul-api-gateway-on-nomad: Deploying Consul API Gateway on nomad

To route requests from the api-gateway in r2 to the service in r1 I’ve added a service-resolver config to Redirect them to r2.

resource "consul_config_entry" "service-resolver" {
  kind = "service-resolver"
  name = "my-mesh-service"
  config_json = jsonencode({
    Redirect = {
      Datacenter = "r1"
    }
  })
}

The api-gateway in r2 must be able to determine the service health of “my-mesh-service” in r1. For this to work the policies attached to the api-gateway token must not be tied to a specific datacenter. [1]
The issue I had trouble with was that the Consul token issued to the api-gateway in r2 is actually a local token and only valid in r2. Therefore the api-gateway runs into the same issue as [1]
While it is possible to configure the auth method to issue global tokens, this only works for the primary region - not for secondaries. You’ll get rejected if you try to create an auth method with -token-locality=global in a sencondary cluster.

From my understanding I have two options now:

  1. Create some way of connection between Consul in r1 and Nomad in r2 (without relying on the mesh) to let workload identities be created against the primary Consul cluster.
  2. Create a static Consul token for the api-gateway without using WI at all.

Both options are more like workarounds and not real solutions.
So, I’m wondering if there’s something else I’ve missed that may solve that problem.

[1]failed_eds_health when using ingress-gateway to services in another datacenter · Issue #12623 · hashicorp/consul · GitHub