Acme_certificate in Azure across subscriptions from Azure DevOps

I’ve run out of ideas here.

I’ve got TF code that creates a certificate and drops it into a keyvault with Azure DNS authorization. This code works fine when run from my desktop when using az login as the service principal the DevOp’s pipeline runs as.

The DNS zones are in a different subscription that the Service Principal doesn’t have access to but I’m passing SP credentials that does in the config block of the acme_certificate azure section. Again, this code works fine when run from my desktop.

However, when run from a DevOp’s pipeline I get the following

  • azure: dns.ZonesClient#Get: Failure responding to request: StatusCode=404 – Original Error: autorest/azure: Service returned an error. Status=404 Code=“ResourceGroupNotFound” Message=“Resource group ‘’ could not be found.”

This is not dis-similar to this post but there doesn’t appear to be a resolution there

This smells like a permissions issue, like the SP the pipeline runs as is being used to try and read the resource group and zone details from the DNS subscription, however, I’ve tried giving it Contributor permissions to it and that doesn’t help. Also, without those permissions it runs fine on my desktop after logging in as that service principal.

The only other piece of useful information I’ve been able to find is that when I get the above error it does not record a login of the Service Principal user thats being passed in the config block. When running it from my desktop it does (but it works then too). I would expect it to be trying to use the client ID and secret provided in the config block to read the DNS resource group etc but that doesn’t appear to be the case.

Does anyone have any ideas?

The reason that it works from your laptop is that you have azure cli installed and configured, see this link.

When running from devops pipeline, somehow you need to provide the provider(azure) credential.

you have multiple options to provide the credentials, as azure devops and the dns are from different account, this might be the easiest one.

I’ve done exactly that. Setup CLIENT ID and SECRET in the provider for azurerm and still getting the same thing. That works without logging in with az login on my desktop as you’d expect but same error out of the pipeline.

All the rest of my azurerm code happily uses the service principal the pipeline is running as without these creds so I didn’t think but even when I tried it it still didn’t help. Additionally the user I’d pass would be the pipeline user which doesn’t have access to the DNS resource group/zones anyway which is what makes this so weird.

As far as I can tell it shouldn’t be trying to access this resource group with anything other than the client ID and secret passed in the config block for azure in the acme_certificate resource

Can you show your provider block? Also just to make sure, you need all four , including subscriptionid and tenantid…

Yep, no problem, provider block below, and config block from the acme_certificate underneath that

provider "azurerm" {
  subscription_id = "subscription_id"
  features {}
  
  client_id     = "client_id"
  client_secret = "client_secret"
  tenant_id     = "tenant_id"
}

and config block

  dns_challenge {
    provider = "azure"

    config = {
      AZURE_CLIENT_ID       = "dnsSP_client_id"
      AZURE_CLIENT_SECRET   = "dnsSP_client_secret"
      AZURE_RESOURCE_GROUP  = var.dns_resource_group
      AZURE_SUBSCRIPTION_ID = var.dns_subscription_id
      AZURE_TENANT_ID       = data.azurerm_client_config.current.tenant_id
    }
  }

Just to be clear, these are two different Service Principals. The DNS one has DNS Contributor access on the resource group and read access to the whole subscription (not sure if that read access is required but I’ll sort that out when I get it working).

The base azurerm creds have access to the keyvault the cert is going into. The pipeline steps are running as the SP that has keyvault access.

Thanks

how is data.azurerm_client_config.current.tenant_id configured?

It’s just a blank data import. Since it should be running as the service connector user in the pipeline it should be returning the right data.

data "azurerm_client_config" "current" {}

But… that would certainly screw it up if it’s not. I’ll replace that tomorrow with the actual value and see if that fixes it (here’s hoping). I’ve done so many iterations I can’t actually remember if I did try hardcoding that at some point today or not but I’m guessing probably not so it’s an excellent place to start tomorrow!

cool.

I am guessing you would have two providers configured, and the azurerm_client_config data source might be tired to the wrong one… anyway, let me know.

Just a single one (well I have the acme provider but I doubt that counts ;))

I’ve paired this way back to basics trying to find the issue so hopefully this is it!

Sadly no luck. I’ve hardcoded everything in the pipeline code now with no luck. About the only thing I can think of now is it’s trying the wrong cloud. acme_certificate does have an AZURE_ENVIRONMENT config option but every time I’ve tried to use it it’s thrown and error that “public” is not a valid option, but that’s what it’s documentation says to use. Digging into various Azure docs suggest AzurePublicCloud is perhaps the correct option and I’ve tried that too (as well as combinations thereof) still with no luck :/.

I’m now down to thinking it’s somehow pulling subscription or tenant from environment variables and ignoring what I’m passing. This issue seems to indicate it does in fact do that. I’m just not actually setting those environment variables. My guess at this point is selecting a Service Connection to run as sets those environment variables so I’m going to have to set the release to run as a Service Connection user in the right subscription, then add authentication code for the other subscription that hosts the keyvault etc

1 Like

And for anyone else that runs into this that answer is you must run your pipeline as a user that is in the right subscription. I’m assuming doing that in the pipeline creates environment variables in the background and the acme provider will use those over what you actually specify in the code.

Based on the link above that appears to be acceptable to the author, which is slightly discouraging but that’s the solution in any case. In the issue they say

The way around this would be by using provider-level configuration, where providers or environment would be set appropriately in each plugin’s local environment instead of cascading down.

But I’ve no idea how to do that as the acme provider doesn’t give you any other way to pass authentication data except in the config block