Azurerm_linux_function_app: azurerm tries to mutate network although no such change is planned

Summary

When changes are being made to an Azure Function App instance (with regional Vnet integration via subnet connected to App Service Plan), even if the changes do not include any networking-related stuff, the terraform apply step still fails, if the identity being used to run the command doesn’t have network change access.

I wonder if anyone has encountered similar issues before, and might be kind enough to share how the problem was worked around.

Details

Scenario

The scenario, in general, is as follows:

  • Terraform is being run on a CI/CD agent
    • The agent is Azure Pipelines self-hosted agent on a Windows VM
    • Terraform uses the VM’s managed identity, which does not have permissions to mutate networks and will not get it
  • The resource to manage is an Azure Function App
    • Created with Terraform
    • The Vnet integration is manually done after resource provisioned (CI/CD not applicable)
    • Integrated to a subnet, specified in the App Service Plan
  • After Vnet integrated, changes to the FuncApp are attempted
    • terraform plan correctly shows the changes as it is in the code
    • most importantly, no network changes are involved
    • terraform apply still shows the right changes, but fails when applying them

The provider ver. is 3.93.0. Re-creating the resources from scratch (except the network) does not help.

Debug details

Using debug output, it was found that the interfering attribute is virtual_network_subnet_id, which has been ignored in the Terraform code. The tfstate keeps this attribute empty; when Terraform tries to connect to the AzureRM API endpoint, the request body correctly contains the actual non-empty subnet ID (which it obtained from prior calls to check the actual infrastructure state). This is where it fails inexplicably - AzureRM assumes a network change is being made where in reality there is none.

Terraform code snippet

resource "azurerm_linux_function_app" "dap_funcapp" {
  name                = "xxxxxxx"
  resource_group_name = data.azurerm_resource_group.dap_rg.name
  location            = data.azurerm_resource_group.dap_rg.location

  storage_account_name       = data.azurerm_storage_account.dap_storage_acct.name
  storage_account_access_key = data.azurerm_storage_account.dap_storage_acct.primary_access_key
  service_plan_id            = data.azurerm_service_plan.funcapp_service_plan.id

  tags = local.tags

  site_config {
    always_on = true

    app_service_logs {
      disk_quota_mb         = 35
      retention_period_days = 0
    }

    application_stack {
      python_version = var.python_runtime_version
    }

  }

  app_settings = {
    WEBSITES_ENABLE_APP_SERVICE_STORAGE = true
    WEBSITE_ENABLE_SYNC_UPDATE_SITE     = true
    FUNCTIONS_WORKER_RUNTIME            = "python"
    FUNCTIONS_EXTENSION_VERSION         = "~4"
    SCM_DO_BUILD_DURING_DEPLOYMENT      = false
    AzureWebJobsFeatureFlags            = "EnableWorkerIndexing"
    BUILD_FLAGS                         = "UseExpressBuild"
    ENABLE_ORYX_BUILD                   = "true"
    XDG_CACHE_HOME                      = "/tmp/.cache"
  }

  lifecycle {
    ignore_changes = [
      virtual_network_subnet_id, site_config["ip_restriction"], site_config["vnet_route_all_enabled"]
    ]
  }
}

Outputs

Full terraform apply output (note some sensitive information is redacted):

data.azurerm_resource_group.dap_rg: Reading...
data.azurerm_resource_group.dap_rg: Read complete after 0s [id=/subscriptions/redacted/resourceGroups/redacted]
data.azurerm_service_plan.funcapp_service_plan: Reading...
data.azurerm_storage_account.dap_storage_acct: Reading...
data.azurerm_service_plan.funcapp_service_plan: Read complete after 0s [id=/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/serverFarms/redacted]
data.azurerm_storage_account.dap_storage_acct: Read complete after 0s [id=/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Storage/storageAccounts/redacted]
azurerm_linux_function_app.dap_funcapp: Refreshing state... [id=/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx]

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # azurerm_linux_function_app.dap_funcapp will be updated in-place
  ~ resource "azurerm_linux_function_app" "dap_funcapp" {
      ~ app_settings                                   = {
          + "FUNCTIONS_EXTENSION_VERSION"         = "~4"
            # (8 unchanged elements hidden)
        }
        id                                             = "/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxxx"
        name                                           = "xxxxxxx"
        # (27 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.
azurerm_linux_function_app.dap_funcapp: Modifying... [id=/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx]
╷
│ Error: updating Linux App Service (Subscription: "redacted"
│ Resource Group Name: "redacted"
│ Site Name: "xxxxxxx"): performing CreateOrUpdate: unexpected status 403 with error: LinkedAuthorizationFailed: The client 'redacted' with object id 'redacted' has permission to perform action 'Microsoft.Web/sites/write' on scope '/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx'; however, it does not have permission to perform action(s) 'Microsoft.Network/virtualNetworks/subnets/join/action' on the linked scope(s) '/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Network/virtualNetworks/redacted/subnets/yyyyyyy' (respectively) or the linked scope(s) are invalid.
│ 
│   with azurerm_linux_function_app.dap_funcapp,
│   on main.tf line 51, in resource "azurerm_linux_function_app" "dap_funcapp":
│   51: resource "azurerm_linux_function_app" "dap_funcapp" {
│ 
│ updating Linux App Service (Subscription:
│ "redacted"
│ Resource Group Name: "redacted"
│ Site Name: "xxxxxxx"): performing CreateOrUpdate: unexpected
│ status 403 with error: LinkedAuthorizationFailed: The client
│ 'redacted' with object id
│ 'redacted' has permission to perform action
│ 'Microsoft.Web/sites/write' on scope
│ '/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx';
│ however, it does not have permission to perform action(s)
│ 'Microsoft.Network/virtualNetworks/subnets/join/action' on the linked
│ scope(s)
│ '/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Network/virtualNetworks/redacted/subnets/yyyyyyy'
│ (respectively) or the linked scope(s) are invalid.
╵
##[error]Cmd.exe exited with code '1'.

Excerpt from the verbose debug output:

PUT /subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx?api-version=2023-01-01 HTTP/1.1
Host: management.azure.com
User-Agent: HashiCorp/go-azure-sdk (Go-http-Client/1.1 webapps/2023-01-01) HashiCorp Terraform/1.6.6 (+https://www.terraform.io) Terraform Plugin SDK/2.10.1 terraform-provider-azurerm/3.93.0 VSTS_664bc712-e7c8-4475-8f5b-9a88f9d8df97_build_250_0 pid-222c6c49-1b0a-5959-a213-6608f9eb8820
Content-Length: 3911
Content-Type: application/json; charset=utf-8
X-Ms-Correlation-Request-Id: redacted
Accept-Encoding: gzip

<request body. note: truncated>
{"id":"/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx","kind":"functionapp,linux","location": ... "virtualNetworkSubnetId":"/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Network/virtualNetworks/redacted/subnets/yyyyyyy","vnetContentShareEnabled":false,"vnetImagePullEnabled":false,"vnetRouteAllEnabled":true},"tags": ... }

2024-04-11T11:36:44.437+0800 [DEBUG] provider.terraform-provider-azurerm_v3.93.0_x5.exe: PUT https://management.azure.com/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx?api-version=2023-01-01: timestamp="2024-04-11T11:36:44.423+0800"
2024-04-11T11:36:44.670+0800 [DEBUG] provider.terraform-provider-azurerm_v3.93.0_x5.exe: AzureRM Response for https://management.azure.com/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx?api-version=2023-01-01: 
HTTP/2.0 403 Forbidden
Content-Length: 767
Cache-Control: no-cache
Content-Type: application/json; charset=utf-8
Date: Thu, 11 Apr 2024 03:36:44 GMT
Expires: -1
Pragma: no-cache
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Cache: CONFIG_NOCACHE
X-Content-Type-Options: nosniff
X-Ms-Correlation-Request-Id: redacted
X-Ms-Failure-Cause: gateway
X-Ms-Request-Id: redacted
X-Ms-Routing-Request-Id: KOREACENTRAL:20240411T033644Z:e4e75263-8d93-43af-9365-37ebc847a9aa
X-Msedge-Ref: Ref A: 8EBCB52744924EC88FD8A1146739450A Ref B: SEL221051504037 Ref C: 2024-04-11T03:36:44Z

{"error":{"code":"LinkedAuthorizationFailed","message":"The client 'redacted' with object id 'redacted' has permission to perform action 'Microsoft.Web/sites/write' on scope '/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx'; however, it does not have permission to perform action(s) 'Microsoft.Network/virtualNetworks/subnets/join/action' on the linked scope(s) '/subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Network/virtualNetworks/redacted/subnets/yyyyyyy' (respectively) or the linked scope(s) are invalid."}}: timestamp="2024-04-11T11:36:44.661+0800"

Any help would be deeply appreciated.

Sorry for bumping this issue but I wonder whether anyone else has encountered similar issues?
If this is an issue with the Azure RM SDK or Azure REST API I could go ask them instead. Thanks

I suspect this is more an ARM API issue or AzureRM provider interaction with that API rather than specifically Terraform as the error response coming back here is from the underlying ARM API call.

I am not in a position to try and recreate this, but the API call from the provider contains the full config of the resource as it is required (based upon your debug log). ARM is then applying/validating that config (even though nothing has changed) and one of those actions is interacting with the network resource, rather than selectively applying changes only where they are required.

One way to confirm this would be to try and update a Linux Function App from the portal (eg. the FUNCTIONS_EXTENSION_VERSION setting) using a user that also does not have the access to network resources used in the VNet integration and see if the same issue occurs through the portal. It might not (If the portal just tries to patch the resource with the change) but if it does send the full config (as opposed to the changes) then you may see the same issue.

Looking at the redacted logs you provide and the ARM API being called Web Apps - Create Or Update - REST API (Azure App Service) | Microsoft Learn it seems that the provider is calling this endpoint with the full config, and wider review of the API seems to indicate that the particular setting that is being changed is not possible to be changed with any endpoints other than this (although my review was not exhaustive)

It may be that your agent will need some additional rights that will allow the required action to allow App Services VNet integration settings to be applied. But via a custom role to prevent it from being able to do any changes to the network resource themselves. This may be of interest RBAC Permission Needed for Configure VNet Integration · Issue #53672 · MicrosoftDocs/azure-docs · GitHub
Which appears to indicate that

Microsoft.Network/virtualNetworks/read
Microsoft.Network/virtualNetworks/subnets/join/action

is required - and it is specifically the last action that appears in the error coming back from the API.

Furthermore they are detailed in the VNet integration docs as part of the minimum require permissions: Integrate your app with an Azure virtual network - Azure App Service | Microsoft Learn

In short, the AzureRM provider is making a call to the Create/Update API endpoint with the full desired config. TF will be ignoring any ‘changes’ on the platform for your ignored parameters but the provider is sending back what it gets from Azure during the state refresh as the ‘current config’ in this instance, merged with the parameter changes that you have not ignored, as a full config to be applied.
This API call requires the caller to have the above rights on the subnets that are VNet integrated in order to succeed (regardless of if a vnet integration related element has changed or not)

Hope that helps

Happy Terraforming

Thanks for the reply! So the key hypothesis here is that this problem is upstream (with Azure REST API).

I concur - perhaps it tries to validate the requestor has the access to implement the whole request body, part of which is not changed. In this case, the request sent by Terraform AzureRM provider in the name of the CI/CD agent contains the VNET information as well, and Azure (maybe) checks whether this identity could do this in its entirety, even though there are parts that don’t require changes.

I guess I will probably create a ticket to Microsoft, and hope they won’t call this a downstream problem instead.

On the other hand, I used another account that has the exact same Azure roles assigned as the CI/CD agent managed identity (i.e. can manage resources but not the network), and updated the Function App’s application settings without issues.

And in general, adding the extra rights to the agent may be a workaround though there would be a lot of people to convince for it to happen.

Thanks again!

I have a suggestion. It won’t help you resolve your issue, but it may provide some further intelligence for any issue that you raise against the AzureRM provider or MS, and why the same issue does not occur when using the portal.

When submitting the change in the portal, turn on the browser dev tools and do a network capture. It may show that, when updating the same setting via the portal, it is doing a PATCH and sending only a subset of the properties (unlike the AzureRM provider appears to be doing).

Thanks, that’s a great idea. Turns out Azure Portal doesn’t work the same way as Terraform does; instead of PATCH-ing to a URL, it sends very intricate requests to only cover the relevant change.

For instance, when changing the app settings in Azure Portal, it sends a POST to a URL https://management.azure.com/batch?api-version=2020-06-01 containing the actual requests, such as PUT /subscriptions/redacted/resourceGroups/redacted/providers/Microsoft.Web/sites/xxxxxxx/config/appsetings?api-version=2022-03-01 with a body containing id, location, properties (the app settings), tags and type “Microsoft.Web/sites/config”.

So this can complicate the analysis.