Terraform init intermittently failing in Azure DevOps pipeline

Hi,

I am using Terraform in an Azure DevOps pipeline and running into an intermittent problem where tf init fails to get existing workspaces from a Storage Account backend.

The tf init call is in a loop and sometimes it works first time, sometimes it works on the second/third attempt, and sometimes it just hangs for 60 minutes until terminated by the pipeline.

The pipeline is being triggered by pushing changes to a readme file in the repo connected to the pipeline so I am confident that nothing is changing in between pipeline runs. As tf init is in a loop I am equally certain that nothing within the pipeline environment is changing.

The IP address of the agent is added to the Storage Account as part of the pipeline, and az storage account network-rule list is used to confirm the IP has been added before continuing with tf init.

tf init fails with the following error:

2020-05-04T09:38:40.4524162Z e[1me[31mError: e[0me[0me[1mFailed to get existing workspaces: storage: service returned error: StatusCode=403, ErrorCode=AuthorizationFailure, ErrorMessage=This request is not authorized to perform this operation.
2020-05-04T09:38:40.4525587Z RequestId:a2585fca-401e-0056-13f7-21c720000000

The storage account logs show the following:

2.0;2020-05-04T09:38:40.4344109Z;ListBlobs;IpAuthorizationError;403;9;9;authenticated;CONTAINER_NAME;CONTAINER_NAME;blob;"https://CONTAINER_NAME.blob.core.windows.net:443/tfstate?comp=list&prefix=STATE_FILE.tfstateenv%3A&restype=container";"/";a2585fca-401e-0056-13f7-21c720000000;0;11.22.33.44:1024;2018-03-28;420;0;130;246;0;;;;;;"Go/go1.12.13 (amd64-linux) azure-storage-go/v36.2.0 api-version/2018-03-28 blob";;;;;;;;;;
2.0;2020-05-04T09:38:50.6831026Z;ListBlobs;Success;200;9;9;authenticated;CONTAINER_NAME;CONTAINER_NAME;blob;"https://CONTAINER_NAME.blob.core.windows.net:443/tfstate?comp=list&prefix=STATE_FILE.tfstateenv%3A&restype=container";"/CONTAINER_NAME/tfstate";49f141d9-e01e-004f-0af7-21479b000000;0;11.22.33.44:1025;2018-03-28;420;0;152;247;0;;;;;;"Go/go1.12.13 (amd64-linux) azure-storage-go/v36.2.0 api-version/2018-03-28 blob";;;;;;;;;;

The first entry (09:38:40) fails. The pipeline sleeps for 10 seconds, before succeeding on the second attempt at 09:38:50. The only difference between the two requests seems to be in the path requested:

# Failed
# prefix=STATE_FILE.tfstateenv%3A&restype=container";"/";
# Succeeded
# prefix=STATE_FILE.tfstateenv%3A&restype=container";"/CONTAINER_NAME/tfstate";

Can anyone help shed some light on why tf init is sometimes requesting /, and sometimes /CONTAINER_NAME/tfstate? We’re looking into the permissions problem with Azure, but this seems like slightly unexpected behaviour.

Thanks,

Mike

For future Googlers: This turned out to be an issue with Azure. If a pipeline agent is located in the same region as a storage account the request will be routed over Microsoft’s internal IPv6 network. As a result the source IP of the request is not the same as the one added to the Storage Account firewall.

2 Likes