We have a usecase where the Terraform apply command takes more than 60 minutes to complete. We are using the S3 backend and our AWS Session token expires exactly at 60 minutes resulting in a session timeout. I found this post, which suggests I can set the S3 backend role_arn setting to force Terrraform to cal sts:assume-roll again. However, I’ve been unable to successfully do this or find documentation on how this is supposed to be configured or tested. Is it possible to please supply a reference to how I can enable this functionality?
Here is an example of the debug messaging we see once the session times out
2022-02-17T19:37:49.077Z [INFO] plugin.terraform-provider-aws_v3.49.0_x5: 2022/02/17 19:37:49 [DEBUG] [aws-sdk-go] DEBUG: Request autoscaling/DescribeAutoScalingGroups Details:
---[ REQUEST POST-SIGN ]-----------------------------
POST / HTTP/1.1
User-Agent: APN/1.0 HashiCorp/1.0 Terraform/0.12.13 (+https://www.terraform.io) terraform-provider-aws/3.49.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.39.0 (go1.16; linux; amd64)
Authorization: AWS4-HMAC-SHA256 Credential=ASIAYRDLK36GFEVMDZPN/20220217/us-east-1/autoscaling/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date;x-amz-security-token, Signature=<HIDDEN>
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Entry point command "entrypoints/rehydrate" failed
2022-02-17T19:37:49.435Z [INFO] plugin.terraform-provider-aws_v3.49.0_x5: 2022/02/17 19:37:49 [DEBUG] [aws-sdk-go] DEBUG: Response autoscaling/DescribeAutoScalingGroups Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 403 Forbidden
Date: Thu, 17 Feb 2022 19:37:49 GMT
2022-02-17T19:37:49.435Z [INFO] plugin.terraform-provider-aws_v3.49.0_x5: 2022/02/17 19:37:49 [DEBUG] [aws-sdk-go] <ErrorResponse xmlns="http://autoscaling.amazonaws.com/doc/2011-01-01/">
<Message>The security token included in the request is expired</Message>
- you probably need to renew your security token. You can find one method at https://cloudsentry.cloud.capitalone.com/about
Thanks @kpanic9 This was my first thought as well. I had actually tried it by manually refreshing the ~/.aws/credentials but I noticed that the terraform apply command wasn’t picking up the new credentials. I’ve been testing with TF_LOG=DEBUG which shows the credentials used when it is in the “wait” loop. I thought maybe the command was reading the credentials from env variables instead? Could I ask how you were able to get the terraform apply command to re-read the credentials file?
No trouble at all. We have a script that does the AWS IAM role assume and credentials file update. Once that script is run, Terraform automatically uses the renewed credentials. If this solution didn’t work for you, the only reason I can think is the timestamp values in ~/.aws/config and ~/.aws/credentials is not being updated.
This mechanism assumes that you have some longer-lived credentials outside of Terraform which the AWS provider and S3 backend will then use to call sts:AssumeRole to get the time-limited session credentials for the role.
If the top-level credentials you are using are time-limited themselves, and you have no possibility of issuing longer-lived credentials then I think your only option would be to ensure your Terraform configuration can complete before the credentials timeout. Terraform can only renew credentials that Terraform itself requested.
Thank you @kpanic9 and apologies for not responding sooner… I was a little tied up with something else. I looked at this closer and it does appear the new credentials are used except when Terraform is in that try/wait loop:
2022-03-10T15:37:20.847Z [INFO] plugin.terraform-provider-aws_v3.49.0_x5: 2022/03/10 15:37:20 [DEBUG] "blue-asg" Capacity: 0 ASG, 0 ELB/ALB, satisfied: false, reason: "Need exactly 1 healthy instances in ASG, have 0": timestamp=2022-03-10T15:37:20.847Z
2022-03-10T15:37:20.847Z [INFO] plugin.terraform-provider-aws_v3.49.0_x5: 2022/03/10 15:37:20 [TRACE] Waiting 10s before next try: timestamp=2022-03-10T15:37:20.847Z
module.stack.module.asg_blue.aws_autoscaling_group.asg: Still modifying... [id=blue-asg, 11m49s elapsed]
During this looping period I don’t see the new credentials picked up by Terraform and this is the period of time where they get timed out. It sounds like you may be seeing something different and if so I’m wondering maybe you are on a more recent version?
The problem is that when the credentials expire I don’t see Terraform attempt to request new ones (I’m testing with trace logging). Could you confirm these are the only settings we need and/or provide a link to a working example?
Looking at this example and back at the log you shared earlier it seems like you are using an older version of Terraform (v0.12.13, it looks like?) and so off the top of my head I can’t confirm how that older version would behave with the settings you shared.
Thank you that’s very helpful. Yes that’s correct, we use v0.12.13. I did notice when testing that I did not see the role_arn or session name variables in the S3 backend file, which was confusing at first but maybe explains your observation that the feature isn’t available in the version we are currently using.
I can see if we can potentially test with a more recent version. iIs there a target you’d recommend or at least aware off the top of your head the feature is available in?