My 25-resource Terraform Project is Taking 3 Minutes to Plan

Hi there, I have a new Terraform repo I’ve been experimenting with:

https://github.com/lancejpollard/cloud

I am new to Terraform, but have run into all kinds of gotchas in terms of syntax and capabilities, it’s very limiting. Currently I have an example that creates 26 resources in the California region for AWS, in one availability zone, but it’s setup to handle all regions and availability zones. It takes 3 minutes - 5 minutes to compute the plan…

Is this normal? What might I be doing wrong? It’s hard to iterate on the terraform plan if it takes 3-5 minutes each time to run terraform plan… Would like it to take like 20 seconds or less.

Hi @lancejpollard,

I wasn’t able to review your configuration in sufficient detail to explain what exactly is going on here; perhaps you can see in the output some specific things that are taking a particularly long time, in which case I could try to explain why that might be.

With that said, it’s not typical to manage an entire multi-region infrastructure deployment in a single Terraform configuration. Usually a larger deployment would be split into smaller units – one per region is a common first level of decomposition, and then possible further decomposition within each region by functional area or by frequency/risk of change – so that the potential impact of a particular change is reduced, and so Terraform does not have to resynchronize the entire space of objects on every change.

(Performance and “blast radius” considerations aside, there is also the concern of limiting the total dependency space of a particular configuration so that you don’t limit your ability to respond with configuration changes during partial outages. If your single Terraform configuration covers all of your regions and one region has an outage, you’d need to run Terraform in an unusual way to avoid work to reconfigure other regions being blocked by the outage. This is a reason for the first levels of decomposition to align with your system’s failure domains.)

I would typically use something like your region module as the first level of decomposition, either by writing a separate root module for each region that all call into that shared module, or using the region module itself as the root and using workspaces in its backend to keep each region’s state separated. For AWS in particular, the former is usually preferable from a failure domain standpoint, because otherwise all of your state storage will be colocated in a single S3 bucket in a single AWS region. Different tradeoffs can apply to other platforms.

There are also often some extra objects that don’t belong to just one region. It looks like your “GlobalAccellerator” objects are examples of that. For these, it’s common to have a separate “global” configuration that can be applied once all of the other regions are active in order to produce whatever global objects are needed to make the parts appear as a single system. The global configuration can use data sources to retrieve information about the region-specific objects as necessary to complete the global object configurations. The global configuration will be spanning across multiple failure domains of course, so it’s best to keep it as small as possible within other constraints.

I hope that helps!

2 Likes

Excellent info, thanks for all the tips, I will have to subdivide this further. Would you recommend creating 1 Terraform “project” or configuration per AZ even? Or is per-region good enough?

Thanks again!