Manage large infrastructure

Hey guys.

I’ve been using Hashicorp software (such as Terraform, Packer, Consul and Vault) for few years now and, despite some mishaps, it has been working quite well.

In this case, my question is about Terraform and how to organise it when we are dealing with large infrastructure.

Currently, the way I organise the code is by grouping resources that “make sense” to group together. For example, I’ll place together in one project all the resources that create a cluster of VMs, including if needed, provisioning scripts. But I’d put the database in a separate project. Why? Because of “separation of concerns” and the cluster of VMs and the database have different life-cycles.

So what do I do? For each project:

  • I create a separate Git repository with all the Terraform and provisioning files.
  • I create a CI/CD pipeline that builds and deploys the infrastructure.
  • I create a remote state file and, for each environment, I create a separate workspace.

When I want to run everything, I run each CI/CD pipeline, separately and in the right order. When I need to run just one, I run only that one.

The problem is, has you already guessed, is that it starts the get complex to manage the dependencies of all the projects.

My question is: is there already any best practices in how to manage large infrastructure projects? Something where it’s possible to keep separate state files (to minimise risk), where we can deploy each project individually or all together, without or with minimum complexity?

I hope I was able to clearly explain myself and it actually made sense.
Looking forward for your opinions.

Cheers.
Hugo