In my current implementation to setup infra on aws, I have a single main.tf which creates a complete environment, I have CI/CD pipeline to do so and replace variables file for different environments ( qa, staging, prod, dr).
Now we are planning to implement layer based coding such as Network layer (VPC, SG…) APP layer, DB layer, so that we can just apply those tf files as per change as with current implementation we had to apply complete main.tf at a time, which would cause even if single error comes we had to recreate complete infra back again.
So if anyone can help with some reference on how to structure terraform code, even grouping of different components would be better.
Our infra has:
VPC (private & public subnets)
EC2 & SG
ALB
Route 53
S3
IAM
How can I create folder structure and maintain best code practice on this? Any help would be appreciated.
You will find that this is a topic of great contention but there are some good practices you can follow such as using modules, remote states, and keeping .tf files small and concise to avoid confusion. I currently maintain dev, stage, and prod, environments on AWS. An example would be deploying your VPC module/modules, then outputting the VPC ID, Subnet ID, etc, to then be pulled in by your EC2 module/modules using the terraform_remote_state data resource.
I personally keep all of my Terraform code in Gitlab and allow CI/CD to deploy everything. I have separate repositories per group of components instead of a folder structure containing everything.
A fresh deployment for me may look like:
Deploy RDS (output and thing lambda may need)
Deploy Lambda using outputs from RDS module (output anything API Gateway might need)
Deploy API Gateway (output anything Route53 may need)
Deploy Route53
Hopefully this is making sense. This is something you’ll have to really explore for yourself and will really depend on what type of infrastructure you’re managing, how many people will be deploying with Terraform, and are you needing to continue to scale your infrastructure or are you at a steady state. Feel free to respond to this with any new questions I may have generated.
I follow a pattern pretty much like @castironclay though I’ve actually been making use of Terraform Cloud and GitHub, but… tomato tomatoe as they say.
Much of my environment has moved towards more serverless so I’m not needing to define lower level infrastructure like VPC, gateways, subnets, etc but I follow the same practice of
having the downstream workspace import the upstream workspace via terraform_remote_state data sources to access the outputs for information needed. I still have a few pieces out there though that chain multiple deployments and have even implemented triggers so when the deployment being dependent on updates, the downstreams start to update as well.
@castironclay Thanks for the help.
Liked the idea of keeping separate repo for various layers and even tf state, which would help in long term maintainability. If single resource has to be updated we can do it by running respective pipeline job pointing to that particular repo, without being able to affect any other running resources.
@jbouse Can you please elaborate on how you implemented the downstream trigger when a dependent in upstream got changed?? How downstream will know that the data in upstream has got changed/updated? ( do u run the terraform plan on downstream code and check? )
This would help us in implementing better IAAC cycle.
@Ruman1996 I’m using Terraform Cloud to execute so I’m simply using Run Triggers between the workspaces. I have my workspace that defines my VPC at the lowest level, this is then set as a Run Trigger for the ALB and EFS workspaces that would need to be updated if the VPC were to be changed. Likewise, if I split an application deployment into separate pieces I can assign the Run Triggers to connect them.
When triggered in this way, it does only run the terraform plan and does not apply despite having Auto Apply set on the workspace so it does require going in and reviewing and approving to push the downstream changes.
Now I do have all my Run Trigger configuration currently part of my one-time initial configuration along with variables but using the hashicorp/tfe provider I don’t see why I wouldn’t eventually be able to even add this into a deployment that I could run from a GitHub Action and possibly use an instance running inside my AWS environment capable of setting everything up. I just haven’t gotten to that point yet.