What would be the best practice for implementing Terraform?
My case: Ideally, I want to use build a CI/CD pipeline in Azure DevOps that will use Terraform to provision infrastructure resources in Azure, then configure them by using Chef (ChefSolo atm, will just Chef Server later in a long run)
I have created a working pipeline that helps me provision a basic Azure VM. However, the structure of Terraform is very basic and not using any Terraform variables files. All the variables are stored as pipeline variables which will be replaced by tokens in Terraform files during the build/release process.
In a long run, we want to use Terraform to create resources for different environments: devlopment, QA, production. And for each environment, there are some certain machine types with different specifications. How would I then structure all my Terraform files in the most effective ways as well as how to manage all the variables files for those machine types terraform file.
And if I want to integrate Chef to the Terraform CI/CD pipeline, how would I achieve that? I know that there is a provisioner step in Terraform that we could use to install and connect the resource to Chef server? But is that a good way though, and what if I only use ChefSolo, how would I integrate ChefSolo into this Terraform pipeline?
There are many ways to implement Terraform in a pipeline. Regarding the first part of your question:
How would I then structure all my Terraform files in the most effective ways?
A common pattern is to divide a subdirectory for each long-running environment. If there is a chance of reused configuration, we can add it to a module.
For example:
environments/
|-- dev/
|-- dev.tf
|-- prod/
|-- prod.tf
resources/
|-- main.tf # module, etc. for reusable configuration
There are some fantastic documents and write-ups that can help:
When to Use Workspaces - While not a guide on VCS structure, per se, it will talk about some useful Terraform Enterprise patterns you can think about applying to your own pipeline.
There are many other patterns that will work, it all depends on scale and your development workflow. As for the second part of your question:
…how to manage all the variables files for those machine types terraform file?
You’re right, the variable files can get unruly! One way to do this is to put a step in your pipeline to retrieve configuration from somewhere and inject it into the pipeline. This is what Terraform Enterprise does. I’ve also scripted the composition of variable files via pipeline but it required the investment to construct the logic.
And if I want to integrate Chef to the Terraform CI/CD pipeline, how would I achieve that?
In general, I prefer to use Packer to build an immutable image instead of running the provisioners. Packer has a chef-solo provisioner. If there are post-provisioning steps that can’t be baked into the image, you might be able to use the remote-exec provisioner to trigger chef-solo, and as you mentioned, there is the Chef provisioner if you choose to use a Chef server. There might be community plugins for chef-solo but they may not be maintained.
@joatmon08 , very nice your explanation helped me a lot and generated a doubt .
In my project I made the separation of environments by workspaces, but reading some recommendations from HashCorp I saw that this is not the best way, follows the excerpt that raised my doubt:
"… In particular, organizations commonly want to create a strong separation between multiple deployments of the same infrastructure serving different development stages (e.g. staging vs. production) or different internal teams. In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls. Named workspaces are not a suitable isolation mechanism for this scenario.
Instead, use one or more re-usable modules to represent the common elements, and then represent each instance as a separate configuration that instantiates those common elements in the context of a different backend. In that case, the root module of each configuration will consist only of a backend configuration and a small number of module blocks whose arguments describe any small differences between the deployments…"
Reading and re-reading I came to the conclusion that the solution used (isolation of environments) I made in this project is wrong, so I started doing some research on the internet to see if my understanding was correct, so I found this thread, which collaborated even more so to the conclusion I came to earlier, namely that isolating workspaces from environments is not the best practice and that it would be ideal to use modules (as in the example you gave above).
One point that I still have a question about and I want to see if you can help me is why isolating environments by workspaces is not a good practice? Well, in the project I’m working on, this strategy works well.
And would it be right to divide the terraforms into sub directories? Where would each directory be configured by environment and would common modules be used for each environment?
The short answer is because multiple projects still share the same state file. This means loss/damage to the state can impact multiple environments. That’s why it can work fine, but the wording is around “strong separation” and “isolation”: there is a single linchpin that essentially connects the two environments. How much of a risk is your call, really.
Sorry, I did not talk about the states files in my project. It is separated by environment (workspaces).
If I understood correctly, create the isolation by workspaces not is a bad practice. I decide which is the best solution for my scenario.
My apologies, @diogocapistrano! The workspace approach of Terraform Cloud versus the eventual subdirectory/module structure of open source Terraform CLI can be a little confusing. We’re in the process of clarifying our documents to better describe each approach. As @Justin-DynamicD pointed out, it will heavily depend on your expected workflow and how you manage your state.
You are correct in that isolation by workspaces is not a bad practice - it works fairly well if your organization can do multi-tenancy and/or a few staging environments. Something with a very flat network topology hosting one or two Kubernetes clusters could use workspaces with minimal overhead. Eventually, as more teams require additional isolation and provision more and more offerings, we often want to standardize configurations and ensure they can be used on-demand (that is where modules and subdirectory structures begin).
If workspaces works for you, I would say continue with it! While it can be difficult to justify a refactor of infrastructure configuration, structuring the subdirectories too early in the project can often lead to some other anti-patterns.