It sounds like you’ve already identified the main source of risk with an unattended apply: Terraform makes its plan based both on what’s written in the current configuration and on the state of the existing objects that configuration is managing. Therefore it’s possible that differences in the state of existing objects can cause the same module to behave differently when applied in a different configuration/workspace.
Another similar source of risk, related to the first, is that if you have a sequence of configuration changes A, B, C that you apply separately on one workspace, but you have another workspace that you apply less often and so A+B+C is all applied at once, the result may not necessarily be the same if changes A or B had visible side-effects that are not directly visible to Terraform. This particular variant is uncommon, but it is possible, because not all remote API behaviors can be 100% encapsulated in Terraform’s abstraction. As a straightforward (though rather contrived) example, consider a sequence of changes A, B where A removes
resource "aws_instance" "foo" and B re-introduces it with the same configuration: an EC2 instance is a stateful object, so destroying one and then replacing it (A then B) may have a visibly-different result than leaving it untouched (applying A and B together), even though the Terraform configuration is unchanged. How significant this is would depend on what software is running in the EC2 instance.
One final risk is non-determinism caused by Terraform applying actions concurrently and by a remote operation not always taking a consistent amount of time to complete. For example, if two objects have the same dependencies then Terraform is likely to try to apply their actions at the same time, which means that in practice the remote system(s) could perceive them to have arrived in either order. In most cases this isn’t an issue, but can be problematic if e.g. you have a module that is lacking a necessary dependency relationship and thus may either succeed or fail depending on what order those operations end up being taken in at runtime. Most dependencies come “for free” as a result of data flow between resources, but there are some cases where the design of the remote API makes a dependency invisible to Terraform unless explicitly recorded using the
depends_on argument. For example, if you create an IAM role, attach a policy to it, and pass that role to an AWS service, Terraform can generally see automatically that the service and the policy attachment both depend on the role, but the role isn’t actually “ready” until the policy is attached and so there is a hidden dependency between the service and the role.
With all of that said, if I were building a system like the one you are describing I would plan to ensure that the following two invarants hold for the full life of the system:
- Remote objects are changed only by running
terraform apply with changes to the configuration.
- Take care when writing your modules to consider all of the necessary dependencies between resources.
- Consider carefully the implications of any non-idempotent actions the Terraform configuration takes. Provisioners are a prominent cause of non-idempotence, but some of the APIs Terraform wraps can have non-idempotent behaviors too.
- Ensure that the same sequence of changes is applied to every instance of the system. If you apply commits A, B, C sequentially to one instance of the system, make sure to do the same for all other instances of the system too, rather than skipping ahead and trying to apply A+B+C all at once.
- If you have any independently-versioned modules as part of your overall configurations, the above rules must apply to changes to those modules too: if you tested module changes D and E separately during development, make sure that you apply D and E separately in production too, rather than applying D+E together or applying in the opposite order E, D for some callers. (This will probably require extra coordination in your development process to make sure that every module change is tested against the result of the one before it, which means you won’t practically be able to develop two changes to the same module concurrently.)
The above set of constraints is pretty conservative. In practice you can probably get away with being a little more liberal, depending on the characteristics of the resource types you plan to use. In the end, Terraform’s behavior depends a lot on the behavior of remote APIs, and we’re talking in the general case an so I’m taking a pessimistic outlook. A module written with your use-case in mind can mitigate some of the worst-case scenarios through careful design and testing, but that will tend to require deep familiarity with the behaviors of the remote services in question and the interactions between them.
Incidentally, you mentioned in passing in your question the idea that Terraform is generating CloudFormation configuration. I just wanted to note, separately from the rest of this answer, that Terraform does not use CloudFormation unless you explicitly use a the
aws_cloudformation_stack resource type: instead, it calls directly into the underlying AWS APIs, the same way that CloudFormation itself would.