Proper way to deal with transient resources?

stuart-c · March 2, 2021, 4:59pm

For this specific case “as soon as possible” is still complex - for the normal case of wanting to be able to use the standard ACM feature of auto-renewals that is “forever”.

Even if you didn’t want that, the moment the the CNAME isn’t needed could very easily be a time when Terraform isn’t running. For example assume a DNS zone is not yet delegated, and therefore isn’t yet queryable by a normal resolver. Terraform can happily create the certificate and CNAME record, but the validation will not yet succeed. That part might timeout. At a later point the zone delegation is completed so AWS can now validate the certificate and issue it. At that point the CNAME is “no longer needed” for this validation process, but Terraform isn’t running.

bcsgh · March 2, 2021, 5:32pm

I refer you back to the use of aws_acm_certificate_validation and the idiom of “keep looping on apply until it becomes a no-op”. Between those, “as soon as possible” becomes well defined and reasonably simple to accomplish.

The same pattern would also work with temporary resources and provider implemented provisoners: if things timed out then things would be left in the last state, and a subsequent apply would continue from that state. The main difference would be that with these features, ignoring timeouts and other transient failures, success happens in a single apply rather than after many. Though I’d still re-apply till I get a no-op as validation.

Side note: what is the accepted practice for dealing with configurations that need to react to changes outside the control of Terraform? Say the result of data lookups changing? The best solution I’m aware of would be to schedule periodic applys, which would be mechanically very similar to the apply-till-no-op from above.

ohmer · March 3, 2021, 5:36am

Hi,

Sounds a little harsh in addition to other very assertive statements while have no problem to agree to disagree… I am happy we have disagreed actually, it challenges my thinking. I will try to find sometime to support my idea with a more practical HCL example and might learn something in the process. I try to stick to ideas and proposal evaluation and less to the person, but your use case is interesting to me, so let me throw a couple more ideas, hoping I won’t be wrong and those will sound valid or invalid to you.

Periodic applies via continuous deployment/schedule script are a common practice in my experience. Terraform is designed to run in automation while you can break this depending on how your environment. An example is to force MFA on the credentials used by pipeline/cron job. Underlying AWS go SDK will request a MFA token on the standard input and fail. Some folks also do use local providers which works fines on a CLI but may fail in a pipeline (you might have no write permissions on $CWD and try to write a template file).

Periodic plans are also common to detect drift. In a non enforced GitOps world, manual modification can happen. Depending on how you write your templates, some drift may exists on the provider but not for Terraform. An example of that it describing an AWS security group and their rules as seperate resources. If somebody adds a new rule in the console, there is no drift for Terraform. Separate group/rule resources describes the presence of a security group and a rule but does not mean that the rule is the only one of the group. driftctl does that and other things.

There was very interesting talks about these pitfalls at the last HashiTalk. Videos have not been posted yet. In the mean time, here are some pointers.
=> Hashitalks 2021 - Google Slides
=> https://www.youtube.com/channel/UCqUCM6vQKkq08amsqf_vKzg

Hope it helps.

bcsgh · March 3, 2021, 6:12am

driftctl sounds like an interesting project. It sounds like an attempt to solve at least part of a problem I’ve worked with before. I’ll have to look into it.

BTW: I didn’t intend my disagreement personally, if you understood it to be I apologize. Also, the issue I was disagreeing with you about is not the technical issues here, it’s the mindset of who gets to defines what is to be done: the product owner or it’s users? Interestingly I’ve seen the same mindset (exaggerated: “this is how things should work and we expect the word to conform to our opinion”) in a number of projects associated with the Go programming language, including Go it self. Personally I think anyone who thinks they can foresee all valid uses of a product is fooling themselves. And anyone who unnecessarily limits their offering to the uses they foresee is hamstringing themselves.

stuart-c · March 3, 2021, 8:39am

Ultimately it is the product owner/developers who decide what are the valid uses of the product they are creating.

That doesn’t mean there might not be other possible uses, but the creators get to decide which of those to ignore, either explicitly (we don’t want Terraform to ever be able to do X) or not (we’d like to be able to tackle that but don’t currently have the time).

As a user you can potentially influence a project, but you generally have no method to control what others decide to do. At least with Open Source projects you have the right (assuming you also have the time & technical ability) to fork a project to go in a different direction.

Most projects are opinionated in one way or another (some strongly) as from a developer’s perspective having a well defined scope and way of approaching things makes things a lot easier - with the downside for those not agreeing/wanting other things may be out of luck,

(I wouldn’t say things are normally “we expect the world to conform to our opinion” as there is generally no requirement to use a particular tool, but instead “it works this way/handles these cases and if you need anything different we might be open to discussions/contribution, but ultimately we may decide such other ideas aren’t going to be implemented/maintained by us, but you are free to go your own way”)

bcsgh · March 4, 2021, 5:56am

That is without question true. However some choices those owner could make are better than others. If TF chose to try to add making coffee and mixed drinks to it’s functionality (despite how critical those are to some teams getting things done) I don’t think many would consider that a good idea. Similarly TF choosing to not support resources who’s ID can’t be chosen in advance would cleanly be a bad idea.

Again true, but I do have a very effective way of controlling what the tools I use do: by controlling which tools I use.

And yet again true… but in my experience the scope of what the clients of a product need to do is not really under anyone’s control; it sort of just happens to everyone. If a product makes choices that are too restrictive then most clients will end up needing to work around those restrictions sooner or later.

(Further; the impression I’ve gotten working with things is that choices around restriction have an interesting similarity to Turing completeness: the system can either do almost everything, or almost nothing. And when you have a system that can do almost everything, and is designed to do 99% of what you need to do, people tend to figure out how to hack that last 1% out, despite what the maintainers would wish, and that generally ends up frustrating for all involved.)

How about a real world example of a case where a product tried to make a choice of the type I’m saying is a bad idea? When protocol buffers a while back added oneof, the maintainers of the Go implementation looked at it’s semantics and noticed that it was impossible to implement using POD structs like all prior protobuf implementations had been. So rather than implement that feature, they posted a notice saying they had chosen not too. I never found out how, but a few weeks later, oneof was added to the Go implementation. The mistake that was made by the maintainers was assuming that they could choose what features their users needed; they can’t. In many cases maintainers can choose what features are provided, but not what people need. (In this case I expect there were other in positions of authority that can and did dictate that what was needed would be offered.)

In summery, my view is that most users will have a small but non-zero set of uses that are outside what the maintainers initially intend to support. Limiting the scope of the product to eliminate those needs will cascade into removing most uses cases. If you don’t elimiate those uses, people will “solve” them. The best solution IMHO is to strive for a balance that minimizes the added complexity that is seen when dealing with the common case and at the same time, maximizes the ability of the product to deal with new and novel uses in a sane and contained way (e.g. golang’s unsafe).

ohmer · March 4, 2021, 6:34am

Turing Completeness related to a discussion on IaC? I think you are overthinking it

apparentlymart · December 14, 2021, 4:11pm

A post was split to a new topic: Managing “transient” objects with Terraform

Topic		Replies	Views
Aws_acm_certificate domains when there's more than one aws_acm_certificate.this Terraform	0	683	September 20, 2020
AWS ACM certificate with domain validation AWS	0	1707	September 7, 2022
How to read values of resource created by Provider A in step that needs Provider B Terraform	6	782	August 9, 2022
Aws_acm_certificate.app_cert.domain_validation_options is a set of object, known only after apply AWS tf-aws-provider-release	1	2256	August 1, 2022
Delete a resource once another resource attribute updated to a certain value Terraform	11	85	February 12, 2025

Proper way to deal with transient resources?

Related topics