I am working on module code-gen for my company.
One problem we’re facing is that it is difficult to infer valid combinations of attributes for a resource, because the provider does not return validators such as “ConflictsWith” from GetSchema.
One potential partial solution to the problem would be to – as we query and import state – ignore attributes that are set to their defaults. This would allow us to avoid setting conflicting attributes (as long as the actual state has no conflicts). However, the terraform provider also does not return Default or DefaultFunc portions of the schema’s attribute.
Is there is any way to get these details from the provider?
Are there any work-arounds?
Is this by design, or would Hashicorp consider proposals to expand what the provider returns? One potential benefit of adding these to GetSchema is that it may be possible to validate terraform files without calls to ValidateDataSourceConfig or ValidateResourceTypeConfig, reducing the exposed API surface area.
edit: here is a related post asking for these to be returned from the provider: How to document custom terraform providers?
The current design for Terraform is that provider schema only includes the bare minimum required to encode and decode the configuration or state information reliably – since those tasks are done by Terraform Core – but everything else is delegated to the provider to give maximum flexibility to adapt to the unique quirks of each target system.
You’ve seen that most providers are written with one of the official SDKs and that each of the SDKs offers some built-in mechanisms which a provider developer can just opt into with declarative schema information rather than by writing code directly. But from Terraform Core’s perspective that’s an implementation detail: it’s possible to implement equivalent functionality directly in code, and it’s possible to write a provider not using the SDK at all if you need to.
So all of this is why the provider protocol itself doesn’t expose details such as conflicting attributes: that behavior is intentionally encapsulated inside the provider to allow provider developers the most flexibility in implementing whatever logic they need to make a particular remote system work, even if that means bypassing the declarative mechanisms in the SDK and writing the logic by hand instead.
I don’t expect we would choose to change this architecture, because this flexibility has already proven itself useful many times. Any protocol additions exposing these details would either return misleading information when a provider is using its own logic, or would prevent providers from using custom logic.
Automatic code generation is an interesting challenge with lots of new requirements that we don’t currently consider to be in Terraform’s scope. However, I would be interested in hearing more about what you are aiming to achieve; it’s possible that we meet be able to meet the needs in some other way that preserves the flexibility of the current architecture, or alternatively design some way for you to integrate with specific SDK implementations in particular and just accept that some providers won’t be compatible with your system because they aren’t using one of the SDKs you are implementing against.
That makes sense! I will do some more digging to understand how certain portions of the schema are implementation details / can use custom logic.
To give you a bit more detail to see if there are alternative methods:
- large AWS and GCP footprints
- Using terraform without any modules, resulting in obscure differences between every resource
- Modules autogenerated with the least amount of variables given our specific footprint
- We plan to “start fresh” by importing instead of porting existing terraform code/state
For example, since all our RDS instances are on the same DB version, this would be hard-coded in the module. From here we intend to drive the reduction of variables to make all resources similar.
We’re doing this by querying AWS and GCP directly, grabbing the resource schema from the provider, and combining this information to generate the modules.
Our current solution is to list out incompatible attributes in code.
We’re also considering, given the limitations of GetSchema, doing two passes: First we generate a module with incompatible attributes, then we run terraform on it to determine what’s incompatible. This idea has not been thoroughly thought out, and depends on being able to capture and parse the error messages.
One interesting note is that it looks like the “Deprecated” field wasn’t returned from GetSchema in previous versions (according to the OP link). This field has helped us immensely now that it is returned.
Thanks for sharing that extra information!
What I’ve understood from your description is that you intend to take the schema of a set of resource types from a particular provider and machine-generate a module wrapping those which exports some subset of the union of all of the arguments to those resource types as input variables on your module.
I assume this would also be matched by exposing some subset of the attributes of all of the resource types as output values.
Did I understand that correctly?
I’m curious to know what drives the decision of whether to include a particular variable or output value in the generated module. Do you have some hand-written metadata which tells your system what parts of the provider schema it ought to be exposing?
While we intend to make this solution work for multi-resource modules, our first goal is to ensure this works for a module which wraps a single resource. No thought to output values has been given, as we don’t use outputs extensively in our existing terraform.
The only hand-written data we are currently providing to the code-gen tool is which resource type to create a module for (along with our workarounds for conflicting attributes). All of the module content - resource, variables, and tfvar files - are inferred from the resource’s schema and our cloud footprint.
To derive this module information, we iterate through all of the attributes and all of the cloud footprint, and decide if an attribute can be static or needs to be a variable based on our existing resources.
If outputs values are added to the module, I would imagine these would need to be explicitly passed to the tool.
Thank you for the guidance so far! It sounds like code-gen isn’t yet a design consideration for terraform, but I think we’re getting close to finding some of the potential work-arounds for code-gen.
Thanks for the additional context!
While I wouldn’t typically recommend writing modules that just directly wrap a single resource – it doesn’t typically add anything beyond just declaring the
resource block directly – I expect there are some situations where it would be useful and indeed right now it will be difficult to exactly replicate the validation behavior of a provider in the input variables of a module, because there is an power mismatch: provider validation logic is arbitrary code written in Go, whereas Terraform variable validation rules are just isolated declarative tests.
I think for now in your situation it would work best to not try to reproduce that logic in your wrapping module at all, and instead just make sure all of your input variables pass through verbatim to the enclosed resource block and let the provider run its own validation logic against those inputs.
The results should be the same except that the error messages will point to the contents of the nested resource block instead of to the input variable declarations, which is not ideal but hopefully an acceptable compromise.