How to enable an optional "Filter" on a Datasource? (using TF Framework)

I’m refactoring an internal TF provider that uses the SDK, to the new Framework. It’s been smooth sailing so far, but there’s a feature I’m struggling to re-implement: datasources with optional filters.

type ThingModel struct {
  Name types.String  `tfsdk:"name"`

type ThingList struct {
  Things []ThingModel `tfsdk:"things"`

data "myprovider_thing" "all_things" {}

This works great, and returns a list of all things from the API.

But I’d like to add an optional filter, so users can do something like:
data "myprovider_thing" "all_things" {
  filter = "some filter string"

Which should only return things who’s name matches the filter’s value.

I’ve tried adding filter to Thing’s schema and ThingList struct. This gives me access to the filter string during Read, but attempts to save the filter to state, dumps it to output, etc. which I don’t want.

What’s the correct way to implement this kind of behavior?

tl;dr: How do I enable an optional attr on a datasource that never gets stored in state, an is only used to modify the Read API query?

Hi @davemcphee :wave:

I guess the first question I would have is why you don’t want filter to be stored in state? Reason I ask is that it seems that it provides information as to the values that are returned by the API call.

I guess I … don’t really know. It just felt like I was doing down the wrong path when I saw my filter string being written to state.

As I’m very new to the TF Framework, and quite new to TF provider development, I assumed that if something felt wrong, I was likely doing it wrong (and one of the drivers for re-writing the old TF SDK based provider was that it does do a lot of things wrong).

In thinking about how to answer your question, I do realize that there’s likely nothing wrong with storing the results of a query under the key used to make that query. What if the results change? Hmm.

If it’s appropriate to store a datasource’s optional filter string in state, then I’m happy to do it, but if there’s some other, more elegant technique, I’d like to hear it.

Terraform internally handles all resources types in much the same way throughout the code, for example in state there are only “resources”, which have a mode of “managed” or “data”. This means their data structures and rules governing them are roughly the same, even though they make different API calls at different times.

Because filter is part of the configuration, it will by necessity be stored as part of the resource state value along with whatever other computed attributes the data source has. The value we work with must conform to the resource schema, with filter only being optional and whatever outputs there are being computed.

The fact that data sources are still stored in the final state at all is a bit of a legacy behavior from early versions of Terraform, since they are always refreshed during plan and we don’t rely on the prior state at that point. That however doesn’t change how the data source is used internally.

Ah, that exactly addresses the source of my confusion; thank you for this explanation!

Writin’ the filter to state it is.