Data source behavior - resource not found

An interesting topic came up in a PR for a Terraform provider for Azure DevOps. The question, which I’d appreciate a clear answer on, is regarding the behavior of the Read function for data sources.

The only guidance that I’ve found for Read functions is through the Plugin Development Guide and the comments for Resources.

This makes sense for a resource - the absence of the resource in the remote service should trigger a re-create.

However, what is the guidance for a data source? In the case that the resource referenced by the data source is not found, does the underlying Terraform core expect an error to be returned, or d.SetId("") to be called and a nil error returned?

The general guidance I’ve heard most provider developers follow, and the closest thing I’ve heard to a stance from HashiCorp on this, is that it depends. I know that’s not a clear answer like you’re looking for, so I’ll try to elaborate.

Data sources, at their core, are Terraform’s way of referencing information that Terraform doesn’t need to control itself. Whether to error or return an empty value really depends on the error being referenced, just like any API. A good illustrative question for this situation is: is there a valid use case for the empty value? When there is, it’s usually for data sources that return a list, and the empty list is useful in some contexts.

To give some concrete examples, if you were to do a data source for returning instances in an autoscaling group, finding that there are none in that autoscaling group probably should result in an empty list being returned, because sometimes autoscalers scale down to 0, and so that’s not an “error state”, that’s an expected state. If you were to do a data source for retrieving an image by ID, getting a 404 probably should yield an error, because that’s not an expected state; you were told something existed, and it doesn’t, and there’s no plausible reason why it wouldn’t during the normal course of events.

A lot of this comes down to how the information is going to be used. Is it going to be used in a count expression, in which case, a count of 0 is supplying valid information downstream? Or is it going to have its fields directly referenced, in which case it’s going to erroneously return default empty values and lie to whatever is interpolating it?

Does that guidance help clarify things a bit?

Indeed it does. This matches the proposal outlined by the OSS contributor that first identified the problem as well, so it is good to know that we are thinking about this correctly.

Is there a good place for me to request clarification on the documentation for the HashiCorp website? This is really helpful context that is somewhat at odds with the guidance I’ve found so far.

I would highly recommend opening an issue on the SDK repo to track the documentation improvement you’d like to see, and someone on our team can work on a solution.

1 Like

I’ve added a new issue here: [docs] Describe correct behavior for data source Read functions · Issue #467 · hashicorp/terraform-plugin-sdk · GitHub

1 Like