How to handle errors when a resource isn't a 1:1 with its API?

This question has stemmed from V2 SDK Provider is unexpectedly removing a nested block from state

But I wanted to ask the question more broadly (for both the v2 SDK and also the new plugin framework) to understand how other people are doing this. I’m hoping this problem isn’t present in the new plugin framework, but even if it isn’t, my provider will be on the v2 SDK for a while so I need to figure out a solution for that still.

So let me paint a picture of my provider…

My ‘resource’ isn’t a single API call.

My resource is actually multiple ‘things’.

For example…

resource "fastly_service_vcl" "example" {
    domain {
        ...
    }

    backend {
        ...
    }

    snippet {
        ...
    }
}

The resource and each of the nested blocks are all separate ‘things’ that each have their own API calls.

So when ‘creating’ this resource (fastly_service_vcl) I need to actually make multiple API calls. One API call to create a service, then I need to make an API call to create a ‘domain’, then another API for ‘backend’, ‘snippet’ etc. Finally once those nested things are created I need to call my API one last time to activate my main resource (fastly_service_vcl).

If there’s an error in the creation of one of the nested blocks, I’m finding the state is getting messed up and reflecting the planned diff even though I return an error from the API call as part of the Create step (refer to V2 SDK Provider is unexpectedly removing a nested block from state for an abstracted example of this issue, where by the ‘plan diff’ intends on deleting a nested block and adding a new nested block even though the associated API calls either failed or didn’t even get made, and that plan diff still gets persisted to my state).

I’ve tried using Partial() and I’ve tried to trigger a Read of each ‘thing’ and although the final state data looks to be correct once I’ve done a read, because my ‘Create’ function has to return an error the state that’s read is dropped and the original planned diff is persisted (I’ve even stopped returning an error altogether and tried to return just the result of the Read, which is successful, and STILL the state reflects the planned diff rather than the modified state after a Read).

I think the problem is due to a mismatch between how your resource has been designed compared with Terraform expectations.

In general I think a resource is seen as a fairly atomic unit, and therefore generally maps to a single API call (rather than multiple). You’d then expect to create multiple resources of different types to implement the functionality.

@maxb suggested that Partial isn’t supported for create and delete, which makes some sense - a resource either exists or doesn’t, rather than partially existing.

My suggestion would be to change the design to have multiple resource types instead of trying to do everything in a single resource.

If you aren’t able to do that you’d probably need to look at implementing some sort of rollback, so in the case of a failure of one of the API calls it makes additional calls to delete whatever was already created. For a partial delete failure I’m not sure how you could implement something similar, so may have to accept that the state believes everything still exists and doesn’t fail if you try to delete an already deleted item.

1 Like

terraform-plugin-framework seems to open up some interesting new options here…

You can return an error diagnostic from your create function, along with valid state, and Terraform will mark the resource as tainted (needing destroy and recreate) but will still save the state you returned, giving you another shot on a future apply to clean up.

You can return an error diagnostic from your delete function, along with valid state, so you can record that you made partial progress on deletion.

1 Like

In case it’s useful for anyone else who stumbles across this discussion, I also asked on StackOverflow and received this answer: How should Terraform provider handle resource error when it consists of multiple entities? - Stack Overflow