Provider plugins that live for the duration of a terraform run

Continued from https://github.com/stefansundin/terraform-provider-ssh/issues/11

In terraform 0.12, it appears that provider plugins are closed as soon as their resources and data sources in the apply graph are complete.

terraform-provider-ssh was relying on the previous behavior and launching a goroutine to maintain each SSH connection it created for the purposes of local forwarding.

Is there a viable way to do this in terraform 0.12? @stefansundin was thinking of forking a child process to maintain connections and using inter-process communication from the provider plugin to that process. That seems like it might be viable at first glance to me but adds a lot of programming overhead compared to using goroutines/channels.

The tricky bit here is that other resources will depend on data_source_ssh_tunnel. Those resources are waiting on data_source_ssh_tunnel to be created to signal that the tunnel is ready. Even if it were possible to wait for the completion of dependent resources (AFAIK it’s not) there’s still no easy way to keep a goroutine running until their completion.

Just looking for guidance on whether there might be a viable path here for a provider or whether I should redirect my effort towards landing this in core.

1 Like

Unfortunately there is no way to achieve this in Terraform 0.12. This worked by coincidence in Terraform 0.11 because the old provider protocol (based on net/rpc and yamux) supported multiple instances of the provider object living in the same process, but the new protocol (based on grpc) supports only one object per process and so it will launch multiple processes if multiple instances are needed and will destroy them as soon as they are no longer needed.

Since what this provider is doing is not within the scope of what the provider protocol is intended for, I think unfortunately the complexity of dealing with this non-standard pattern must fall on the provider itself, rather than Terraform Core. With that said, I expect it’ll be tricky to implement this as a separate shared process, because there is no direct signal to a provider that it is “done” and so I’m not sure what you could use to know when that other process ought to shut down.

I think if Terraform Core were to change to support this use-case, we’d be better off doing something like what’s discussed in GitHub issue #8367, which covers either implementing SSH tunnels directly as part of Terraform itself or adding some new concept to the provider protocol for modelling helper objects that should exist only for the duration of a single walk and be explicitly destroyed once the walk completes. (That is, Terraform would still re-create it separately for refresh, then plan, then apply, just because each of those steps is isolated from the next to reduce the risk of Terraform behaving differently when they all happen in one command vs. when they are run as separate commands.)

The Terraform team unfortunately hasn’t had time to do further design and prototyping for that feature due to priorities being elsewhere, so it’s unlikely that either of these could be implemented in the very near future.

Thanks for the detailed reply Martin! It sounds like the complexity of this feature and its broader implications for the terraform plugin lifecycle means it’s a bit much for me to try to take on for a PR. If you think there’s an opportunity to contribute I can take a look but otherwise I will report back and update the provider project with links to this discussion.