Can a TF provider just manage its own state instead of using a meta value?

Does the value returned from a ConfigureContextFunc serve any necessary function for the SDK/Terraform core? My understanding is that it’s purely a provider-defined value. If that’s the case, then is there any reason my provider couldn’t manage its own state and ignore the meta value altogether?

for example:

type MyProvider struct {
	client MyClient
}

func (p MyProvider) Provider() *schema.Provider {
	return &schema.Provider{
		Schema: map[string]*schema.Schema{
			"id_token": {
				Type:       schema.TypeString,
				Required:   true, 
			},
		},
		ConfigureContextFunc: p.configure, // a method on this instance
		ResourcesMap: map[string]*schema.Resource{
			"myprov_thing": {
				CreateContext: p.create, // operations are methods on this instance
				ReadContext:   p.read,
				DeleteContext: p.delete,
				UpdateContext: p.update,
				Schema: map[string]*schema.Schema{
					"name": {
						Type:     schema.TypeString,
						Required: true,
					},
				},
			},
		},
	}
}

func (p *MyProvider) configure(_ context.Context, d *schema.ResourceData) (interface{}, diag.Diagnostics) {
	// Set field of the instance
	p.client = MyClient{ 
		Token: d.Get("id_token").(string),
	}

	// return nothing to the SDK
	return nil, nil
}

func (p MyProvider) create(ctx context.Context, d *schema.ResourceData, _ interface{}) diag.Diagnostics {
	res := p.client.CreateThing(ctx, Request{ Name: d.Get("name").(string) }, )
	d.SetId(res.ID)
	return nil
}

func main(){
	prov := MyProvider{}
	plugin.Serve(&plugin.ServeOpts{ ProviderFunc: prov.Provider	})
}

Is the config function ever called more than once in the same process, like maybe in the case of aliased providers with different configs?

Is HashiCorp not active in this forum?

Anyone here? Beuller?

As far as I can tell, Terraform is creating separate processes for each aliased provider. I can’t think of any reason why this shouldn’t work as long as this holds true. Is that something I can rely on?

Hi there! Sorry, we were swamped with release work and then some needed PTO to recover from shipping.

I do believe at the moment configure is not called more than once per process. I don’t believe that it anyone has decided it can’t be, and so relying on it is probably risky.

Is there a reason you don’t want to just return prov from the Configure function and let the SDK inject it and manage it for you?

Thanks for getting back to me!

Yes. My provider receives contributions from many different teams and the customary singleton architecture has become a maintenance nightmare for us. We were stepping all over each others’ toes, local unit testing had become unmanageably difficult and brittle, and all of the usual reasons to not use global/shared state.

We want to present a unified provider for the sake of customer experience, but we need a way to decouple the teams. We have managed to separate them except for a common meta value that more or less serves as a dependency injection mechanism.

This mechanism works well enough at run time, but the separate resources are still far too tightly coupled because of this complicated workaround. It would be a LOT simpler for us if each group of resources could manage their own state in a struct instance. The CRUD entrypoints could be methods on a struct and thereby carry their own state. That would let us use conventional dependency injection, making testing (and everything else) easier.

The only problem is that that meta value as a stand-in for instance state. As long as there’s only one meta value per process and one process per alias, we’re golden.

You may find terraform-plugin-mux interesting.

This is something there is ongoing investigation around and experimentation with. Go’s story for dependency injection makes it a bit tricky to pull off well, however.

I think language is betraying us here. The meta value returned by Configure is the configured state of the provider. It is not the state of the resource. I think that’s what you’re saying here, too, I just want to be sure because “instances” are specific instantiations of resources in Terraform, too, so I want to be sure we’re talking about the same thing here.

Fundamentally, the point of that meta value is to enable access to the values of the provider configuration block from within resources, or the artifacts the Configure function created from those values.

I guess I’m just confused about how this solves your pain point, because it just moves the global mutable state to a different place? You’ll still have your global MyProvider. (I mean, technically, the result from Configure isn’t global mutable state, in that it’s not global…) I guess I’m just struggling to understand how this solution fits into the problem you described.

All this to say: what you’ve described will likely work today, there’s no guarantee it’ll work tomorrow, and I’m interested in hearing more about your pain points to keep in mind for future work.

Yes. When I say “instance” and “state”, I mean an instance of a Go struct which implements a provider and whose members make up its state (not a resource instance/state).

Yes, it’s not global in the strictest sense. I make the comparison because it is a singleton instance that is so widely shared throughout the provider that it poses similar problems for us as global state would.

I did find that interesting because it’s very similar to the solution we landed on. We had already invested in making our own version of it by the time that released some weeks back. I estimate it would be difficult to convince management to abandon it now that we have so much built on top of it.

I had guessed that the intent of this meta value was so that TF doesn’t necessarily need to spawn many processes for aliases of the same provider. Once I noticed that aliased providers always got new processes, passing around a meta value like that seemed obtuse given that methods on an object could just as easily serve as resource CRUD functions.

Thanks for the clear answer on that. If you’re saying it’s not a reliable assumption that configure is called only once per process, then I guess we’ll just have to process the meta value. There are still other ways we can decouple teams, but I was hoping we could get away with simply ignoring meta.

I guess I just nitpick on this one because I don’t see how your solution solves it, because your proposed solution is, in fact, global state. So I’m trying to understand what difference it brings to you in a practical sense, because I’d like to understand the shortcomings and take them into account for future design work.

Oh neat! I will warn you that terraform-plugin-go (which terraform-plugin-mux is built on) is considered the “interopable base layer” for these kinds of tools to be built on, and so while it’s totally amazing that you’ve built your own version, and I don’t want to discourage that at all, I would make sure it’s updated to use terraform-plugin-go’s exported types for the RPC calls, as that will let you interoperate with our future efforts. (I’m not sure how you were exposing RPC calls before, as the types the SDK relied upon were unimportable, but that’s neither here nor there.)

To be clear, ignoring meta is totally supported, and something we do in official providers (random doesn’t even bother supporting the Configure function). It’s just the hoisting of the provider configuration values to global mutable state in the main function you may have a bad time with. For example, I’m pretty sure the helper/resource test framework as it exists will give you some grief there, as it instantiates several copies of the provider in the same process, and they will all have their Configure function called at various times.

You may be able to make it work. I think whether it’s a good idea depends on how closely you want to track the SDK developments in the future, as we may accidentally break it as it has different requirements than most providers; how important it is to you to be able to use the latest and greatest versions of the SDK, as if we break something for this particular case and only this particular case, we will obviously care about fixing it, but need to weigh that against work that would impact more of our providers; and how hard it will be to remove once implemented, as you may find yourself in a corner where you need to pivot to something else.

I can’t make any of those calls for you, unfortunately. :slight_smile: All I can tell you is that hoisting our presumed-local provider configuration information up to a global value is outside the expectations we have for compatibility considerations, and so you’re kind of taking your compatibility into your own hands at that point.

Injecting dependencies for the purpose of local unit testing, such as mock API clients, has been troublesome for us. We haven’t found a great way to deal with that yet.

We didn’t. Our approach was to have each component contribute resource schemas for their resources and then pass an aggregate schema to the SDK. Everything was in-process.

1 Like

That’s really helpful feedback, thanks!