Framework v1.1.1 - Strategies for diagnosing resource state churn

@hQnVyLRx you can eliminate the Read call entirely from troubleshooting by using terraform plan -refresh=false. Building the provider with the latest changes on terraform-plugin-framework yields some key log entries:

2023-02-06T09:04:12.081-0500 [DEBUG] provider.terraform-provider-issue644: Detected value change between proposed new state and prior state: @caller=/Users/bflad/src/github.com/hashicorp/terraform-plugin-framework/internal/fwserver/server_planresourcechange.go:191 tf_resource_type=issue644_rack_type tf_rpc=PlanResourceChange @module=sdk.framework tf_attribute_path=leaf_switches tf_provider_addr=example.com/chrismarget/issue644 tf_req_id=38190b69-4c1a-a9bb-7420-8a503757b6ac timestamp=2023-02-06T09:04:12.081-0500
2023-02-06T09:04:12.081-0500 [DEBUG] provider.terraform-provider-issue644: Marking Computed attributes with null configuration values as unknown (known after apply) in the plan to prevent potential Terraform errors: tf_attribute_path=id tf_provider_addr=example.com/chrismarget/issue644 tf_req_id=38190b69-4c1a-a9bb-7420-8a503757b6ac @module=sdk.framework tf_rpc=PlanResourceChange @caller=/Users/bflad/src/github.com/hashicorp/terraform-plugin-framework/internal/fwserver/server_planresourcechange.go:200 tf_resource_type=issue644_rack_type timestamp=2023-02-06T09:04:12.081-0500
2023-02-06T09:04:12.081-0500 [DEBUG] provider.terraform-provider-issue644: marking computed attribute that is null in the config as unknown: @module=sdk.framework tf_attribute_path=AttributeName("id") tf_provider_addr=example.com/chrismarget/issue644 tf_req_id=38190b69-4c1a-a9bb-7420-8a503757b6ac @caller=/Users/bflad/src/github.com/hashicorp/terraform-plugin-framework/internal/fwserver/server_planresourcechange.go:364 tf_resource_type=issue644_rack_type tf_rpc=PlanResourceChange timestamp=2023-02-06T09:04:12.081-0500
2023-02-06T09:04:12.081-0500 [DEBUG] provider.terraform-provider-issue644: marking computed attribute that is null in the config as unknown: @module=sdk.framework tf_req_id=38190b69-4c1a-a9bb-7420-8a503757b6ac tf_attribute_path="AttributeName("leaf_switches").ElementKeyValue(tftypes.Object["logical_device":tftypes.Object["name":tftypes.String, "panels":tftypes.List[tftypes.Object["columns":tftypes.Number, "port_groups":tftypes.List[tftypes.Object["port_count":tftypes.Number, "port_roles":tftypes.Set[tftypes.String], "port_speed":tftypes.String]], "rows":tftypes.Number]]], "name":tftypes.String, "redundancy_protocol":tftypes.String, "spine_link_count":tftypes.Number, "spine_link_speed":tftypes.String]<"logical_device":tftypes.Object["name":tftypes.String, "panels":tftypes.List[tftypes.Object["columns":tftypes.Number, "port_groups":tftypes.List[tftypes.Object["port_count":tftypes.Number, "port_roles":tftypes.Set[tftypes.String], "port_speed":tftypes.String]], "rows":tftypes.Number]]]<null>, "name":tftypes.String<"leaf switch label">, "redundancy_protocol":tftypes.String<null>, "spine_link_count":tftypes.Number<"1">, "spine_link_speed":tftypes.String<"10G">>).AttributeName("logical_device")" tf_provider_addr=example.com/chrismarget/issue644 tf_resource_type=issue644_rack_type tf_rpc=PlanResourceChange @caller=/Users/bflad/src/github.com/hashicorp/terraform-plugin-framework/internal/fwserver/server_planresourcechange.go:364 timestamp=2023-02-06T09:04:12.081-0500

Inspecting the protocol level data to determine why the framework determined Detected value change between proposed new state and prior state is a little trickier for troubleshooting outside being able to run this in an acceptance test. That data is directly from Terraform. The TF_LOG_SDK_PROTO_DATA_DIR environment variable can be used to dump MessagePack encoded files, e.g. TF_LOG_SDK_PROTO_DATA_DIR=/tmp terraform plan -refresh=false and inspecting TIMESTAMP_PlanResourceChange_Request_PriorState.msgpack versus TIMESTAMP_PlanResourceChange_Request_ProposedNewState.msgpack.

Another option is starting the provider in debug mode, after adding code like this: Plugin Development - Debugging Framework Providers | Terraform | HashiCorp Developer

Which via editor configuration or manually via delve should provide a generated environment variable to trigger Terraform to use the debugger-attached, already running provider:

Provider started. To attach Terraform CLI, set the TF_REATTACH_PROVIDERS environment variable with the following:

        TF_REATTACH_PROVIDERS='{"example.com/chrismarget/issue644":{"Protocol":"grpc","ProtocolVersion":6,"Pid":75532,"Test":true,"Addr":{"Network":"unix","String":"/var/folders/f3/2mhr8hkx72z9dllv0ry81zm40000gq/T/plugin1988883645"}}}'

Then setting that generated environment variable value before executing Terraform. Using the debugger-guided output (since I was having trouble using fq on the MessagePack files directly), it appears that Terraform version 1.3.7 is sending a known value in the prior state for logical_devices while its sending a null value over in the proposed new state (plan) data:

This type of value difference will trigger the framework to mark any Computed and unconfigured attributes as unknown in the plan. Terraform core will use its typical set-based difference rules (e.g. the whole set value is the “index” for a set element) when rendering a plan for that type. Running off a main branch build of Terraform seems to yield the same result.

I think in terms of next steps here, there are two options:

  • One, potentially filing a Terraform core issue to see if its possible to associate the underlying Computed data correctly in the proposed new state from the available data in the prior state: Issues · hashicorp/terraform · GitHub
  • Two, trying to disassociate Computed attributes from the configurable set attribute. In general, sets which contain both configurable and computed data are problematic as by definition a set is “indexed” on its whole value. When there is partially configured data, the set “index” is different than its complete value. This can confuse logic on both sides of protocol.