Terraform 0.13 - handling of data source - Data resource reads can no longer be disabled by -refresh=false

Hi,

I’m looking for some advice on how to deal with a data source I’m using within a module. As per my understanding terraform 0.13 evaluates data sources during the plan phase (upgrade guide). However if a data source is used within a module the data source might be empty during the plan phase and therefore fail. Even though the data source filter depends on a module input (variable).

I have a code base which can be applied using terraform 0.12.29 successfully. However it fails if terraform 0.13.5 is used.
The module uses a AWS TGW_ID as input and should search for properties of the TGW. However the TGW isn’t yet deployed (plan phase). I don’t want to end up running apply multiple times.

So I’m wondering
a) is my understanding correct, shouldn’t modules use data sources at all?
b) are there recommended approaches to get out-of that “deadlock”?
c) Should I just provide the output of the data source as an required input to the module so that the data source is run outside of the module?
LOG Output:

2020-12-03T20:26:22.256+0100 [DEBUG] plugin.terraform-provider-aws_v3.19.0_x5: </DescribeAccountAttributesResponse>
2020/12/03 20:26:22 [DEBUG] Resource instance state not found for node "aws_ec2_transit_gateway.tgw", instance aws_ec2_transit_gateway.tgw
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "local.name"
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "local.id"
2020/12/03 20:26:22 [DEBUG] ReferenceTransformer: "aws_ec2_transit_gateway.tgw" references: []
2020/12/03 20:26:22 [DEBUG] Resource instance state not found for node "module.mymodule.aws_customer_gateway.cgw", instance module.mymodule.aws_customer_gateway.cgw
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "var.ip_address"
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "local.merged_tags"
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "var.vpn_type"
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "var.bgp_asn"
2020/12/03 20:26:22 [DEBUG] ReferenceTransformer: "module.mymodule.aws_customer_gateway.cgw" references: []
2020/12/03 20:26:22 [WARN] Provider "registry.terraform.io/hashicorp/aws" produced an invalid plan for module.mymodule.aws_customer_gateway.cgw, but we are tolerating it because it is
 using the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .tags: planned value cty.UnknownVal(cty.Map(cty.String)) does not match config value cty.MapVal(map[string]cty.Value{"Name":cty.UnknownVal(cty.String)})
2020/12/03 20:26:22 [WARN] Provider "registry.terraform.io/hashicorp/aws" produced an invalid plan for aws_ec2_transit_gateway.tgw, but we are tolerating it because it is using the legacy plugin S
DK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .tags: planned value cty.UnknownVal(cty.Map(cty.String)) does not match config value cty.MapVal(map[string]cty.Value{"Name":cty.UnknownVal(cty.String)})
      - .vpn_ecmp_support: planned value cty.StringVal("enable") does not match config value cty.NullVal(cty.String)
      - .dns_support: planned value cty.StringVal("enable") does not match config value cty.NullVal(cty.String)
      - .amazon_side_asn: planned value cty.NumberIntVal(64512) does not match config value cty.NullVal(cty.Number)
      - .auto_accept_shared_attachments: planned value cty.StringVal("disable") does not match config value cty.NullVal(cty.String)
      - .default_route_table_association: planned value cty.StringVal("enable") does not match config value cty.NullVal(cty.String)
      - .default_route_table_propagation: planned value cty.StringVal("enable") does not match config value cty.NullVal(cty.String)
2020/12/03 20:26:22 [DEBUG] Resource instance state not found for node "module.mymodule.aws_vpn_connection.vpn", instance module.mymodule.aws_vpn_connection.vpn
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "var.static_routes_only"
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "var.vpn_type"
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "local.merged_tags"
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "var.transit_gateway_id"
2020/12/03 20:26:22 [DEBUG] ReferenceTransformer: "module.mymodule.aws_vpn_connection.vpn" references: []
2020/12/03 20:26:22 [DEBUG] Resource instance state not found for node "module.mymodule.data.aws_ec2_transit_gateway_route_table.tgwrtb", instance module.mymodule.data.aw
s_ec2_transit_gateway_route_table.tgwrtb
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "var.transit_gateway_id"
2020/12/03 20:26:22 [DEBUG] ReferenceTransformer: "module.mymodule.data.aws_ec2_transit_gateway_route_table.tgwrtb" references: []
2020/12/03 20:26:22 [WARN] Provider "registry.terraform.io/hashicorp/aws" produced an invalid plan for module.mymodule.aws_vpn_connection.vpn, but we are tolerating it because it is u
sing the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .tags: planned value cty.UnknownVal(cty.Map(cty.String)) does not match config value cty.MapVal(map[string]cty.Value{"Name":cty.UnknownVal(cty.String)})
2020/12/03 20:26:22 [DEBUG] Resource instance state not found for node "module.mymodule.data.external.tgwlookup", instance module.mymodule.data.external.tgwlookup
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "var.transit_gateway_id"
2020/12/03 20:26:22 [WARN] ReferenceTransformer: reference not found: "path.module"

Hi @tbugfinder,

The primary motivation for reading data resources during the plan phase rather than in a separate refresh phase was actually to address the problem you’ve described here, by allowing data resources to exist in the same dependency graph as the plan actions and thus let Terraform better handle situations where a data resource depends on a managed resource.

However, we did get there gradually and I believe there were some rough edges for more complex cases in the intermediate steps in Terraform 0.13, before we were able to remove the separate refresh phase altogether.

If you can share a little more detail about what you’ve tried and what behavior you saw then I might be able to suggest an approach that will work correctly for you on Terraform 0.13.

Terraform 0.14 – released yesterday – includes the completion of this gradual work where there is no separate refresh step at all, so upgrading to 0.14 might make it work better but it might require some extra upgrade steps to deal with the fact that you can’t successfully run terraform apply (and thus, you can’t fully complete the automatic 0.12 to 0.13 upgrade process). However, if you’re interested in this path I’m happy to try to worth through that process with you. The short version is that I expect you’ll need to use the terraform state replace-provider command to fix up some legacy provider addresses in your state, which Terraform v0.13 would normally have done automatically after your first successful terraform apply.

Hi @apparentlymart
indeed I also tried with terraform 0.14.0 without luck.
Months ago I already started this issue https://github.com/hashicorp/terraform-provider-aws/issues/11554.

I’ve compiled a test-repo.

$ terraform apply

Error: Required attribute is not set

  on ../../../main.tf line 46, in resource "aws_ec2_transit_gateway_route" "this":
  46:   transit_gateway_route_table_id = data.aws_ec2_transit_gateway_route_table.this.id



Error: Required attribute is not set

  on ../../../main.tf line 46, in resource "aws_ec2_transit_gateway_route" "this":
  46:   transit_gateway_route_table_id = data.aws_ec2_transit_gateway_route_table.this.id



Error: Required attribute is not set

  on ../../../main.tf line 46, in resource "aws_ec2_transit_gateway_route" "this":
  46:   transit_gateway_route_table_id = data.aws_ec2_transit_gateway_route_table.this.id


Terraform 0.12.29 is happy if the value is “routed through” a null data source ( transit_gateway_route_table_id = data.null_data_source.values.outputs....).

Thanks & Best

I tried to overcome that by injecting this code.

 transit_gateway_route_table_id = coalesce(data.aws_ec2_transit_gateway_route_table.this.id,"dummy")

But terraform 0.12.29, 0.13.5, 0.14.0 said then:

Error: Provider produced inconsistent final plan

When expanding the plan for
module.testvpn.aws_ec2_transit_gateway_route.this[1] to include new values
learned so far during apply, provider "registry.terraform.io/-/aws" produced
an invalid new value for .transit_gateway_route_table_id: was
cty.StringVal("dummy"), but now cty.StringVal("tgw-rtb-xxxxxxxxxxxxxxxxx").

This is a bug in the provider, which should be reported in the provider's own
issue tracker.

So I’m making the use of the data source value dependent on the input (var.transit_gateway_id) which is also used within the data source:

transit_gateway_route_table_id = length(var.transit_gateway_id) > 4 ? data.aws_ec2_transit_gateway_route_table.this.id : "notgwinput"

Using that the data source is evaluated late (computed) and applied successfully using terraform 0.12.29 - 0.14.0.

Hi @tbugfinder! Thanks for sharing the example repository.

I’m a bit confused by it though: in the error message output you shared there’s a aws_ec2_transit_gateway_route resource, but I couldn’t find that resource in the configuration you shared. Is there something more that isn’t included in the repository? :thinking:

Hi @apparentlymart!
Thanks for following up. Unfortunately I’ve hidden it in the branch tfdatasource.

Hi @tbugfinder! Thanks for that extra pointer. Now you drew attention to it I see that your original link was to that branch, but the forum’s presentation of the link as a big link box obscured that. I’m sorry for not seeing it. :confounded:

I did try to plan your test configuration on Terraform v0.14 and was able to reproduce the error messages you shared.

In order to bypass the error messages I switched to the hard-coded transit_gateway_route_table_id you had commented out, which then allowed the plan to succeed. Out of curiosity then I tried looking at the data resource id result directly via an output value:

output "route_table_id" {
  value = data.aws_ec2_transit_gateway_route_table.this.id
}

The plan output then showed the incorrect behavior directly:

Changes to Outputs:
  + fixture_output = {
      + route_table_id = null
       # (omitted the rest for clarity)
    }

A data resource we’ve not read yet because it’s delayed until the apply step should show as (known after apply), not as null. We can in fact also see this in the plan for the data resource, where the absense of id implies that it has the value null because in resources null represents omitting an attribute:

  # module.testvpn.data.aws_ec2_transit_gateway_route_table.this will be read during apply
  # (config refers to values not yet known)
 <= data "aws_ec2_transit_gateway_route_table" "this"  {
      + arn                             = (known after apply)
      + default_association_route_table = (known after apply)
      + default_propagation_route_table = (known after apply)
      + tags                            = (known after apply)
      + transit_gateway_id              = (known after apply)

      + filter {
          + name   = "default-association-route-table"
          + values = [
              + "true",
            ]
        }
      + filter {
          + name   = "transit-gateway-id"
          + values = [
              + (known after apply),
            ]
        }
    }

Interestingly the same isn’t true for the other data resource. It has a correct (known after apply) value for its id:

  # module.testvpn.data.aws_ec2_transit_gateway_vpn_attachment.this will be read during apply
  # (config refers to values not yet known)
 <= data "`aws_ec2_transit_gateway_vpn_attachment`" "this"  {
      + id                 = (known after apply)
      + tags               = (known after apply)
      + transit_gateway_id = (known after apply)
      + vpn_connection_id  = (known after apply)
    }

This led me to wonder what is different between how these two resource types are defined. Looking at the schemas, I can see the underlying cause:

      "id": {
        "type": "string",
        "description_kind": "plain",
        "optional": true,
        "computed": true
      },
      "id": {
        "type": "string",
        "description_kind": "plain",
        "optional": true
      },

The second of these is the one for aws_ec2_transit_gateway_route_table and it’s incorrect because the absense of "computed": true means that this is an argument intended to be set in the configuration, rather than to be populated as a result.

I’m not sure what changed in Terraform 0.13 to make this only then become relevant, but the behavior of Terraform v0.13 and v0.14 seems to be correct here: there is no id value set in the configuration for that data resource, and therefore its result is null to represent it being unset. Based on the schema rules, Terraform “knows” that its value can’t change as a result of reading the value during the apply step, so Terraform doesn’t mark it as (known after apply).

I think to make this work correctly would require changing the provider’s declared schema for aws_ec2_transit_gateway_route_table to mark id as a “computed” attribute, which will then tell Terraform that its value will be decided only after reading the underlying data source, unless it’s set explicitly in the configuration as one of the query arguments.

I’ll add a note about this to the AWS provider issue you linked to earlier. Unfortunately based on this I don’t have a ready workaround to suggest, but hopefully there will be a new version of the AWS provider with a correct schema and then that will allow you to use this data source successfully after upgrading. :crossed_fingers:

1 Like

Hi @apparentlymart
Thank you very much! Another day of great learnings :slight_smile:

Having a little more time to think about it, a possible workaround did come to mind:

For many AWS object types, the ARN has the object’s ID embedded inside it. If that’s true for EC2 Transit Gateway Route Tables then you might be able to write a more complex expression to extract the ID out of the arn attribute instead of directly using the id attribute. Because the arn is (known after apply), that should cause the derived ID to also be unknown and thus get the effect you were looking for.

I’m not familiar with this part of the EC2 API so I can’t say for certain what this would look like, but with some quick searching I found the following as an example of a transit gateway route table ARN, which seems promising in that it seems to have the object’s ID at the end of it:

arn:aws:ec2:us-east-2:111122223333:transit-gateway/tgw-0262a0e521EXAMPLE

Based on that ARN syntax, the workaround could look something like this:

split("/", data.aws_ec2_transit_gateway_route_table.this.arn)[1]

That is, to split on the single / character in the ARN and then take the second part, which should be the same as the ID.

Of course, this is obtuse to read so I’d suggest adding a comment about what’s going on here and then switch this back to the more “obvious” expression as soon as there’s an available fix for the schema problem.