Refresh data source after instance creation

tfdude · October 10, 2022, 8:25am

Hi,

EDIT: using terraform v1.3.2

I’m using a data source to generate lists of VM IP addresses. I use those lists as ansible inventory to provision the machines. I use following code:

data "digitalocean_droplets" "all" {}
data "aws_instances" "all_aws" {}

locals {
  all_aws = data.aws_instances.all_aws.public_ips
  all_do      = data.digitalocean_droplets.all.droplets.*.ipv4_address
}

resource "local_file" "inventory" {
  filename = "inventory"
  content = <<EOF
[do]
%{for ip in local.all_do~}
${ip}
%{endfor~}

[aws]
%{for ip in local.all_aws~}
${ip}
%{endfor~}
EOF
}

This works great for me except one annoying caveat. Every time I add or remove a VM I need to run terraform apply twice - first run does creation/removal, the second run recreates local file inventory. I tried forcing the creation with depends_on both in local file and data sources, but it does not work - in ealy apply phase there is no new machines so the file is recreated with old contents.

Is there any hack to force refreshing the data source after machines are created? Using apply twice or using taint is frustrating

tfdude · October 10, 2022, 9:30am

So depends_on in data source actually fixes the issue, however there are two problems with this approach:

I need to declare depends_on in every data sources
I have to modify the data sources and make them point to every resource with machine I have, which is counterproductive to the aim of using data source in thsi scenario (just to consolidate information without additioanal manual copy paste).

Is there any way to force a resource to be created last in the plan? I could just make a null resource and force the data sources to depend on it.

apparentlymart · October 10, 2022, 11:27pm

Hi @tfdude,

What you’ve described here is a typical problem that arises if you try to manage and read the same objects in the same configuration.

I typically recommend against doing this altogether. Instead, it’s often better to refer directly to the resource blocks that are managing these objects so that Terraform can automatically recognize that the local file depends on the EC2 instances and the DigitalOcean droplets, and therefore order the operations correctly without you needing to explicitly specify the hidden dependencies.

The snippet you shared doesn’t include the resource "aws_instance" "..." and resource "digitalocean_droplet" "..." blocks, so I can’t show a full example, but the general idea would be to remove the data "digitalocean_droplets" "all" block and the data "aws_instances" "all_aws" block and refer to one or more aws_instance and digitalocean_droplet resources elsewhere in the configuration. The resources to refer to will presumably be the same resources you previously specified in your depends_on argument.

tfdude · October 11, 2022, 11:27am

Thanks,

I think I will abandon this approach altogether in favor of inventory plugins in Ansible. Approach I described above is really convoluted - I have several explicit resources and a dozen of resources defined in my module hierachy - trying to consolidate it this way is asking for trouble. I’ll just tag the instances and use this to build host hierachy for provisioning

Topic		Replies	Views
Datasources are removed from Terraform state file when we run refresh Terraform	2	446	November 29, 2022
Data external runs before resource proxmox_vm_qemu Terraform	1	274	July 8, 2022
Only create resources that don't already exist Terraform first-time-question	3	16616	April 22, 2024
Why does `-refresh=false` not disable refresh of data sources? Terraform	8	2466	November 3, 2022
Terraform 0.13 - handling of data source - Data resource reads can no longer be disabled by -refresh=false Terraform	9	6752	December 8, 2020

Refresh data source after instance creation

Related topics