Datasources are removed from Terraform state file when we run refresh

Datasources which have some depends_on either directly or if a datasource is present in a module which depends on another module or template the datasources are wiped off from state file when we run terraform refresh .

simple example

data “local_file” “foo” {
filename = “${path.module}/foo.bar”
depends_on = [null_resource.cluster]
}

terraform {
required_providers {
local = {
source = “hashicorp/local”
version = “2.2.3”
}
null = {
source = “hashicorp/null”
version = “3.2.1”
}
}
}

resource “null_resource” “cluster” {

Changes to any instance of the cluster requires re-provisioning

triggers = {
date = timestamp()
}

}

  1. terraform apply and terraform show
harinireddy@Harinis-MBP test-ds-refresh % terraform apply         

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create
 <= read (data resources)

Terraform will perform the following actions:

  # data.local_file.foo will be read during apply
  # (depends on a resource or a module with changes pending)
 <= data "local_file" "foo" {
      + content        = (known after apply)
      + content_base64 = (known after apply)
      + filename       = "./foo.bar"
      + id             = (known after apply)
    }

  # null_resource.cluster will be created
  + resource "null_resource" "cluster" {
      + id       = (known after apply)
      + triggers = {
          + "date" = (known after apply)
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

null_resource.cluster: Creating...
null_resource.cluster: Creation complete after 0s [id=5266093942093123266]
data.local_file.foo: Reading...
data.local_file.foo: Read complete after 0s [id=ad881bf69d02d5529818391e5eb0afea4c26349e]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
harinireddy@Harinis-MBP test-ds-refresh % terraform show 
# data.local_file.foo:
data "local_file" "foo" {
    content        = "hi hello"
    content_base64 = "aGkgaGVsbG8="
    filename       = "./foo.bar"
    id             = "ad881bf69d02d5529818391e5eb0afea4c26349e"
}

# null_resource.cluster:
resource "null_resource" "cluster" {
    id       = "5266093942093123266"
    triggers = {
        "date" = "2022-11-29T11:24:21Z"
    }
}

2)terraform refresh and terraform show

harinireddy@Harinis-MBP test-ds-refresh % terraform refresh
null_resource.cluster: Refreshing state... [id=5266093942093123266]
harinireddy@Harinis-MBP test-ds-refresh % terraform show   
# null_resource.cluster:
resource "null_resource" "cluster" {
    id       = "5266093942093123266"
    triggers = {
        "date" = "2022-11-29T11:24:21Z"
    }
}

Hi @hkantare,

Data sources are only kept in the final state as an implementation detail, though they can help with debugging at times. If values are needed externally from the data source, those should be assigned to root module outputs. Is there a reason you need the data source to be present in the state?

As for the behavior, I have a feeling that this is from some of the quirks in the old refresh command. I would try using the -refresh-only plan option and see if it gives you a different result with the current release.

I recently asked a question which led to some informative responses about data sources, which may be useful for more background reading: Why does `-refresh=false` not disable refresh of data sources?