Sensitive state, hashing, and compliance needs

I imagine folks have already discussed this some, and I’m happy to relocate my thoughts elsewhere, but I’d like to ask about sensitive values in state and potential approaches to making state a little less sensitive.

As I understand it, the established solution for handling resources that interact with sensitive information is to consider the state file sensitive:

For any Terraform module that reads or writes Vault secrets, these files should be treated as sensitive and protected accordingly.

from https://registry.terraform.io/providers/hashicorp/vault/latest/docs

For some organizations, this doesn’t provide sufficient protection, or doesn’t meet compliance and legal requirements. One possible solution that’s come up previously is, for some classes of sensitive information, to store the hash of the data in the state file and determine whether the value has changed based on the hash of the new value. Certainly, moving the management of sensitive information out of Terraform also resolves these concerns, but hampers Terraform rollout+adoption and results in splitting up related infra functionality.

Some potential approaches to realizing sensitive data hashing, which come with varying trade-offs around who ends up supporting the functionality:

  • define a recommended hashing strategy for Terraform providers, and expect providers to add appropriate (opt-in?) support; this pushes most of the work out to providers
  • define this strategy, and add minimal functionality to Terraform core to support providers that want to implement this (perhaps just a terraform-block level flag like hash_sensitive_state
  • add support to terraform core explicitly, such that reads from Terraform state prefer a comparison over a real read (perhaps not viable, given the existing sdk/api), and writes of sensitive data hash it prior to writing out to the state file
  • add a hash_changes field to the lifecycle block (or some similar directive) that asks terraform to explicitly hash the value of the resource attribute instead of storing it verbatim
  • introduce a different kind of plugin (distinct from the provider kind) that allows the plugin to abstract or intercept state interactions and add this sort of functionality (among others)

For organizations moving toward Terraform as a tool for encoding durable infrastructure, it can be painful and detrimental to adoption to have to reject proposals on the grounds that it doesn’t meet our compliance needs and require infra engineers to reimplement provider functionality to configure sensitive data in a local-exec provisioner.

2 Likes

For those interested in security, keeping sensitive values out of .tfstate is very much needed!!

I like the plugin suggestion.

variable private_key_pem {
  type = string
  sensitive = true
  transform = name_of_plugin
}

Transform plugin could at minimum have raw_to_state and state_to_raw functions. Maybe also a raw_to_redacted for display output.