Terraform Provider for Vault is behaving like it's not Idempotent

We keep running into errors of “path is already in use” for basic resources such as creating auth methods:

https://registry.terraform.io/providers/hashicorp/vault/latest/docs/resources/auth_backend

Try running some Terraform code to create a userpass auth method more than once, if you want to see this in action. Will provide more repro steps later if anyone’s interested.

Does the Terraform provider complain if the resource has already been created, but not by the same Terraform token? E.g. if another Terraform run, or another, separate state file / workspace, has created the auth method in Vault?

Some of the details:

$ terraform providers
.
├── provider.vault ~> 2.12
├── provider.vault.engineering
└── provider.vault.finance



# Enable an auth method
resource "vault_auth_backend" "userpass" {
  path= "userpass"
  type = "userpass"
}
resource "vault_generic_endpoint" "student" {
  depends_on = [vault_auth_backend.userpass]
  path = "auth/userpass/users/student"
  ignore_absent_fields = true

  data_json = <<EOT
 {
  "policies": ["client", "admins"],
  "password": "changeme#&*(atonce"
 }
 EOT
}

To reproduce, try running this from two different state files.

You should get an error like the following:

Error: error writing to Vault: Error making API request.

URL: POST https://sfvault.opr.test.statefarm.org:8200/v1/sys/mounts/userpass
Code: 400. Errors:

* path is already in use at userpass/

  on userpass_auth.tf line 2, in resource "vault_mount" "mount_userpass":
  2: resource "vault_mount" "mount_userpass" {

Latest Terraform & Latest Vault Enterprise 1.5.

Looks like the issue is from running this from 2 different workspaces.

Terraform will believe it cannot overwrite the pre-existing resource. So technically, it is idempotent, since it’s not even establishing control of the resource in the first place.

Hi @v61!

The intended behavior of Terraform is a provider should return an error if asked to create something that already exists, because otherwise the result would likely be that the remote object ends up bound to two different Terraform resource instances, and that would violate Terraform’s assumptions. For example, if you were to remove the resource block from one of the configurations and apply it then Terraform would plan to delete the remote object, which would then inadvertently cause “drift” for the other configuration.

If you want to bind an existing remote object to an instance in a new configuration, you can use terraform import to achieve that, but note the warning on that page that you must be careful to preserve the constraint that each object is managed by only one configuration. The terraform import command cannot verify that automatically.

With that said, in this case it seems to be Vault itself that is enforcing this rule, because the underlying Vault API distinguishes between the following operations:

  • Enable Auth Method, which the Terraform provider uses to implement “create”
  • Disable Auth Method, which the Terraform provider uses to implement “delete”
  • Tune Auth Method, which the Terraform provider uses to implement “update” (supported only for the contents of the tune block; all other changes require object replacement)

Based on this, it seems that Vault itself doesn’t consider enabling an auth method to be idempotent (it’s modeled as an HTTP POST request), and so the Terraform provider gets that behavior “for free” and therefore automatically meets Terraform’s expectations that creating an already-existing object should fail.

Yeah Seth’s Codifying Vault Policies and Configuration (and the crappy Python I wrote when trying to imitate Seth’s approach) both rely on that implicit “idempotency” of Vault’s API. Vault’s API design already lets me avoid that “resource so nice we created it twice!” issue, so “hats off” to the Vault devs on that.

Now back to the Terraform provider. Let’s say I only used 1 workspace, like a good Terraform user should, but I had already enabled the auth method manually.

Haven’t tested that scenario yet, but my guess/hypothesis/suspicion, from what I read here, is that terraform apply might still cause an error.

Or am I wrong, and terraform apply will detect the existing state and avoid making a POST, because it’s running a GET first to detect that the resource already exists?

Further updates to come!

P.S. this terminology of “bind” & “remote object” as in “bind an existing remote object,” “resource instances” as in “Terraform resource instances,” “configurations” as in “one of the configurations”, and “implement” as in “uses to implement ‘update’” seems specific enough to have technical definitions, as do the assumptions you refer to. In a cursory review of the Terraform docs, I didn’t find docs that seemed specific to these assumptions, although I think what you’re saying here’s pretty much implied by these two guys: