Problem getting vault secrets in a Nomad job

I don’t know if this is a Vault question or Nomad question, but I’m trying here. Admins, feel free to move this over. :slight_smile:

Based on the docs here: Vault Integration and Retrieving Dynamic Secrets | Nomad - HashiCorp Learn I think I’m doing things correctly…but could be way off.

I have a vault policy like so:

path "kv-v2/db" {
  capabilities = ["read"]
}

I add that to vault like so:

vault policy write db /ops/shared/config/db-policy.hcl

My Nomad servers have a valid vault token which was created using the steps in the doc link above. The token was put in the environment, and then Nomad was started.

My jobs have this stanza in the top level:

vault {
  policies = ["db"]
  change_mode   = "signal"
  change_signal = "SIGUSR1"
}

My template in the job is this:

      template {
        data = <<EOT
          {{ with secret "kv-v2/db" }}
POSTGRES_USER="{{ .Data.data.user }}"
POSTGRES_PASSWORD="{{ .Data.data.pass | toJSON }}"
          {{ end }}
EOT
        destination = "db.env"
        env         = true
      }

From the docs, it looks like I’m doing everything right, but when I run my jobs, they fail to run, and I get this error in the Nomad console:

Missing: vault.read(kv-v2/db)

I created the ‘db’ policy, which allows reads to kv-v2/db, and I put the ‘db’ policy in the vault stanza for the job…but it’s not working.

Any hints? Please feel free to point me to the paragraph(s) in the documentation I have missed. :slight_smile:

Thank you!

I found this post: Nomad unable to get vault token but from what I can see, I did everything the poster did in their “working” setup. I’ve modified my setup slightly, but not really functionally.

I moved the vault stanza to the task section

I renamed the mount point ‘secret,’ so my secret input is now:

vault kv put secret/db/config user=blah pass=blah

My policy, which is added to vault using the name ‘db’ is:

path "secret/db/*" {
  capabilities = ["read"]
}

The vault stanza in my task is:

      vault {
        policies = ["db"]
        change_mode   = "signal"
        change_signal = "SIGUSR1"
      }

I would be very appreciative of any tips. This is my last hurdle to get my Proof of Concept cluster up and going. :slight_smile:

OK, digging around on the client syslogs, I found this:

Apr  1 22:09:09 ip-172-31-27-91 nomad[1815]:     2021/04/01 22:09:09.197562 [WARN] (view) vault.read(secret/db/config): vault.read(secret/db/config): Error making API request.
Apr  1 22:09:09 ip-172-31-27-91 nomad[1815]: URL: GET http://active.vault.service.consul:8200/v1/secret/data/db/config
Apr  1 22:09:09 ip-172-31-27-91 nomad[1815]: Code: 403. Errors:
Apr  1 22:09:09 ip-172-31-27-91 nomad[1815]: * 1 error occurred:
Apr  1 22:09:09 ip-172-31-27-91 nomad[1815]: #011* permission denied

So, I could have surmised that. :slight_smile: It still doesn’t tell me why, of course. How do I go about debugging this? How do I see what token the server is trying to use? What token the client is trying to use?

Thanks!

Logging in with the nomad server token gives me a view where I can’t see secret/

So, that gives me something. Hmm…nothing in the bring-up docs talks about giving permissions to the nomad server to read the tree.

I’m guessing read permissions to secret/* need to be added to the nomad-server-policy.hcl. presented in the vault/nomad integration doc referenced in my first post.

But that doesn’t make sense either. What purpose does the task’s policy serve if Nomad can read the entire tree anyway?

OR…does policy needed to make the template != policy/role given to an app once it’s started?

Any insight on all this would be appreciated. :slight_smile:

OK…I’m out of ideas. I added this to the nomad-server policy.

path "secret/" {
  capabilities = ["read"]
}

path "secret/*" {
  capabilities = ["read"]
}

Now, when I log in to Vault using the nomad token, I can see secret/ in the top-level list, but when I lick on “secret/” I am told: “You don’t have access to secret/.”

Help.

The saga continues. I changed the nomad policy to

path "secret/" {
  capabilities = ["list", "read"]
}

path "secret/*" {
  capabilities = ["list", "read"]
}

I can now list everything in the vault UI when logging in with the nomad token.

Still getting this when the app starts: Missing: vault.read(secret/db/config)

Next try: add “list” to the db policy the app uses.

I edited my db policy to

path "secret/db/*" {
  capabilities = ["list", "read"]
}

Still being told “Missing vault.read”

I’m lost now…

Hey, I’ve got Vault working with Nomad. KVv2 stores need an extra item in the “with secret” section.

This:

{{ with secret "kv-v2/db" }}
POSTGRES_USER="{{ .Data.data.user }}"

Should be like:

{{ with secret "kv-v2/data/db" }}
POSTGRES_USER="{{ .Data.data.user }}"

Notice the “/data” between the mount point and the path.

Here’s the documentation on V1 vs V2 KV engines in the template docs: template Stanza - Job Specification | Nomad by HashiCorp

As noted, this also impacts the vault policies within Nomad.

2 Likes

Thank you for pointing that out! I’m not sure how I missed that. I think that needs to be called out in more places and much louder! :slight_smile:

1 Like

I agree 100%. On more than one occasion I felt that some lines in the docs should have the old (now deprecated) “marquee” or “blink” tag. Like, “hey reader, you really really need to know this bit here” :grinning_face_with_smiling_eyes:

Update: it all works now! Wow…that took WAAAAAY too long to solve…

2 Likes

The best part is the lack of error messages in either Vault or Nomad, at least that I saw. I just happened to have come across that during my setup.

Glad it’s working for you now!

The only “error” I got was “Missing vault.read,” but yeah, that was annoying.

It may have to do with the fact that the policies can be created before the mount or path is created, so it can’t check path validity at mount time. But a better error would certainly help. However, that would be hard too…because it’s a case of “the path doesn’t match,” which, technically, isn’t an error…it’s just “no privs because no path match.” Maybe the policy shouldn’t have been tied to the underlying api. Hmm…

I think a solution might be for vault to return an error that the mandatory /data for a KVv2 engine wasn’t included in the request.

Wouldn’t help on the client side, but the vault server should know the engine type and that that is a requirement from the API.

1 Like