Nomad Job Spec Environment Variable Best Practices

At my job we had a forced migration to Nomad after using K8s/Helm successfully for awhile.

Our developers have been pretty unimpressed with the lack of tooling/paved roads into deploying Nomad applications. Particularly around the handling of environment variables and secrets. As mentioned, we used Helm which allows for config maps and has tight integration with Vault.

Nomad has a strange concept called dynamic environment variables to pull variables from Vault and Consul. These are different than regular environment variables (why?) and require awful template syntax for adding env vars from Consul/Vault. There is no way to lint or check syntax/values. I was expecting much better integration considering these are all HashiCorp products.

The other solutions involving JSON files, S3, importing via consul kv are just insanity. Right now we have to paste environment variables back and forth between separate files, some in JSON format, some in the strange template HCL format. No concise syntax.

Is this really the best we have with Nomad for an environment variable/secret solution?

Hello.

Itā€™s been a while since I worked with helm/k8s, so canā€™t quite compare which is easier.

Are you referring to the consul-template formatting when you say ā€˜strange concept called dynamic environment variablesā€™?

These act as normal environment variables, except they are not passed/stored in plain text (vault).

Can also pass & store secrets/kv-data as files, etc if you like (also from consul/vault). E.g. certificates, etc.

Hello thanks for the response. Yes that is exactly what I am referring to. I thought my developers were joking when they showed me this:

      template {
        data = <<EOF
APP_DEBUG="{{key "rtb/APP_DEBUG"}}"
APP_ENV="{{key "rtb/APP_ENV"}}"
APP_NAME="{{key "rtb/APP_NAME"}}"
APP_URL="{{key "rtb/APP_URL"}}"

.....30 more

{{with secret "secret/data/rtb/supply-service/app_key"}}
APP_KEY="{{.Data.data.key}}"
{{end}}

This block above has to be duplicated a few lines below for the same applicationā€™s worker processes.

Then in our code repo, we have to store a JSON file with the non-secret values, one for prod, one for dev.

On deploy we have to call consul kv import with the JSON file as input. And then this job spec will load from Consul.

You canā€™t lint those ENV vars above, and we have to copy and paste the keys to 5-6 different places each time. The main application Nomad spec, the worker application Nomad spec, our docker-compose.yaml, our docker-compose.test.yaml, the JSON file for dev, the JSON file for prod.

To add insult to injury, I figuredā€¦alright well at least I can write a CLI tool that can generate all of this stuff for us. But no, HCL is not really JSON compatible as it claims to be. I have yet to find a single tool that can go back and forth between HCL2 and JSON.

I am at my wits end trying to make this situation better!

That sounds a bit painful, yeah.

A few things that Iā€™ve done at some point, that you hopefully find helpfull:

  • Use existing tooling to render the job-files from templates. Iā€™ve used/use terraform and nomad-pack (Levant). Then you can e.g. replace k/v paths per environment.
  • Use environment-file(s) instead of individual values for consul/vault values.
  • Use separate (nomad/consul/vault) clusters for different environments.

Itā€™s also worth mentioning that Consul-template is very flexible (and a bit confusing; for me atleast), and you can use a lot more advanced features besides the bare minimum shown in the nomad docs.

You can also ignore hcl/hcl2 altogether, and write/submit jobs as json, if you REALLY donā€™t like it.

Sounds like you have suffered/struggled a bit, so you probably know most of this already :wink:

Hi @nickpoulos :wave:

Iā€™m sorry to hear your team is having trouble migrating to Nomad. We appreciate you reaching out for help, and please keep raising these so we can help you and improve Nomad.

You certainly donā€™t have to manually read keys into env vars if their name match the desired variable result. You can instead range over the result of the ls function and set each key as an environment variable with the specific value.

Hereā€™s an example based on what you provided:

job "env" {
  datacenters = ["dc1"]
  type        = "batch"

  group "env" {
    task "env" {
      driver = "docker"

      config {
        image   = "alpine:3.15"
        command = "/bin/sh"
        args    = ["-c", "env | grep APP"]
      }

      template {
        data        = <<EOF
{{ range ls "rtb" }}
{{ .Key }}={{ .Value }}
{{ end }}
EOF
        destination = "local/env"
        env         = true
      }
    }
  }
}

After setting some keys in Consul under the rtb/* path you will get an output like this:

Weā€™re also working on Nomad Pack to provide Helm-like management and workflows. Itā€™s still a work in progress, but we would love some early feedback. Hereā€™s a tutorial on how to get started:

For linting, Iā€™m not sure what kind of validations you are looking for, but have you looked at semgrep? Their HCL support is still in beta, but it seems to work quite well.

Hereā€™s an example of a linting rule to check for the env vars from the job above:
https://semgrep.dev/s/L3jX

Is this the type of thing you are looking for?

3 Likes

Thank you everyone for the thoughtful responses, even when my original post sounds a bit ranty on a second read! I guess the frustration was coming out :grimacing:

So we do have separate clusters for dev/prod for all 3 services - Nomad, Vault, Consul.

We looked at Nomad Pack and Levant, but it was confusing which one was supposed to do what. We were also hesitant to learn yet another templating language.

Right now we have landed on exactly what you suggested @runeron. We went back to good old .env files and took everything out of Consul.

We are committing .env files (.env, .env.dev, .env.prod) since they do not have sensitive data in them anymore. For that, we have a .secrets file that holds the vault annotations we need, which also gets committed.

Rather than learn another template language and install another dependency, we stuck with our appā€™s language, PHP, which also has a lot of templating features.

We have our job spec as a HCL/PHP template, jobspec.hcl.php and run php jobspec.hcl.php > jobspec.hcl to render our template on deploy.

At this point we could not use the .env files and use PHP to render the template to load the values into Consul, or use the new range method, but there are not a lot of benefits, and it keeps our job spec less complex.

We use PHP to load the .secrets file and build the ā€œwith secretā€ blocks dynamically.

Thank you @lgfa29, I did not realize we could range over things like that from Consul, good to know if we ever go back to storing env in Consul.

1 Like

I definitely understand and empathize with your frustration. Itā€™s just environment variables, how hard could this be :sweat_smile:

Iā€™m glad you found a workflow that works for you. If thereā€™s anything that we could do that would prevent you from going through this pain (maybe weā€™re missing some docs?) just let us know :slightly_smiling_face:

One thing that I forgot to mention is that, if you would want to iterate over Vault secrets, you can use the secrets function instead of ls in your task template. Since you are still using Vault, this can be handy in some cases.

1 Like

@lgfa29 I would like to side with the OP about the insanity that I found Go templating to be.

After a calmer, second thought, I realized it was more of a ā€œprior knowledgeā€ bias for me, as I knew Jinja2 templating before, and the Go templating (with its prefix notation) seemed too weird.

That said, I feel there is a lack of examples in the docs example for the templating section.

Also, the two step jump from Nomad docs into ConsulTemplate docs might not seem obvious to a new user of Nomad.

Bear with me here, I had the following recent experience ā€¦

I wanted to do some eq string matching. As I didnā€™t find an example, I (unnecessarily) ended up doing a regexMatch of ^foo$.

I accidentally stumbled upon an answer in this discuss forum of an eq match and then realized the simpler alternative.

So, what I think might help is, not only just listing what all functions are available with their explanation, but also examples of the same.

I have ended up writing templates after cobbling things together from learn examples, discuss answers, random searches, etc. I feel it could be simpler in one place, possibly within the ConsulTemplate wiki docs?

Regards,
Shantanu

1 Like

Oh 100%. Go templates have a steep learning curve and it manages to, at the same time, look weird for people that are used to other templating languages and even for those used to Go itself :grimacing:

Unfortunately thatā€™s whatā€™s most used and readily available in the Go ecosystem. If thereā€™s one benefit to Go templates is that at least you can transfer some knowledge (for example, Docker --format flag uses Go templates). So once you are past that initial pain you will start seeing other places where it can be applied.

Consul templates add another layer on top of that. Even though itā€™s just a new set of functions that are made available to templates, you still need to find and learn them.

Another resource that I forgot to send before is our Learn guide about Go templates:

So, from what I gather so far in the thread, we could improve our template docs page by:

  • Explicitly mentioning that they use Go templates and linking to their official docs as well as to the guide I sent above.
  • Link directly to the list of functions that Consul Template include instead of the repo home.
  • Add more examples, specially around Consul and looping over variables.

Wikis are a little tricky to manage and moderate, but PRs that add more examples to our docs would be great :slightly_smiling_face:

3 Likes

Doc updates are available in this PR:

Let me know if I missed anything that would be worth adding :slightly_smiling_face:

1 Like

Yes agreed 100%. I feel like just having many different examples of real world usage of these Go template constructs inside both Nomad templates and Consul templates would be very helpful here.

For us it I think a lot of it was the combination of having to learn both HCL and Go templates at the same time. While HCL has some agreeable and noble goals, it seems to take the worst of both worlds and merge it together into something wholly proprietary and half-compatible with other more familiar config languages. Then when we ran into the environment variable questions, I had to come ask if this was truly as insane as I thought. Glad to see its not quite the case! We will continue to adjust our workflows as we learn more and settle on some best practices.

@lgfa29 I did have one more question about env var precedence. As I mentioned, we have two groups in our job spec. One for the web app, and one for worker jobs. They both use the same environment variables, with a few exceptions.

If we use the range Go template syntax to load the env vars from Consul in a template block, but then also specify variables in the env block ā€“ which will take precedence within the application context?

Just thinking ā€¦ I would ideally want to collect the examples in one place (ConsulTemplate), though there are Nomad specific examples, which would not work in Consul Template:

Example:

{{ env "attr.memory.totalbytes" | parseInt | divide 1073741824 }}g

You can use terraform to ā€˜importā€™ variables to consul instead of consul cli
https://registry.terraform.io/providers/hashicorp/consul/latest/docs/resources/keys

You can import all sorts of things (scripts for example), not only env variables.