Run new client "bootstrap" job before allowing "normal" jobs

Good afternoon …

I am looking for some guidance on how I may be able to achieve the below … Let’s assume my Nomad cluster is fully operational and I am using appropriate ACLs …

  • New client node (running as root) comes online and successfully registers itself with the servers
  • Nomad deploys a bootstrap job (one and done) to the new node to bring it into full alignment
    • Let’s say the bootstrap job is an Ansible playbook, ran against localhost
    • No other jobs will be deployed to this new client node until the bootstrap job has finished
    • Just for clarity, I am not referring to bootstrapping the ACL system, these are traditional server configs (yum packages, dirs, files, perms, etc.)
  • When bootstrap job has finished, node becomes available to allocate normal jobs
    • Example of a normal job would be to deploy a Java artifact using the Java driver

I assume that:

  • The bootstrap job would use the raw_exec driver
  • The bootstrap job would use the sysbatch scheduler

But how do I get this ONE bootstrap job to a) discover this new node, and b) run as the 1st job, before any other “normal” job will try and allocate to it ??

My initial idea was to start with the client config, to add some meta that would declare the current state as bootstrapping … My bootstrap job would have this as a constraint, the job runs against the new node (again, Ansible playbook against localhost), and then use Ansible to transition that state to ready during the bootstrap process by updating the config (see below) …

But then I THINK I would need to restart the client for that change to take effect (read somewhere a SIGHUP doesn’t fully reload all client configs, confirmed it to be so when running nomad in -dev mode) – and so my assumption is the job would fail (using the thing, to update the thing, and restart the thing) …

My preference would be that I could fully manage this only using Nomad tooling and not have to rely on other components (i.e. Consul, consul-templates, Vault, complicated user-data script, etc.) << not against them at all, just trying to find ways to baby step introduce this into existing workflows …

I hope this all makes sense, and I am ok with a “that’s not what this was designed for” answer if that’s what it is – don’t want to square peg it …

Thanks for the help !!

# client config

# before
meta {
  state = "bootstrapping"

# after
meta {
  state = "ready"

FWIW, I was able to figure this out, and it wasn’t that hard … I just needed to have a legit cluster running (1 server, 1 client) – where job allocations would not be removed after restarting Nomad … It seems using the option -dev removes those allocations/jobs after a restart ??

Anyway – here is my working example ::

I tried to be descriptive in the README so you can follow along …

On a side note, this is really nice, because it allows me to introduce Nomad into existing team workflows, without having to introduce a bunch of tooling … I can easily see where Nomad could manage config management executions in a centralized environment, instead of relying on “complicated” server cron jobs …

Anyway – I hope this helps if you have similar requirements/goals …

1 Like

Hi @gkspranger and thanks so much for answering and posting a link to your repository.

jrasell and the Nomad team