Local Nomad Cluster Using Vagrant

Hi,

I’m looking for resources on setting up a local, stateful Nomad cluster for development, using Vagrant. The Nomad tutorial introduces tooling to create a stateless -dev cluster. I’d like to expand this to a stateful cluster that uses Vault and Consul as well. Any tips would be appreciated.

Thanks,
Manoj

Hey Manoj,

I set this up locally via Ansible + custom playbooks for each service, which extend the ansible-community Consul, Vault, and Nomad roles.

Depending on your configuration, you might be able to get away with a fully provisioned cluster at the time of your vagrant up call.

1 Like

Hi @egmanoj :wave:

Have you tried HashiQube?

It’s a community project that will spin-up a VM with several of the HashiCorp tools running. Give it a try and see if it fits what you need :slightly_smiling_face:

1 Like

I haven’t. And it looks very interesting. Will take it for a spin. Thanks!

1 Like

Nice! Let us know how it goes :grinning_face_with_smiling_eyes:

Tried another approach first. Rationale: want to use the same tools - as much as possible - for local, QA, staging, prod. With that in mind tried my hand at writing a Terraform script to provision a local Virtualbox cluster. I was hoping to reuse (more or less) the same TF script for QA (AWS), and Prod (AWS/GCP/Azure/Custom).

There is a Terraform plugin for Virtualbox, but it appears to be flaky. I was not able to get past terraform apply. Two boxes got created, but apply itself failed after running for ~11 minutes.

virtualbox_vm.node[0]: Still creating... [11m40s elapsed]
╷
│ Error: [ERROR] Starting VM: exit status 1
│ 
│   with virtualbox_vm.node[1],
│   on main.tf line 15, in resource "virtualbox_vm" "node":
│   15: resource "virtualbox_vm" "node" {
│ 
╵
╷
│ Error: [ERROR] Starting VM: exit status 1
│ 
│   with virtualbox_vm.node[0],
│   on main.tf line 15, in resource "virtualbox_vm" "node":
│   15: resource "virtualbox_vm" "node" {
│ 

I think I’ll put TF for local aside for now and look closely at HashiQube. I’d prefer a ‘simpler’ cluster than what HQ builds by default - only Nomad, Consul, and Vault installed. TF can wait until I’m ready to build QA+ clusters.

Update: I’ve been working with ansible-consul and ran into a problem.

@momer Are you at liberty to share your custom playbooks? If so, could you share them/snippets?

Update: Tried using both ansible-consul and ansible-nomad roles in the same playbook to install consul and nomad in the same servers, without success. As far as I can tell:

  1. Consul gets installed, starts, and is accessible via UI
  2. Nomad is installed, but doesn’t start

I am stuck at this point. How can I troubleshoot? Here is what systemd is telling me when I query the status of nomad:

vagrant@consul1:/etc/nomad.d$ sudo service nomad status
● nomad.service - nomad agent
   Loaded: loaded (/lib/systemd/system/nomad.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: exit-code) since Fri 2022-02-04 09:37:02 UTC; 13s ago
     Docs: https://nomadproject.io/docs/
  Process: 17166 ExecStart=/usr/local/bin/nomad agent -config=/etc/nomad.d (code=exited, status=1/FAILURE)
 Main PID: 17166 (code=exited, status=1/FAILURE)
  1. Has anyone managed to setup consul, nomad in the same servers using Vagrant (and Ansible) locally? If so could you share your Vagrantfile and Ansible playbook?
  2. A question for HashiCorp folks: ansible-consul and ansible-nomad are very useful, but I’m not sure if they are actively supported. Do you have documentation lying around somewhere describing how to use them together to create local clusters?

Ref: https://github.com/ansible-community/ansible-nomad/issues/147

Hi @egmanoj,

Thanks for using Nomad. I haven’t set things up with Ansible, but the following process works for me reliably using Vagrant and Nomad. I hope it is useful for you.

Note: I use the Vagrantfile in the Nomad repo

Host machine

  • Install Vagrant
  • Install Virtualbox
  • mkdir -p /opt/gopath/src/github.com/hashicorp
  • sudo chown <username> /opt/gopath/src
  • cd /opt/gopath/src/github.com/hashicorp
  • git clone https://github.com/hashicorp/nomad
  • cd nomad
  • vagrant up linux nomad-server01 nomad-client01
  • Get coffee because that’s gonna take a bit
  • vagrant status

You should see a list of a boxes like this

linux                     not created (virtualbox)
linux-ui                  not created (virtualbox)
freebsd                   not created (virtualbox)
nomad-server01            not created (virtualbox)
nomad-client01            not created (virtualbox)
nomad-server02            not created (virtualbox)
nomad-client02            not created (virtualbox)
nomad-server03            not created (virtualbox)
nomad-client03            not created (virtualbox)

Single Server Setup

nomad-server01

  • vagrant ssh nomad-server01
  • make bootstrap - installs dependencies
  • make dev - builds Nomad from source in the current directory bin folder
  • cp bin/nomad /opt/gopath/bin/nomad
  • sudo mkdir /etc/nomad.d/data
  • sudo chown vagrant /etc/nomad.d
  • vim /etc/nomad.d/server.hcl
log_level= "DEBUG"
datacenter = "dc1"
data_dir = "/etc/nomad.d/data"
enable_debug = true

server {
  enabled = true
  bootstrap_expect = 1 # this could also be 3 or 5 if you need/want to test with multiple servers
}

addresses {
  rpc = "192.168.56.11"
  serf = "192.168.56.11"
}
  • sudo vim /etc/systemd/system/nomad.service
[Unit]
Description=Nomad Service
Requires=network-online.target
After=network-online.target

[Service]
LimitAS=infinity
LimitRSS=infinity
LimitCORE=infinity
LimitNOFILE=infinity
User=root
EnvironmentFile=-/etc/sysconfig/nomad
Environment=GOMAXPROCS=2
Restart=on-failure
ExecStart=/opt/gopath/bin/nomad agent $OPTIONS -config=/etc/nomad.d
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGINT
StandardOutput=journal

[Install]
WantedBy=multi-user.target
  • sudo systemctl enable nomad

Client Setup

  • vagrant ssh nomad-client01
  • cp bin/nomad /opt/gopath/bin/nomad
  • sudo mkdir /etc/nomad.d/data
  • sudo chown vagrant /etc/nomad.d
  • vim /etc/nomad.d/client.hcl
log_level= "DEBUG"
datacenter = "dc1"
data_dir = "/etc/nomad.d/data"
enable_debug = true

client {
  enabled = true
  server_join {
    retry_join = ["192.168.56.11"]
  }
}
  • sudo vim /etc/systemd/system/nomad.service
[Unit]
Description=Nomad Service
Requires=network-online.target
After=network-online.target

[Service]
LimitAS=infinity
LimitRSS=infinity
LimitCORE=infinity
LimitNOFILE=infinity
User=root
EnvironmentFile=-/etc/sysconfig/nomad
Environment=GOMAXPROCS=2
Restart=on-failure
ExecStart=/opt/gopath/bin/nomad agent $OPTIONS -config=/etc/nomad.d
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGINT
StandardOutput=journal

[Install]
WantedBy=multi-user.target
  • sudo systemctl enable nomad

At this point I have a running cluster with one server and one client. I can repeat the process and adjust the configuration to add more servers (bootstrap_expect=3 for example) or clients. Don’t forget to use an odd number of servers (1, 3, or 5) so that you can calculate quorum effectively. I recommend you start with the simplest possible config (single server) and progressively expand.

Please let me know if that helps, or if you run into issues. I’m currently working on making this easier for users and would welcome any feedback.

1 Like

No kidding :grinning: :point_down:

$ time vagrant up linux nomad-server01 nomad-client01
...
    nomad-client01: Processing triggers for dbus (1.12.2-1ubuntu1.2) ...
    nomad-client01: Processing triggers for mime-support (3.60ubuntu1) ...

real  202m56.823s
user  0m30.517s
sys	  0m12.081s

Working on the rest, will keep you posted.

Hi @DerekStrickland,

Thanks a lot. I now have a local nomad cluster up and running.

I successfully tested a simple Tomcat job using raw_exec (This article helped).

Some questions/comments.

  1. Of the three VMs started using vagrant up, linux is used as a build server. If I download and install pre built binaries I should be able to get rid of this one. Is that correct?
  2. I see that consul is installed as well. Is nomad connected to consul?

Hi @egmanoj

Sorry for the delay. My inbox is too full :cry:

So you’d have to add config to the server/client hcl to connect to Consul. And for good measure, I think you’d probably want to add a similar /etc/systemd/system/consul.service file for Consul.

No problem!

Figured this out, thanks.

A note for future readers:

In the end I managed to create a local nomad cluster as follows:

  1. Install Vagrant, and Ansible in the host (my laptop)
  2. Create a Vagrantfile to spin up three virtual machines - one server, two clients
  3. Provision them (I used Ansible, mainly because I’m very familiar with the tool)
    1. Install Docker (I used the vagrant-docker-compose plugin)
    2. Install Consul
    3. Configure Consul - consul.hcl, systemd unit
    4. Install Nomad
    5. Configure Nomad - server/client.hcl, systemd unit
  4. Finally, run vagrant up

With the cluster up and running I tested out the official tutorial on load balancing using Fabio. I did run into some issues there, but those were unrelated to the cluster itself.

Note that I did not use the official(-ish) ansible-consul, and ansible-nomad roles as I could not get them to work together. Instead I wrote simple roles to install (using apt), and configure (combination of copy, and template) Consul and Nomad.

For future readers, part 2: I’ve published my tooling and instructions on Github. Feel free to take it for a spin and share comments.

1 Like

Thank you @egmanoj!!! I love the way this community shares! Great stuff!

1 Like