I’m looking for resources on setting up a local, stateful Nomad cluster for development, using Vagrant. The Nomad tutorial introduces tooling to create a stateless -dev cluster. I’d like to expand this to a stateful cluster that uses Vault and Consul as well. Any tips would be appreciated.
Tried another approach first. Rationale: want to use the same tools - as much as possible - for local, QA, staging, prod. With that in mind tried my hand at writing a Terraform script to provision a local Virtualbox cluster. I was hoping to reuse (more or less) the same TF script for QA (AWS), and Prod (AWS/GCP/Azure/Custom).
There is a Terraform plugin for Virtualbox, but it appears to be flaky. I was not able to get past terraform apply. Two boxes got created, but apply itself failed after running for ~11 minutes.
virtualbox_vm.node[0]: Still creating... [11m40s elapsed]
╷
│ Error: [ERROR] Starting VM: exit status 1
│
│ with virtualbox_vm.node[1],
│ on main.tf line 15, in resource "virtualbox_vm" "node":
│ 15: resource "virtualbox_vm" "node" {
│
╵
╷
│ Error: [ERROR] Starting VM: exit status 1
│
│ with virtualbox_vm.node[0],
│ on main.tf line 15, in resource "virtualbox_vm" "node":
│ 15: resource "virtualbox_vm" "node" {
│
I think I’ll put TF for local aside for now and look closely at HashiQube. I’d prefer a ‘simpler’ cluster than what HQ builds by default - only Nomad, Consul, and Vault installed. TF can wait until I’m ready to build QA+ clusters.
Update: Tried using both ansible-consul and ansible-nomad roles in the same playbook to install consul and nomad in the same servers, without success. As far as I can tell:
Consul gets installed, starts, and is accessible via UI
Nomad is installed, but doesn’t start
I am stuck at this point. How can I troubleshoot? Here is what systemd is telling me when I query the status of nomad:
vagrant@consul1:/etc/nomad.d$ sudo service nomad status
● nomad.service - nomad agent
Loaded: loaded (/lib/systemd/system/nomad.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Fri 2022-02-04 09:37:02 UTC; 13s ago
Docs: https://nomadproject.io/docs/
Process: 17166 ExecStart=/usr/local/bin/nomad agent -config=/etc/nomad.d (code=exited, status=1/FAILURE)
Main PID: 17166 (code=exited, status=1/FAILURE)
Has anyone managed to setup consul, nomad in the same servers using Vagrant (and Ansible) locally? If so could you share your Vagrantfile and Ansible playbook?
A question for HashiCorp folks: ansible-consul and ansible-nomad are very useful, but I’m not sure if they are actively supported. Do you have documentation lying around somewhere describing how to use them together to create local clusters?
Thanks for using Nomad. I haven’t set things up with Ansible, but the following process works for me reliably using Vagrant and Nomad. I hope it is useful for you.
linux not created (virtualbox)
linux-ui not created (virtualbox)
freebsd not created (virtualbox)
nomad-server01 not created (virtualbox)
nomad-client01 not created (virtualbox)
nomad-server02 not created (virtualbox)
nomad-client02 not created (virtualbox)
nomad-server03 not created (virtualbox)
nomad-client03 not created (virtualbox)
Single Server Setup
nomad-server01
vagrant ssh nomad-server01
make bootstrap - installs dependencies
make dev - builds Nomad from source in the current directory bin folder
cp bin/nomad /opt/gopath/bin/nomad
sudo mkdir /etc/nomad.d/data
sudo chown vagrant /etc/nomad.d
vim /etc/nomad.d/server.hcl
log_level= "DEBUG"
datacenter = "dc1"
data_dir = "/etc/nomad.d/data"
enable_debug = true
server {
enabled = true
bootstrap_expect = 1 # this could also be 3 or 5 if you need/want to test with multiple servers
}
addresses {
rpc = "192.168.56.11"
serf = "192.168.56.11"
}
At this point I have a running cluster with one server and one client. I can repeat the process and adjust the configuration to add more servers (bootstrap_expect=3 for example) or clients. Don’t forget to use an odd number of servers (1, 3, or 5) so that you can calculate quorum effectively. I recommend you start with the simplest possible config (single server) and progressively expand.
Please let me know if that helps, or if you run into issues. I’m currently working on making this easier for users and would welcome any feedback.
$ time vagrant up linux nomad-server01 nomad-client01
...
nomad-client01: Processing triggers for dbus (1.12.2-1ubuntu1.2) ...
nomad-client01: Processing triggers for mime-support (3.60ubuntu1) ...
real 202m56.823s
user 0m30.517s
sys 0m12.081s
Thanks a lot. I now have a local nomad cluster up and running.
I successfully tested a simple Tomcat job using raw_exec (This article helped).
Some questions/comments.
Of the three VMs started using vagrant up, linux is used as a build server. If I download and install pre built binaries I should be able to get rid of this one. Is that correct?
I see that consul is installed as well. Is nomad connected to consul?
So you’d have to add config to the server/client hcl to connect to Consul. And for good measure, I think you’d probably want to add a similar /etc/systemd/system/consul.service file for Consul.
With the cluster up and running I tested out the official tutorial on load balancing using Fabio. I did run into some issues there, but those were unrelated to the cluster itself.
Note that I did not use the official(-ish) ansible-consul, and ansible-nomad roles as I could not get them to work together. Instead I wrote simple roles to install (using apt), and configure (combination of copy, and template) Consul and Nomad.
thanks a lot for the tutorial, finally it’s working for me and I can see the nomad UI
one thing to add :
ports were not forwarded for some reason (server01). I check settings from virtual box and only port 22 was forwarded, so I added the port 4646 there and it worked.