Working/modern version of "auto bootstrapping nomad cluster" post

This seems outdated. Auto-bootstrapping a Nomad Cluster The repo it links to has questions and no answers - running through it hits many syntax errors.

Anything recent that can shed some light on bootstrapping a production nomad/consul cluster? Presumably using terraform.

Hi @josh.m.sharpe :wave:

Yes, that’s an old post and repo that seems to have been created more as a demonstration of a Nomad feature at the time.

For a more up-to-date resource you can check nomad/terraform/aws at main · hashicorp/nomad · GitHub.

Let us know if you hit any problems with this as well.

@lgfa29 just a thought: how about a post (like an announcement post) where people can post in the replies links to articles that they notice to be old/outdated or sometimes just not working (usually due to the change in the network stanza moving to the group)

The Learn/Docs Team could then mark those repos/links/blogs as deprecated with a top-level banner indicating to go look elsewhere for “latest version” documentation!

# just-a-thought

2 Likes

I agree, it would be great if users could give more feedback to the docs and especially the “learn” resources (since there’s no public git repo for them). I know, this causes more work for the documentation team, but could improve the documentation and lower the number of support/clarification requests in the forum.

2 Likes

@lgfa29 Looks like it got all the way to this and then hung there for 10-15 minutes before I pressed ctrl-c

==> amazon-ebs: Creating AMI hashistack 1618687925 from instance i-0b5d0b2a49ce4abba
amazon-ebs: AMI: ami-0b55bf923cce1a245
==> amazon-ebs: Waiting for AMI to become ready…
Cancelling build after receiving interrupt
==> amazon-ebs: Error waiting for AMI: RequestCanceled: waiter context canceled
==> amazon-ebs: caused by: context canceled
==> amazon-ebs: Provisioning step had errors: Running the cleanup provisioner, if present…
==> amazon-ebs: Terminating the source AWS instance…
==> amazon-ebs: Cleaning up any extra volumes…
==> amazon-ebs: Deleting temporary security group…
==> amazon-ebs: Deleting temporary keypair…
Build ‘amazon-ebs’ errored after 16 minutes 15 seconds: Error waiting for AMI: RequestCanceled: waiter context canceled
caused by: context canceled

==> Wait completed after 16 minutes 15 seconds
Cleanly cancelled builds after being interrupted.

I see the (relatively) recent commit to update the AMI - but using a version of ubuntu 5 years old doesn’t give me a lot of faith this thing will work.

Thanks for idea @shantanugadgil. I raised this issue internally and we’ll add a note to blog posts that are outdated (here’s an example from Vault: Vault: Cubbyhole Authentication Principles).

With regards to a pinned post, I think it could get hard to follow individual discussions if everything is reported in the same post. Everyone should feel welcome to just start a new post :slightly_smiling_face:

1 Like

Thank you for the feedback @fhemberger. Causing more work is not really a problem since we would always like to know when things don’t work :grinning_face_with_smiling_eyes:

Learn guides have a feedback button in their footer that you can use to report any issues:

Once you click in one of them, a text box will appear where you can provide more details.

1 Like

It seems like the AMI snapshot is still being built:

==> amazon-ebs: Waiting for AMI to become ready…

This sometimes happens with AWS, and I am not quite sure why. If you look in your AWS dashboard under EC2 -> Snapshots you will be able to see its progress there.

I started updating some of the dependencies, the new code should be available soon. Thanks for the heads up :slightly_smiling_face:

1 Like

Has any progress been made here? I’m hours into the documenation-blog-youtube-o-sphere this morning and every resource I find has some blocker. Maybe I have a misunderstanding of this product line. Does everyone manually set up their initial cluster? Is it me or is that ironic that a platform designed for configureable/repeatable provisioning doesn’t have a well documented method of setting itself up in a few common environments?

Here are some examples:
Nomad - The Hard Way - YouTube → EXCELLENT video. Suggests we manually provision instances and download/install the client. Easy enough, sure, but if I’m going to do it a bunch of times while I learn and break stuff couldn’t it be automated?

https://registry.terraform.io/modules/hashicorp/nomad/aws/latest/submodules/nomad-cluster → “Specify the ID of the Nomad AMI.” → Yes, which one? Where do I find that?

GitHub - hashicorp/nomad-auto-join: Terraform config to automatically bootstrap a Nomad cluster Maybe? 4 years old.

Even this black-box of non-learning HashiCorp Nomad on AWS - Quick Start is failing with:

CREATE_FAILED Template error: Fn::Select cannot select nonexistent value at index 2

Hi @josh.m.sharpe :wave:

Yes, we’ve been discussing internally how to best provide practical guidance on how to deploy Nomad. We know it’s frustrating that there is no “single” way to do it, but the challenge has been how to balance different environments and requirements into a single turnkey solution.

The work that I mentioned about updating the Terraform configuration got sidetracked but I was able to do some more updates today and it seems to be working now. You can check my changes in this branch: https://github.com/hashicorp/nomad/tree/refactor-aws-terraform/terraform/aws

I would say that Packer is probably the most common way to deploy Nomad, but what goes into the image (configuration files for example) are very dependant on each environment. Ansible is a common tool as well. There are also things that need to be done manually, like bootstrapping the ACL system, setting up TLS and gossip encryption etc.