Generic locker to give identity to machines in an AWS ASG

shantanugadgil · July 2, 2021, 6:19am

Hi,

Does anyone know of a tool like (something like, not identical)
GitHub - seatgeek/resec: ReSeC- Redis Service Consul

but for a generic purpose of being able to get a serially increasing number for machines inside an AWS ASG.

The end goal is that every new machine in a ASG will try to “lock” a certain key and try to acquire an integer. This integer will become the Nth identity of the machine.

If the AWS ASG has EC2 health checks then when this machine reboots, it could potentially “release” this number from the KV and try to “reacquire” a new number after reboot.

I did find this:

from ref:

Any ideas/help for this use case would be helpful.

Also, within the AWS ASG I have Nomad agents as well, which will leverage the id (which can be saved as “meta”)

Regards,
Shantanu Gadgil

shantanugadgil · July 2, 2021, 6:23am

Addendum: I have some home grown scripts and a SystemD unit which would “acquire” and “release” the kv path at startup and shutdown, but was hoping that there would be an existing tool.

The ASG shutdown relies on a notification hook, so gives time to execute quick scripts during possible clean shutdown.

gc-ss · July 12, 2021, 7:23pm

Shantanu,

Let’s sketch out an usecase that I think should address the simpler scenario where:

You have a spotfleet on AWS
You maintain external EBS volume(s) per instance in the spotfleet that have the state
When spotfleet gets a termination recommendation or notice, you immediately start to drain the old one and spin up a new node
When that new node comes up, you setup (using userdata) with the same “node specific” metadata and attach the relevant external EBS volume(s)
You essentially have migrated the old node to the new node

Does this properly capture your usecase we can work on or did I lose details/context in the translation?

shantanugadgil · July 16, 2021, 12:09pm

Apologies for the delayed response …

Let’s sketch out an usecase that I think should address the simpler scenario where:

You have a spotfleet on AWS

Yes, spot or reserved (either way the termination hook is caught and drain etc. done)

You maintain external EBS volume(s) per instance in the spotfleet that have the state

This is not needed yet. I am not yet in the need for mimicking full stateful set. The need is currently just for a use case that I need all the “ec2 index numbers” always defragmented, i.e. any new machine should use “first unused” slot in the list.

When spotfleet gets a termination recommendation or notice, you immediately start to drain the old one and spin up a new node
When that new node comes up, you setup (using userdata) with the same “node specific” metadata and attach the relevant external EBS volume(s)

Yes, this is essentially spot on. Sometime the node could be just being terminated due to EC2 health check failures etc. but yes, this covers the specific point.

You essentially have migrated the old node to the new node

This would be needed for full stateful set , not immediately a need, but satisfies the use case/

gc-ss · July 16, 2021, 2:57pm

Ok, appreciate the feedback. I will be working on this starting August. If you want to sketch out the usecase a bit more in the meantime or commit some code let me know.

I will message you on the gitter/tf channel for quick chats and use the code repo/issues section to ask longer/detailed questions that are not a good fit for chat.

Would you be open to occasional zoom calls? I’m on PST

Topic		Replies	Views
Patching Consul for Vault behind ASG Consul	1	483	May 18, 2021
Vault restore from consul snapshot Vault	2	1348	February 19, 2020
Consul exclusive ACL policy for nodes/service in ASG Consul	4	780	July 12, 2019
Vault with raft-storage using AWS Auto-Scaling-Group and Auto-rejoin Vault	3	2638	July 20, 2020
DR for Vault OSS? Vault	4	523	October 13, 2023

Generic locker to give identity to machines in an AWS ASG

Related topics