Zero downtime deployment

balancerofthings · February 23, 2020, 5:07pm

Given an existing consul server, and I deploy my application service onto an Amazon EC2, along with consul client, then the client will find the server by the tags using the aws api and register them (using an IAM role on the instance). Consul is then configuring backend pool configuration of a supported proxy via its data plane API.

When I want to deploy a new version of my application service, I would provision new EC2s in a new autoscaling group (immutable infrastructure). Before I destroy my previous autoscaling group, I would typically test my target group before adding them to the load balancer. I am aware that consul has health checks and so do the backend pools of the configured proxy. There’s two scenarios I’m trying to understand:

A bad service application (fails the consul health checks). I some consul will not configure the backend pool of a proxy if the health check fails and it is up my pipeline to handle this. I assume I can use the consul rest api to confirm passing health checks.
The new service application is healthy and I delete the old autoscaling group. Is there a race condition between the machine getting the shutdown signal from the autoscaling group terminating the instance and the consul client reporting to consul server that the instance should be removed? I suspect so.

Thanks

Wolfsrudel · February 23, 2020, 7:01pm

Sounds like a job for service mesh, especially l7 traffic management and canary deployment:

Topic		Replies	Views
New Service Discovery and Service Mesh Guides on Learn Consul dns , connect , health-check , learn	3	2044	January 15, 2020
Service instance fallback? Consul connect	17	1707	February 7, 2020
Fault tolerance in service registration? Consul	0	401	February 13, 2020
Bandwidth reduction Consul	1	500	February 3, 2020
Service Mesh and Load balancing Consul	3	493	June 22, 2020

Zero downtime deployment

Related topics