ECS srevice destroy stuck issue

sanjay.verma · September 19, 2023, 9:04am

Hi All,

I am creating an ECS EC2 type cluster and deploying my container images in that using ECS service. Deployment is succesfull and all the required resources are created and container is deployed onto the ECS cluster.

But when i am trying to destroy the resources, the operation gets stuck at ECS service destroy and gets failed after 20 min timeout (Althuogh it deletes the ECS service in the backend but remains in destroying state) and when i do terraform destroy again it deletes the remaining resources.

Please provide your suggestion if anyone have faced this kind of issue.

I have attached the screenshot of the timeout error also for reference.

PeterBocan · September 19, 2023, 12:34pm

the service can be only deleted if you scale it down to 0 replicas. It’s not possible to remove services if you have one or more replicas (desired state). It’s the stupid ECS thing.

sanjay.verma · September 19, 2023, 2:11pm

during the stuck phase i can see in the portal that service is deleted from ECS cluster and showing the status as ‘Draining’. So It’s ECS who is not updating the actual status.

intoxicated · December 1, 2023, 4:25am

I’m experiencing same issue. And I ended up being delete terraform.tf and remove all deployed resources manually. Are there better ways to handle this kind issue?

sanjay.verma · December 1, 2023, 4:41am

Yes, I came up with the solution. We need to scale down the ECS service to zero once destroy operation is send using local executioner.

ECS service for gitlabrunner

resource “aws_ecs_service” “gitlabrunner_service” {
name = var.service_name
cluster = aws_ecs_cluster.gitlabrunner_cluster.id
task_definition = aws_ecs_task_definition.gitlabrunner_td.arn
desired_count = var.desired_tasks
#launch_type = “EC2”

capacity_provider_strategy {
capacity_provider = aws_ecs_capacity_provider.gitlabrunner_cp.name
weight = 100
}

#force_new_deployment = true

deployment_circuit_breaker {
enable = true
rollback = true
}

provisioner “local-exec” {
when = destroy
command = <<EOF
echo “Update service desired count to 0 before destroy.”
#Get region out of cluster
REGION=(echo {self.cluster} | cut -d’:’ -f4)
echo “Region: $REGION”
#Set the Service desired count to 0
aws ecs update-service --region REGION --cluster {self.cluster} --service ${self.name} --desired-count 0 --force-new-deployment
echo “Update service command executed successfully.”
EOF
}

timeouts {
#create = “10m” # Timeout for resource creation
delete = “5m” # Timeout for resource deletion
}
}

lashshop160 · January 26, 2024, 10:32am

The ECS service destruction process gets stuck and eventually fails after a 20-minute timeout, even though the ECS service is deleted in the backend. Strangely, upon running terraform destroy again, the remaining resources are successfully removed. I have attached a screenshot of the timeout error for reference.

Topic		Replies	Views
Destroy the specific ECS service Terraform	1	303	November 28, 2023
Terraform support for rolling deployments on ECS with draining Terraform	0	593	June 15, 2020
Optimizing terraform + aws to deploy and destroy 4000+ instances Terraform	3	3867	June 11, 2021
EC2 Fleet takes longer than usual to get destroyed AWS	0	528	May 3, 2021
Deletion of `kubernetes_ingress_v1` times out Kubernetes	0	682	November 25, 2022

ECS srevice destroy stuck issue

ECS service for gitlabrunner

Related topics