Need Support with Terraform AWS EKS

Hi Team,

We have setup our AWS infrastructure with terraform. Please find the terraform infrastructure module for your reference.

module “vpc” {
source = “./modules/vpc”
env = var.env
cidr_block = var.cidr_block
public_subnets = var.public_subnets
private_subnets = var.private_subnets
database_subnets = var.database_subnets
vpc_log_bucket = var.central_vpc_log_s3_bucket_arn
}

module “launch-template” {
source = “./modules/launch-template”
env = var.env
eks_cluster_id = module.eks.eks_cluster_id
eks_ami_id = var.eks_ami_id
eks_nodegroup_instance_type = var.eks_nodegroup_instance_type
eks_nodegroup_volume_size = var.eks_nodegroup_volume_size
eks_nodegroup = var.eks_nodegroup
enable_keypair = true
}

EKS

module “eks” {
source = “./modules/eks”
env = var.env
region = var.region
eks_cluster_version = var.eks_cluster_version
eks_nodegroup_instance_type = var.eks_nodegroup_instance_type
eks_nodegroup = var.eks_nodegroup
public_subnets = module.vpc.public_subnets
private_subnets = module.vpc.private_subnets
eks_desired_node_size = var.eks_desired_node_size
eks_min_node_size = var.eks_min_node_size
eks_max_node_size = var.eks_max_node_size
launch_template_ids = module.launch-template.launch_template_ids
launch_template_versions = module.launch-template.launch_template_versions
}

First, we have deployed infrastructure with below parameters

eks_nodegroup_instance_type = [“r5d.xlarge”, “r5d.8xlarge”, “r5d.4xlarge”]

#EKS nodegroup external volume size
eks_nodegroup_volume_size = [100, 250, 100]

#EKS node group names
eks_nodegroup = [“kafka”, “neo4j”, “starburst”]

#EKS Node group min size
eks_min_node_size = [3, 1, 1]

#EKS Node group desired size
eks_desired_node_size = [23, 3, 2]

#EKS Node group max size
eks_max_node_size = [30, 6, 6]

In the meanwhile we manually added one nodegroup(ex: demo-system-ng).

After requirements changes, we have updated the parameters as below:

Also, incidently the new nodegroup we were trying to add (demo-system-ng) has the same name as that of manually deployed nodegroup.

eks_nodegroup_instance_type = [“r5d.xlarge”, “r5d.8xlarge”, “r5d.4xlarge”, “r5d.4xlarge”]

#EKS nodegroup external volume size
eks_nodegroup_volume_size = [150, 300, 100, 100]

#EKS node group names
eks_nodegroup = [“kafka”, “neo4j”, “starburst”, “system”]

#EKS Node group min size
eks_min_node_size = [3, 1, 1, 1]

#EKS Node group desired size
eks_desired_node_size = [30, 23, 2, 2]

#EKS Node group max size
eks_max_node_size = [30, 6, 6, 5]

Below are the queries we have and issues we faced:

After re-deploying infrastructure, it took around 90 minutes to deploy the infrastructure. As you can see we were trying to update nodegroup configuration, still it took more than 90 minutes.
1: Can you please help us understand why it takes so much time to deploy the infrastructure through terraform, whereas when we do the same change through the AWS Console, it hardly takes around 20-30 minutes.
Also is there any workaround through which we can reduce time, since our production environment may have clusters with node count close to 100.

Further, since we were trying to create nodegroup with the same name as that of an existing nodegroup (created manually), the terraform apply failed after 90 minutes with the “error : resource already exist”.
2: Here we need to understand, why terraform plan did not warn about this issue or why did this error not come up at the start of the deployment.
Is there any recommended tool, through which we can get such details before terraform apply.

Also, we observed that if we try to update Nodegroup role (add/remove permission policies), terraform tries to re-deploy entire cluster nodegroup instead of just updating IAM role.
3: Is this the expected behaviour from terraform?? We have this concern because we can directly update IAM Role policies from AWS Console without any changes being made to the nodegroup.