Advice on bootstrapping complex kubernetes infrastructure

I’d like to get some perspective from the community. I work mostly with Kubernetes, and I see recurring problems that I wonder if others are also struggling with and if there are better solutions. I haven’t seen these specific issues addressed elsewhere, but they seem common enough and maybe someone has found a better solution that I missed.

A common use case I’ve seen for Terraform is to deploy a managed Kubernetes cluster (ex. AKS, EKS, GKE, etc) and managed nodes (ex. managed nodegroup, autoscaling group, etc). In this process, there may be some pre-node configuration that needs to be done in Kubernetes (ex. for CNI/CSI), and some post-node configuration to be completed before the cluster is made available for clients (ex. installing certain controllers, like cert-manager, ingress, etc).

Possible issues I see during deployment:

  • Resources are deployed successfully, but not necessarily available right away - ex. EKS is ready, but there are network/DNS delays are preventing access to the API
  • Some resources (ex. controllers, nodes, etc) indirectly depend on other resources (ex. IAM policy attachments) that are not tracked by implicit dependencies (references) and are missed as explicit ones (ex. with depends_on)
  • Testing these issues is usually generally hard for a number of reasons, like random timeouts or different orders of execution

Possible issues I see during destruction:

  • Kubernetes controllers removed before their dependent resources, which Terraform is usually completely unaware of, causing sticky situations where things cannot easily be deleted
  • Controllers removed before their resources are finalized, leaving orphaned resources that may cost money (esp. LoadBalancers, volumes, nodes, etc)
  • Helm delete hooks that cannot run because the nodes are tainted or destroyed

Proposed solution:

I am considering a chain of null_resources “checkpoints”, with dependencies on each other representing the bootstrapping process - cluster_ready → plugins_ready → nodes_ready → controllers_ready → kuberentes_ready. Each checkpoint depends on additional resources as needed to be considered “ready” - ex. EKS for cluster_ready, or node instances for nodes_ready, etc. Precondition checks can also make sure the resources are actually available.

During create, this seems like it helps manage complex dependencies with a single reference (“nodes_ready”). During destroy, it seems like it helps ensure dependent resources (ex. kubernetes workloads) are cleaned up before allowing destruction of the “checkpoint” and its dependencies (ex. controllers). It can also be managed separately in the module and does not interfere with the normal dependency mechanisms of Terraform - resources can still reference each other and add their own explicit dependencies, as long they don’t cause a cycle with the checkpoints.

The most obvious alternative I considered is breaking the stages out as separate modules, but that just seems to move the complexity outside Terraform and does not itself ensure the modules run in the correct order, either.

Before proceeding further, I want to check with the broader community. I am concerned that this approach is unnecessarily complex, or maybe it is just a complex by its nature. Does the approach make sense to anyone else? Is there a better way to handle this?