EKS/GKE/AKS -> Kubernetes Resources - Provider Dependency


Over the past few months, I have been working on an enterprise project to migrate an interactive node app into AWS. The specifics of the app aren’t important, however I built the infrastructure for the whole stack through Terraform (0.12). The backend architecture is essentially an EKS cluster running Traefik 2.3 as the ingress controller, fronted by an NLB.

Through development of the project, I came across some interesting behaviour with Terraform that I had to work around by structuring the project different, into “staged” terraform runs. Here is a summary of what I noticed:

If you have a TF project that looks to create a new EKS cluster using the AWS provider & then also deploy k8s resources onto the newly created cluster, you cannot achieve this in a single “terraform apply”. The reason for this is that the Kubernetes TF provider is initialised with auth data during the plan phase. So essentially, if you have all of your TF resources defined in a single .tf configuration, the EKS cluster will be created with no issues, but when it comes to the kubernetes resources, creation of these objects will fail with auth issues. If you were to then re-run a “terraform apply” however, the kubernetes provider would then be able to successfully pick up the already existing EKS auth data and the creation of the k8s resources would then succeed.

To get around this issue, I essentially has to break up the project into “layers” or “phases”, if you will. I have a directory called “layer-1” which creates the AWS provider provisioned resources, like the EKS cluster, auto-scaling template and ASG. I have configured this layer to then output a localfile for the kubeconfig for the EKS cluster, which is then consumed by “layer-2”. “layer-2” then contains all of the configuration for the kubernetes provider provisioned resources (Flux GitOps controller in this instance). This structure results in isolated state files for each “layer” (which has actually proved to be helpful in some cases, although I am pretty sure this is not the way TF was intended to be used!).

This has been noted elsewhere (https://blog.logrocket.com/dirty-terraform-hacks/ - Break up dependent providers into staged Terraform runs).

I have been trying to read up on whether this scenario has been reported elsewhere, however I haven’t had much luck. I was also wondering if this may have been resolved in TF 0.13, but the changelogs don’t appear to reference this use case.

Just wondering if anyone else has encountered this?

Edit: Commented on https://github.com/hashicorp/terraform/issues/2430 which appears to detail the same topic

1 Like

Yes this is perfectly normal.

For infrastructure of any complexity (which your qualifies as) I would expect to have a number of different state files in use. This can be due to the difficulties you mention (you can’t login to a Kubernetes cluster that doesn’t yet exist) but mainly is to split things into logical sections.

Trying to put too much in a single state file causes a number of issues. It can be harder to organise resources which are very different in a single code directory (yes you can use modules, but that doesn’t always make it clearer), it is slower to process (the more resources the longer it takes to refresh state, calculate changes and apply them) and prevents parallelisation (you can only run a single plan or apply against a state file at a time).

For a similar architecture we split things into several code repos/state files/build pipelines. This allows each piece to be deployed separately, keeping things quick and helping people understand what is happening a bit more easily. We then use remote state to pull necessary values from one piece of code to another.

We split as follows:

  • Account level resources (things which apply to the whole AWS account and aren’t environment specific)
  • Base resources (per environment) - VPC, EKS cluster, common SGs
  • Repo per namespace/application - both Kubernetes resources (usually using Helm) and associated AWS dependencies (SGs, S3 buckets, RDS, queues, etc.)

Once things have passed the initial setup state we find than 90% of changes are being made to the per namespace/app code, with the majority for the functionality this system is providing (as opposed to the supporting namespaces such as metrics, system or logging applications). By having that code separate from everything else things are both safer (because you can’t accidentally make a code change that breaks the whole cluster) and quicker (as Terraform only has to refresh/calculate/apply changes for a small set of resources).

If we tried to have everything in a single state file, not only would we have issues as you mentioned, but more importantly every build (every time a branch is changed, including work in progress) could easily take 10 minutes longer (just to refresh all the VPC, EKS & other Kubernetes resources) as well as causing lots of state lock contention, again slowing things down.

I have a single TF state file that:
a) creates a VPC
b) creates an EKS cluster
c) deploys a bunch of stuff into the EKS cluster

I don’t have problems with dependency ordering. In one ‘terraform apply’, it can create the vpc, create the eks cluster in the newly created vpc subnets, configure credential access to the k8s cluster, and deploy a bunch of kubernetes resources. It appears to do something intelligent to delay configuration of providers and resources that rely on the output of other resources. Or am I just getting lucky?