Hello,
I am following Provision a GKE Cluster (Google Cloud) | Terraform - HashiCorp Learn and trying to build upon it, but having trouble.
I am able to apply code based on this set of instructions and successfully get an empty Kubernetes cluster as I expect. However, the instructions switch away to running a bunch of manual kubectl commands, which frankly defeats the purpose of config as code.
Once I attempt to add a terraform block that does something with the cluster, such as installing Jenkins from Helm, I run into auth problems.
The tutorial code describes
...
# provider "kubernetes" {
# load_config_file = "false"
# host = google_container_cluster.primary.endpoint
# username = var.gke_username
# password = var.gke_password
But that is demonstrating a deprecated authentication approach, and is a dead end. Instead I came up with this:
data "google_client_config" "default" {}
provider "kubernetes" {
host = google_container_cluster.primary.endpoint
token = data.google_client_config.default.access_token
cluster_ca_certificate = base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)
}
However terraform apply
results in this error:
Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable
Then on this next page of the learn site: Manage Kubernetes Resources via Terraform | Terraform - HashiCorp Learn
It says “cloud-specific auth plugin” is the top recommendation, and “The cloud provider [instructions] will configure the Kubernetes provider using cloud-specific auth tokens” referring back to the page ^ recommending deprecated basic auth.
How exactly should I declare the Kubernetes provider to use cloud-specific auth plugin?
Update: it works if I manually reconfigure kubectl after the error, pointing it to the new cluster. (e.g., I run gcloud container clusters get-credentials <new cluster name> --region <region>
)
I also did an export KUBE_CONFIG_PATH=/home/me/.kube/config
though that might not have been needed.
My question has shifted slightly: how to write a terraform script that configures kubectl with the cluster it just created, so it can start installing into the cluster.
Update: after much frustration I solved it.
For possible benefit of others - here’s how I understand the way to auth without a manual step and without writing credentials to a file.
The sample makes use of Google’s google_container_cluster
. This module creates the new empty GKE cluster and will produce terraform output variables with the certs and keys needed to install things into the cluster with something like Helm. Those variables (which they call “attributes” but should call “outputs”) are described here: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster#attributes-reference
(Side note. In the Hashi docs I found a confusing note which I understood to tell me to separate creation of the GKE cluster from the scripts that install into the cluster, because of timing concerns. I did this when things weren’t working and it made no difference; it might not be necessary but I left it this way.)
My ‘create cluster’ module uses the basics of the Hashi sample, and I renamed primary
to my_cluster
which is less confusing to me, but most importantly exposed the creds/keys as outputs:
output "config" {
value = {
name = google_container_cluster.my_cluster.name
region = var.region
kubernetes_cluster = google_container_cluster.my_cluster
ca_certificate = base64decode(google_container_cluster.my_cluster.master_auth.0.cluster_ca_certificate)
client_certificate = base64decode(google_container_cluster.my_cluster.master_auth.0.client_certificate)
client_key = base64decode(google_container_cluster.my_cluster.master_auth.0.client_key)
host = google_container_cluster.my_cluster.endpoint
}
}
And my ‘install apps’ module uses those outputs:
variable "cluster_config" {}
provider "kubernetes" {
host = var.cluster_config.host
cluster_ca_certificate = var.cluster_config.ca_certificate
client_certificate = var.cluster_config.client_certificate
client_key = var.cluster_config.client_key
}
provider "helm" {
kubernetes {
host = var.cluster_config.host
cluster_ca_certificate = var.cluster_config.ca_certificate
client_certificate = var.cluster_config.client_certificate
client_key = var.cluster_config.client_key
}
}
Then my parent module that ties create cluster
with install apps
is
module "my_cluster" {
source = "./create-cluster"
project_id = var.project_id
region = var.region
}
module "my_cluster_apps" {
source = "./install-apps"
cluster_config = module.my_cluster.config
}
Update: yesterday the above worked fine. Today I got several errors on terraform apply
Error: <some operation> forbidden: User "system:anonymous" cannot get resource <blah>
Maybe someone could explain why this happens. I speculate when the cluster was created yesterday, somehow a default token was set. And today, it seems that token is invalid and results in confusing errors.
The workaround was to add this:
data "google_client_config" "provider" {}
provider "kubernetes" {
...
token = data.google_client_config.provider.access_token
}
provider "helm" {
kubernetes {
...
token = data.google_client_config.provider.access_token
}
}