I am generating my kubernetes certificates using Vault through terraform, but I’ve come across this issue where certain client certificates, such as those of controller-manager or scheduler, have a system: prefix, for example:
Subject: CN = system:kube-scheduler
I’m using this certificate role through terraform that is used by these certificates:
Is there an option that allows : in the CN or is it simply impossible to do this in vault?
This is the error that I’m getting:
╷
│ Error: error creating certificate system:kube-scheduler by kubernetes for PKI secret backend "kube1-ca": Error making API request.
│
│ URL: PUT https://vault-0.company.internal:8200/v1/kube1-ca/issue/kubernetes
│ Code: 400. Errors:
│
│ * common name system:kube-scheduler not allowed by this role
│
│ with vault_pki_secret_backend_cert.kube1_scheduler["kube-controlplane-2"],
│ on certificates.tf line 407, in resource "vault_pki_secret_backend_cert" "kube1_scheduler":
│ 407: resource "vault_pki_secret_backend_cert" "kube1_scheduler" {
│
╵
I’ve opened a request for this feature, but this all seems hopeless. It’s quite clear that Hashicorp is not particularly friendly towards self-hosted setups. The obstacoles are unending. Maybe I’m wrong, I don’t know, I might be missing something, but it’s been quite some time already since I’ve been using all these tools, terraform, packer, vault, consul.
So until now nobody has successfully created kubernetes certificates using Vault through Terraform, if I understand this correctly. You’d have expected complete compatibility given that we’re not talking about some obscure software, it’s really the thing people talk about most.
It’s been just a terrible experience, to put it bluntly
I do agree that there are parts of the products which are not as polished as I would hope they would be, but I don’t think a lack of support for self-hosted setups is the problem.
In my opinion, the problem you are experiencing here is that Vault’s PKI support is narrowly focussed on creating TLS certificates for basic HTTPS use-cases only, and the support for more flexible use-cases is immature, missing, or incomplete.
I agree with you that the Vault PKI secrets engine is not well suited for acting as a Kubernetes cluster CA.
I have similar issues with terraform when trying to provision clusters (consul, vault, kubernetes, whatever) where one node needs information (tokens, data) from another. The code becomes terribly redundant because of this.
I don’t think it’s an over-generalisation. People (and Hashicorp) rely on whatever the cloud has to offer and which automatically solves these issues, keeping your code at least sane. There are so many hacks that you have to do over and over to be able to set up the basic things on a self-hosted setup. I’m not saying I’m doing everything right, I’m sure there are many things I could do more elegantly even within these limits, but it’s still terribly easy to lose perspective. Public clouds abstract away a huge amount of logic, and that’s why it all seems (and it as from the perspective of the user) reasonable. It’s not anymore when you yourself have to build the tools for yourself.
Just as a side note: I also use proxmox. In packer a bug was introduced where qcow2 as a storage type is simply no longer being accepted (not supported). So you simply cannot create virtual machines anymore with that type of storage. You’d call that a huge bug under normal circumstances, but given that it’s Proxmox, it’s low to no-priority. It’s been months and nobody is taking care of it, because Proxmox and, to a lesser extent, it’s true, self-hosting don’t really matter
Running Vault and Consul on plain old VMs, built and installed manually, is a valid way of self-hosting. Back when Vault and Consul were being concieved, you know, back in the 0.x days, it may even have been the only expected target environment.
I’m not saying that that’s a good thing, I’m just saying that “hard to self-host” and “hard to completely ephemerally provision brand new clusters in a fully automated manner” aren’t quite the same thing, and precision helps guide conversations in useful directions.
That’s indeed a fair point, those are two different things. Having said that, Terraform is all about automation, and I’d have expected more elegant means to provision virtual machines (less redundant code, also when it comes to modules etc.) and I’d have expected better compatibility at least between their own products.
Regarding proxmox and packer: There’s also another error where it complains if I unmount the CD when creating the image. The way that I do it now is to pin the packer version so that I don’t encounter the qcow2 bug again, and also to force it ignore the errors at the end, meaning, they it doesn’t delete the virtual machine. Or I simply tell it not to unmount the CD-ROM anymore, but that’s an issue, because the clones will inherit that It’s embarrassing. But ok, I’m finished with my diatribe.
I guess what I’m trying to do is kind of overkill anyway, so I’ll just probably let kubeadm manage the certificates. Even if I solve this issue, there’s still a lot to do and I’m not sure it’s worth generating all the certificates myself anyway, given the complexities of kubernetes itself.
Later edit: the certificates for the etcd cluster I’ll still have them issued through vault, as it’s also easy to add them to the initial control plane node.