Vault agent injector failed to fetch v1 mutating webhook config
I have an AKS cluster, deployed the vault via helm chart with disabled TLS.
Namespace for vault is labeled with istio, peer-authentication is set to PERMISSIVE.
After deployment the pod with agent injector can’t start, there is a bug:
Listening on ":8080"... 2021-12-01T09:11:28.404Z [WARN] handler: failed to determine Admissionregistration API version, defaulting to v1: error="Get "https://10.254.0.1:443/apis/admissionregistration.k8s.io/v1/mutatingwebhookconfigurations": dial tcp 10.254.0.1:443: connect: connection refused" 2021-12-01T09:11:28.405Z [INFO] handler: Starting handler.. 2021-12-01T09:11:28.405Z [INFO] handler.auto-tls: Generated CA 2021-12-01T09:11:28.406Z [WARN] handler.auto-tls: failed to fetch v1 mutating webhook config: WebhookName=vault-agent-injector-cfg err="Get "https://10.254.0.1:443/apis/admissionregistration.k8s.io/v1/mutatingwebhookconfigurations/vault-agent-injector-cfg": dial tcp 10.254.0.1:443: connect: connection refused" 2021-12-01T09:11:28.406Z [INFO] handler.certwatcher: Updated certificate bundle received. Updating certs... Error updating MutatingWebhookConfiguration: Patch "https://10.254.0.1:443/apis/admissionregistration.k8s.io/v1/mutatingwebhookconfigurations/vault-agent-injector-cfg": dial tcp 10.254.0.1:443: connect: connection refused Error updating MutatingWebhookConfiguration: Patch "https://10.254.0.1:443/apis/admissionregistration.k8s.io/v1/mutatingwebhookconfigurations/vault-agent-injector-cfg": dial tcp 10.254.0.1:443: connect: connection refused
Looks like the agent injector can’t connect to the mutatingwebhook.
I tried to create ingress service entries but it doesn’t help.
When I get rid of the ingress from the namespace then everything works like a charm including autounseal.
My values.yaml file that I use for the deployment looks like below:
global:
enabled: true
tlsDisable: true
openshift: false
injector:
enabled: true
image:
repository: "hashicorp/vault-k8s"
tag: "0.14.0"
resources:
requests:
memory: 256Mi
cpu: 250m
limits:
memory: 256Mi
cpu: 250m
server:
enabled: true
image:
repository: "hashicorp/vault"
tag: "1.9.0"
# These Resource Limits are in line with node requirements in the
# Vault Reference Architecture for a Large Cluster
resources:
requests:
memory: 256Mi
cpu: 250m
limits:
memory: 256Mi
cpu: 250m
readinessProbe:
enabled: true
livenessProbe:
enabled: true
path: "/v1/sys/health?standbyok=true"
initialDelaySeconds: 60
# extraEnvironmentVars is a list of extra environment variables to set with the stateful set.
# These could be used to include variables required for auto-unseal.
extraEnvironmentVars:
VAULT_AZUREKEYVAULT_VAULT_NAME:
VAULT_AZUREKEYVAULT_KEY_NAME:
# VAULT_CACERT: /vault/userconfig/tls-ca/ca.crt
# extraVolumes is a list of extra volumes to mount. These will be exposed
# to Vault in the path `/vault/userconfig/<name>/`.
#extraVolumes:
# - type: secret
# name: tls-server
# - type: secret
# name: tls-ca
# This configures the Vault Statefulset to create a PVC for audit logs.
# See https://www.vaultproject.io/docs/audit/index.html to know more
dataStorage:
enabled: true
size: 1Gi
auditStorage:
enabled: true
size: 1Gi
standalone:
enabled: false
# Run Vault in "HA" mode.
ha:
enabled: true
replicas: 2 # connected with number of retry_join clauses
raft:
enabled: true
setNodeId: true
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
seal "azurekeyvault" {
tenant_id = "xxx"
}
storage "raft" {
path = "/vault/data"
retry_join {
leader_api_addr = "http://vault-0.vault-internal:8200"
}
autopilot {
cleanup_dead_servers = "true"
last_contact_threshold = "200ms"
last_contact_failure_threshold = "10m"
max_trailing_logs = 250
min_quorum = 5
server_stabilization_time = "10s"
}
}
service_registration "kubernetes" {}
Tried to apply various solutions but nothing helped, could someone point out what is wrong in the config?
Is it even possible to have ingress enabled on the vault’s namespace to handle tls?