hello,
I’m trying to deploy hashicorp vault in my kubernetes cluster. I am facing some issue with vault initialization.
Here is an excerpt from the helm values.yml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: hashicorp-vault
namespace: hashicorp
labels:
app.kubernetes.io/name: vault
app.kubernetes.io/instance: hashicorp-vault
app.kubernetes.io/managed-by: Helm
spec:
serviceName: hashicorp-vault-internal
podManagementPolicy: Parallel
replicas: 1
updateStrategy:
type: OnDelete
selector:
matchLabels:
app.kubernetes.io/name: vault
app.kubernetes.io/instance: hashicorp-vault
component: server
template:
metadata:
labels:
helm.sh/chart: vault-0.29.1
app.kubernetes.io/name: vault
app.kubernetes.io/instance: hashicorp-vault
component: server
annotations:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: vault
app.kubernetes.io/instance: "hashicorp-vault"
component: server
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 10
serviceAccountName: hashicorp-vault
securityContext:
fsGroup: 1000
runAsGroup: 1000
runAsNonRoot: true
runAsUser: 100
hostNetwork: false
volumes:
- name: config
configMap:
name: hashicorp-vault-config
- name: home
emptyDir: {}
initContainers:
- command:
- sh
- -c
- |
chmod +x /vault/data
#chown -R vault:vault /vault/data
#chmod -R 770 /vault/data
ls -l /vault/data
image: busybox
imagePullPolicy: IfNotPresent
name: set-permissions
securityContext:
privileged: true
runAsGroup: 0
runAsUser: 0
volumeMounts:
- mountPath: /vault/data
name: data
containers:
- name: vault
image: hashicorp/vault:1.18.1
imagePullPolicy: IfNotPresent
command:
- "/bin/sh"
- "-ec"
args:
- |
cp /vault/config/extraconfig-from-values.hcl /tmp/storageconfig.hcl;
[ -n "${HOST_IP}" ] && sed -Ei "s|HOST_IP|${HOST_IP?}|g" /tmp/storageconfig.hcl;
[ -n "${POD_IP}" ] && sed -Ei "s|POD_IP|${POD_IP?}|g" /tmp/storageconfig.hcl;
[ -n "${HOSTNAME}" ] && sed -Ei "s|HOSTNAME|${HOSTNAME?}|g" /tmp/storageconfig.hcl;
[ -n "${API_ADDR}" ] && sed -Ei "s|API_ADDR|${API_ADDR?}|g" /tmp/storageconfig.hcl;
[ -n "${TRANSIT_ADDR}" ] && sed -Ei "s|TRANSIT_ADDR|${TRANSIT_ADDR?}|g" /tmp/storageconfig.hcl;
[ -n "${RAFT_ADDR}" ] && sed -Ei "s|RAFT_ADDR|${RAFT_ADDR?}|g" /tmp/storageconfig.hcl;
/usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/storageconfig.hcl
securityContext:
allowPrivilegeEscalation: false
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: VAULT_K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: VAULT_K8S_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: VAULT_ADDR
value: "http://127.0.0.1:8200"
- name: VAULT_API_ADDR
value: "http://$(POD_IP):8200"
- name: SKIP_CHOWN
value: "true"
- name: SKIP_SETCAP
value: "true"
- name: HOSTNAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: VAULT_CLUSTER_ADDR
value: "https://$(HOSTNAME).hashicorp-vault-internal:8201"
- name: HOME
value: "/home/vault"
volumeMounts:
- name: data
mountPath: /vault/data
- name: config
mountPath: /vault/config
- name: home
mountPath: /home/vault
ports:
- containerPort: 8200
name: http
- containerPort: 8201
name: https-internal
- containerPort: 8202
name: http-rep
readinessProbe:
# Check status; unsealed vault servers return 0
# The exit code reflects the seal status:
# 0 - unsealed
# 1 - error
# 2 - sealed
exec:
command: ["/bin/sh", "-ec", "vault status -tls-skip-verify"]
failureThreshold: 2
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 3
lifecycle:
# Vault container doesn't receive SIGTERM from Kubernetes
# and after the grace period ends, Kube sends SIGKILL. This
# causes issues with graceful shutdowns such as deregistering itself
# from Consul (zombie services).
preStop:
exec:
command: [
"/bin/sh", "-c",
# Adding a sleep here to give the pod eviction a
# chance to propagate, so requests will not be made
# to this pod while it's terminating
"sleep 5 && kill -SIGTERM $(pidof vault)",
]
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: smb-storage-class
Here is the persistent Volume
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-hashicorp-data-share
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: "smb-storage-class"
csi:
driver: smb.csi.k8s.io
volumeHandle: pv-hashicorp-data-share
volumeAttributes:
source: "//192.168.0.100/hashicorp-data-share"
mountOptions:
- guest
# mfsymlinks option is important to enable sym links on smb share
- mfsymlinks
- dir_mode=0777
- file_mode=0777
EOF
Storage Class:
# Storage Class
cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: smb-storage-class
provisioner: smb.csi.k8s.io
parameters:
# On Windows, "*.default.svc.cluster.local" could not be recognized by csi-proxy
source: //smb-server.default.svc.cluster.local/share
# if csi.storage.k8s.io/provisioner-secret is provided, will create a sub directory
# with PV name under source
csi.storage.k8s.io/provisioner-secret-name: smbcreds
csi.storage.k8s.io/provisioner-secret-namespace: default
csi.storage.k8s.io/node-stage-secret-name: smbcreds
csi.storage.k8s.io/node-stage-secret-namespace: default
volumeBindingMode: Immediate
reclaimPolicy: Retain
mountOptions:
- dir_mode=0777
- file_mode=0777
#- uid=1001
#- gid=1001
- noperm
- unix
- mfsymlinks
- cache=strict
- noserverino # required to prevent data corruption
EOF
My SMB Share config:
[hashicorp-data-share]
path = /shared/hashicorp/data
browseable = yes
writable = yes
guest ok = yes
create mask = 0777
directory mask = 0777
force user = nobody
unix extensions = no
#force group = nogroup
#force user = 1000
#force group = 1000
It looks like that the container does not have the writes to write to the shared path and when I ssh into the vault pod, I get the following error.
vault/data/core $ vault operator init
Error initializing: Error making API request.
URL: PUT http://127.0.0.1:8200/v1/sys/init
Code: 400. Errors:
* failed to initialize barrier: failed to persist keyring: chmod /vault/data/core/_keyring2677234820: operation not permitted
/vault/data/core $ ls -lrta
total 1
drwxrwxrwx 2 root vault 0 Dec 21 21:49 ..
-rwxrwxrwx 1 root vault 0 Dec 21 22:05 _keyring2677234820
drwxrwxrwx 2 root vault 0 Dec 21 22:05 .
Output of init container:
root:x:0:0:root:/root:/bin/sh │
│ daemon:x:1:1:daemon:/usr/sbin:/bin/false │
│ bin:x:2:2:bin:/bin:/bin/false │
│ sys:x:3:3:sys:/dev:/bin/false │
│ sync:x:4:100:sync:/bin:/bin/sync │
│ mail:x:8:8:mail:/var/spool/mail:/bin/false │
│ www-data:x:33:33:www-data:/var/www:/bin/false │
│ operator:x:37:37:Operator:/var:/bin/false │
│ nobody:x:65534:65534:nobody:/home:/bin/false │
│ total 0 │
│ drwxrwxrwx 2 root 1000 0 Dec 21 22:05 core │
│ My Id is -- uid=0(root) gid=0(root) groups=0(root),10(wheel),1000
I can’t figure out what the issue is here. It seems the vault user doesn’t have the right to write to /vault/data/core, but I cannot set the rights in the init container because it cannot find the vault user.
I went through failed to initialize barrier: failed to persist keyring: mkdir /vault/data/core: permission denied · Issue #20953 · hashicorp/vault · GitHub and other recommended posts but couldn’t resolve the issue.