Vault pods won't start on openshift 4.14

Using hashicorp-vault 1.15.2 on OpenShift 4.14.6, the vault-0 pod fails to start after a cluster reboot with this error:

18m Warning FailedCreate statefulset/vault create Pod vault-2 in StatefulSet vault failed error: Pod “vault-2” is invalid: [spec.containers[0].image: Required value, spec.containers[0].readinessProbe.httpGet.port: Invalid value: 0: must be between 1 and 65535, inclusive]
18m Warning RecreatingFailedPod statefulset/vault StatefulSet hashicorp-vault/vault is recreating failed Pod vault-0
18m Normal SuccessfulDelete statefulset/vault delete Pod vault-0 in StatefulSet vault successful
5m22s Warning FailedCreate statefulset/vault create Pod vault-0 in StatefulSet vault failed error: Pod “vault-0” is invalid: [spec.containers[0].image: Required value, spec.containers[0].readinessProbe.httpGet.port: Invalid value: 0: must be between 1 and 65535, inclusive]
18m Warning FailedDelete statefulset/vault delete Pod vault-0 in StatefulSet vault failed error: pods “vault-0” not found
3m26s Normal NoPods poddisruptionbudget/vault No matching pods found

If I patch the stateful set via the web interface and set the port to 8200, I then get hit with this:

3m26s Warning FailedCreate statefulset/vault create Pod vault-0 in StatefulSet vault failed error: Pod “vault-0” is invalid: spec.containers[0].image: Required value

And now I’m lost. How do I recover the vault?

How are you deploying? Helm? Kustomize? Yaml?

Feels like open shift is doing some patching of its own and breaking the config.

Deploying was done using helm as follows:

helm repo add openshift-helm-charts https://charts.openshift.io/

helm repo add hashicorp https://helm.releases.hashicorp.com

helm repo update

helm install vault openshift-helm-charts/hashicorp-vault -n hashicorp-vault --set=‘global.openshift=true’ --set=‘server.ha.enabled=true’ --set=‘server.ha.raft.enabled=true’ --set=‘server.dataStorage.storageClass=ocs-storagecluster-ceph-rbd’ --set=‘server.extraEnvironmentVars.VAULT_CLI_NO_COLOR=1’

Ok, this is a self inflicted wound from doing this:

oc patch statefulset vault --type=merge -p ‘{“spec”:{“template”:{“spec”:{“containers”:[{“name”: “vault”, “readinessProbe”:{“httpGet”:{“path”:“/v1/sys/health?uninitcode=204&standbyok=true”}}}]}}}}’

Sigh. Solution: don’t do that. But it somehow seems wrong if the rest of the parameters get nuked.

Recovery was merging the original statefulset that gets created when installing. Maybe my oc-foo isn’t quite up to grade for changing a single parameter in there.