HashiCorp Vault frequently changes the leader

We compiled HashiCorp Vault ourselves for s390x and are using it as a 3-node cluster with HAProxy in front. The Vault version was self-compiled.
Vault is running on Red Hat Enterprise Linux 9, 2 CPU Cores and 8GB RAM. There are periods where HashiCorp Vault runs stably for several hours, followed by phases with frequent leader changes. The log shows the following:

[WARN] storage.raft: failed to contact: server-id=node2 time=3.658501986s
[WARN] storage.raft: failed to contact: server-id=node3 time=3.592689253s
[WARN] storage.raft: failed to contact quorum of nodes, stepping down

[WARN] core.cluster-listener: no TLS config found for ALPN: ALPN=[“req_fw_sb-act_v1”]
[WARN] core.cluster-listener: no TLS config found for ALPN: ALPN=[“req_fw_sb-act_v1”]
[WARN] core.cluster-listener: no TLS config found for ALPN: ALPN=[“req_fw_sb-act_v1”]
[WARN] core.cluster-listener: no TLS config found for ALPN: ALPN=[“req_fw_sb-act_v1”]
[WARN] core.cluster-listener: no TLS config found for ALPN: ALPN=[“req_fw_sb-act_v1”]
[WARN] core.cluster-listener: no TLS config found for ALPN: ALPN=[“req_fw_sb-act_v1”]

We attempted to mitigate this behavior by setting performance_multiplier = 7, but this did not lead to any improvement.

Does anyone have recommendations for further configuration options or general approaches to resolve this issue?

Best regards

Stefan

Are there any other items in the Vault logs? Is there heavy Vault utilization from your workloads during this period? I would not expect “frequent” changes, so could also be a network issue (but unsure of your network configuration) - for example if you’re nodes are spread across multiple failure domains, increased latency could cause this: