Does raft require HTTPS?

Hello,

I am trying to run a Vault cluster of 5 pods using the official Helm chart. I can provide more details if necessary, but I think at this stage the problem I am experiencing is not Kubernetes-specific.

I don’t want to use TLS, so I disabled/removed all TLS-related configuration options. Looking at the pods’ logs, it seems they can’t manage to talk to each other. All of them report the same messages in their logs:

[INFO]  core: attempting to join possible raft leader node: leader_addr=http://vault-2:8200
[WARN]  core: join attempt failed: error="error during raft bootstrap init call: Put \"http://vault-2:8200/v1/sys/storage/raft/bootstrap/challenge\": dial tcp: lookup vault-2 on 10.96.0.10:53: server misbehaving"

I did some research on this “server misbehaving” error message, but couldn’t find anything on the internet. I suspect that pods expect other pods to talk HTTPS, and aren’t happy when they get a plain HTTP response. Could that be the cause of this “server misbehaving” message?

Thanks for any help! If you require more details (version numbers, configuration, logs, etc.) please ask and I will provide them.

Look more closely at this part of your error message:

The error mentions lookup, an IP address which looks very much like what Kubernetes may have picked for its default in-cluster DNS server, and port 53, the DNS port.

So, the server misbehaving refers to your Kubernetes DNS server, not Vault-to-Vault communication.

Hi @maxb ,

Yes, that’s right, thanks for pointing this out! I will have to investigate why the DNS server is “misbehaving”…

Thanks!

No RAFT protocol does not require TLS.
I’ll start by saying I’m a kubernetes noob and most likely moving to Nomad for my use cases anyway. That said, sounds like the most likely issue with that message is some garbled data from kubs dns-system. Are you using external-names or service registration? If not you need to statically call out the names of the other nodes in your configuration – most likely is that the names you are using are not getting translated to the right node.

It appears this is the Go DNS client’s rather odd way of expressing “the DNS server sent me an error, but not an error I consider meaningful for this operation”:

As a starting point, you might want to exec into a pod within Kubernetes and check whether you’re actually able to resolve “vault-2” there.