Vault load balancing and read\write requests

davidd · September 28, 2021, 11:22am

Hi.

We plan on using Vault Enterprise behind a load balancer which will likely be haproxy. I was thinking of using haproxy to send any PUT requests to the active node and any GET requests to the active node or performance standbys. This will avoid the standby’s having to then redirect writes to the active node.
I’m thinking we would check for a 200/473 status and also if the request is a read or write.
From researching we’d likely need http mode and haproxy will need to decrypt on receiving the traffic and re-encrypt it before sending to the Vault nodes. Does this sound workable?
Thanks

mikegreen · September 28, 2021, 1:47pm

That is workable.
I’d question the effort and added complexity if it is really needed - what are you performance load/requirements on writes?

davidd · September 28, 2021, 2:26pm

We don’t have exact requirements on how quick the response should be. I was thinking that if we could hit the active node for writes it would be preferred. Do people normally set haproxy in tcp mode and just look for the 200/473 response to send to any node and then writes are redirected? We’re not tied to using haproxy as a load balancer so can change if another is more suited

mikegreen · September 28, 2021, 3:41pm

Nginx, HA Proxy, F5 - anything that can parse HTTP response codes from sys/health is normal to use.
Depends on the read/write workload balance. If 90% of your requests are reads, you’re not going to get alot of performance benefit to split it up. Its all somewhat hand-wavy if you do not have an idea of workload/thruput requirements.

davidd · September 28, 2021, 4:26pm

Thanks. More than 90% of requests will likely be reads

aram · September 29, 2021, 8:42am

Using a ALB is probably the only way to do vault architecture correctly. The nice thing is that the application layer check is built into the vault health check:

In AWS for a targetGroup you can use: /v1/sys/health?perfstandby=200 against 8200 would tell give you the healthy nodes, and /v1/sys/health against 8201 would give the leader node in two nice groups. 8200 can be used to direct your ALB for your users and 8201 can be used as your cluster for any PR or DR connectivity.

davidd · September 29, 2021, 9:43am

thanks we’re on premise so plan on using haproxy to check for a 200 and 473 status code. I thought about it since and it may be pointless to check for read\write requests since we’ll use performance replication and the write will be directed to the primary cluster anyway. So the 200 or 473 status may be sufficient.

aram · September 29, 2021, 9:45am

There is no need to check for type of request. All nodes can reply to reads or writes. Internally when a write request is made, the node that got the request will internally tell the leader to store the updated value. There is no need for you to check or validate that and you’ll just end up causing more headache for yourself if you do.

davidd · September 29, 2021, 10:00am

So if it’s a write the node that receives it will reply to the client and forward the request to the leader? And since it’s asynchronous there’s no delay? With performance replication does the leader in a secondary cluster forward the write to the leader in a primary cluster? So the write request may be received by a performance standby in a secondary which replies to the client after forwarding the request to it’s local leader which forwards to the leader in the Primary cluster? Sorry for all the questions

aram · September 29, 2021, 10:12am

Essentially correct.

Correct.

BTW, even the nodes in the same instance of vault that are not “leader” nodes are considered Performance standbys. You can see this in the health check output of any non-leader node in the primary cluster.

  "performance_standby": true,

icassano · October 22, 2022, 4:02pm

Hello, could you share your haproxy conf, please?
I am becoming crazy because my haproxy conf does not work for vault. It returns ssl handshake errors but connecting directly to a vault server it works fine.
Ignazio

davidd · October 24, 2022, 8:19am

This is what I was testing with. I did see intermittent ssl handshake errors when using haproxy

    frontend vault_https
      mode tcp
      log global
      timeout client 30000
      bind *:443 
      description Vault over https
      default_backend vault_https
      use_backend vault_https_backup if { nbsrv(vault_https) lt 3 }
      option tcplog
      log         /dev/log local2 debug

    backend vault_https
      mode tcp
      timeout check 5000
      timeout server 30000
      timeout connect 5000
      option httpchk GET /v1/sys/health
      http-check expect rstatus 200|473|429
      option tcplog
      option httplog
      log         /dev/log local2 debug
      server server1 server1:8200 check port 8200 check-ssl verify none inter 2000  send-proxy fastinter 1000 downinter 10000 fall 2 rise 2
      server server2 server2:8200 check port 8200 check-ssl verify none inter 2000  send-proxy fastinter 1000 downinter 10000 fall 2 rise 2
      server server3 server3:8200 check port 8200 check-ssl verify none inter 2000  send-proxy fastinter 1000 downinter 10000 fall 2 rise 2
      server server4 server4:8200 check port 8200 check-ssl verify none inter 2000  send-proxy fastinter 1000 downinter 10000 fall 2 rise 2
      server server5 server5:8200 check port 8200 check-ssl verify none inter 2000  send-proxy fastinter 1000 downinter 10000 fall 2 rise 2

    backend vault_https_backup
      mode tcp
      timeout check 5000
      timeout server 30000
      timeout connect 5000
      option httpchk GET /v1/sys/health
      http-check expect rstatus 200|473|429
      server server6 server6:8200 check port 8200 check-ssl verify none inter 2000 send-proxy fastinter 1000 downinter 10000 fall 2 rise 2
      server server7 server7:8200 check port 8200 check-ssl verify none inter 2000 send-proxy fastinter 1000 downinter 10000 fall 2 rise 2
      server server8 server8:8200 check port 8200 check-ssl verify none inter 2000 send-proxy fastinter 1000 downinter 10000 fall 2 rise 2
      server server9 server9:8200 check port 8200 check-ssl verify none inter 2000 send-proxy fastinter 1000 downinter 10000 fall 2 rise 2
      server server10 server10:8200 check port 8200 check-ssl verify none inter 2000 send-proxy fastinter 1000 downinter 10000 fall 2 rise 2

Topic		Replies	Views
Load balancing and application availability Vault	7	2246	October 12, 2021
Vault HA Failover using HAProxy Vault	2	3674	June 29, 2022
Vault health-check in a forwarding mode Vault	1	1875	August 13, 2020
Cloud vault LoadBalancer send write requests to stabdby nodes in vault cluster when we use cluster endpoint for connections Vault connect , vault	3	1002	November 9, 2022
Vault Transit Secret Engine with HA Cluster with Integrated Storage Vault	4	501	April 27, 2021

Vault load balancing and read\write requests

Related topics