Question regarding preferred Storage Backend(s) for large k/v store (up to 50 million secrets)

Hi there,

I’ve been using vault lately to evaluate its feasability for an upcoming project and so far am quite happy with it :slight_smile:

My main concern today is that there is a good chance, that we end up with up to 50 million secrets, that need to be stored in the kv secrets engine (possibly in multiple subfolders).

Does anyone have experience with this amount of data inside vault and maybe can recommend a storage backend for this task? The only constraint currently is, that it must be hosted onsite and can’t be a cloud-based storage backend.

To give this question some background:

  • I’m currently running vault with a postgres backend, both inside docker-containers on a virtual (ubuntu) server
  • My test script is inserting secrets into vault and currently sits at approx 9 million secrets
  • The rate of insertion has steadily dropped from approx. 75 secrest/second to now approx. 20 secrets/second
  • When I open the UI (while the test is running), it needs 5-7 minutes in order to show the list of secrets

The actual use-case for the project at hand would be something like this:

  • have an ‘archive’ folder/path with the said max 50 mio secrets, that basically never need to be accessed. In the event, that it is accessed, it may very well take a couple of minutes - no problem
  • have a ‘work directory’, that at any point holds roughly 100k secrets, that are actively added, modified, retrieved and finally moved to the archive. Actions in this folder should be “fast”

I’m still very much experimenting in order to get a feeling for vault and how I can use it best to solve my challenges. I appreciate any input, that you guys could provide and look forward to an interesting discussion.

Many thanks in advance
Julian

Hi Julian,

I’d recommend Consul as the storage back end. It supports high availability and can be run onsite. It’s also been Vault’s main one for a while, so is battle-tested.

We have a reference architecture here that may help as well: https://learn.hashicorp.com/vault/operations/ops-reference-architecture.

-Becca

Hi Becca,
thanks a lot for your reply.

I had a quick look at consul while researching the different storage backends and found a paragraph in the KV description that made me cautious:

Consul KV allows users to store indexed objects, though its main uses are storing configuration parameters and metadata. Please note that it is a simple KV store and is not intended to be a full featured datastore (such as DynamoDB) but has some similarities to one.

I have never worked at this scale (in terms of number of secrets) with neither vault nor consul. Do you think, that consul (with a moderate number of servers/nodes) can handle this amount of secrets?

Many thanks in advance
Cheers - Julian

Hi @tyrannosaurus-becks,

sorry to ping you again, I’m researching this issue and would highly appreciate your (or anyones :)) input.

I am not sure if Consul is the right backend for us, tbh. Especially these points make me wonder:

It should be noted that the Consul key/value store is not designed to be used as a general purpose database.

NOTE: Consul is not designed to serve as a general purpose database, and you should keep this in mind when choosing what data are populated to the key/value store.

  • This issue on the consul github, that indicates, that Consul stores everything in memory and crashes when Memory runs out. I imagine, that tens of millions of Keys end up being a significant amount of RAM, that I’d like to avoid paying for, when the bulk of the data is only seldomly accessed (see my initial post above)

  • The etcd documentation on github. While I don’t know the reliablility of that info there, it states, that Consul can only reliably store “hundres of megabytes”.

Since I’d like to use a high availability backend, I’m right now inclined to give cassandra a try, can you recommend that approach?

Many thanks in advance
Best Regards
Julian

If you want to store an unbounded number of keys, none of the in-memory databases – this includes both Consul and etcd – should be your target. (The information on etcd’s page seems suspect – like etcd, Consul is fine, until you run out of memory.)

You may want to look at the built-in Raft storage that will hopefully go GA in Vault 1.4. This allows for HA but it stores data on disk. You may also want to consider a different architecture, such as using Vault’s Transit secret engine to perform encryption as a service, then storing the actual (encrypted) data in a separate, highly scalable data store.

@JulesRenz would be really interested in how you bench marked vault and the numbers you got using different storage options. A blog post with the numbers and process would great value for the community.

I was interested in using Raft storage so we don’t have to manage another Consul cluster, however since it is a beta feature i am reluctant to take the jump.