Vault_kv_store is large

hello guys

I need help, tell me what can be done? I have a backend Postgres and there is a vault_kv_store table that occupies 272 gigabytes, over 210 million records, trying to clean it, but all requests are executed for a very long time, or rather, how best to clean the table?

I think execute DELETE FROM vault_kv_store WHERE parent_path LIKE ‘/sys/expire/id/auth/kubernetes/’; but even the select on this request does not pass
I can’t understand how to work the services that look at the vault, because they also communicate through the selection, and receive data.
Tell me, how best to clean the table?

It is not intended that operators change the contents of the Vault storage manually in any way - doing so without extremely intimate knowledge of how Vault works may cause severe problems for your installation.

Instead, your approach should be to identify which paths in the Vault store are responsible for most of the data, and use that to understand the usage patterns causing them, and change those.

For example, you’ve implied you are concerned about /sys/expire/id/auth/kubernetes/ - lots of data here would imply there’s an unreasonably huge number of active Vault tokens for login sessions via the auth/kubernetes/ auth method.

If the TTL of tokens from auth/kubernetes/ logins was high - say a month - but you have lots of processes in Kubernetes which just log in, do a few operations, and then discard their token, getting a new one from a fresh login next time they interact with Vault - then this would leave Vault stuck tracking a huge number of tokens which the applications have discarded, until they eventually expire.

There may be other scenarios causing problems as well, as 272 GB is very large indeed. The production Vault instance I’m familiar with has just 1 GB of stored data.

I am unfamiliar with using Postgres as a Vault backend, and I’m not sure if this will work given the large size of the database, but if you were able to run and post here the output from something like:

SELECT COUNT(*), parent_path FROM vault_kv_store GROUP BY parent_path;

it could be very useful to help understand where the large amounts of data are in your Vault store.

However, if you’ve already identified /sys/expire/id/auth/kubernetes/ as an area of concern, it might be best to tackle that first, to start reducing the database size.

1 Like

The reasons for increasing the table are known. Now I’m trying to solve the problem of how to clean the table. It is impossible to remove unnecessary lines in the table, a very large request delay. You start removing the lines /sys/expire/id/auth/kubernetes/
for example DELETE FROM vault_kv_store WHERE parent_path LIKE ‘/sys/expire/id/auth/kubernetes/’;
This request operates for more than 14 hours, without the result (((

You’ve kind of ignored the core point in my reply, so let me restate it:

The way forward here is to solve the root cause of the problem, not to start deleting bits of the Vault data store ad-hoc.

1 Like

Sorry for not answering for so long.
There is no way to show Select Count (*), Parent_path from vault_kv_store group by parent_path; because this request is not executed ((I can only write the total number of fields in the database, 208 million records.
Alas, in the settings of the authentication method, the lifetime of tokens is not indicated.
I am still looking for an opportunity to clean the base.

I want to set the lifetime on some authentication methods, let’s see if it will begin to delete old records

Why not? Or are you saying it takes too long to run and times out?

yeah, I received
vault list sys/leases/lookup/auth/kubernetes Error listing sys/leases/lookup/auth/kubernetes: context deadline exceeded
and SQL query to backend too.
This is a big problem, I can’t clean the old leases