Basic vault setup consuming 2GB of disk space in /opt/vault/data/sys and rising by the day

Hi,

Vault newbie here!

A few months ago I deployed a very simple vault setup to host three secrets for (currently) 15 hosts as part of a automation project using salt.

Vault and Salt are working together as expected, however I recently noticed Vault is consuming an awful lot of disk space (just short of 2GB) for just a simple setup and it’s been slowly incrementing every 15 minutes since it was deployed (based on my internal monitoring).

The salt minion on all the endpoints is scheduled to run salt-call state.apply every 15 minutes to keep their state in sync, hence I think that it highly likely to be relevant.

The salt minion does require a password stored in vault in certain circumstances, but I suspect the salt minion is request it on every execution to populate it’s pillar data.

The salt master is configured to use Vault using an approle as per the documentation at Basic Configuration - Salt Extension for interacting with Vault. A redacted copy of the config is available at the bottom of this message.

I suspect Salt is generating a refresh token for each minion for the requests made every fifteen minutes. I presumed the old tokens would expire and be removed from the database, but now I’m not so sure.

I also wondered if they are being expired but the “database” needs defragging or similar. As I’m using the simple file storage backend it seems all the tokens are placed on the disk as flat files, not entered into a single database file so that seems unlikely.

I’ve been unable to find a ridiculous amount of tokens “hanging around” when I navigate Vault using the Web GUI.

Does anyone have any advice as to how I might find and (if there’s loads of them) expire old tokens issued for the salt-minions?

Failing that, does anyone have any advice on how to troubleshoot this further?

Lastly, is using the file storage backend okay for this very simple setup or is there a better storage backup which would allow me to deal with this type of issue better?

Many thanks

Steve


Disk space incrementing every 15 minutes.

Output from vault status

Key             Value
---             -----
Seal Type       shamir
Initialized     true
Sealed          false
Total Shares    5
Threshold       3
Version         1.17.5
Build Date      2024-08-30T15:54:57Z
Storage Type    file
Cluster Name    vault-cluster-XXXXXXXX
Cluster ID      XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
HA Enabled      false

Vault’s config file

cat /etc/vault.d/vault.hcl

ui = true

storage "file" {
  path = "/opt/vault/data"
}

# HTTPS listener
listener "tcp" {
  address       = "0.0.0.0:8200"
  tls_cert_file = "/etc/nginx/ssl/XXXX.cer"
  tls_key_file  = "/etc/nginx/ssl/XXXX.key"
}

Disk usage highlighting the disk space is being consumed in /opt/vault/data/sys/token

sudo du -d 1 -h /opt/vault/data/
196K    /opt/vault/data/auth
20K     /opt/vault/data/audit
240K    /opt/vault/data/logical
100K    /opt/vault/data/core
1.8G    /opt/vault/data/sys
1.8G    /opt/vault/data/

sudo du -d 1 -h /opt/vault/data/sys/
124K    /opt/vault/data/sys/counters
1.2G    /opt/vault/data/sys/token
586M    /opt/vault/data/sys/expire
40K     /opt/vault/data/sys/policy
1.8G    /opt/vault/data/sys/

sudo du -d 1 -h /opt/vault/data/sys/token
581M    /opt/vault/data/sys/token/accessor
587M    /opt/vault/data/sys/token/id
1.2G    /opt/vault/data/sys/token
cat /etc/salt/master.d/vault.conf
vault:
  auth:
    method: approle
    approle_mount: salt-master-approle
    role_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
    secret_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
  issue:
    type: approle
    approle:
      mount: salt-minions-approle
  server:
    url: https://FQDN:8200
    verify: /etc/ssl/certs/ca-certificates.crt
  metadata:
    entity:
      minion-id: '{minion}'
    secret:
      saltstack-jid: '{jid}'
      saltstack-minion: '{minion}'
      saltstack-user: '{user}'
  policies:
    assign:
      - saltstack/minions
      - saltstack/{minion}
    cache_time: 60
    refresh_pillar: null

Additional information…

Having taken a closely look at the files in /opt/vault/data/sys/token/id/ I can see the around 72 files are being created every 15 minutes and that the oldest files are dated approx 32 days ago…

# ll -t /opt/vault/data/sys/token/id/ | head
total 587M
drwx------ 2 vault vault  18M Sep  5 13:16 ./
-rw------- 1 vault vault 1.1K Sep  5 13:16 _h81c7a9a0e6b0948e8799b8dc6c4d439916789f517dbbd37eacde1f99755946d2
-rw------- 1 vault vault  885 Sep  5 13:16 _hf713fcc5e968d4f8028af56d31fccd4ae08abf3a428522e4c9eb7a94a92802fa
-rw------- 1 vault vault  885 Sep  5 13:16 _h57d9607271f94a99ee153b1b4ea99b8326c66a46798ce2d3e10c6306dcd796b3
-rw------- 1 vault vault 1.1K Sep  5 13:16 _hdc78bdb1cd057a50c73bfb28a7edbb956882236e664c4886d81f5780649facb8
-rw------- 1 vault vault  885 Sep  5 13:16 _h3c30235fc9409a23b99f2f2e4ffe2dbbb5bd4a46d31178dd1e0effca12be8b71
-rw------- 1 vault vault  885 Sep  5 13:16 _h3adfd8854c7b9e290b2c6224bc333d28525a67e9251bf40771c547897f5525a7
-rw------- 1 vault vault 1.1K Sep  5 13:16 _h5257118ad0a131f54d68f3af16c914b7a9f02c9583131a22a2bceda9f0f87e76
-rw------- 1 vault vault  885 Sep  5 13:16 _h00be03968d1c3f7d1702d790847f0e883388167e4146fcd1a052cbdff3488ea8

# ll -t /opt/vault/data/sys/token/id/ | tail
-rw------- 1 vault vault  885 Aug  4 13:30 _hb20f25fc19fa0180d51641535c7bd5a23f66d738e7cf065f946f9fb488da6dbb
-rw------- 1 vault vault  885 Aug  4 13:30 _hf57416727568142a6c284a1277da75ee05fb489e1c37a329dd4404b314e38cb2
-rw------- 1 vault vault  885 Aug  4 13:30 _hec320e36a1720c6366a0368f8c32942a80b46d3eb207ff65144394666cb56a91
-rw------- 1 vault vault  885 Aug  4 13:30 _h602b3b8a342deced36d2fe4f51187cd23644d4f5baab02feb7177bdd1f093c02
-rw------- 1 vault vault  885 Aug  4 13:30 _h4c8eabf26b2674fc4ae86b4773a889c74d50f69994d61cfefc5a1af89cb88105
-rw------- 1 vault vault  885 Aug  4 13:30 _hd143e0647305a65198d3735e588ff6ef407016bbe85b4517a94ae132b2a521dd
-rw------- 1 vault vault  885 Aug  4 13:30 _h0ab66b5e7d168f214f5739ecb8946cdf63cd4901d6e1c93305afb2bfb1496581
-rw------- 1 vault vault  885 Aug  4 13:30 _h71d6b5dbc5cafeba4f3c4dfdb132a1ecfbb6bcef3d7870b873e602f9b29d5737
-rw------- 1 vault vault  885 Aug  4 13:30 _h26ea5e9d2348af618a70edcc7ebb0f41571af81f21e0051df7feb060b1862219
-rw------- 1 vault vault  721 Jul 24 16:52 _h4a5a8a02d1a7dddb2aacf71e8ecd9d3382ab8d22e11f3d8560172f3c7bc4f563

And those oldest files are trimmed on a rolling basis (so maybe it’ll just sit at 2GB and it’s not really a problem as I thought eventually the host would run out of the disk space).

root@captain:~ # ll -t /opt/vault/data/sys/token/id/ | tail
-rw------- 1 vault vault  885 Aug  4 13:30 _ha942ecee09c6f3c8fd0728269ad7ba33723b435e5f68d7c8e858a283dcdbb743
-rw------- 1 vault vault  885 Aug  4 13:30 _hb20f25fc19fa0180d51641535c7bd5a23f66d738e7cf065f946f9fb488da6dbb
-rw------- 1 vault vault  885 Aug  4 13:30 _hf57416727568142a6c284a1277da75ee05fb489e1c37a329dd4404b314e38cb2
-rw------- 1 vault vault  885 Aug  4 13:30 _hec320e36a1720c6366a0368f8c32942a80b46d3eb207ff65144394666cb56a91
-rw------- 1 vault vault  885 Aug  4 13:30 _h602b3b8a342deced36d2fe4f51187cd23644d4f5baab02feb7177bdd1f093c02
-rw------- 1 vault vault  885 Aug  4 13:30 _h4c8eabf26b2674fc4ae86b4773a889c74d50f69994d61cfefc5a1af89cb88105
-rw------- 1 vault vault  885 Aug  4 13:30 _hd143e0647305a65198d3735e588ff6ef407016bbe85b4517a94ae132b2a521dd
-rw------- 1 vault vault  885 Aug  4 13:30 _h0ab66b5e7d168f214f5739ecb8946cdf63cd4901d6e1c93305afb2bfb1496581
-rw------- 1 vault vault  885 Aug  4 13:30 _h71d6b5dbc5cafeba4f3c4dfdb132a1ecfbb6bcef3d7870b873e602f9b29d5737
-rw------- 1 vault vault  885 Aug  4 13:30 _h26ea5e9d2348af618a70edcc7ebb0f41571af81f21e0051df7feb060b1862219

#
# Run same comment 15 minutes later
#

root@captain:~ # ll -t /opt/vault/data/sys/token/id/ | tail
-rw------- 1 vault vault  885 Aug  4 13:45 _h35c9602c98efd14406fbb4d4484022b3cef84e1e0aec941b4c56604509d4cd78
-rw------- 1 vault vault  885 Aug  4 13:45 _hf053047cba751a68c9910247d91042f0b7dbbbea336d6c789acf66b59e12cadc
-rw------- 1 vault vault  885 Aug  4 13:45 _hea82785c11fa3e32ab0e5577ee8010ffdde0cfc315e0bb209ebcc0c35bf0c8aa
-rw------- 1 vault vault  885 Aug  4 13:45 _h76432fb259fb55123ce8139653301c98f0c52d3b4804123864438691cbbcf946
-rw------- 1 vault vault  885 Aug  4 13:45 _hbc4a31ac9e50091188589e653a475cfffa0f10a59da29f5760879249d397836d
-rw------- 1 vault vault  885 Aug  4 13:45 _h830ae7cd56daf3f9d298859de77be84d42f70d4006a95f7766f38478b34efa51
-rw------- 1 vault vault  885 Aug  4 13:45 _h0b959c0a21afbf297cc0a21da4b0d18382b80932c6821a4a6da988dcb452b687
-rw------- 1 vault vault  885 Aug  4 13:45 _hd0583907bd164323213b53809b555a241ef995532b5105d3e13d736bba591055
-rw------- 1 vault vault  885 Aug  4 13:45 _hfd7dd4e81d4ab80677ca60c8e7c47565551001ce26abbd381831163462879033
-rw------- 1 vault vault  885 Aug  4 13:45 _h6c77b7ecfe6514d9cd0b8930315544ed644256edeac7fb4031f1467e32664cca

I think I’m getting closer to working this out.

Token management | Vault | HashiCorp Developer - https://developer.hashicorp.com/ states that…

The default token TTL (default_lease_ttl) and the max TTL (max_lease_ttl) is set to 32 days (768 hours). This implies that the tokens are valid for 32 days from its creation whether an app is using the token or not.

That page also offers some advice on how to tune the default_lease_ttl and max_lease_ttl.

$ vault read sys/auth/token/tune
Key                  Value
---                  -----
default_lease_ttl    768h
description          token based credentials
force_no_cache       false
max_lease_ttl        768h
token_type           default-service
$ vault write sys/auth/token/tune default_lease_ttl=8h max_lease_ttl=720h
Success! Data written to: sys/auth/token/tune
$ vault read sys/auth/token/tune
Key                  Value
---                  -----
default_lease_ttl    8h
description          token based credentials
force_no_cache       false
max_lease_ttl        720h
token_type           default-service

I’ve tuned mine down to 8hrs but will probably reduce down down further to 15 minutes.

Changing the default_lease_ttl and max_lease_ttl didn’t immediately release any disk space… I suspect all the tokens present on the disk have a 32 days expiry so I’ll need to wait for them to naturally expire. Hopefuylly in 32 day I’ll only have 8 hours worth of tokens.

Finally, the following command shows how many tokens are currently available.

$ vault read sys/internal/counters/tokens
Key         Value
---         -----
counters    map[service_tokens:map[total:145884]]

I’ll keep an eye on this value, but I suspect it’ll start to reduce over time. :crossed_fingers:

Thanks for listening :rofl:

Regards

Steve

While poking about in the Web GUI I noticed that the salt-master-approle authentication method still had the 32 days default so I adjusted the commands as follows…

$ vault read sys/auth/salt-master-approle/tune
Key                   Value
---                   -----
default_lease_ttl     768h
description           n/a
force_no_cache        false
listing_visibility    hidden
max_lease_ttl         768h
token_type            default-service
$ vault write sys/auth/salt-master-approle/tune default_lease_ttl=8h max_lease_ttl=720h
$ vault read sys/auth/salt-master-approle/tune
Key                   Value
---                   -----
default_lease_ttl     8h
description           n/a
force_no_cache        false
listing_visibility    hidden
max_lease_ttl         720h
token_type            default-service

Changing those also didn’t release any disk space, but I’m hopeful over time it’ll start to reduce.