Token character set?

Recently the size and format of at least approle tokens changed. I have one user that it parsing the returned json for the token using regex.
That seems to fail from time to time.
I’ve look through the docs didn’t find anything describing the char set used for tokens.
Anyone know if that is available?

You may be referring to this changelog entry from 1.10:

  • Server Side Consistent Tokens: Service tokens have been updated to be longer (a minimum of 95 bytes) and token prefixes for all token types are updated from s., b., and r. to hvs., hvb., and hvr. for service, batch, and recovery tokens respectively. Vault clusters with integrated storage will now have read-after-write consistency by default. [GH-14109]

But, please don’t try to parse JSON using regex. Use a real JSON parser, which pretty much every programming language has available these days.

The current implementation of tokens involves a type prefix, as explained in the changelog, followed by a random or encoded string, possibly followed by a namespace suffix when Enterprise namespaces are in use.

As far as I know, there’s no formal commitment not to change the character sets in use, so I’d encourage not creating a dependency on that. Just treat them as an opaque string.

But if you really really want to know, then perusing the source code suggests service/recovery tokens use base62, and batch tokens use url-safe-base64. Enterprise namespace suffixes, we of course can’t see the source code for, but empirically, they appear to be a dot followed by base62.

Again, though, please don’t take a dependency on this, just treat them as opaque strings.

/\"client_token\"\:\ \"(.*)\",/gm

Will capture using the json K/V rather than the format of the token itself. That said – Almost every OS and basic language has a JSON parser. Why would you try to regex a json output? Recently they have changed how long and what the format of the token can be and there is no guarantee that they will not change it again.