Is the format of this ciphertext documented somewhere? Is it possible to parse this ciphertext into separate fields? Specifically, I’m interested in parsing out the initialization vector and the tag, but I would like to know what all of the fields are.
(I understand that the ciphertext from that example might not have been encrypted with an AES key. If the format of the ciphertext varies by key type, I’m specifically interested in the ciphertext format for AES keys. But I imagine that others might be interested in the ciphertext format for other key types.)
The first prefix identifies it as having been wrapped by Vault (and is configurable). The “v1” indicates that it’s key version 1, so that when you rotate keys we know which version to use for decryption. The last part is a base64’d concatenation of the IV and ciphertext. Assuming you’re using AES-GCM, after de-base-64ing, the first 96 bits will be the IV, and the rest the ciphertext.
Won’t change the parsing, but the ciphertext includes padding and some kind of MAC (like a SHA256 hash of the plain text) to detect cipher text attacks:
$ vault write --field=plaintext transit/decrypt/mykey ciphertext=vault:v1:VLiFIPvGDnBcRj+luztNV5OFiK/4vpRxQb59hNurwgQRb+f471LmU4A=
Error writing data to transit/decrypt/guillaume: Error making API request.
URL: PUT https://localhost:8200/v1/transit/decrypt/mykey
Code: 400. Errors:
* invalid ciphertext: unable to decrypt
So if after parsing you intend to decrypt by some other means (assuming you have the key), you will have to take this into account.
@mikewertheim There is no MAC. It uses AES-GCM so it is authenticated but that’s built-in. There is also no padding outside of the normal AES algorithm.
That’s where I would start investiguating the source to find the format of the actual bytes that are encrypted. With that knowledge, it is just a matter of cherry picking the GCM data from the ciphertext and decrypting it.
I don’t like when I get questions about my use case from someone who doesn’t know the history behind it, so I’ll resist the urge to question yours… Still an interesting challenge. I might give it a shot!
Thanks @jeff, still a little confused about the authenticated data… I’ll read up on that. I was expecting the AAD part of GCM to be separate, like the IV.
Anyway, to answer @mikewertheim, here is a roundtrip decryption of transit ciphertext. Works on my machine, ymmv!
$ vault read transit/export/encryption-key/guillaume
Key Value
--- -----
keys map[1:VxJWkOYm2F5z1nF1th9zreS6ZAZMFkCq0c/Ik460ayw=]
name guillaume
type aes256-gcm96
Thanks @ixe013 this works for AES256-GCM with non convergent encryption. I’m struggling to make it work with convergent encryption activated and Python.
I got a persistent cryptography.exceptions.InvalidTag which reveals the nonce/IV or the key is incorrect. Have you tried or successfuly got his working?
import base64
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
from cryptography.hazmat.primitives.kdf.hkdf import HKDF
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.backends import default_backend
backend = default_backend()
ciphertext = base64.b64decode('72CrTbu2j0++F1SfDkr2JVSqlK5lU0frqm0qWROOcvzv8V1K4nc6P62mCw==')
iv = ciphertext[:int(96/8)] #First 96 bits
actual_ciphertext = ciphertext[int(96/8):] #Remaining bits
aad = None
password_bytes = base64.b64decode('/FVNsIbURbaNDxBEXxK4HRwCuT7xHqZ07Ji0cZwcPT0=')
hkdf = HKDF(
algorithm=hashes.SHA256(),
length=32,
salt=b'', #Even if salt is recommended to be set, it is Null. An acceptable option
info=b'1234', #Here is the context for the convergent encryption
backend=backend
)
key = hkdf.derive(password_bytes)
plaintext = AESGCM(key).decrypt(iv, actual_ciphertext, aad)
print(plaintext)