Unseal and init vault on AWS using awskms - not booting/starting

I am attempting to automate the deploy of vault in k8s on aws using eks. Terraform is used to provision the k8s cluster, storage and also a KMS key to be used for unseal. Once the workers are deployed we stand up various services using helm.

The vault nodes (3 of them) are not booting up completely due to a couple errors. Initially we are getting an access denied accessing the KMS key. It is indicating that an assumed role does not have access to the describe key. I have no idea where this assumed roles came from, it does not exist on our systems and is very different from the service account defined in the helm manifests. Here is the error message:

Error parsing Seal configuration: error fetching AWS KMS wrapping key information: AccessDeniedException: User: arn:aws:sts::##############:assumed-role/eks-imply.<my account>/i-0a5ec9371c6d6eee0 is not authorized to perform: kms:DescribeKey on resource: arn:aws:kms:us-west-2:############:key/16bad6d5-19fd-4dc3-a4dd-4cfc6b9f63eb because no identity-based policy allows the kms:DescribeKey action
	status code: 400, request id: c5bb43ff-f527-4a84-a593-35f83d0d7d8

We are also experiencing errors rejoining the cluster when a single node is deleted (even though it never comes up all the way). It is not finding the tls config in order to communicate with the leader node. These files are not created since none of the nodes come up. We are also unable to exec into the pods for more detailed troubleshooting.

Failed to initiate raft retry join, "failed to create tls config to communicate with leader node (retry_join index: 0): failed to read CA file: open /vault/userconfig/tls-ca/ca.crt: no such file or directory"2022-04-14T18:44:15.745Z [WARN]  storage.raft.fsm: raft FSM db file has wider permissions than needed: needed=-rw------- existing=-rw-rw----

Any recommendations, suggestions or questions are welcome. Thanks in advance.

I know it’s been a bit since you posted this, but…

Can you provide any more information about the KMS key you have created? and the Key policy as well as the IAM role policy you provide the node-groups or service vault utilizes?

My key and key-policy look like this

resource "aws_kms_key" "vault" {
  description             = "Vault unseal key"
  policy                  = data.aws_iam_policy_document.vault_key_policy.json

  tags = module.label.tags
}

data "aws_iam_policy_document" "vault_key_policy" {
  # Copy of default KMS policy that lets you manage it
  statement {
    sid = "Enable IAM User Permissions"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type        = "AWS"
      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root",
      ]
    }
  }

  # Required for EKS
  statement {
    sid = "Allow service-linked Vault role use of the CMK"
    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:ReEncrypt*",
      "kms:GenerateDataKey*",
      "kms:DescribeKey"
    ]
    resources = [aws_kms_key.vault.arn]

    principals {
      type = "AWS"
      identifiers = [
        aws_iam_role.eks-vault-role.arn
      ]
    }
  }
}

and the service policy

resource "aws_iam_role_policy" "vault-auto-unseal-policy" {
  name = "${title(var.name)}EKS_Vault_Auto_Unseal_RolePolicy"
  policy = data.aws_iam_policy_document.vault_auto_unseal.json
  role   = aws_iam_role.eks-vault-role.arn
}

data "aws_iam_policy_document" "vault_auto_unseal" {
  statement {
    effect = "Allow"
    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:DescribeKey",
      "ec2:DescribeInstances"
    ]
    resources = [aws_kms_key.vault.arn]
  }
}

Finally, how are you mounting your CA secret in your vault service? Assuming you are not using the auto-generate certificates… either way:

failed to read CA file: open /vault/userconfig/tls-ca/ca.crt: no such file or directory

leads me to believe this may not be happening.

We eventually got to this same point. A few items were overlooked in the initial attempt.

Thank you.