Issues using Vault 0.10.4 and Consul 1.8

Importing from GitHub #8448.

I am trying to setup vault as a CA provider in my consul connect setup.
But it does not take the configuration specified for vault as a CA provider.

It goes by the default consul as a CA provider neither does it show any error messages in the consul server logs.

connect configuration:

   "connect": {
        "enabled": true
        "ca_provider": "vault"
        "ca_config": {
            "address": "<vault_address>"
            "token": "<vault_root_token_for_demo>"
            "root_pki_path": "/consul-connect-demo-root"
            "intermediate_pki_path": "/consul-connect-demo-intermediate"
            "TLSSkipVerify": "true"
    }
},

log snippet:

    2020-08-06T18:18:48.158Z [INFO]  agent.server.connect: initialized primary datacenter CA with provider: provider=consul
    2020-08-06T18:18:48.158Z [INFO]  agent.leader: started routine: routine="federation state anti-entropy"
    2020-08-06T18:18:48.158Z [INFO]  agent.leader: started routine: routine="federation state pruning"
    2020-08-06T18:18:48.158Z [INFO]  agent.leader: started routine: routine="CA root pruning"

Operating system: centos 7

Consul version: 1.8.0
Vault version : 0.10.4

What am I missing ?

Regards,
Ashwin

Hi Ashwin -

Thanks for posting, and welcome to the forums. I copied over your question, as this may get more traction here.

  • Can you please post the command you are using to start Consul?
  • What configuration file and command are you using for Vault?
  • Do the “Connect CA” messages continue after restarting the servers?

Best,
Jono

Hi @jsosulska,

Following are my answers:

Q. Can you please post the command you are using to start Consul?
A. consul agent -config-dir=consul.hcl

Q. What configuration file and command are you using for Vault?
A.
consul configuration
{
“bootstrap_expect”: 1,
“server”: true,
“disable_update_check”: true,
“ui”: true,
“datacenter”: “dc1”,
“data_dir”: “<data_dir>”,
“log_level”: “INFO”,
“encrypt”: “<encrypt_key>”,
“ports”: {
“http”: 8500,
“dns”: 8600,
“serf_lan”: 8301,
“serf_wan”: 8302
},
“client_addr”: “0.0.0.0”,
“raft_protocol”: 2,
“performance”: {
“raft_multiplier”: 1,
“leave_drain_time”: “20s”,
“rpc_hold_timeout”: “20s”
},
“connect”: {
“enabled”: true
“ca_provider”: “vault”
“ca_config”: {
“address”: “<vault_address>”
“token”: “<vault_root_token_for_demo>”
“root_pki_path”: “/consul-connect-demo-root”
“intermediate_pki_path”: “/consul-connect-demo-intermediate”
“TLSSkipVerify”: “true”
}
},
“node_name”: “<consul_server>”,
“advertise_addr”: “<consul_server_address>”,
“retry_join”: ["<consul_server>"]
}

vault configuration

ui = true

pid_file = “<path_to_pid_location>/vault.pid”
listener “tcp” {
address = “0.0.0.0:<vault_port>”
tls_disable = “false”
tls_cert_file = “<path_to _cert_file>”
tls_key_file = “<path_to _key_file>”
tls_disable_client_certs = “true”
}
api_addr = “https://<vault_node_name>:<vault_port>”
storage “consul” {
address = “localhost:<consul_http_port>”
path = “<vault_data_path>”
consistency_mode = “default”
max_parallel = “128”
service = “vault”
scheme = “http”
token = “”
}

Q. Do the “Connect CA” messages continue after restarting the servers?
A. Yes, they continue even after restarting the servers. I even tried to delete data folder and start from fresh… but it still shows Consul as CA provider. Verified the root certificate from the /v1/ca/connect/roots endpoint as well.

Regards,
Ashwin

Hi @ashwinkupatkar,

I had a similar issue with the CA configuration not being updated properly I had to “helm delete” the chart and to manually delete the persistent volume claims from consul so the changes were reflected on the next installation

Hi @seguidor777,

I am deploying consul server on a VM and consul clients are running inside k8s and on vm nodes.

Okay, I suggest you to try a fresh install but after deleting the persistent volumes, that worked for me

Cheers

I tried deleting the data directory of consul cluster and starting from fresh… but no luck.

I thought you were using persistent volumes. BTW verify if there is any other thing that you can delete so your configuration is updated correctly. Also you can try calling the consul connect API to update the CA provider on fly.

For Vault, the document states that it creates on its own and mounts the path for.

root_pki_path
intermediate_pki_path

  • RootPKIPath / root_pki_path ( string: <required> ) - The path to a PKI secrets engine for the root certificate. If the path doesn’t exist, Consul will attempt to mount and configure this automatically.

  • IntermediatePKIPath / intermediate_pki_path ( string: <required> ) - The path to a PKI secrets engine for the generated intermediate certificate. This certificate will be signed by the configured root PKI path. If this path doesn’t exist, Consul will attempt to mount and configure this automatically.

1 Like

Hey there @ashwinkupatkar - Have you tried setting up your topology using this guide on VM+K8s deployments? Let me know if that works for you!

As for the Vault information - when you launch Consul, do you see those paths created in Vault at all?

Also - I have heard from several people this is a pain point, but there is content coming that will help alleviate this pain point. I’ll be sure to post it here and in the Consul/Vault forums when launched. :slight_smile:

2 Likes

Hi, is it possible you’re making this change after the cluster is running? Once the cluster is running, it ignores changes to any of these config values and you must use the API (https://www.consul.io/api/connect/ca#update-ca-configuration) or CLI (https://www.consul.io/docs/commands/connect/ca#set-config) commands to update it.

I just tested this. config.json:

{"connect": {
        "enabled": true,
        "ca_provider": "vault",
        "ca_config": {
            "address": "<vault_address>",
            "token": "<vault_root_token_for_demo>",
            "root_pki_path": "/consul-connect-demo-root",
            "intermediate_pki_path": "/consul-connect-demo-intermediate",
            "TLSSkipVerify": "true"
    }
}}
consul agent -server -config-file config.json -data-dir ./data -bootstrap-expect=1|grep agent.server
BootstrapExpect is set to 1; this is the same as Bootstrap mode.
bootstrap = true: do not enable unless necessary
    2020-08-10T14:36:54.648-0700 [INFO]  agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:e772c47a-d5e4-85ad-1272-71430e179296 Address:10.0.1.4:8300}]"
    2020-08-10T14:36:54.649-0700 [INFO]  agent.server.raft: entering follower state: follower="Node at 10.0.1.4:8300 [Follower]" leader=
    2020-08-10T14:36:54.650-0700 [INFO]  agent.server.serf.wan: serf: EventMemberJoin: Lukes-MacBook-Pro-2.local.dc1 10.0.1.4
    2020-08-10T14:36:54.651-0700 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: Lukes-MacBook-Pro-2.local 10.0.1.4
    2020-08-10T14:36:54.651-0700 [INFO]  agent.server: Adding LAN server: server="Lukes-MacBook-Pro-2.local (Addr: tcp/10.0.1.4:8300) (DC: dc1)"
    2020-08-10T14:36:54.652-0700 [INFO]  agent.server: Handled event for server in area: event=member-join server=Lukes-MacBook-Pro-2.local.dc1 area=wan
    2020-08-10T14:37:02.432-0700 [WARN]  agent.server.raft: heartbeat timeout reached, starting election: last-leader=
    2020-08-10T14:37:02.432-0700 [INFO]  agent.server.raft: entering candidate state: node="Node at 10.0.1.4:8300 [Candidate]" term=2
    2020-08-10T14:37:02.466-0700 [INFO]  agent.server.raft: election won: tally=1
    2020-08-10T14:37:02.466-0700 [INFO]  agent.server.raft: entering leader state: leader="Node at 10.0.1.4:8300 [Leader]"
    2020-08-10T14:37:02.466-0700 [INFO]  agent.server: cluster leadership acquired
    2020-08-10T14:37:02.467-0700 [INFO]  agent.server: New leader elected: payload=Lukes-MacBook-Pro-2.local
    2020-08-10T14:37:02.487-0700 [INFO]  agent.server: Cannot upgrade to new ACLs: leaderMode=0 mode=0 found=true leader=10.0.1.4:8300
    2020-08-10T14:37:02.498-0700 [INFO]  agent.server: Created the builtin namespace: namespace=default
    2020-08-10T14:37:02.556-0700 [ERROR] agent.server: failed to establish leadership: error="error generating CA root certificate: Get %3Cvault_address%3E/v1/consul-connect-demo-root/ca/pem: unsupported protocol scheme """

You can see it trying to talk to vault.

1 Like

Hi @lkysow,

I tried the exact same command as you tried with.

I even tried with the latest version of vault (1.4)

but still the ca provider remains as consul

All of the above setups were tried from scratch (means no data directory exists)

@jsosulska can you please update the Vault version to 0.10.4 from 10.4 in the ticket title ?

Thanks,
Ashwin

Hi @jsosulska,

I do not see those paths created in vault when i login to vault.

Can you run this and paste the output for me please:

cd /tmp
mkdir -p consul-test/data
cd consul-test
cat <<EOF > config.json
{"connect": {
        "enabled": true,
        "ca_provider": "vault",
        "ca_config": {
            "address": "<vault_address>",
            "token": "<vault_root_token_for_demo>",
            "root_pki_path": "/consul-connect-demo-root",
            "intermediate_pki_path": "/consul-connect-demo-intermediate",
            "TLSSkipVerify": "true"
    }
}}
EOF
consul agent -server -config-file config.json -data-dir ./data -bootstrap-expect=1|grep agent.server

Can you also show consul version output.

1 Like

Hi @lkysow,

thankyou soo much … after tallying the config … i realised that i was missing semi-colon after the ca_config attributes… after fixing that ; vault as a CA could appear in logs and corresponding paths created in vault.

One thing to improve here is when i specify the config in .hcl file … its gives no errors in the log… however if i rename the config to .json file … it gives me the errors and thats when we get to know if there are parsing issues.

This is something .hcl formatted files should come up with… right error logging… if there are any.

Thanks much @lkysow !

Regards,
Ashwin

1 Like

Hi @ashwinkupatkar - Happy that this could get resolved! Updated the title so others can find it.

1 Like