Hi We have Vault Server setup inside EC2 Instance,
we encounter condition that one of the ec2 instance replaced with new one because ec2 health check failure in ASG activity.
then we found that new ec2 instance would not join the cluster no matter what… although in the old ec2 instance its detected
new instance vault status
:
Key Value
--- -----
Recovery Seal Type awskms
Initialized true
Sealed true
Total Recovery Shares 0
Threshold 0
Unseal Progress 0/0
Unseal Nonce n/a
Version 1.11.2
Build Date 2022-07-29T09:48:47Z
Storage Type raft
HA Enabled true
old instance vault status
Key Value
--- -----
Recovery Seal Type shamir
Initialized true
Sealed false
Total Recovery Shares 5
Threshold 3
Version 1.11.2
Build Date 2022-07-29T09:48:47Z
Storage Type raft
Cluster Name vault-cluster-8e51cf81
Cluster ID 4813eebf-4254-d876-036d-d910ee0e65a2
HA Enabled true
HA Cluster https://10.11.36.88:8201
HA Mode active
Active Since 2022-09-06T17:41:28.706117295Z
Raft Committed Index 2737746
Raft Applied Index 2737745
then we try to delete the vault data in the old instance,
its try to join with the old instance, but showing this error :
Sep 06 18:11:46 ip-10-11-10-243 vault[17305]: 2022-09-06T18:11:46.423Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 18:11:46 ip-10-11-10-243 vault[17305]: 2022-09-06T18:11:46.424Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"
in the leader / old instance :
Sep 06 18:18:06 ip-10-11-36-88 vault[22523]: 2022-09-06T18:18:06.573Z [ERROR] storage.raft: failed to heartbeat to: peer=10.11.10.243:8201 error="dial tcp 10.11.10.243:8201: connect: connection refused"
Sep 06 18:18:07 ip-10-11-36-88 vault[22523]: 2022-09-06T18:18:07.458Z [ERROR] storage.raft: failed to make requestVote RPC: target="{Voter 713bd1bc-1b5a-e013-2ba6-e64f23a37ca9 10.11.17.229:8201}" error="dial tcp 10.11.17.229:8201: connect: connection refused"
we have the config below :
disable_mlock = true
ui = true
api_addr = "https://vault-server.internal-domain.io:8200"
cluster_addr = "https://{{ GetPrivateIP }}:8201"
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_disable = "false"
tls_client_ca_file = "/opt/vault/tls/ca.crt"
tls_cert_file = "/opt/vault/tls/tls.crt"
tls_key_file = "/opt/vault/tls/tls.key"
tls_require_and_verify_client_cert = "true"
proxy_protocol_behavior = "allow_authorized"
proxy_protocol_authorized_addrs = [
"10.0.0.0/8"
]
}
storage "raft" {
path = "/opt/vault/data"
retry_join {
auto_join = "provider=\"aws\" region=\"us-west-2\" tag_key=\"retry_join\" tag_value=\"vault-server-34141-001\" addr_type=\"private_v4\""
auto_join_port = 8200
auto_join_scheme = "https"
leader_tls_servername = "vault"
leader_ca_cert_file = "/opt/vault/tls/ca.crt"
leader_client_cert_file = "/opt/vault/tls/tls.crt"
leader_client_key_file = "/opt/vault/tls/tls.key"
}
autopilot {
cleanup_dead_servers = "true"
last_contact_threshold = "200ms"
last_contact_failure_threshold = "10m"
max_trailing_logs = 250
min_quorum = 3
server_stabilization_time = "60s"
}
}
seal "awskms" {
region = "us-west-2"
kms_key_id = "alias/vault-server-kms-key"
}
telemetry {
prometheus_retention_time = "30s"
disable_hostname = true
}
is there anything I could do for the new node to join leader / old instance ?
since the new instance already try to join the old instance / cluster, but always shown failed unsealed :
Sep 06 18:19:03 ip-10-11-10-243 vault[17350]: 2022-09-06T18:19:03.499Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 18:19:03 ip-10-11-10-243 vault[17350]: 2022-09-06T18:19:03.499Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"
New Instance full log :
Sep 06 18:40:02 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:02.954Z [INFO] storage.raft: creating Raft: config="&raft.Config{ProtocolVersion:3, HeartbeatTimeout:15000000000, ElectionTimeout:15000000000, CommitTimeout:50000000, MaxAppendEntries:64, BatchApplyCh:true, ShutdownOnRemove:true, TrailingLogs:0x2800, SnapshotInterval:120000000000, SnapshotThreshold:0x2000, LeaderLeaseTimeout:2500000000, LocalID:\"5f917b7c-e84d-8b5e-825f-0370afb2b993\", NotifyCh:(chan<- bool)(0x4000bc0310), LogOutput:io.Writer(nil), LogLevel:\"DEBUG\", Logger:(*hclog.interceptLogger)(0x4000c35770), NoSnapshotRestoreOnStart:true, skipStartup:false}"
Sep 06 18:40:02 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:02.955Z [INFO] storage.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:88592729-6d11-69ff-d47a-db87809717f7 Address:10.11.36.88:8201} {Suffrage:Voter ID:713bd1bc-1b5a-e013-2ba6-e64f23a37ca9 Address:10.11.17.229:8201} {Suffrage:Voter ID:e836b268-a1e3-2fb0-22f0-0cf740d55c2c Address:10.11.7.133:8201} {Suffrage:Nonvoter ID:5f917b7c-e84d-8b5e-825f-0370afb2b993 Address:10.11.10.243:8201}]"
Sep 06 18:40:02 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:02.955Z [INFO] core: successfully joined the raft cluster: leader_addr=https://10.11.36.88:8200
Sep 06 18:40:02 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:02.955Z [INFO] storage.raft: entering follower state: follower="Node at 10.11.10.243:8201 [Follower]" leader-address= leader-id=
Sep 06 18:40:03 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:03.121Z [WARN] storage.raft: failed to get previous log: previous-index=2745136 last-index=1 error="log not found"
Sep 06 18:40:05 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:05.182Z [INFO] http: TLS handshake error from 10.11.59.81:5497: EOF
Sep 06 18:40:06 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:06.481Z [WARN] storage.raft: failed to get previous log: previous-index=2739064 last-index=1 error="log not found"
Sep 06 18:40:06 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:06.699Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 18:40:06 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:06.699Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"
Sep 06 18:40:11 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:11.700Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 18:40:11 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:11.700Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"
Sep 06 18:40:12 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:12.070Z [INFO] http: TLS handshake error from 10.11.7.91:2938: EOF
Sep 06 18:40:12 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:12.918Z [INFO] http: TLS handshake error from 10.11.7.91:49652: EOF
Sep 06 18:40:15 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:15.181Z [INFO] http: TLS handshake error from 10.11.59.81:20852: EOF
Sep 06 18:40:16 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:16.701Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 18:40:16 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:16.701Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"
Sep 06 18:40:21 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:21.702Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 18:40:21 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:21.702Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"
Sep 06 18:40:22 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:22.070Z [INFO] http: TLS handshake error from 10.11.7.91:28835: EOF
Sep 06 18:40:22 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:22.918Z [INFO] http: TLS handshake error from 10.11.7.91:4524: EOF
Sep 06 18:40:25 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:25.182Z [INFO] http: TLS handshake error from 10.11.59.81:46091: EOF
Sep 06 18:40:26 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:26.703Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 18:40:26 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:26.703Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"
Sep 06 18:40:31 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:31.704Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 18:40:31 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:31.705Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"
Sep 06 18:40:32 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:32.070Z [INFO] http: TLS handshake error from 10.11.7.91:57071: EOF
Sep 06 18:40:32 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:32.918Z [INFO] http: TLS handshake error from 10.11.7.91:18522: EOF
Sep 06 18:40:35 ip-10-11-10-243 vault[17422]: 2022-09-06T18:40:35.182Z [INFO] http: TLS handshake error from 10.11.59.81:35948: EOF
Update :
- After changing to new instance again, Found interesting logs that it attempt to challenge to itself, based on what I assume in the logs below,
Although I believe it need to challenge to the cluster leader IP address, anyone can point me whats wrong with my configuration?
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:41.152Z [INFO] core: attempting to join possible raft leader node: leader_addr=https://10.11.36.88:8200
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:41.153Z [INFO] core: attempting to join possible raft leader node: leader_addr=https://10.11.13.5:8200
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:41.153Z [INFO] core: attempting to join possible raft leader node: leader_addr=https://10.11.17.229:8200
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:41.159Z [ERROR] core: failed to get raft challenge: leader_addr=https://10.11.13.5:8200
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: error=
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: | error during raft bootstrap init call: Error making API request.
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: |
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: | URL: PUT https://10.11.13.5:8200/v1/sys/storage/raft/bootstrap/challenge
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: | Code: 503. Errors:
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: |
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: | * Vault is sealed
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]:
Sep 06 20:20:41 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:41.182Z [INFO] http: TLS handshake error from 10.11.7.91:57983: EOF
Sep 06 20:20:43 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:43.582Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 20:20:43 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:43.583Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"
Sep 06 20:20:45 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:45.617Z [INFO] http: TLS handshake error from 10.11.7.91:60729: EOF
Sep 06 20:20:48 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:48.584Z [INFO] core: stored unseal keys supported, attempting fetch
Sep 06 20:20:48 ip-10-11-13-5 vault[1762]: 2022-09-06T20:20:48.584Z [WARN] failed to unseal core: error="stored unseal keys are supported, but none were found"