Single node consult fails to start

I’m working on a project to use vault and consul - home lab sort of setup.

I’m trying to get consul working. The configuration I came up with for some reason is not working. Its a single instance of consul. The config stripped down is:

data_dir = "/opt/consul"
client_addr = "127.0.0.1 192.168.1.120"
ui_config{
  enabled = true
}
server = true
bind_addr = "127.0.0.1" # Listen on localhost
bind_addr = "192.168.1.120" # Listen on lan
bootstrap_expect=1

When I start the service via systemd it just hangs eventually failing:

# systemctl start consul
Job for consul.service failed because a timeout was exceeded.
See "systemctl status consul.service" and "journalctl -xeu consul.service" for details.

If I look in syslog file I see a number of things:

2024-10-09T10:06:10.573889-04:00 controller systemd[1]: Starting consul.service - "HashiCorp Consul - A service mesh solution"...
2024-10-09T10:06:10.795452-04:00 controller consul[1025613]: ==> Starting Consul agent...
2024-10-09T10:06:10.796599-04:00 controller consul[1025613]:                Version: '1.19.2'
2024-10-09T10:06:10.798401-04:00 controller consul[1025613]:             Build Date: '2024-08-27 16:06:44 +0000 UTC'
2024-10-09T10:06:10.799390-04:00 controller consul[1025613]:                Node ID: 'd5eb6879-32e3-0349-75e4-0e47acec4d71'
2024-10-09T10:06:10.799555-04:00 controller consul[1025613]:              Node name: 'controller'
2024-10-09T10:06:10.799632-04:00 controller consul[1025613]:             Datacenter: 'dc1' (Segment: '<all>')
2024-10-09T10:06:10.799701-04:00 controller consul[1025613]:                 Server: true (Bootstrap: true)
2024-10-09T10:06:10.799765-04:00 controller consul[1025613]:            Client Addr: [127.0.0.1 192.168.1.120] (HTTP: 8500, HTTPS: -1, gRPC: -1, gRPC-TLS: 8503, DNS: 8600)
2024-10-09T10:06:10.799849-04:00 controller consul[1025613]:           Cluster Addr: 192.168.1.120 (LAN: 8301, WAN: 8302)
2024-10-09T10:06:10.802665-04:00 controller consul[1025613]:      Gossip Encryption: false
2024-10-09T10:06:10.802793-04:00 controller consul[1025613]:       Auto-Encrypt-TLS: false
2024-10-09T10:06:10.802880-04:00 controller consul[1025613]:            ACL Enabled: false
2024-10-09T10:06:10.802928-04:00 controller consul[1025613]:      Reporting Enabled: false
2024-10-09T10:06:10.802971-04:00 controller consul[1025613]:     ACL Default Policy: allow
2024-10-09T10:06:10.803017-04:00 controller consul[1025613]:              HTTPS TLS: Verify Incoming: false, Verify Outgoing: false, Min Version: TLSv1_2
2024-10-09T10:06:10.803077-04:00 controller consul[1025613]:               gRPC TLS: Verify Incoming: false, Min Version: TLSv1_2
2024-10-09T10:06:10.803126-04:00 controller consul[1025613]:       Internal RPC TLS: Verify Incoming: false, Verify Outgoing: false (Verify Hostname: false), Min Version: TLSv1_2
2024-10-09T10:06:10.803182-04:00 controller consul[1025613]: ==> Log data will now stream in as it occurs:
2024-10-09T10:06:10.803236-04:00 controller consul[1025613]: 2024-10-09T10:06:10.790-0400 [WARN]  agent: skipping file /etc/consul.d/consul.env, extension must be .hcl or .json, or config format must be set
2024-10-09T10:06:10.803309-04:00 controller consul[1025613]: 2024-10-09T10:06:10.791-0400 [WARN]  agent: skipping file /etc/consul.d/consul.hcl-orig, extension must be .hcl or .json, or config format must be set
2024-10-09T10:06:10.803367-04:00 controller consul[1025613]: 2024-10-09T10:06:10.791-0400 [WARN]  agent: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
2024-10-09T10:06:10.803453-04:00 controller consul[1025613]: 2024-10-09T10:06:10.791-0400 [WARN]  agent: bootstrap = true: do not enable unless necessary
2024-10-09T10:06:10.816329-04:00 controller consul[1025613]: 2024-10-09T10:06:10.816-0400 [WARN]  agent.auto_config: skipping file /etc/consul.d/consul.env, extension must be .hcl or .json, or config format must be set
2024-10-09T10:06:10.817266-04:00 controller consul[1025613]: 2024-10-09T10:06:10.817-0400 [WARN]  agent.auto_config: skipping file /etc/consul.d/consul.hcl-orig, extension must be .hcl or .json, or config format must be set
(...)
2024-10-09T10:06:19.549385-04:00 controller consul[1025613]: 2024-10-09T10:06:19.548-0400 [INFO]  agent.leader: stopped routine: routine="virtual IP version check"
2024-10-09T10:06:20.896342-04:00 controller consul[1025613]: 2024-10-09T10:06:20.895-0400 [ERROR] agent.server.autopilot: Failed to reconcile current state with the desired state
2024-10-09T10:06:21.738096-04:00 controller consul[1025613]: 2024-10-09T10:06:21.737-0400 [INFO]  agent: Synced node info

Eventually when the start times out this is the next bit in the log:

2024-10-09T10:07:40.753131-04:00 controller systemd[1]: consul.service: start operation timed out. Terminating.
2024-10-09T10:07:40.755261-04:00 controller consul[1025613]: 2024-10-09T10:07:40.752-0400 [INFO]  agent: Caught: signal=terminated
2024-10-09T10:07:40.755725-04:00 controller consul[1025613]: 2024-10-09T10:07:40.752-0400 [INFO]  agent: Graceful shutdown disabled. Exiting
2024-10-09T10:07:40.755956-04:00 controller consul[1025613]: 2024-10-09T10:07:40.752-0400 [INFO]  agent: Requesting shutdown
2024-10-09T10:07:40.756118-04:00 controller consul[1025613]: 2024-10-09T10:07:40.753-0400 [INFO]  agent.server: shutting down server

Given this is a single node setup what am I doing wrong / missing? I’ve read over the docs and nothing is obvious to me. I know a single node is not redundant / ha - this is for a lab :wink: I set the bootstrap_expect to 1 which I believe allows for the single node. Is there more to it?

If I have vault started/running in the ui of consul I see:

Vault Sealed Status
ServiceName
vault
CheckID
vault:192.168.1.120:8200:vault-sealed-check
Type
ttl
Notes
Vault service is healthy when Vault is in an unsealed status and can become an active Vault server

I assume this is because consul itself appears to not be happy yet?

I seem to have things working now. I also figured out the last bit about vault - I needed to do the vault init so that is working too. Here is the consul config I ended up with - a slight modification from what i posted earlier:

connect {
  enabled = true
}
autopilot {
  min_quorum = 1
}
data_dir = "/opt/consul"
client_addr = "127.0.0.1"
ui_config{
  enabled = true
}
server = true
bootstrap_expect=1
retry_join = ["127.0.0.1"]

The retry_join config fixes this. This behaviour started after a recent change to the systemd unit file, where it was changed from Type=Simple to Type=Notify.

ref: SystemD configuration broken with single-node / dev setups · Issue #18097 · hashicorp/consul · GitHub