2 node - Controller-Worker Setup on Vagrant

Im trying to setup a simple 2 node setup inside a Vagrantbox on Ubuntu. Below are the worker and controller configs respectively. Setup checks out fine, except that behaviour of the worker seems little odd. Worker systemd services starts and authenticates to controller successfully but ends up looking for a local listener on default controller port (0.0.0.0:9201) forever, since I’m using a dedicated node for worker.
Details below:

Worker:

disable_mlock = true

# listener denoting this is a worker proxy
listener "tcp" {
  address = "172.16.1.112:9202"
  tls_disable = "true"
  purpose = "proxy"
}

# worker block for configuring the specifics of the
# worker service
worker {
#  name= "worker-1"
  public_addr = "172.16.1.112"
  initial_upstreams =  [ "controller-1:9201" ]
  address = "172.16.1.112"
  auth_storage_path = "/etc/boundary.d/test"
#  tags {
#   type = ["worker-1"]
#   }

}

Controller:

# disable memory from being swapped to disk
        disable_mlock = true

        # API listener configuration block
        listener "tcp" {
          # Should be the address of the NIC that the controller server will be reached on
          # Use 0.0.0.0 to listen on all interfaces
          address = "0.0.0.0:9200"
          # The purpose of this listener block
          purpose = "api"

          # TLS Configuration
          tls_disable   = false
          tls_cert_file = "/etc/boundary.d/tls/boundary.crt"
          tls_key_file  = "/etc/boundary.d/tls/boundary.key"

          # Uncomment to enable CORS for the Admin UI. Be sure to set the allowed origin(s)
          # to appropriate values.
          #cors_enabled = true
          #cors_allowed_origins = ["https://yourcorp.yourdomain.com", "serve://boundary"]
        }

        # Data-plane listener configuration block (used for worker coordination)
        listener "tcp" {
          # Should be the IP of the NIC that the worker will connect on
          address = "0.0.0.0:9201"
          # The purpose of this listener
          purpose = "cluster"
          tls_disable   = "false"
          tls_cert_file = "/etc/boundary.d/tls/boundary.crt"
          tls_key_file  = "/etc/boundary.d/tls/boundary.key"
        }

        # Ops listener for operations like health checks for load balancers
        listener "tcp" {
          # Should be the address of the interface where your external systems'
          # (eg: Load-Balancer and metrics collectors) will connect on.
          address = "0.0.0.0:9203"
          # The purpose of this listener block
          purpose = "ops"

          tls_disable   = false
          tls_cert_file = "/etc/boundary.d/tls/boundary.crt"
          tls_key_file  = "/etc/boundary.d/tls/boundary.key"
        }

        # Controller configuration block
        controller {
          # This name attr must be unique across all controller instances if running in HA mode
          name = "controller-1"
 controller {
          # This name attr must be unique across all controller instances if running in HA mode
          name = "controller-1"
          description = "Boundary controller number one"
          #tls_disable   = "true"
          #tls_cert_file = "/etc/boundary.d/tls/boundary.crt"
          #tls_key_file  = "/etc/boundary.d/tls/boundary.key"
          # This is the public hostname or IP where the workers can reach the
          # controller. This should typically be a load balancer address
          public_cluster_address = "controller-1"

          # Enterprise license file, can also be the raw value or env:// value
          # license = "file:///path/to/license/file.hclic"

          # After receiving a shutdown signal, Boundary will wait 10s before initiating the shutdown process.
          graceful_shutdown_wait_duration = "10s"

          # Database URL for postgres. This is set in boundary.env and
          #consumed via the “env://” notation.
          database {
                  url = "postgresql://postgres:postgres@127.0.0.1:5432/postgres?sslmode=disable"
          }
        }

Using worker-led authentication is being used.

Issue - Despite configuring the controller IP/name, worker continuously polls for local listener 0.0.0.0:9201 as controller for some weird reason. And automatically updates the upstream to 0.0.0.0

First line confirms worker authentication to controller is successful.

ul 26 16:06:20 boundary-2 boundary[12248]: {"id":"J03YG5y5C2","source":"https://hashicorp.com/boundary/boundary-2/worker","specversion":"1.0","type":"system","data":{"version":"v0.1","op":"worker.(Worker).upstreamDialerFunc","data":{"msg":"worker has successfully authenticated"}},"datacontentype":"application/cloudevents","time":"2023-07-26T16:06:20.727256595Z"}
Jul 26 16:06:20 boundary-2 boundary[12248]: {"id":"Yx79mQz1C9","source":"https://hashicorp.com/boundary/boundary-2/worker","specversion":"1.0","type":"system","data":{"version":"v0.1","op":"worker.(Worker).updateAddrs","data":{"msg":"Upstreams after first status set to: [0.0.0.0:9201]"}},"datacontentype":"application/cloudevents","time":"2023-07-26T16:06:20.778388436Z"}
Jul 26 16:06:20 boundary-2 boundary[12248]: {"id":"IJ3tOFcXIq","source":"https://hashicorp.com/boundary/boundary-2/worker","specversion":"1.0","type":"error","data":{"error":"(nodeenrollment.protocol.Dial) unable to dial to server: (nodeenrollment.protocol.Dial) unable to dial to server: dial tcp 0.0.0.0:9201: connect: connection refused","error_fields":{},"id":"e_ju8KOvVuJx","version":"v0.1","op":"worker.(Worker).upstreamDialerFunc"},"datacontentype":"application/cloudevents","time":"2023-07-26T16:06:20.779632794Z"}
Jul 26 16:06:20 boundary-2 boundary[12248]: {"id":"KUWh4ctx2A","source":"https://hashicorp.com/boundary/boundary-2/worker","specversion":"1.0","type":"error","data":{"error":"worker.(Worker).upstreamDialerFunc: unknown, unknown: error #0: (nodeenrollment.protocol.Dial) unable to dial to server: (nodeenrollment.protocol.Dial) unable to dial to server: dial tcp 0.0.0.0:9201: connect: connection refused","error_fields":{"Code":0,"Msg":"","Op":"worker.(Worker).upstreamDialerFunc","Wrapped":{}},"id":"e_O7cDecx1Sy","version":"v0.1","op":"worker.(Worker).upstreamDialerFunc"},"datacontentype":"application/cloudevents","time":"2023-07-26T16:06:20.780113562Z"}

Some more logs, denoting removal old valid upstreams and amending new upstream of local listener 0.0.0.0:9201

Jul 26 16:20:06 boundary-2 boundary[12248]: {"id":"udbvJrPV64","source":"https://hashicorp.com/boundary/boundary-2/worker","specversion":"1.0","type":"system","data":{"version":"v0.1","op":"worker.(Worker).updateAddrs","data":{"msg":"Upstreams has changed; old upstreams were: [0.0.0.0:9201], new upstreams are: [0.0.0.0:9201 controller-1:9201]"}},"datacontentype":"application/cloudevents","time":"2023-07-26T16:20:06.643339422Z"}
Jul 26 16:20:06 boundary-2 boundary[12248]: {"id":"G05FlTrvnH","source":"https://hashicorp.com/boundary/boundary-2/worker","specversion":"1.0","type":"system","data":{"version":"v0.1","op":"worker.(Worker).upstreamDialerFunc","data":{"msg":"worker has successfully authenticated"}},"datacontentype":"application/cloudevents","time":"2023-07-26T16:20:06.671962992Z"}
Jul 26 16:20:08 boundary-2 boundary[12248]: {"id":"m47sSaaJmu","source":"https://hashicorp.com/boundary/boundary-2/worker","specversion":"1.0","type":"error","data":{"error":"(nodeenrollment.protocol.Dial) unable to dial to server: (nodeenrollment.protocol.Dial) unable to dial to server: dial tcp 0.0.0.0:9201: connect: connection refused","error_fields":{},"id":"e_bdtJclHeam","version":"v0.1","op":"worker.(Worker).upstreamDialerFunc"},"datacontentype":"application/cloudevents","time":"2023-07-26T16:20:08.198884097Z"}
Jul 26 16:20:08 boundary-2 boundary[12248]: {"id":"0BSEgjgRTT","source":"https://hashicorp.com/boundary/boundary-2/worker","specversion":"1.0","type":"error","data":{"error":"worker.(Worker).upstreamDialerFunc: unknown, unknown: error #0: (nodeenrollment.protocol.Dial) unable to dial to server: (nodeenrollment.protocol.Dial) unable to dial to server: dial tcp 0.0.0.0:9201: connect: connection refused","error_fields":{"Code":0,"Msg":"","Op":"worker.(Worker).upstreamDialerFunc","Wrapped":{}},"id":"e_IYq1iGSYme","version":"v0.1","op":"worker.(Worker).upstreamDialerFunc"},"datacontentype":"application/cloudevents","time":"2023-07-26T16:20:08.199088417Z"}
Jul 26 16:20:09 boundary-2 boundary[12248]: {"id":"Y9e2t4yKhr","source":"https://hashicorp.com/boundary/boundary-2/worker","specversion":"1.0","type":"system","data":{"version":"v0.1","op":"worker.(Worker).updateAddrs","data":{"msg":"Upstreams has changed; old upstreams were: [0.0.0.0:9201 controller-1:9201], new upstreams are: [0.0.0.0:9201]"}},"datacontentype":"application/cloudevents","time":"2023-07-26T16:20:09.069749314Z"}

Worker shows up fine from the command

vagrant@boundary-1:~$ boundary workers list -tls-insecure
Direct usage of BOUNDARY_TOKEN env var is deprecated; please use "-token env://<env var name>" format, e.g. "-token env://BOUNDARY_TOKEN" to specify an env var to use.

Worker information:
  ID:                        w_BHwoctASdf
    Type:                    pki
    Version:                 1
    Authorized Actions:
      no-op
      read
      update
      delete
      add-worker-tags
      set-worker-tags
      remove-worker-tags

  ID:                        w_VI2cwvmup9
    Type:                    pki
    Version:                 1
    Address:                 172.16.1.112:9202
    ReleaseVersion:          Boundary v0.13.0
    Last Status Time:        Wed, 26 Jul 2023 16:15:18 UTC
    Authorized Actions:
      no-op
      read
      update
      delete
      add-worker-tags
      set-worker-tags
      remove-worker-tags

Any pointers towards solving this will be appreciated. Controller and Worker hostnames resolve locally. Hence have used hostnames directly wherever applicable.

Basic sanity checks done:
9200, 9201, 9203 - running on controller, reachable via worker
9202 - running on worker, reachable via controller
Firewalls disabled.

You’ve got the attribute public_cluster_address in your controller config – that should be public_cluster_addr.

(FYI, I noticed after I posted my previous reply that we had the incorrect parameter name in one of our tutorials – that’s been fixed now.)

Thank you! Right usage of the parameter fixed the problem