Nomad error creating an Ingress Gateway with sidecar service

Hi,
i’m getting in trouble on how to create an ingress gateway for a service registered as sidecar_service.
Hi have a services exposed from a docker container at an internal port “8161” named “amq-management”. I want to create an ingress point for this service so i can resolve it with a static access point like “amq-management.ingress.dc1.consul:8080”. So i tryed to register the service using sidecar service and create an ingress point that listent on 8080 port and point to that service. Just a small premise: i tried example on hashicorp documentation and everything works fine. So i think that i haven’t understand very well how it works.

This is my nomad job:

job "activemq-job" {
datacenters = ["dc1"]

group "ActiveMQ" {
     count = 1

 

    network {
        mode = "bridge" 
    }

    service {
        name = "amq-management"
        id = "amq-management-1"
        tags = ["activemqmanagement"]
        port = "8161"
        connect {
             sidecar_service { }
        } 
    }


    task "ActiveMQ" {
        driver = "docker"



        config {
            image = "..."
        }

    }     

}


group "ingress-group" {

network {
  mode = "bridge"
  port "inbound" {
    static = 8080
    to     = 8080
  }
}

service {
  name = "my-ingress-service"
  port = "8080"

  connect {
    gateway {
      ingress {
        listener {
          port     = 8080
          protocol = "tcp"
          service {
            name = "amq-management"
          }
        }
      }
    }
  }
}
}
}

The problem is that group “ActiveMQ” doesn’t start because “connect-proxy-amq-management” fails with this error:

envoy_bootstrap: error creating bootstrap configuration for Connect proxy sidecar: exit status 1

looking on nomad log i have those errors:

2021-05-25T09:43:30.473+0200 [ERROR] client.alloc_runner.task_runner.task_hook.envoy_bootstrap: error creating bootstrap configuration for Connect proxy sidecar: alloc_id=9ce9b0c3-e10e-10ae-6c00-0b15df0a696e task=connect-proxy-amq-management error="exit status 1" stderr="No sidecar proxy registered for _nomad-task-9ce9b0c3-e10e-10ae-6c00-0b15df0a696e-group-ActiveMQ-amq-management-8161
May 25 09:43:30 consul02 nomad[394648]: "
May 25 09:43:30 consul02 nomad[394648]:     2021-05-25T09:43:30.474+0200 [ERROR] client.alloc_runner.task_runner: prestart failed: alloc_id=9ce9b0c3-e10e-10ae-6c00-0b15df0a696e task=connect-proxy-amq-management error="prestart hook "envoy_bootstrap" failed: error creating bootstrap configuration for Connect proxy sidecar: exit status 1"
May 25 09:43:30 consul02 nomad[394648]:     2021-05-25T09:43:30.474+0200 [INFO]  client.alloc_runner.task_runner: not restarting task: alloc_id=9ce9b0c3-e10e-10ae-6c00-0b15df0a696e task=connect-proxy-amq-management reason="Exceeded allowed attempts 2 in interval 30m0s and mode is "fail""

It says that is missing sidecar proxy.

I tried to change

service {
        name = "amq-management"
        id = "amq-management-1"
        tags = ["activemqmanagement"]
        port = "8161"
        connect {
             sidecar_service { }
        } 
    }

with

service {
        name = "amq-management"
        id = "amq-management-1"
        tags = ["activemqmanagement"]
        port = "8161"
        connect {
             sidecar_service {
                      proxy {
                         local_service_port = 8161
                      }
            }
        } 
    }

but the output is the same error

I tried executing this command:

consul connect proxy -sidecar-for amq-management-1 > amq-proxy.log &

and the error is the same too

What’s wrong in what i have done?

Hi @nico_ntrax ! Sorry you’re running into trouble, but I think we can help. First, what version of Nomad are you running? The recent v1.1.0 release included a fix around group/service name validation to ensure names used by Connect services do not contain upper-case characters, due to an underlying Consul bug. When running this job with Nomad v1.1.0, it stops with,

$ nomad job run activemq-job.nomad
Error submitting job: Unexpected response code: 500 (1 error occurred:
	* Consul Connect group "ActiveMQ" with service "amq-management" must not contain uppercase characters

)

Other than that I think the job looks alright, although without a docker image for the Active MQ task it’s hard to test :slight_smile:

First of all, many many many thanks for you suggestions. Using lowercase allow nomad to start the job!
Now on consul and on nomad everything is green and seems that connections between components should works by the way when i try to execute a curl on the upstream uri it doesn’t works.
The status is this:

only 1 Centos machine with Consul + Nomad.

Consul configuration is this:

data_dir = "/var/lib/consul"
log_level = "DEBUG"

datacenter = "dc2"

server = true
disable_update_check = true
bootstrap_expect = 1
ui = true

enable_syslog = true


verify_incoming = false
verify_outgoing = false

bind_addr = "0.0.0.0"
client_addr = "0.0.0.0"
advertise_addr = "{{GetInterfaceIP \"eno1\"}}"

ports {
  grpc = 8502
}

connect {
  enabled = true
}

config_entries {
  bootstrap 
  {
      kind = "proxy-defaults"
      name = "global"

      config {
          protocol = "http"
      }

     mesh_gateway = {
          mode = "local"
     }
  }
}

Nomad configuration is this one:

log_level = "DEBUG"
datacenter = "dc2"
data_dir = "/opt/nomad"

bind_addr = "{{ GetInterfaceIP \"eno1\" }}" 


server {
   enabled = true
   bootstrap_expect = 1
}

client {
   enabled = true
   network_interface = "eno1"
   options {
    docker.cleanup.image = false
  }
}

disable_update_check = true
log_file = "/var/log/nomad.log"


addresses {
   http = "{{ GetInterfaceIP \"eno1\" }}"
}

advertise {
  http = "127.0.0.1"
  rpc  = "{{ GetInterfaceIP \"eno1\" }}"
  serf = "127.0.0.1"
}


plugin "raw_exec" {
  config {
     enabled = true
     no_cgroups = true
  }
}

plugin "docker" {
   config {
      volumes {
         enabled = true
      }
    }
 }

consul {
    address = "127.0.0.1:8500"  
    server_service_name = "nomad"
   client_service_name = "nomad-client"

   auto_advertise = true

   server_auto_join = true
   client_auto_join = true
}

After running consul and nomad i try to create an envoy ingress gateway executing from command line:

consul connect envoy -gateway=ingress -register -service=ingress-service -address=<IP>:21003

In this case “ingress-service” is the name that i give to the service name in my ingress gateway service:

Kind = "ingress-gateway"
Name = "ingress-service"

TLS {
   Enabled = false
}

Listeners = [
{
   Port = 8080
   Protocol = "http"
   Services = [
    {
        Name = "amq-management"
     }
   ]
 }
]

So the situation on consul is this:

Now i try to execute from nomad this job:

job "my-activemq" {
datacenters = ["dc2"]

group "my-activemq-group" {
     count = 1

    network {
        mode = "bridge"
      
        port "management" { 
                to = 8161 
        }
    }

    service {
        name = "amq-management"
        id = "amq-management-1"
        port = "management"
        connect {
             sidecar_service { 
                proxy {
                  
                }       
             }
        }
    }


    task "amq" {
        driver = "docker"

        config {
            image = "rmohr/activemq"

            ports = ["management"]
            
        } 
    }
}
}

(many many thanks because now nomad is able to execute the job!!!)

now the status on consul is this:

if i open “ingress-service” the connection seems right:

ingress_gateway_3

and this is the upstream:
ingress_gateway_4

but if i execute the curl

curl --noproxy "*" -v  http://amq-management.ingress.dc2.consul:8080

(–noproxy is necessary due proxy company)

the output is this

* Rebuilt URL to: http://amq-management.ingress.dc2.consul:8080/
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to amq-management.ingress.dc2.consul (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: amq-management.ingress.dc2.consul:8080
> User-Agent: curl/7.61.1
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< content-length: 91
< content-type: text/plain
< date: Fri, 28 May 2021 10:41:02 GMT
< server: envoy
< x-envoy-upstream-service-time: 7
<
* Connection #0 to host amq-management.ingress.dc2.consul left intact
upstream connect error or disconnect/reset before headers. reset reason: connection failure

what i expect is this:
ingress_gateway_5

what’s wrong?

(sorry for all images but i tried to recreate step by step what i have done to better understand what i have done)

Hi,
i have done some progresses but i don’t know if this is the correct way, by the way it works:

first i create an ingress hcl file with the following instructions

kind = "ingress-gateway"
name = "amq-test-ingress-gateway"


TLS {
  Enabled = false
}

listeners = [
  {
    port = 5001
    protocol = "http"
    services = [
       {
         name = "amq-management"
       }
    ]
  }
]

then from command line i execute this:

 consul connect envoy -register -gateway=ingress -service=amq-test-ingress-gateway -admin-bind='127.0.0.1:9000' -address='<myIP>:8000' -grpc-addr 0.0.0.0:8502

and at last i execute from nomad this job:

job "activemq-test" {
datacenters = ["dc1"]

group "gateway" {
    network {
      mode = "bridge"
    }

service {
  name = "amq-api-gateway"

  connect {
    gateway {

      proxy { }

      terminating {
        service {
          name = "amq-management"
        }
      }
    }
  }
}
}


group "activemq-testgroup" {
     count = 1

    network {
        mode = "bridge"
        port "management" { 
                to = 8161 
        }
    }

    service {
        name = "amq-management"
        id = "amq-management-1"
        port = "management" 
    }


    task "amq" {
        driver = "docker"

        config {
            image = "rmohr/activemq"
            ports = ["management"]
            
        }
        
    }
	
	
}
}

now if i execute a curl like this:

curl  http://amq-management.ingress.dc1.consul:5001/

it finally works.

by the way i use terminating that should be used if i want to create a routing to a service outside the mesh.But i’m still having problems if i want to create a routing from an upstream url (like amq-management.ingress.dc1.consul:5001) to a service that join the mesh

i’ll let you update if there are new progresses

Hey @nico_ntrax sorry for the slow reply. I believe the last bug in that job config is the use of the port label for the destination service.port, rather than just setting the numeric port. I was able to get this minimalist configuration running and responding, using the -dev and -dev-connect modes on consul and nomad.

Start Consul

consul agent -dev

Start Nomad

sudo nomad agent -dev-connect

Job file

job "my-activemq" {
  datacenters = ["dc1"]

  group "ingress-group" {
    network {
      mode = "bridge"
      port "inbound" {
        static = 8080
        to     = 8080
      }
    }

    service {
      name = "my-ingress-service"
      port = "8080"
      connect {
        gateway {
          ingress {
            listener {
              port     = 8080 
              protocol = "tcp"
              service {
                name = "amq-management"
              }
            }
          }
        }
      }
    }
  }

  group "my-activemq-group" {
    network {
      mode = "bridge"
    }

    service {
      name = "amq-management"
      port = 8161 # not a port mapping
      connect {
        sidecar_service {}
      }
    }


    task "amq" {
      driver = "docker"
      config {
        image = "rmohr/activemq"
      }
    }
  }
}

Run job

nomad job run my-activemq.nomad

(wait a few seconds to launch)

Query through ingress gateway

curl $(dig +short @127.0.0.1 -p 8600 amq-management.ingress.dc1.consul. ANY):8080

Of course your environment is slightly different with the multi-DC setup, but I think that port value was the culprit. Note that you can also set listener.protocol = "http" here, but you’d need to set the service-defaults config entry first, e.g.

default.hcl

Kind      = "service-defaults"
Name      = "amq-management"
Protocol  = "tcp"

Wite

consul config write default.hcl

Hope that helps. And thanks for providing such clear reproduction steps, that helps us immensely when troubleshooting. :slightly_smiling_face:

1 Like

you are the best! it works!!! many thanks! more i get in deep with hashicorp stack and more i think how powerful are all those instruments!