Panic crash caused by new multiport functionality

I cannot get basic functionality with simple service supporting two ports: 8080 (HTTP) and 9080 (GRPC). I was followed the steps listed here using a small greeter service that I wrote:

However, it is not working, as the injector pods does this, which I guess is not the desired outcome.

anic: runtime error: index out of range [-1]

goroutine 258 [running]:
github.com/hashicorp/consul-k8s/control-plane/connect-inject.(*EndpointsController).createServiceRegistrations(_, {{{0x1b293d8, 0x3}, {0x22c9093, 0x2}}, {{0xc0003d3760, 0x1f}, {0xc0003d3780, 0x1a}, {0xc000623b20, ...}, ...}, ...}, ...)
	/home/runner/work/consul-k8s/consul-k8s/control-plane/connect-inject/endpoints_controller.go:417 +0x2374
github.com/hashicorp/consul-k8s/control-plane/connect-inject.(*EndpointsController).registerServicesAndHealthCheck(_, {{{0x1b293d8, 0x3}, {0x22c9093, 0x2}}, {{0xc0003d3760, 0x1f}, {0xc0003d3780, 0x1a}, {0xc000623b20, ...}, ...}, ...}, ...)
	/home/runner/work/consul-k8s/consul-k8s/control-plane/connect-inject/endpoints_controller.go:251 +0x2e8
github.com/hashicorp/consul-k8s/control-plane/connect-inject.(*EndpointsController).Reconcile(0xc000305e00, {0x2689b60, 0xc00059f800}, {{{0xc0007da990?, 0x20f4f80?}, {0xc0008ca4b0?, 0xc0005be000?}}})
	/home/runner/work/consul-k8s/consul-k8s/control-plane/connect-inject/endpoints_controller.go:195 +0x10db
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0xc0005d7cc0, {0x2689b60, 0xc00059f770}, {{{0xc0007da990?, 0x20f4f80?}, {0xc0008ca4b0?, 0xc000791c80?}}})
	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:114 +0x222
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0005d7cc0, {0x2689ab8, 0xc0005be380}, {0x1f9d360?, 0xc000446e80?})
	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:311 +0x2e9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0005d7cc0, {0x2689ab8, 0xc0005be380})
	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:266 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:227 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:223 +0x307

I am deducing that this is a bug, so I wrote an issue:

I hope someone can look at this, and determine if there is something missing in the docs that is missing or not clear that could cause this.

Hi @darkn3rd, thank you for opening this question. I was able to solve it for you where you opened the issue on the GitHub repository.

Thanks.

Question: Should the injector panic or just fail with an error?

To me, it makes sense to panic as it makes the error more obvious. But I can also see an argument for just erroring out. I made a note in the engineering sync agenda for next week where the Consul on Kubernetes engineers will discuss what the better user experience is.

Thanks. I was worried that panic as a form of feed back to end users has been normalized. :wink:

I took my multi-port demo application and published it here:

I am working a more advanced application that is a distributed graph database that requires multiple ports. I pray this one works. If I can get this to work, I can work on other features, such as observability, ACLs, and ingress gateway.

I have been on this journey thus far for 11 days, so I am happy to see some success with Consul Connect. In the past, I experimented with NGINX Service Mesh (took 5 days), Istio (3 days), and Linkerd (1 day).

I’m seeing this too.

In our case, I had tried to configure multiple consul services for rabbitmq: one for amqp and one for UI/http, but it started crashing randomly. Can someone post the link to the issue when it is created so that others who encounter this issue know where to track it?

This is not documented very clearly (but docs that are there are definitely appreciated). This is what I found with multiple ports used by a single service.

  • cannot have two services that use the same port. Thus in case you are using a StatefulSet that has a headless service and a regular service, you will need to add a label on the headless service to tell Consul to ignore it, i.e. consul.hashicorp.com/service-ignore: 'true'
  • cannot have a service with 2+ ports, so you have to segregate this to one port to service
  • server: add consul.hashicorp.com/connect-service annotation to list each of the services.
  • clients: add consul.hashicorp.com/connect-service-upstreams annotation listing the services the client can connect to.
  • transparent proxy will not work with this, so to show that, I add consul.hashicorp.com/transparent-proxy: "false" annotation.
  • metrics will not work with multi-port, so metrics should not be enabled, as this will cause stacktraces.
1 Like

@darkn3rd This worked perfectly! Thank you!

1 Like

I found another thing, and I am not sure about the workaround. With transparent proxy, traffic should go through the mesh. But when you use upstream, you can tunnel through local host. However, other containers in or outside the mesh can reach the service that should be secured. I’m looking for annotations or other docs that could fix this, but I have not found it yet. I wrote this up as Connections bypass ACL security in multi-port · Issue #1606 · hashicorp/consul-k8s · GitHub.

Following up on this issue, essentially when you enable multi-port and it turns off transparent-proxy mode, the service will be vulnerable. The only way to further restrict this is to restrict the service running on the system to local-host, so the actual service has to do the fire walling, or use another technology, such as network policies. This causes a unique challenge to integrate ingress-controllers or API gateways if the service has to be restricted to local host.

The current architecture of Consul does not seem compatible for a service mesh solution for multi-port, given security and observability issues. Essentially, it doesn’t support finalized Kubernetes APIs, such as a service that supports a list of ports, where Consul can only support a single port for tranparent proxy and observability. Contrast to other SMs, out of the box they support multiple ports and transparent proxy is the default and only option.

For the ingress-controller issue, couldn’t you create a consul ingress-gateway and point the ingress to the service there? Essentially the ingress gateway provides the external visibility and the entrance to the service mesh. Then the localhost issue would be irrelevant because the gateway is going to hit the envoy proxy. Am I thinking about this wrong?

[Ingress Controller] ---(ingress)--> [Ingress Gateway] --(consul service def)--> [Multi-port service]

I thought of this, but the current docs/tutorial is in Terraform + Kustomize and requires installing EKS. It would take time to break that up and extract the needed solution path to be able to use it.

Also, out of the box, it doesn’t yet support gRPC. So if gRPC is needed on the edge, this is a no -solution.

The ingress gateway is just a CRD. You can create one easily:

apiVersion: consul.hashicorp.com/v1alpha1
kind: IngressGateway
metadata:
  name: <name>
  namespace: <namespace>
spec:
  listeners:
    - port: 8080
      protocol: tcp
      services:
        - name: <consul service>
  tls: # you can enable this
    enabled: false

You’ll need to update your ingress definition to point to the ingress-gateway and a service intention to allow the inbound traffic from the ingress-gateway to your service.

I’ve never tried gRPC. We use web sockets and there is support at L4 but not L7 for them. We’re making do for now. Are you sure you can’t just treat them like straight tcp sockets?

I could do that, but then I lose certificates correct? I haven’t tried the Ingress Gateway, as I thought this was a datacenter-to-datacenter use case.

Right now I cannot get service intentions to work from service and client in different namespaces.

You’re probably thinking of Mesh Gateways for dc to dc.

You would lose tls at level 4 (tcp). Logically, I believe you need to be using L7 for that.

I believe the service intentions would list the consul service, not the k8s service. If you’re using namespaces (and not using enterprise), you’ve probably got the wrong service name.