Consul connect integration of jaeger in Kubernetes

Hi all,

I’m looking to integrate consul connect with opentracing (jaeger) with deployments in Kubernetes. I’m using the official consul-helm chart to deploy consul and so on.

I have tried out this demo https://github.com/hashicorp/consul-demo-tracing
although it left me a bit confused:

the demo is using the fake service, which has opentracing instrumentation implemented within. I thought the point of using a service mesh is that the service mesh can automatically report spans to the collector without explicitly instrumenting the code? (whether or not it’s useful that’s a different debate).

I’m looking at the envoy configuration for jaeger https://github.com/envoyproxy/envoy/tree/master/examples/jaeger-tracing, and it appears that it requires listener section specified for the initial opentracing span: https://github.com/envoyproxy/envoy/blob/master/examples/jaeger-tracing/service1-envoy-jaeger.yaml#L2-L31

My question is, how do I specify that in the context of consul-helm in kubernetes? Has anyone done that?

Thanks in advance!

Hi

A Service Mesh can not automatically enable tracing for your applications, this is due to the way that traces are built. A trace is a collection of independent spans, to relate them together into a trace each span contains a reference to its parent.

While the data plane in the Service Mesh can create spans for the things it understands such as the inbound requests, upstream calls, retries, etc. It does not understand what is going on internally in your application. For example when your application calls an upstream through the data plane, the mesh has no way of automatically knowing which initial request triggered this upstream call.

To solve this, when you make an upstream call, your application needs to forward the request span detail as either HTTP headers or gRPC metadata.

This blog post explain the process in a little more detail:

You can also see how this works using OpenTracing with Go in the following examples:

Configuring Zipkin Tracing with Consul

To enable tracing in Envoy for Consul you need to create a proxy-defaults configuration like the following example. The socket address configuration points to your Zipkin collector. In this instance we are using Jaeger which is exposed by a Kubernetes service at the address jaeger and on port `9411.

When you launch a Connect enabled pod, the Envoy proxy which is automatically injected will be configured with these defaults.

kind = "proxy-defaults"
name = "global"

config {
  envoy_extra_static_clusters_json = <<EOL
    {
      "connect_timeout": "3.000s",
      "dns_lookup_family": "V4_ONLY",
      "lb_policy": "ROUND_ROBIN",
      "load_assignment": {
          "cluster_name": "jaeger_9411",
          "endpoints": [
              {
                  "lb_endpoints": [
                      {
                          "endpoint": {
                              "address": {
                                  "socket_address": {
                                      "address": "jaeger",
                                      "port_value": 9411,
                                      "protocol": "TCP"
                                  }
                              }
                          }
                      }
                  ]
              }
          ]
      },
      "name": "jaeger_9411",
      "type": "STRICT_DNS"
    }
  EOL

  envoy_tracing_json = <<EOL
    {
        "http": {
            "config": {
                "collector_cluster": "jaeger_9411",
                "collector_endpoint": "/api/v1/spans"
            },
            "name": "envoy.zipkin"
        }
    }
  EOL
}

To deploy this configuration you can use the consul config write CLI command, the API , or you can set this into the Helm chart values.

An example of the Helm chart values containing the above Jaeger config can be seen below.

  centralConfig:
    enabled: true

    # defaultProtocol allows you to specify a convenience default protocol if
    # most of your services are of the same protocol type. The individual annotation
    # on any given pod will override this value. A protocol must be provided,
    # either through this setting or individual annotation, for a service to be
    # registered correctly. Valid values are "http", "http2", "grpc" and "tcp".
    defaultProtocol: null

    # proxyDefaults is a raw json string that will be applied to all Connect
    # proxy sidecar pods that can include any valid configuration for the
    # configured proxy.
    proxyDefaults: |
      {
        "envoy_extra_static_clusters_json": "{\"connect_timeout\": \"3.000s\", \"dns_lookup_family\": \"V4_ONLY\", \"lb_policy\": \"ROUND_ROBIN\", \"load_assignment\": { \"cluster_name\": \"jaeger_9411\", \"endpoints\": [ { \"lb_endpoints\": [ { \"endpoint\": { \"address\": { \"socket_address\": { \"address\": \"jaeger\", \"port_value\": 9411, \"protocol\": \"TCP\" } } } } ] } ] }, \"name\": \"jaeger_9411\", \"type\": \"STRICT_DNS\" }",
         "envoy_tracing_json":"{ \"http\": { \"config\": { \"collector_cluster\": \"jaeger_9411\", \"collector_endpoint\": \"/api/v1/spans\" }, \"name\": \"envoy.zipkin\" }}"
      }

Kind regards,

Nic

Thank you for your reply.

Maybe I wasn’t clear in the original post, but I wasn’t asking for the service mesh to instrument my service internally. My use case is really just this:

To elaborate it a bit more: I have a service (let’s call it Service A) which is already instrumented with opentracing (Jaeger). Service A calls Service B, which is not yet instrumented. Service B calls Service C which is instrumented. I’d like to use envoy to automatically create a span for Service B when Service A calls it.

I actually figured it out a few days ago, after reading the code in consul connect, consul-k8s and envoy. It does involve having Service B forwarding a couple of opentracing headers down.

Thank you again for replying.

Hi @nic,
I am trying to use the centralConfig block in helm installation of consul and consul connect but looks like this blocked is being ignored by the injector.

The only way I managed to make it work was to create the proxy-defaults.hcl file and write to consul using the command line.

Adding my consul-values.yaml:

# Choose an optional name for the datacenter
global:
  datacenter: minidc

# Enable the Consul Web UI via a NodePort
ui:
  service:
    type: 'NodePort'

# Enable Connect for secure communication between nodes
connectInject:
  enabled: true
  centralConfig:
    enabled: true
    proxyDefaults: |
      {
        "envoy_prometheus_bind_addr": "0.0.0.0:9102",
        "envoy_dogstatsd_url": "udp://127.0.0.1:9125",
        "envoy_extra_static_clusters_json": "{\"connect_timeout\": \"3.000s\", \"dns_lookup_family\": \"V4_ONLY\", \"lb_policy\": \"ROUND_ROBIN\", \"load_assignment\": { \"cluster_name\": \"jaeger_9411\",\"endpoints\": [{\"lb_endpoints\": [{\"endpoint\": {\"address\": {\"socket_address\": {\"address\": \"simplest-collector\",\"port_value\": 9411,\"protocol\": \"TCP\"}}}}]}]},\"name\": \"jaeger_9411\",\"type\": \"STRICT_DNS\"}",
        "envoy_tracing_json": "{\"http\":{\"name\":\"envoy.zipkin\",\"config\":{\"collector_cluster\":\"jaeger_9411\",\"collector_endpoint\":\"/api/v1/spans\",\"shared_span_context\":false}}}"
      }

client:
  enabled: true

# Use only one Consul server for local development
server:
  replicas: 1
  bootstrapExpect: 1
  disruptionBudget:
    enabled: true
    maxUnavailable: 0

Any idea?

Hey @ezraroi,

I have just tested your config and it seems to apply correctly using v0.23.1 (also tested with v0.16). The only thing I changed was the address, on my server I have the collector running as jaeger-collector

Below is my full vars file but I am not seeing anything really different.

This is the kubeconfig I am using for Jaeger.

Even with out changes the configuration applied correctly, you can check the config values with the following command.

➜ kubectl exec -it consul-consul-server-0 sh            
/ consul config read -name global -kind proxy-defaults
{
    "Kind": "proxy-defaults",
    "Name": "global",
    "Config": {
        "envoy_dogstatsd_url": "udp://127.0.0.1:9125",
        "envoy_extra_static_clusters_json": "{\"connect_timeout\": \"3.000s\", \"dns_lookup_family\": \"V4_ONLY\", \"lb_policy\": \"ROUND_ROBIN\", \"load_assignment\": { \"cluster_name\": \"jaeger_9411\",\"endpoints\": [{\"lb_endpoints\": [{\"endpoint\": {\"address\": {\"socket_address\": {\"address\": \"jaeger-collector\",\"port_value\": 9411,\"protocol\": \"TCP\"}}}}]}]},\"name\": \"jaeger_9411\",\"type\": \"STRICT_DNS\"}",
        "envoy_prometheus_bind_addr": "0.0.0.0:9102",
        "envoy_tracing_json": "{\"http\":{\"name\":\"envoy.zipkin\",\"config\":{\"collector_cluster\":\"jaeger_9411\",\"collector_endpoint\":\"/api/v1/spans\",\"shared_span_context\":false}}}"
    },
    "MeshGateway": {},
    "Expose": {},
    "CreateIndex": 4,
    "ModifyIndex": 4
}
/ # 

If you would like to checkout my setup, you can run the demo with the following command, you just need Docker running.

curl https://shipyard.run/apply | bash -s github.com/shipyard-run/blueprints//consul-k8s

If you don’t get sorted, ping me, maybe we can jump on a zoom call to try and diagnose the problem.

Kind regards,

Nic

Hi @nic, thanks for your replay…

I found the issue… looks like:

enabled: true

Is not working, while:

enabled: "true"

Is working…

Looks like a bug in the code or in the docs:

enabled (boolean: true) - Turns on the central configuration feature. Pods that have a Connect proxy injected will have their service automatically registered in this central configuration.

Hi, that’s weird. Looking at the code it uses a helm if .Values.connectInject.centralConfig.enabled which should do the conversion with true or "true".

One thing that may be the underlying cause is that this config can’t be changed once the cluster is brought up. Consul will never update the config, even if you change it in the Helm chart and run helm upgrade. Because of this, we recommend you manage this config outside of the helm chart and using the consul config write command.

1 Like