I’m not able to get Trace spans in Datadog from Envoy when running in Fargate. I get traces from the app, but when calling another app with a similar task definition, I don’t get the Envoy Spans I expect to see.
With the closest I get, the envoy container logs an error:
failed to generate xDS resources for
"type.googleapis.com/envoy.config.listener.v3.Listener":
Any JSON doesn't have '@type'
The task definition I’m using has 4 containers:
1 - the app
2 - consul
3 - envoy
4 - datadog
Consul & Envoy containers are the same image:
nicholasjackson/consul-envoy:v1.11.2-v1.20.1
and they both use CONSUL_LOCAL_CONFIG
to pass the configuration.
The consul agent container starts with:
"command": [
"agent",
"-bind",
"{{ GetInterfaceIP \"eth1\" }}",
"-data-dir",
"/consul/data",
"-retry-join",
"DATACENTER"
]
The Envoy Container starts with:
"command": [
"consul",
"connect",
"envoy",
"-sidecar-for",
"SERVICE",
"-token",
"CONSUL_ACL_TOKEN"
]
The CONSUL_LOCAL_CONFIG
variable contains a json blob containing the sidecar proxy config with envoy tracing like:
{
"service": {
"connect": {
"sidecar_service": {
"proxy": {
"destination_service_name": "SERVICE",
"local_service_port": 9090,
"config": [
{
"envoy_public_listener_json": "{\"address\":{...}}",
"envoy_tracing_json": "{\"http\":{...}}",
"envoy_extra_static_clusters_json": "{\"name\":\"datadog_local\",...}"
}
]
}
}
}
}
}
Those vars expand like:
envoy_extra_static_cluster
:
{
"name": "datadog_local",
"connect_timeout": "3.000s",
"lb_policy": "ROUND_ROBIN",
"dns_lookup_family": "V4_ONLY",
"load_assignment": {
"cluster_name": "datadog_local",
"endpoints": [
{
"lb_endpoints": [
{
"endpoint": {
"address": {
"socket_address": {
"address": "127.0.0.1",
"port_value": 8126,
"protocol": "TCP"
}
}
}
}
]
}
]
}
}
envoy_public_listener
:
{
"address": {
"socket_address": {
"address": "127.0.0.1",
"port_value": 9090
}
},
"filter_chains": [
{
"filters": [
{
"name": "envoy.filters.network.http_connection_manager",
"typed_config": {
"@type": "type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager",
"generate_request_id": "true",
"request_id_extension": {
"typed_config": {
"@type": "type.googleapis.com/envoy.extensions.request_id.uuid.v3.UuidRequestIdConfig",
"use_request_id_for_trace_sampling": "false"
}
},
"codec_type": "auto",
"stat_prefix": "public_listener",
"route_config": {
"name": "local_route",
"virtual_hosts": [
{
"name": "backend",
"domains": [
"*"
],
"routes": [
{
"match": {
"prefix": "/"
},
"route": {
"cluster": "service1"
}
}
]
}
]
},
"http_filters": [
{
"name": "envoy.filters.http.health_check",
"typed_config": {
"@type": "type.googleapis.com/envoy.extensions.filters.http.health_check.v3.HealthCheck",
"pass_through_mode": "false",
"headers": [
{
"exact_match": "/healthcheck",
"name": ":path"
}
]
}
},
{
"name": "envoy.filters.http.router",
"typed_config": {}
}
],
"use_remote_address": "true"
}
}
]
}
]
}
envoy_tracing_json
:
{
"http": {
"name": "envoy.tracers.datadog",
"typed_config": {
"@type": "type.googleapis.com/envoy.config.trace.v3.DatadogConfig",
"collector_cluster": "datadog_local",
"service_name": "envoy"
}
}
}
The Datadog config being referenced is from datadog’s doc
The envoy log shows warnings generating the xDS listener resource pointing to a misconfigured listener:
[warning][config] [./source/common/config/grpc_stream.h:196] DeltaAggregatedResources gRPC config stream closed since 46s ago: 14, failed to generate all xDS resources from the snapshot: failed to generate xDS resources for "type.googleapis.com/envoy.config.listener.v3.Listener": Any JSON doesn't have '@type'
Traces generate for the app, but envoy spans are not visible.
I’ve been able to get this to work in docker containers from my local machine using nicholasjacksons fake-service demo tracing repo and opened a bug on his fake-service repo referencing that. But I can’t seem to get this working in Fargate.
Any guidance on this Envoy config in Fargate would be appreciated.