Get Nomad job logs into Splunk/Elasticsearch

Whats the recomended way to get docker logs into both the nomad cli & gui and an external logging facility like ELK?

The following works, but breaks nomad logs cli and the nomad gui

 logging {
          type = "syslog"
          config {
            "syslog-address" =  "udp://127.0.0.1:514"
            "syslog-facility" = "local4"
            "tag" = "foobar"
          }
        }

Here some people recommend using the sidecar pattern to run ‘filebeat’, ‘logstash’, ‘fluentd’, ‘vector.dev’ along side the nomad job.

I’d rather not duplicate logging in every nomad job, and instead have an agent running on every nomad agent that forwards all logs. In order to do that, how would you correlate the allocation to a service?

How have others centralized nomad logs?

1 Like

Hey Spencer,

We use SumoLogic in our environment, so for now I’m using the vendor provided Docker image for logging containers. I run it as a system job on all Nomad clients. The config basically mounts the docker socket, so the container can gets logs via docker’s API.

This approach lets me configure labels in the Nomad docker job. The sumo agent is looking for docker labels that correlate environment, app name, etc… to categorize the logs per container.

I don’t have to touch the default docker json-file logging approach, or touch any logging options in the Nomad jobs. It also does not impact nomad’s internal logging for jobs. So “nomad alloc logs” commands will still work just fine.

I believe something similar could be setup with Filebeat as a system job, but I haven’t tried as we don’t use Elastic for logs. Cursory glance at the docs looks like you can probably configure this with “autodiscover hints”

https://www.elastic.co/guide/en/beats/filebeat/current/running-on-docker.html
https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover-hints.html

2 Likes

Hello!
One approach that I have successfully implemented is to use Filebeat installed in the Nomad client node machine.

In this context Filebeat uses the Docker input to capture logs of all the containers running in that node: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-docker.html

I added a rule in the filebeat configuration to only capture the log of containers that have the label logging=True (i.e.: This was a convention created by me to control what containers should have their logs forwarded to Logstash - drop event when the label is not present): https://www.elastic.co/guide/en/beats/filebeat/current/drop-event.html
https://www.nomadproject.io/docs/drivers/docker.html#labels

To install Filebeat in the client Nomad node, you can either install it directly on the machine or run Filebeat as System Jobs that will run in every client node, which is more convenient.

1 Like

Another option is to run nomad_follower as a system job – I did some work on it recently to allow it to work on locked down clusters, use temporary vault credentials to fetch allocations and so forth. Since it uses a Nomad token rather than the docker socket it’s a little more secure and it write out the logs to a host_volume mounted directory (so you do need Nomad 0.10.0).

Here’s my fork with the fixes:

Good luck!

We have been using for a few weeks filebeat to ship the logs of several Nomad clusters into Elasticsearch. We wrote a custom module that is currently being contributed back to filebeat. It is implemented as an autodiscover provider, this allows us to specify certain tags in the meta stanza to customize how the logs of a given job/group/task should be parsed.

We deploy/run filebeat as a system job on all the nodes and mount the alloc_dir directory into the container. This way each filebeat instance talks to the local agent and it is only responsible of shipping the logs of only the current node. Normally we write directly to Kafka but it could also write directly into ES with the elasticsearch output.

For example, a task running an instance of Nginx (with the default log format) could be tagged on Nomad with the following meta:

meta {
    task-key                                       = "custom-meta"
    "co.elastic.logs/processors.dissect.tokenizer" = "%{ip} - %{user} [%{local_time}] \"%{request}\" %{status} %{bytes_sent} \"%{referer}\" \"%{user_agent}\""
}

this would append the following “section” to the event before it is sent to ES:

"dissect": {
    "bytes_sent": "7231",
    "referer": "http://nginx-web.nomad.trivago.com/",
    "user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36",
    "ip": "10.2.10.138",
    "user": "-",
    "local_time": "15/Nov/2019:09:04:04 +0000",
    "request": "GET / HTTP/1.1",
    "status": "200"
}

By default we enrich each log event with the following data from the Nomad job:

  • job
  • namespace
  • status
  • type (job type: system/service/batch)
  • task.* (information about the task and custom metadata defined in the job/group/task using the meta stanza)
  • datacenters
  • region

:warning: We don’t support (yet) metrics (i.e metricbeats) but it is in our roadmap.

Hi @jorgelbg,
Your filebeat nomad module sounds very interesting and something that we could leverage as well. Do you have any info as to if this will be included with filebeat? Or if this will be an opensource module we could leverage? Thanks for any info!

Hi @codyja we’re contributing back to filebeat our code. Currently, we have opened this PR https://github.com/elastic/beats/pull/14954 that includes the autodiscover module and the processor(s) for Filebeat (as described in my previous comment).

The response from the beats maintainers has been very welcoming so I think that the module will be available in a future release :crossed_fingers:.

That’s awesome news @jorgelbg! Let me know if I can help in any way. I may see if I can figure out how to test it out myself.

If you want to give it a try I can push the internal docker image that we’ve been using to a public registry and you could give it a try. We’re still going through certain edge cases that we’ve found, but we’re still using it in our production clusters.

Sure I’d like to give it a spin, that would be great. Thanks

I pushed a private image into docker’s registry. Do you have a docker ID for me to add you as a collaborator so that you can pull the image?

@jorgelbg
I was also searching for a similar solution. Can you share a small doc explaining how to setup?

Sure! If you have a docker ID I can add you to the private repo (where I’ve been pushing the image that we’re using currently internally). And you can test it. I would love to get some external feedback as well.

The documentation (including the sample configuration) is included in the PR on the elastic repository.

I’d be interested in testing this out as well!

Hi @alievrouw sorry for the delayed response, missed the email notification :sweat:

Do you still want to use/test it? I have a container built that we’re using internally and can share it.

Yes, I’m interested in trying it out!

@alievrouw Thanks for giving it a try!

You should be able to pull the jorgelbg/filebeat-nomad image from DockerHub. This is the image version that we’re running right now in all of our environments. Keep in mind that this image was not built using the current filebeat master.

We deploy this as a system job into our entire cluster with a config like:

filebeat.autodiscover:
  providers:
    - type: nomad
      host: {{ env "node.unique.name" }}
      hints.enabled: true
      hints.default_config:
        type: log
        paths:
          - /appdata/nomad/alloc/${data.meta.alloc_id}/alloc/logs/${data.meta.task.name}.stderr.[0-9]*
          - /appdata/nomad/alloc/${data.meta.alloc_id}/alloc/logs/${data.meta.task.name}.stdout.[0-9]*
        ignore_older: 24h

By default, filebeat connects to the local agent (does not need to be specified) to run all queries. Let me know if you have any questions.

2 Likes