When a sidecar container fails

I’m running a bunch of containers in nomad that are built with a log-shipper sidecar container following the pattern outlined here: https://nomadproject.io/guides/operating-a-job/accessing-logs/#log-shipper-pattern

When the primary container finishes its task (or if it fails), there’s a grace period for the sidecar to finish up log processing before also terminating. I’ve run into a few instances, however, where the sidecar has failed, which in turn causes my primary task/container to terminate.

Is there a way to allow the primary task/container to continue running if the sidecar fails? While I want my logs to be as complete as possible, losing a log shipper is far less impactful than having a batch job crash mid-way through processing its data set. I’ve seen articles that show how to do this in k8s, but I’m not sure how to do this in nomad. Help?



Not currently. To workaround this problem myself, I’ve usually run log shippers as a system job on each host and had them poll logs out of where they’re placed for each alloc. That avoids tying the two lifecycles together (and has the bonus feature of using up less memory on the host typically).

The upcoming task dependencies feature https://github.com/hashicorp/nomad/pull/6843 might provide some help here for you as well.