[Nomad][Autoscaler] Observability of scaling actions


Are there any metrics or events that contain information on what scaling actions have been performed? We are attempting to create scaling policies that work for our jobs and it would be extremely helpful to be able to put together a dashboard showing metrics on host resources (used and allocated) along with scaling actions.

This might be a general nomad question, as it seems like a manual scale action not taken by the autoscaler would ideally be visible in the same metric / event stream.

The autoscaler logs are very useful for seeing what checks resulted in an action and do contain all the actions. However, it requires manual searching and correlating. For the aws-asg target plugin we were able to do this via the events from the autoscaling group. I was hoping there is a way to do something similar for jobs.

If not, I’d love to hear how other teams figured out what scaling policies work for their jobs.