I have been gradually updating various services to use the autoscaler over the past month or so and am starting to encounter issues with scaling off of Datadog metrics. The main issue is that Datadog, as well as other SaaS APM tools, have fairly aggressive rate limits for querying timeseries data using their APIs. For Datadog it is possible to workaround these limits by raising them, but this doesn’t seem scalable as they will need to be perpetually raised as more scaling checks and services are created.
I was wondering if anyone using the autoscaler has any advice or best practices for limiting the amount of queries the autoscaler makes against APM tools such as aggregating queries across multiple services?