[Nomad Autoscaler] Issue with nomad-autoscaler.policy.source.error_count Metric Not Being Emitted in the 0.3.7 version

In my Nomad deployment, I’m using the nomad autoscaler 0.3.7 version, and I have noticed that the nomad-autoscaler.policy.source.error_count metric is not being emitted as expected. I need this metric is for monitoring and scaling decisions, based on this https://developer.hashicorp.com/nomad/tools/autoscaling/telemetry#nomad-autoscaler-telemetry metrics should be there

Here are all the metrics that were emitted on my side when checking from autoscaler metrics endpoint

curl http://localhost:8080/v1/metrics\?format\=prometheus
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 4.6621e-05
go_gc_duration_seconds{quantile="0.25"} 0.000122481
go_gc_duration_seconds{quantile="0.5"} 0.000185463
go_gc_duration_seconds{quantile="0.75"} 0.000808118
go_gc_duration_seconds{quantile="1"} 0.179447639
go_gc_duration_seconds_sum 453.801392779
go_gc_duration_seconds_count 44674
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 101
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.17.9"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 2.7680152e+07
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 4.97384121408e+11
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 3.309918e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 4.078634973e+09
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 8.514704e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 2.7680152e+07
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 8.5630976e+07
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 3.043328e+07
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 326453
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 8.138752e+07
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 1.16064256e+08
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.6952750613060596e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 4.078961426e+09
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 2400
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 416432
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 933888
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 3.3155216e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 877098
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 1.376256e+06
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 1.376256e+06
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 1.31092504e+08
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 11
# HELP nomad_autoscaler_plugin_manager_access_ms nomad_autoscaler_plugin_manager_access_ms
# TYPE nomad_autoscaler_plugin_manager_access_ms summary
nomad_autoscaler_plugin_manager_access_ms{plugin_name="aws-asg-us-east-1",plugin_type="target",quantile="0.5"} 0.009200000204145908
nomad_autoscaler_plugin_manager_access_ms{plugin_name="aws-asg-us-east-1",plugin_type="target",quantile="0.9"} 0.009200000204145908
nomad_autoscaler_plugin_manager_access_ms{plugin_name="aws-asg-us-east-1",plugin_type="target",quantile="0.99"} 0.009200000204145908
nomad_autoscaler_plugin_manager_access_ms_sum{plugin_name="aws-asg-us-east-1",plugin_type="target"} 3861.547602577135
nomad_autoscaler_plugin_manager_access_ms_count{plugin_name="aws-asg-us-east-1",plugin_type="target"} 64946
nomad_autoscaler_plugin_manager_access_ms{plugin_name="aws-asg-us-east-2",plugin_type="target",quantile="0.5"} 0.011451000347733498
nomad_autoscaler_plugin_manager_access_ms{plugin_name="aws-asg-us-east-2",plugin_type="target",quantile="0.9"} 0.011451000347733498
nomad_autoscaler_plugin_manager_access_ms{plugin_name="aws-asg-us-east-2",plugin_type="target",quantile="0.99"} 0.011451000347733498
nomad_autoscaler_plugin_manager_access_ms_sum{plugin_name="aws-asg-us-east-2",plugin_type="target"} 5588.599981206935
nomad_autoscaler_plugin_manager_access_ms_count{plugin_name="aws-asg-us-east-2",plugin_type="target"} 34710
# HELP nomad_autoscaler_policy_total_num nomad_autoscaler_policy_total_num
# TYPE nomad_autoscaler_policy_total_num gauge
nomad_autoscaler_policy_total_num 2
# HELP nomad_autoscaler_runtime_alloc_bytes nomad_autoscaler_runtime_alloc_bytes
# TYPE nomad_autoscaler_runtime_alloc_bytes gauge
nomad_autoscaler_runtime_alloc_bytes 2.756036e+07
# HELP nomad_autoscaler_runtime_free_count nomad_autoscaler_runtime_free_count
# TYPE nomad_autoscaler_runtime_free_count gauge
nomad_autoscaler_runtime_free_count 4.078635008e+09
# HELP nomad_autoscaler_runtime_heap_objects nomad_autoscaler_runtime_heap_objects
# TYPE nomad_autoscaler_runtime_heap_objects gauge
nomad_autoscaler_runtime_heap_objects 324884
# HELP nomad_autoscaler_runtime_malloc_count nomad_autoscaler_runtime_malloc_count
# TYPE nomad_autoscaler_runtime_malloc_count gauge
nomad_autoscaler_runtime_malloc_count 4.078959872e+09
# HELP nomad_autoscaler_runtime_num_goroutines nomad_autoscaler_runtime_num_goroutines
# TYPE nomad_autoscaler_runtime_num_goroutines gauge
nomad_autoscaler_runtime_num_goroutines 97
# HELP nomad_autoscaler_runtime_sys_bytes nomad_autoscaler_runtime_sys_bytes
# TYPE nomad_autoscaler_runtime_sys_bytes gauge
nomad_autoscaler_runtime_sys_bytes 1.31092504e+08
# HELP nomad_autoscaler_runtime_total_gc_pause_ns nomad_autoscaler_runtime_total_gc_pause_ns
# TYPE nomad_autoscaler_runtime_total_gc_pause_ns gauge
nomad_autoscaler_runtime_total_gc_pause_ns 4.53801377792e+11
# HELP nomad_autoscaler_runtime_total_gc_runs nomad_autoscaler_runtime_total_gc_runs
# TYPE nomad_autoscaler_runtime_total_gc_runs gauge
nomad_autoscaler_runtime_total_gc_runs 44674
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 7606.34
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 25
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 7.2183808e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.69413238968e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 8.36616192e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19

Hi @kuklipuk,

This is being caused because no errors are being received, so the telemetry exporter is garbage collecting the data point. Currently, the Nomad Autoscaler does not pre-define data points and export them irregardless of data. Therefore, you can assume no metric means a zero count.

Predefining metrics sounds like a good feature addition, so please feel free to raise a request against the repository.

Thanks,
jrasell and the Nomad team

“you can assume no metric means a zero count”

“Time series that are not present until something happens are difficult to deal with, as the usual simple operations are no longer sufficient to correctly handle them. To avoid this, export a default value such as 0 for any time series you know may exist in advance.”
Source: Prometheus avoid missing metrics.