Issue with adjusting service weights based on check return values?

Hello,

Cannot seem to get Consul agent to adjust the weight of a node serving a service based on the result of a check script. Using HAPROXY resolvers.

Versions:

Consul v1.15.2
Revision 5e08e229
Build Date 2023-03-30T17:51:19Z
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

Config:
consul.hcl:

enable_local_script_checks = true
service {
  name = "myservice"
  port = 5000
  weights = {
    passing = 63999
    warning = 255
  }
  checks = [
  {
    id = "maint"
    name = "maint"
    args = ["/home/consul/exit_code.sh"]
    interval = "10s"
    timeout = "1s"
  },
  {
    id = "check"
    name = "HTTP API on port 5000"
    http = "http://localhost:5000/health"
    interval = "10s"
    timeout = "1s"
  }
  ]
}

exit_code.sh

#!/bin/bash
exit 1

Per the docs

Check script conventions
A check script's exit code is used to determine the health check status:

Exit code 0 - Check is passing
Exit code 1 - Check is warning
Any other code - Check is failing

Wrote a test script exit_code.sh that just exits 0 or exits 1. I manually edit this to try to flip the weight. When I return 0, everything works. When I return 1, the service goes into Maintenance in the load balancer (weight not adjusted, dns query returns only one of the two nodes). I thought, perhaps incorrectly, that the Warning weight would be returned when a check goes to Warning, is that not the case?

Thank you,
Paul

According to Service configuration reference | Consul | HashiCorp Developer, Consul weights only apply to DNS SRV queries.

Whatever weight configuration exists in your loadbalancer, would be a matter for whatever piece of integration code is querying Consul to modify the loadbalancer configuration.

Hi Max,
Thanks for the reply. I’m basically doing this.

And I did check the SRV records using dig and it’s not returning the node that is in Warning state at all.

I guess that simplifies things - we can ignore the haproxy part of the environment, as the behaviour has been checked without that subsystem.

I’m not really sure what to check next though - my only ideas are fairly basic ones:

  • There is a setting - dns_config { only_passing = true } - which can be set in the Consul config file to hide warning status nodes from DNS.

  • Maybe something unexpected has happened and the node isn’t in warning status?

Max,

I completely overlooked the simple dns_config setting. It was set to true. Fixed that and everything is working as expected. Thanks for the tip.

Best,
Paul