What is the specific definition of "local agent"

Hi guys, I’m trying to use consul.

The term local agent / local consul agentlocal state comes up a lot in the documentation, like anti-entropy/agent/service

I would like to know what it means exactly, is it a generic term for all agents in the current datacenter, or is it a single agent that is currently being accessed? What exactly is the range that 'local' refers to?

For example, /agent/service is used to interact with the services of the “local consul agent”, while /catalog can interact with all services, and it can also cross data centers.

So can agent/service interact with all services in the current datacenter, or can it only interact with services registered to the currently connected agent node?

Sorry, I’m still using dev mode and not using clusters yet, so I can’t try reasoning on multiple agents for now. I’d like to hear the official definition, but I can’t find it in the glossary or other documentation.

In a traditional VM-based or bare metal hosting environment, each machine runs a Consul agent.

Each service registers with the Consul agent local to the machine it is running on. When you see “local agent”, it’s a reference to that pattern.

Each agent is responsible for syncing information about the services which have registered with it - locally - back into the overall cluster catalog of services - and making sure that stays up to date - that’s what anti-entropy refers to.

The /agent/ HTTP APIs are the ones that are used for interactions between services and their local agents - distinct from the rest of the HTTP APIs, which are mostly only used on Consul servers (servers being specific agents which play an additional role in managing the overall datacenter cluster state).

1 Like

Hi @maxb , Thank you for your reply.

I think I have a general understanding of: The /agent/service API, strictly speaking, can only interact with services that are registered to this Agent. In the usual deployment structure, these services are the services on the machine where the Agent is located.

I have a few more questions and thoughts. To describe them clearly, I will divide them into three parts:
(English is not my native language, so please understand any reading difficulties this may cause for you. Thank you very much for reading and I hope to hear back from you!)

1. Confirmation of a question

I would like to confirm a question:
Suppose I have two services and two client agents running on two cloud hosts. I have written code in each service to register itself and discover the service, and these codes connect to the client agent on the respective machine.

So: should the code for discovering services be implemented using the /catalog API? Because if /agent/service is used, each service will only be able to query itself?

2. If the above question exists, thoughts arising

If so, /agent/service seems a bit hard to use because the code is coupled to where the service is deployed – the service you want to discover using the /agent/service API must be deployed locally.

So, is it common to use the /catalog API to discover services? Because that’s how I can guarantee that I can always look up the service I need.

In terms of pure experience, I think /agent/service + /agent/health/service/id works better than /catalog + /health. Because the latter two both return arrays of service instances, /health cannot be queried by id.

Am I right in thinking this way?

3. About the deployment of client agent

In the first sentence of your reply, you mentioned that a consul agent is usually deployed for each machine (I think you mean client agent)

I found a description in this document: “In a typical deployment, you must run client agents on every compute node in your datacenter”
May I ask if the “compute node” mentioned here refers to a single machine?

You are right in thinking that the /agent APIs give you a view of only that one agent.

The /catalog APIs give you a view of the whole datacenter.

However for discovering services, the /query APIs are the usual approach. This gives you automatic exclusion of unhealthy services, and the ability to query for services in other datacenters if desired.

Yes, “compute node” is just a way of saying “machine, however physical or virtual it might be”.

1 Like

Hi @maxb , Thanks for the guidance, hopefully I understand it completely.

I experimented with /query

  • It can return both service details (including address) and health status.
  • It requires creating a “preparedQuery” (and optionally a “session”) and writing some extra code to handle the deletion of the preparedQuery so that it does not fail when the program restarts and runs the creation code again (no preparedQuery with the same name is allowed).
  • By setting OnlyPassing=false, it can return service instances with passing and warning status, but without critical.

However, I also found an API: /health/service

  • It can also return both service details (with address) and health status. (Its response content has the same service and check sections as /query)
  • It is less code than /query because it does not need to create a preparedQuery.
  • Setting passing=false, it can return service instances in any health state (including critical), so it’s more comprehensive than /query.

Based on the above comparison, I feel /health/service is a bit better to use.

But since you said /query is commonly used, I feel unsure about it. I tried to Google some golang sample code for service discovery using consul, however, there are too few useful results.


Also, the data returned by /query and /health/service always contains a serfHealth in the Checks array, it seems to be related to Gossip, do developers need to care about it? I would like to know if it is always in array[0]? This way I don’t have to write additional judgments to determine the health of the desired service

Thanks for reading.

If the more limited capabilities of /health/service are sufficent for you, by all means use it.

But, if you want Consul to support automatic fallback to services running in different Consul datacenters, you do need to use queries.

The query endpoint can be used with sessions and anonymous queries, but I am more familiar with using it with predefined named queries. It also supports registering templated queries - although this feature is shockingly underdocumented, being mentioned only in Prepared Queries - HTTP API | Consul | HashiCorp Developer AFAIK, and in not that much detail - in this way it’s possible to, for example, register a query called best-, with a regular expression of ^best-.*$, and use it to query for best-<name-of-service> for multiple services.

The meaning of this check is, “Is the Consul agent through which this service is registered, considered healthy by the Consul cluster membership gossip system (serf)?” - i.e. it uses the Consul cluster membership gossip as a way to detect hosts which have gone down.

I doubt that it is guaranteed to always be first.

You generally shouldn’t be writing those anyway - just specify passing=true and let Consul take care of it for you.

1 Like

Ok, @maxb , thank you very much for your help.

Your words have been very helpful to me. I think I am getting to understand how to use consul properly.

Hi, @maxb

What is the function of automatic fallback?

I can’t find anything about it in the /query API documentation.

I have no idea. I have never heard of “auto-rewind” in Consul.

@maxb , Sorry, it’s a translation problem.
To make this Q&A seem less verbose. I made a correction on the original question and hope you will re-edit it as well.

See this part of the documentation: Prepared Queries - HTTP API | Consul | HashiCorp Developer

1 Like