Consul API Gateway - Plans or Suggestions for Rate Limiting and Monitoring?

Right now, we manage a fleet of Nginx instances on AWS EC2 and use a consul-template to generate our nginx configuration files, which configures nginx upstreams so that we can route requests to pods running inside our EKS cluster.

We implement rate limits by IP and HTTP Headers using custom Lua code for Nginx, including logic for implementing slow start, etc.

The Consul API Gateway resource looks very appealing and that could potentially allow us to reduce our operational burden of managing our Nginx fleet and allow for greater flexibility between the cluster operator and our various service owners who own specific endpoints.

How would you recommend implementing rate limiting for various endpoints with a deployment of Consul API Gateway? Is this being researched or on the roadmap?

And another question is how would you recommend implementing monitoring for Consul Api Gateway? (e.g. if there are requests that for one reason or another, did not end up getting routed to a pod, for example, if a 429 or 502 was served instead by the gateway for some reason, if such logic exists)

Hi @ktham,

Consul API Gateway doesn’t currently support rate limiting but it is on our roadmap.

Can you tell me more about how you want to do rate limiting? One specific question I have is, if you have multiple instances of the same logical gateway, for a given source IP or HTTP header, do you want the rate limit to be what is allowed by each instance or the total of what’s allowed by all the instances?

As for monitoring the traffic, API Gateway doesn’t add any monitoring capabilities to what Consul already has. You can find more information in the Observability and the Distributed Tracing sections of the Consul product documentation.

We do have additional traffic monitoring capabilities on the roadmap for API Gateway.

Let me know if you have any other questions.

Kind Regards,
Jeff

1 Like

Hey @Jeff-Apple , was out for a while, apologies for late response.

One specific question I have is, if you have multiple instances of the same logical gateway, for a given source IP or HTTP header, do you want the rate limit to be what is allowed by each instance or the total of what’s allowed by all the instances?

We’d like the configuration to specify the total allowed by all gateway instances. And so if there are 5 instances of the logical gateway, we’d like to specify a rate N, and then each gateway will do round_up(N/5). This is assuming that all requests are reasonably distributed across all gateway instances. When the number of logical gateway instances gets scaled up or down, the rate for each gateway would ideally be automatically updated.

The idea here is we’d like to do both IP-based rate limiting, and API-key (specified through HTTP header) based rate limiting for specific paths/routes

Hi @ktham,

I really appreciate you taking the time to provide us this info. It helps as we plan the details for adding rate limiting to the product.

Kind Regards,
Jeff

to clarify if i am interested in getting the request coming into the api-gateway to show up on the jaeger trace. is this possible today with dist tracing, for now i only managed to get the first service to show up ?.

and if not when is this planned ?