Clustering/Microservice deployments (Grafana Mimir/Loki)

I am trying to create a sensible Nomad-Pack for deploying Grafana Mimir in Microservice mode, but I am struggling trying to solve a few problems (Loki deployment uses a similar model).

Mimir consists of multiple components/targets, which can be scaled & deployed individually.
The different components keep track of other instances using memberlist (gossip protocol). Communication between individual instances & components is done using gRPC (multiple hash-rings).

I want to create a pack as follows:

  • Dynamic number of Nomad Groups, each representing an identical configuration (consisting of one or more components).
  • Each Nomad group can scale to multiple instances.
  • All instances register with memberlist on start (includes its own grpc/memberlist connection-info)
  • Any instances in any group need to be accessible from all other instances via published gRPC & memberlist conenction-info.

Currently I deploy as follows:

  • Register gRPC & memberlist as service for each instance
  • Expose the grpc-ports for each instance on host (dynamic port) in the Nomad “network” config.
  • Add memberlist-service as consul upstream, and use this local port in application-config when registering w/memberlist (since using gossip)
  • Use Nomad env.variables from dynamically exposed ports (memberlist/grpc) in application-config for registering in memberlist.
  • Each nomad-group is created based on a variable (list) w/custom parameters
  • Frontend proxy (nginx) manages read/write routing using consul-template + consul-service tags.

Example from current config.

# ..snip
memberlist:
  cluster_label: x-nomad
  node_name: $${NOMAD_GROUP_NAME}-$${NOMAD_ALLOC_INDEX}
  advertise_addr: $${NOMAD_HOST_IP_memberlist}
  advertise_port: $${NOMAD_HOST_PORT_memberlist}
  join_members: ["localhost:7947"]

ingester:
  ring:
    replication_factor: 2
    instance_id: $${NOMAD_GROUP_NAME}-$${NOMAD_ALLOC_INDEX}
    instance_addr: $${NOMAD_HOST_IP_grpc}
    instance_port: $${NOMAD_HOST_PORT_grpc}
    kvstore:
      store: memberlist

distributor:
  remote_timeout: 5s
  ring:
    instance_id: $${NOMAD_GROUP_NAME}-$${NOMAD_ALLOC_INDEX}
    instance_addr: $${NOMAD_HOST_IP_grpc}
    instance_port: $${NOMAD_HOST_PORT_grpc}
    kvstore:
      store: memberlist
# ..snip

I want to solve this WITHOUT having to expose the ports on the host (keep all communication “inside” consul mesh), but I am not able to see
a sensible way of doing this. Any suggestions for solving this deployment scenario in a better way would be much appreciated (or pointers to similar nomad-jobs)!

1 Like

Were you able to solve this? I’d like to give this a shot too.

I was not able to solve it the way I wanted, no.

The example pack from original question is more or less what I ended up using (exposes the ports on hosts).

Didn’t spend much more time on it, since no one else semed to have the same issue/challenge.