How to use maglev load balancing in kubernetes-grpc

Hi there, Im trying to deploy a sticky session with grpc client, server and a load balancer but failing hard.
Here is my load balancer

apiVersion: 'v1'
kind: 'Service'
metadata:
  name: grpc-test-load-balancer
  labels:
    app: grpc-loadbalancer
spec:
  ports:
    - protocol: 'TCP'
      port: 18788
      targetPort: 18788
  selector:
    app: 'grpc-test'
  type: 'LoadBalancer'
  loadBalancerIP: ''
---
apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceResolver
metadata:
  name: grpc-test-load-balancer
spec:
  loadBalancer:
    policy: maglev
    hashPolicies:
      - field: "cookie"
        fieldValue : "brown-cookie"

Here is my grpc server deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: grpc-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grpc-test
  template:
    metadata:
      name: grpc-test
      labels:
        app: grpc-test
      annotations:
        'consul.hashicorp.com/connect-inject': 'true'
        'prometheus.io/scrape': 'true'
        'prometheus.io/port': '9102'

    spec:
      nodeSelector:
        kubernetes.io/arch: amd64
      serviceAccountName: grpc-test
      containers:
        - name: grpc-test
          imagePullPolicy: Never
          image: docker-beast
          command: ["/bin/sh", "-c"]
          args: ["cd /usr/local/bin; ./grpc-test-server -e 0.0.0.0:18788"]
          ports:
            - containerPort: 18788
              name: grpc
          env:
            - name: MY_POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - mountPath: /usr/local/bin/
              name: kuberbins

      volumes:
        - name: kuberbins
          hostPath:
            path: /mnt/k8s/
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: grpc-test

I’m using service dns:port (grpc-test-load-balancer:18788) in my grpc client to connect to grpc server pods, but can’t seem to figure out what I’m doing wrong. Load balancer algorithm is always random. Can you help?

Hi @canuysal,

You will also need to deploy a sidecar alongside your gRPC client application, configure the client application to connect to a local port exposed by the proxy which routes traffic across the service mesh to the upstream gRPC server.

Here’s an example client and server deployment config which may help to better explain how the communication should be configured.

# gRPC server resolver configuration
---
apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceResolver
metadata:
  name: grpc-test
spec:
  loadBalancer:
    policy: maglev
    hashPolicies:
      - field: "cookie"
        fieldValue : "brown-cookie"

# gRPC server deployment and service account
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grpc-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grpc-test
  template:
    metadata:
      name: grpc-test
      labels:
        app: grpc-test
      annotations:
        'consul.hashicorp.com/connect-inject': 'true'
        'prometheus.io/scrape': 'true'
        'prometheus.io/port': '9102'
    spec:
      nodeSelector:
        kubernetes.io/arch: amd64
      serviceAccountName: grpc-test
      containers:
        - name: grpc-test
          imagePullPolicy: Never
          image: docker-beast
          command: ["/bin/sh", "-c"]
          args: ["cd /usr/local/bin; ./grpc-test-server -e 0.0.0.0:18788"]
          env:
            - name: MY_POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - mountPath: /usr/local/bin/
              name: kuberbins
      volumes:
        - name: kuberbins
          hostPath:
            path: /mnt/k8s/
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: grpc-test

# gRPC client deployment and service account
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grpc-client
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grpc-client
  template:
    metadata:
      name: grpc-client
      labels:
        app: grpc-client
      annotations:
        'consul.hashicorp.com/connect-inject': 'true'
        'consul.hashicorp.com/connect-service-upstreams': 'grpc-test:18788'
        'prometheus.io/scrape': 'true'
        'prometheus.io/port': '9102'
    spec:
      nodeSelector:
        kubernetes.io/arch: amd64
      serviceAccountName: grpc-client
      containers:
        - name: grpc-client
          imagePullPolicy: Never
          image: docker-beast
          command: ["/bin/sh", "-c"]
          args: ["cd /usr/local/bin; ./grpc-client -e 127.0.0.1:18788"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: grpc-client

I also recommend checking out the following tutorial which walks through this process of configuring services to communicate over Consul service mesh.

Hi @blake , thanks for the detailed response. I get it now, it all made clear at the 127.0.0.1:18788 part at the client side.

Right now I’m able to connect two services together with Consul’s service mesh, however I still couldn’t get service resolver working.

I dont’t think my

    hashPolicies:
      - field: "cookie"
        fieldValue : "brown-cookie"

part is correct for grpc maglev load balancing. I tried things like

policy: maglev
sourceIP: true
.
.
policy: maglev
    hashPolicies:
      - field: "header"
        fieldValue : "content-type"

They dont seem to change my load balance algorithm. Can you show me the correct path for this, can’t find the right policies for grpc, read several articles on consul yet I’m stuck.

Should I also recreate my client/server or anything after i deploy my service resolver?

I managed to get it working, in case anyone stumbles upon this, read on!

Apparently I missed this part in the load balancing envoy documentation.

In order to enable service resolution and apply load balancer policies, you first need to configure HTTP as the service protocol in the service's service-defaults configuration entry.

So for grpc we first create a service defaults yaml like this

apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceDefaults
metadata:
  name: grpc-test
spec:
  protocol: grpc

And for service resolving to happen, we need to select an appropriate header for grpc, which one can inspect on wireshark/http2 protocol. So our service resolver should look like this

apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceResolver
metadata:
  name: grpc-test
spec:
  loadBalancer:
    policy: maglev
    hashPolicies:
      - field: header
        fieldValue: content-type

And dont forget to use local ip in pods for some envoy magic to happen like @blake mentioned. Good luck out there!

Last Update: apparently this resolves all traffic to the same hash because content-type is same for all grpc messages.

I think SourceIP is bugged because docs state:

SourceIP (bool: false) - Determines whether the hash should be of the source IP address 
rather than of a field and field value. Cannot be specified along with Field or FieldValue.

But if i try to add it to hash policies without field, i get this error

admission webhook "mutate-serviceresolver.consul.hashicorp.com" denied the request:
 serviceresolver.consul.hashicorp.com "grpc-test" is invalid: 
spec.loadBalancer.hashPolicies[1].field: Invalid value: "":
 must be one of "header", "cookie", "query_parameter"

And i think there is no way of load balancing grpc without using ip address. @blake

Hi @canuysal,

I’m glad you were able to get your service communicating over the mesh, and validating some aspects of the service resolver config.

spec.loadBalancer.hashPolicies[1].field: Invalid value: “”: must be one of “header”, “cookie”, “query_parameter”

Thanks for reporting this. It appears Consul’s CRD controller does not correctly support this SourceIP parameter.

I’ll make sure that we file a GitHub issue to track updating the CRD to fully support all LoadBalancer parameters provided by the service router.

Fix incorrect validation for ServiceResolver by lkysow · Pull Request #456 · hashicorp/consul-k8s · GitHub should fix this.

Hi @canuysal,

A fix for this is now available in consul-k8s version 0.25.0.