Nomad Server: ERR_SOCKET_NOT_CONNECTED

Hi,
We have a nomad cluster with 50+ clients and have noticed that sometimes the UI will stop responding and throw ERR_SOCKET_NOT_CONNECTED or Connection reset by peer errors. This tends to happen if there are a bunch of administrators/developers using the UI simultaneously and loading pages in quick succession.

We were able to reproduce this locally by spamming the UI and eventually it will stop responding. The logs don’t indicate anything useful.

$ curl http://127.0.0.1:4646
curl: (56) Recv failure: Connection reset by peer

Reproduction steps:

  • nomad agent -dev -bind 0.0.0.0 -log-level DEBUG
  • Open browser to http://127.0.0.1:4646
  • spam F5 or load UI in A LOT of tabs

We noticed that by monitoring the number of Nomad sockets opened, it will get to a point where the count plateaus when the UI stops responding:
ss -apn | grep nomad | wc -l

Is there some sort of throttling mechanism that our team unintentionally trips when we’re performing too many calls to a server in quick succession?

Thanks,
VH

Hi @vincenthuynh. Thanks for using Nomad!

I wonder if you are running into this issue? It feels the same to me based on your post.