We have a nomad cluster with 50+ clients and have noticed that sometimes the UI will stop responding and throw
Connection reset by peer errors. This tends to happen if there are a bunch of administrators/developers using the UI simultaneously and loading pages in quick succession.
We were able to reproduce this locally by spamming the UI and eventually it will stop responding. The logs don’t indicate anything useful.
$ curl http://127.0.0.1:4646 curl: (56) Recv failure: Connection reset by peer
nomad agent -dev -bind 0.0.0.0 -log-level DEBUG
- Open browser to
- spam F5 or load UI in A LOT of tabs
We noticed that by monitoring the number of Nomad sockets opened, it will get to a point where the count plateaus when the UI stops responding:
ss -apn | grep nomad | wc -l
Is there some sort of throttling mechanism that our team unintentionally trips when we’re performing too many calls to a server in quick succession?