Hi,
in our environment we have gluster cluster and we are using a self written csi plugin to connect to a volume. In general this also works well, but sometimes we experience a failure: Transport endpoint is not connected
When I do an ls
in the job container itself I get:
root@ac0c0829a9af:/opt# ls -la
ls: cannot access 'shared': Transport endpoint is not connected
total 12
drwxr-xr-x 1 root root 4096 Sep 25 14:20 .
drwxr-xr-x 1 root root 4096 Sep 25 14:20 ..
d????????? ? ? ? ? ? shared
root@ac0c0829a9af:/opt#
On the client I got data from the volume:
root@client03(nomadclient-internal-hetz):/var/lib/nomad/client/csi/monolith/csi.gluster/per-alloc/d1abed91-330a-a8fc-9906-ba12afe9be91/application/rw-file-system-multi-node-multi-writer$ ls -la
total 192
drwxrwxr-x 49 root root 4096 Sep 25 14:54 .
drwx------ 3 root root 4096 Sep 25 14:19 ..
drwxr-xr-x 3 root root 4096 Sep 24 07:04 folder1
drwxr-xr-x 3 root root 4096 Sep 11 10:27 folder2
Same inside the csi plugin:
root@ec3ce43542ba:/mnt# ls -la /csi/per-alloc/d1abed91-330a-a8fc-9906-ba12afe9be91/application/rw-file-system-multi-node-multi-writer
total 192
drwxrwxr-x 49 root root 4096 Sep 25 14:54 .
drwx------ 3 root root 4096 Sep 25 14:19 ..
drwxr-xr-x 3 root root 4096 Sep 24 07:04 folder1
drwxr-xr-x 3 root root 4096 Sep 11 10:27 folder2
So, mounting the volume via the csi plugin works.
But sometimes I also got the failure in the csi plugin itself. It’s really strange.
Does anyone know this error?