I am trying to run the NFS CSI driver on my Nomad v1.3.5 cluster. The cluster is using the containerd driver version 0.9.3.
EDIT: The following works with the Docker driver. This is specifically an issue with the containerd driver.
Here is my node job:
job "nfs-node" {
datacenters = ["dc1"]
type = "system"
group "node" {
task "node" {
driver = "containerd-driver"
config {
image = "registry.k8s.io/sig-storage/nfsplugin:v4.1.0"
privileged = true
host_network = true
args = [
"-v=5",
"--nodeid=${attr.unique.hostname}",
"--endpoint=unix:///csi/csi.sock",
]
}
csi_plugin {
id = "nfs"
type = "node"
mount_dir = "/csi"
}
}
}
}
And here is my controller job:
job "nfs-controller" {
datacenters = ["dc1"]
group "controller" {
task "controller" {
driver = "containerd-driver"
config {
image = "registry.k8s.io/sig-storage/nfsplugin:v4.1.0"
privileged = true
host_network = true
args = [
"-v=5",
"--nodeid=${attr.unique.hostname}",
"--endpoint=unix:///csi/csi.sock", # Adjust accordingly
]
}
csi_plugin {
id = "nfs"
type = "controller"
mount_dir = "/csi"
}
}
}
}
Both jobs deploy fine and I am able to create a volume with the following volume config:
# volume.hcl
id = "nginx" # ID as seen in nomad
name = "nginx" # Display name
type = "csi"
plugin_id = "nfs" # Needs to match the deployed plugin
capability {
access_mode = "multi-node-multi-writer"
attachment_mode = "file-system"
}
mount_options {
fs_type = "nfs"
}
parameters {
server = "10.240.0.50"
share = "/srv/nfs/nomad"
}
$ nomad volume create volume.hcl
Created external volume 10.240.0.50#srv/nfs/nomad#nginx# with ID nginx
When I go deploy the following job that calls on the volume, I get the following error:
# nginx.nomad
job "nginx-persistent" {
datacenters = ["dc1"]
group "nginx-persistent" {
volume "nginx" {
type = "csi"
read_only = false
source = "nginx"
attachment_mode = "file-system"
access_mode = "multi-node-multi-writer"
}
network {
mode = "cni/bridge"
port "http" {
to = 80
}
}
service {
name = "nginx"
port = "http"
provider = "nomad"
}
task "nginx" {
driver = "containerd-driver"
volume_mount {
volume = "nginx"
destination = "/usr/share/nginx/html"
}
config {
image = "nginx:latest"
}
}
}
}
Error:
rpc error: code = Unknown desc = Error in creating task: failed to create shim: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/opt/nomad/client/csi/node/nfs-2/per-alloc/aa738bdf-87ba-efd0-cb76-ac653799285e/nginx/rw-file-system-multi-node-multi-writer" to rootfs at "/usr/share/nginx/html": stat /opt/nomad/client/csi/node/nfs-2/per-alloc/aa738bdf-87ba-efd0-cb76-ac653799285e/nginx/rw-file-system-multi-node-multi-writer: no such file or directory: unknown
Looking at the logs on the node job, it looks to be mounting fine:
I0913 18:43:50.888390 1 utils.go:95] GRPC response: {}
I0913 18:45:51.578087 1 utils.go:88] GRPC call: /csi.v1.Node/NodePublishVolume
I0913 18:45:51.578107 1 utils.go:89] GRPC request: {"target_path":"/local/csi/per-alloc/aa738bdf-87ba-efd0-cb76-ac653799285e/nginx/rw-file-system-multi-node-multi-writer","volume_capability":{"AccessType":{"Mount":{"fs_type":"nfs"}},"access_mode":{"mode":5}},"volume_context":{"server":"10.240.0.50","share":"/srv/nfs/nomad","subdir":"nginx"},"volume_id":"10.240.0.50#srv/nfs/nomad#nginx#"}
I0913 18:45:51.578310 1 nodeserver.go:129] NodePublishVolume: volumeID(10.240.0.50#srv/nfs/nomad#nginx#) source(10.240.0.50:/srv/nfs/nomad/nginx) targetPath(/local/csi/per-alloc/aa738bdf-87ba-efd0-cb76-ac653799285e/nginx/rw-file-system-multi-node-multi-writer) mountflags([])
I0913 18:45:51.578339 1 mount_linux.go:183] Mounting cmd (mount) with arguments (-t nfs 10.240.0.50:/srv/nfs/nomad/nginx /local/csi/per-alloc/aa738bdf-87ba-efd0-cb76-ac653799285e/nginx/rw-file-system-multi-node-multi-writer)
I0913 18:45:51.579257 1 utils.go:88] GRPC call: /csi.v1.Node/NodeUnpublishVolume
I0913 18:45:51.579271 1 utils.go:89] GRPC request: {"target_path":"/local/csi/per-alloc/047cb962-7915-09d1-53db-3f3a1f92e75c/nginx/rw-file-system-multi-node-multi-writer","volume_id":"10.240.0.50#srv/nfs/nomad#nginx#"}
I0913 18:45:51.579297 1 nodeserver.go:163] NodeUnpublishVolume: unmounting volume 10.240.0.50#srv/nfs/nomad#nginx# on /local/csi/per-alloc/047cb962-7915-09d1-53db-3f3a1f92e75c/nginx/rw-file-system-multi-node-multi-writer
I0913 18:45:51.581044 1 mount_helper_common.go:99] "/local/csi/per-alloc/047cb962-7915-09d1-53db-3f3a1f92e75c/nginx/rw-file-system-multi-node-multi-writer" is a mountpoint, unmounting
I0913 18:45:51.581059 1 mount_linux.go:294] Unmounting /local/csi/per-alloc/047cb962-7915-09d1-53db-3f3a1f92e75c/nginx/rw-file-system-multi-node-multi-writer
W0913 18:45:51.586546 1 mount_helper_common.go:133] Warning: "/local/csi/per-alloc/047cb962-7915-09d1-53db-3f3a1f92e75c/nginx/rw-file-system-multi-node-multi-writer" is not a mountpoint, deleting
I0913 18:45:51.586596 1 nodeserver.go:168] NodeUnpublishVolume: unmount volume 10.240.0.50#srv/nfs/nomad#nginx# on /local/csi/per-alloc/047cb962-7915-09d1-53db-3f3a1f92e75c/nginx/rw-file-system-multi-node-multi-writer successfully
I0913 18:45:51.586603 1 utils.go:95] GRPC response: {}
I0913 18:45:51.612249 1 utils.go:151] skip chmod on targetPath(/local/csi/per-alloc/aa738bdf-87ba-efd0-cb76-ac653799285e/nginx/rw-file-system-multi-node-multi-writer) since mode is already 020000000777)
I0913 18:45:51.612279 1 nodeserver.go:148] volume(10.240.0.50#srv/nfs/nomad#nginx#) mount 10.240.0.50:/srv/nfs/nomad/nginx on /local/csi/per-alloc/aa738bdf-87ba-efd0-cb76-ac653799285e/nginx/rw-file-system-multi-node-multi-writer succeeded
I0913 18:45:51.612292 1 utils.go:95] GRPC response: {}
Any ideas why I’m seeing that error? Thanks!