At DigitalOcean I have created multiple volumes (via Terraform) and they show up in DigitalOcean as unattached. In my Nomad cluster, I have the digitalocean/do-csi-plugin
running. When I deploy a job that requests a CSI volume the deployment responds with this message:
Constraint missing CSI Volume sample-volume filtered 4 nodes
The job specification has:
...
volume "cloud-volume" {
type = "csi"
source = "sample-volume"
access_mode = "single-node-writer"
attachment_mode = "file-system"
}
task "app" {
driver = "exec"
volume_mount {
volume = "cloud-volume"
destination = "${NOMAD_TASK_DIR}/state"
}
...
The DigitalOcean CSI Plugin is running as two jobs. One for the controller and one for the nodes. The output from those two jobs looks like this:
Controller
-----------
time="2023-04-06T16:38:36Z" level=info msg="removing socket" host_id=349266694 region=nyc3 socket=/csi/csi.sock version=v4.5.1
time="2023-04-06T16:38:36Z" level=info msg="starting server" grpc_addr=/csi/csi.sock host_id=349266694 http_addr= region=nyc3 version=v4.5.1
time="2023-04-06T16:38:36Z" level=info msg="probe called" host_id=349266694 method=probe region=nyc3 version=v4.5.1
time="2023-04-06T16:38:36Z" level=info msg="get plugin info called" host_id=349266694 method=get_plugin_info region=nyc3
response="name:\"dobs.csi.digitalocean.com\" vendor_version:\"v4.5.1\" " version=v4.5.1
time="2023-04-06T16:38:36Z" level=info msg="probe called" host_id=349266694 method=probe region=nyc3 version=v4.5.1
time="2023-04-06T16:38:36Z" level=info msg="get plugin capabitilies called" host_id=349266694 method=get_plugin_capabilities region=nyc3
response="capabilities:<service:<type:CONTROLLER_SERVICE > >
capabilities:<service:<type:VOLUME_ACCESSIBILITY_CONSTRAINTS > >
capabilities:<volume_expansion:<type:ONLINE > > "
version=v4.5.1
time="2023-04-06T16:38:36Z" level=info msg="probe called" host_id=349266694 method=probe region=nyc3 version=v4.5.1
time="2023-04-06T16:38:36Z" level=info msg="controller get capabilities called" host_id=349266694 method=controller_get_capabilities region=nyc3
response="capabilities:<rpc:<type:CREATE_DELETE_VOLUME > >
capabilities:<rpc:<type:PUBLISH_UNPUBLISH_VOLUME > >
capabilities:<rpc:<type:LIST_VOLUMES > >
capabilities:<rpc:<type:CREATE_DELETE_SNAPSHOT > >
capabilities:<rpc:<type:LIST_SNAPSHOTS > >
capabilities:<rpc:<type:EXPAND_VOLUME > >
capabilities:<rpc:<type:LIST_VOLUMES_PUBLISHED_NODES > > "
version=v4.5.1
...
Node
-----
time="2023-04-06T16:40:33Z" level=info msg="removing socket" host_id=349266693 region=nyc3 socket=/csi/csi.sock version=v4.5.1
time="2023-04-06T16:40:33Z" level=info msg="starting server" grpc_addr=/csi/csi.sock host_id=349266693 http_addr= region=nyc3 version=v4.5.1
time="2023-04-06T16:40:33Z" level=info msg="probe called" host_id=349266693 method=probe region=nyc3 version=v4.5.1
time="2023-04-06T16:40:33Z" level=info msg="get plugin info called" host_id=349266693 method=get_plugin_info region=nyc3
response="name:\"dobs.csi.digitalocean.com\" vendor_version:\"v4.5.1\" " version=v4.5.1
time="2023-04-06T16:40:33Z" level=info msg="probe called" host_id=349266693 method=probe region=nyc3 version=v4.5.1
time="2023-04-06T16:40:33Z" level=info msg="get plugin capabitilies called" host_id=349266693 method=get_plugin_capabilities region=nyc3
response="capabilities:<service:<type:CONTROLLER_SERVICE > >
capabilities:<service:<type:VOLUME_ACCESSIBILITY_CONSTRAINTS > >
capabilities:<volume_expansion:<type:ONLINE > > "
version=v4.5.1
time="2023-04-06T16:40:33Z" level=info msg="node get info called" host_id=349266693 method=node_get_info region=nyc3 version=v4.5.1
time="2023-04-06T16:40:33Z" level=info msg="probe called" host_id=349266693 method=probe region=nyc3 version=v4.5.1
time="2023-04-06T16:40:33Z" level=info msg="node get capabilities called" host_id=349266693 method=node_get_capabilities
node_capabilities="[rpc:<type:STAGE_UNSTAGE_VOLUME >
rpc:<type:EXPAND_VOLUME >
rpc:<type:GET_VOLUME_STATS > ]"
region=nyc3
version=v4.5.1
...
I see no errors or warning or any other information in the logs about the plugin trying to find or attach volumes for the job. If I at least had an error, I could have something to work on.
From the command line I check the status:
$ nomad plugin status cloud-provider
ID = cloud-provider
Provider = dobs.csi.digitalocean.com
Version = v4.5.1
Controllers Healthy = 1
Controllers Expected = 1
Nodes Healthy = 4
Nodes Expected = 4
Allocations
ID Node ID Task Group Version Desired Status Created Modified
0eddf2bd 8d9d9f32 primary 0 run running 2m2s ago 2m1s ago
f1173ddb 8c5733e8 primary 0 run running 2m2s ago 2m2s ago
5a7d760d 665a1555 primary 0 run running 2m2s ago 2m2s ago
a24c0c4f 59c3dcb3 primary 0 run running 2m2s ago 2m2s ago
ae4bc347 59c3dcb3 primary 0 run running 4m ago 3m49s ago
---
$ nomad volume status
Container Storage Interface
No CSI volumes
What should be my next step if discovering why the volumes are not being found and used?
Nomad 1.5.0 and 1.5.2 (across multiple nodes)