Hi, I’m trying out CSI with aws-ebs-csi-driver in my Nomad cluster, but my controller and nodes plugin stay in a “unhealthy” state after their corresponding tasks are launched, and I’m puzzled why.
I followed the tutorial in Stateful Workloads with Container Storage Interface | Nomad - HashiCorp Learn and submitted the following snippet to my Nomad cluster:
job "plugin-aws-ebs-controller" {
datacenters = ["dc1"]
group "controller" {
task "plugin" {
driver = "docker"
config {
image = "amazon/aws-ebs-csi-driver:v1.4.0"
args = [
"controller",
"--endpoint=unix://csi/csi.sock",
"--logtostderr",
"--v=5",
]
}
csi_plugin {
id = "aws-ebs0"
type = "controller"
mount_dir = "/csi"
}
resources {
cpu = 500
memory = 256
}
}
}
}
The task was launched successfully, but the plugin doesn’t seem to register as it show up as “Unhealthy” under /ui/csi/plugins
. When I check the plugin from the API at /v1/plugin/csi/aws-ebs0
, I get this response:
{
"Allocations": [],
"ControllerRequired": false,
"Controllers": {},
"ControllersExpected": 1,
"ControllersHealthy": 0,
"CreateIndex": 944,
"ID": "aws-ebs0",
"ModifyIndex": 951,
"Nodes": {},
"NodesExpected": 0,
"NodesHealthy": 0,
"Provider": "",
"Version": ""
}
Could someone point me to how I could debug this? I’ve checked the docker container running the CSI controller, it seem to be running fine and it’s listening on the socket unix://csi/csi.sock
, but I don’t think it’s receiving any RPC request from the Nomad cluster.
One of the questions I have in mind is – does the socket need to be mounted from within the container back into the host, where the nomad agent is running? I had assumed the csi_plugin => mount_dir
option would do this automagically but I could be wrong.