I have setup a nomad cluster in Digital Ocean using do-hashicorp-cluster which is quite lovely and very well done.
I wanted to use DO Volumes for stateful data and found This example
I’m having issue with it, however. When I bring up the example redis job I get this error in the allocation:
2021-08-23T13:00:27+01:00 Setup Failure failed to setup alloc: pre-run hook "csi_hook" failed: claim volumes: rpc error: cannot change attachment mode of claimed volume
Additionally I’m somewhat confused as to why terraform is being used to create the volume instead of letting the CSI driver do it. Is this something specific to DO?
Thanks!
Hi @bhechinger,
Thanks for using nomad! I’m sorry to hear you are having issues with the csi_plugin
feature.
I’m currently working on re-creating your issue and will let you know what I find.
Cheers,
@DerekStrickland and the Nomad Team
That’s amazing, thank you so much!
Hi @bhechinger,
Thanks for reporting this issue. There was some outdated terraform configuration in that demo folder. You can either check out this pending PR and grab the updates from the branch, or wait for it to get merged into main
.
Thanks again for using Nomad!
@DerekStrickland and the Nomad team
I probably won’t get to messing with this again until Monday. I’ll report back here though once I’ve had a chance to try it again.
Thanks again for all your help!
Sounds great. Have a wonderful weekend!
Failed as follows:
Recent Events:
Time Type Description
2021-09-01T12:24:19+01:00 Setup Failure failed to setup alloc: pre-run hook "csi_hook" failed: claim volumes: rpc error: controller publish: attach volume: controller attach volume: CSI.ControllerAttachVolume: rpc error: code = Unknown desc = POST https://api.digitalocean.com/v2/volumes/1ed876e7-0b17-11ec-992a-0a58ac144393/actions: 404 (request "5fcadc85-3cf8-4f05-86dd-d534036040aa") invalid volume id: allocation not found
2021-09-01T12:24:17+01:00 Received Task received by client
Are you running the demo inside the do-hashicorp-cluster container? That tripped me up. I had to clone nomad there, set the do_token
variable, run the ssh tunnels script that do-hashicorp-cluster provided, and run terraform apply
from inside the container.
I’m not, but if it’s only relying on the tunnels I copied that script out of the repo and have one that I run so that I can get the tunnels without the container running.
Just tried this from inside the container just in case there was something else going on I was not aware of. Made zero difference. Get the same exact error.
Just curious if you made any progress? I haven’t forgotten about you. I just can’t seem to reproduce the error.
None whatsoever.
Let me tear it all down and stand it back up documenting exactly what steps I’ve taken so we can see if maybe I’m missing something?
Just sat down and tried this again and this time it worked.
So it… worked? Maybe?
In the job spec it has this:
volume_mount {
volume = "test"
destination = "/test"
}
Should it have mounted that into the container? I don’t see it there, however:
root@79416ca8426d:/data# mount | grep temp
root@79416ca8426d:/data# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 25G 3.3G 21G 14% /
tmpfs 64M 0 64M 0% /dev
tmpfs 489M 0 489M 0% /sys/fs/cgroup
shm 64M 0 64M 0% /dev/shm
/dev/vda1 25G 3.3G 21G 14% /data
tmpfs 1.0M 0 1.0M 0% /secrets
udev 472M 0 472M 0% /test
tmpfs 489M 0 489M 0% /proc/acpi
tmpfs 489M 0 489M 0% /proc/scsi
tmpfs 489M 0 489M 0% /sys/firmware
root@79416ca8426d:/data# ls /test
/test
root@79416ca8426d:/data# ls -l /test
brw-rw---- 1 root disk 8, 0 Mar 1 11:57 /test
root@79416ca8426d:/data#
That doesn’t seem right?