Hi,
I am trying to deploy a job that will run on all nodes with a specific attribute.
I have tried setting the job type as a system and then apply
the constraint with instance type and another try with a meta tag.
I am getting this error
finished with status “complete” but failed to place all allocations
I need to deploy my jobs on all nodes that have a specific attribute \ tag,
so when I add \ remove nodes changes will apply automaticly.
Hi @eyal-ha. Thanks for using Nomad.
Can you post your jobspec and node configuration? Based on the docs, it seems like this should work for you, but it is hard to see what isn’t working without the configuration to review.
Thank you.
this is the node agent condig we have the same config for the second machine group with ec2
/etc/nomad.d/nomad.hcl"
datacenter = “ca-dc”
region = “canada”
data_dir = “/var/lib/nomad”
bind_addr = “0.0.0.0”
leave_on_terminate = true
enable_syslog = true
client {
enabled = true
max_kill_timeout = “180s”
options = {
“docker.auth.config” = “/etc/docker/config.json”
“docker.auth.helper” = “ecr-login”
}
meta {
public_ip = “${public_ip}”
env = “lightsail”
}
}
telemetry {
collection_interval = “10s”
disable_hostname = true
prometheus_metrics = true
publish_allocation_metrics = true
publish_node_metrics = true
datadog_address = “localhost:8125”
}
this is a sample of our job file
job “app” {
datacenters = [“ca-dc”]
type = “system”
update {
health_check = “checks”
min_healthy_time = “10s”
healthy_deadline = “3m”
progress_deadline = “5m”
auto_revert = true
auto_promote = false
canary = 0
stagger = “7s”
}
group “app1” {
constraint {
attribute = “${meta.env}”
value = “lightsail”
}
network{
port “someport”{
static = 1234
}
}
restart {
attempts = 2
interval = “30s”
delay = “5s”
mode = “delay”
}
task "app1-taksk" {
driver = "docker"
env = {
CONFIG1: test
}
config {
network_mode = "host"
logging {
type = "fluentd"
config {
fluentd-address = "localhost:24224"
}
}
ulimit {
nofile = "40960:40960"
}
image = "some image"
}
service {
name = "app1-service"
tags = ["some-tag"]
check {
type = "tcp"
port = "port"
interval = "10s"
timeout = "2s"
}
}
resources {
cpu = 2150
memory = 500
}
kill_timeout = "20s"
}
}
group “app2” {
constraint {
attribute = “${meta.env}”
value = “ec2”
}
network{
port “someport”{
static = 5678
}
}
restart {
attempts = 2
interval = “30s”
delay = “5s”
mode = “delay”
}
task "app2taksk" {
driver = "docker"
env = {
CONFIG1: test
}
config {
# set network mode for docker driver https://github.com/hashicorp/nomad/issues/8747
network_mode = "host"
logging {
type = "fluentd"
config {
fluentd-address = "localhost:24224"
}
}
ulimit {
nofile = "40960:40960"
}
image = "some image"
}
service {
name = "app2-service"
tags = ["some-tag"]
check {
type = "tcp"
port = "port"
interval = "10s"
timeout = "2s"
}
}
resources {
cpu = 2150
memory = 500
}
kill_timeout = "20s"
}
}
}
If I am reading this correctly you are creating a system job, and you have a constraint on one group for meta = lightsail and on another group for meta = ec2. I don’t think that’s going to ever be able to run on the same node. A system job is node specific. I think you need 2 different jobs. One with the group for lightsail and another for the group for ec2.
this is only to show the configs, I have created separate system jobs for each meta tag, it still fails.