I’m looking for a way to restrict network communication in Nomad between different groups/jobs on the same physical host when using bridge
mode for networking.
I tested this by creating two jobs, identical except for the job name:
job "one" {
datacenters = ["dc1"]
type = "service"
group "main" {
network {
mode = "bridge"
}
task "ubuntu" {
driver = "docker"
config {
image = "alpine:latest"
command = "sleep"
args = ["infinity"]
}
}
}
}
Then, in the first container: nc -v -l -p 1234
And in the second: echo 'hello world!' | nc 172.26.64.2 1234
Result: hello world!
is printed to the shell connected to the first container, proving that arbitrary network traffic is possible between the two containers.
(Normally I would just ping one container from the other, but that doesn’t seem to be possible here, probably because the containers don’t have CAP_NET_RAW
or something.)
Ideally, I’d like to be able to disable network communication between containers in bridge
networking mode, and then selectively re-enable it with Consul Connect so that containers can only communicate to specific endpoints on specific ports. Is that possible?
I guess it doesn’t specifically have to use bridge
networking, I just assumed that would be a requirement because you have to use bridge
mode for Consul Connect. But if there’s another way to accomplish my goal (no network traffic between distinct jobs/groups that isn’t explicitly allowed) then I’m all ears.
Hi @jfmontanaro, when using bridge networking Nomad creates a nomad bridge on the host that is shared among allocations using bridge networking. The intent behind bridge networking is to enable the Connect model, where tasks need only to bind to 127.0.0.1
inside their group’s network namespace, making them accessible only via the Connect envoy proxy. The security layer then falls on Connect and it’s ACLs/Intentions, etc.
For a more advanced networking model Nomad supports CNI plugins.
(Normally I would just ping one container from the other, but that doesn’t seem to be possible here, probably because the containers don’t have CAP_NET_RAW or something.)
Indeed, Nomad’s docker
and exec
/java
task drivers drop CAP_NET_RAW
by default.
Hi Seth, thanks for your reply.
This is what I expected to be the case, and I’m perfectly happy to use Connect/Consul ACL’s to restrict access as much as necessary. My concern was that it seems possible to bypass Connect (and therefore ignore whatever ACLs are in place) by just using the target task’s IP address directly.
In theory, then, an attacker who gained access to one application running in Nomad could potentially exploit this fact to gain access to other, unrelated Nomad-controlled applications. For example, a lot of applications store session data in Redis. Redis is notorious for allowing access to anyone who can talk to its network socket, so if any other Nomad task that happens to be on the same host can talk to this Redis server then any other task could potentially take control of the Redis-backed application.
It’s sounding like Nomad doesn’t really have any mechanism to lock down network traffic the way I’ve described, is that correct?
@jfmontanaro How is that attack vector going to work if the tasks are bound to 127.0.0.1
inside the network namespace?
I run a lot of prepackaged stuff, which doesn’t always expose configs for what address to bind. I’m sure with Redis specifically I could probably make it work, but I’d much prefer to be able to specify at a global level which tasks should be able to access what over the network. Relying on every network-accessible service in every task to bind to localhost just seems very fragile to me.
Ah yes that is true, the security model depends on tasks being configured to do the right thing. In the future we do hope to add support for Connect’s transparent gateway feature which I think would provide the kind of locked down environment you’re looking for, but so far Consul’s and Nomad’s priorities haven’t lined up to make that possible yet.