How to add to the terraform dependency graph when resources are in different modules?

I’m struggling finding how to make one resource dependent on another, which are in different modules.

My use case is that I have AWS Datasync agents running on AWS EC2 instances launched from the AWS Datasync AMI, and when AWS publish a new AMI when I have instances already running, running the terraform plan wants to replace the instances but this does not replace the agents. I’m using the “ip_address” attribute of the AWS Datasync Agent as the mechanism for activating the agent, and the issue is that since the IP is static (for the AWS instance I’ve defined a separate ENI resource, which makes the IP static when the instance is replaced), there’s no change in the AWS Datasync agent resource. This means that if I run the terraform apply, the instances get replaced, and all the agents go offline with no way to reactivate them, because they were activated against a previous instance on the same IP. What I want to happen is for the agent resources to be replaced if the underlying instance is replaced.

Extra constraints and notes:

  • removing the ENI resource from the terraform configuration will not always result in a new IP being used, because we have limited free IPs in the subnet so it’s likely it will pick the same one anyway, and I need this to work in every case
  • if I have 4 instances running, I may only want to replace 2 of them with the new AMI since the other 2 may be in the middle of running Datasync tasks which I don’t want to interrupt (I can get terraform to only replace 2 instances by setting the AMI id for the other ones as n-1)

My instances and agents are defined within modules in my main.tf (simplified for brevity):

locals {
  datasync_ids = {
    "ds_001" = { some config including which AMI id to use}
    "ds_002" = { some config }
    "ds_003" = { some config }
    "ds_004" = { some config }
}

module "datasyncservers" {
  source = "./datasyncservers"

  datasync_ids = local.datasync_ids
}

module "datasyncagent" {
  source     = "./datasyncagent"
  depends_on = [module.securitygroup]

  datasync_ids     = local.datasync_ids
  datasync_ids_ips = try(module.datasyncservers.datasync_ids_ips, {})
}

Ideally what I want to do is something like, for the Datasync agent resource (the full path of which would be module.datasyncagent.module.agent[var.uid].aws_datasync_agent.datasync):

resource "aws_datasync_agent" "datasync" {
  depends_on = [ module.datasyncservers.module.datsyncinstances[var.uid].aws_instance.datasync_instance ]

where module.datsyncinstances uses a foreach over datasync_ids on the module definition, and var.uid is the key from local.datasync_ids e.g. “ds_001”; but terraform gives me this error if I try this:

Error: Invalid expression

  on datasyncagent/agent/agent.tf line 32, in resource "aws_datasync_agent" "datasync":
  32:   depends_on = [ module.datasyncservers.module.datsyncinstances[var.uid].aws_instance.datasync_instance ]

A single static variable reference is required: only attribute access and
indexing with constant keys. No calculations, function calls, template
expressions, etc are allowed here.

which I think is a result of a few issues, the two main ones being:

  • I can’t directly reference a “fully-qualified” resource in this way since it’s in a different submodule to the one I’m tryng to reference from (i.e. it’s not referencing from the root of the graph, it thinks the reference is relative to the current submodule)
  • I think it doesn’t like the variable as the map index, because terraform can’t calculate exactly what to make this resource dependent upon when planning

If I could do this, I think that would solve it, because it would tie each agent back to the instance it is running on within the terraform dependency graph, which means if I had 4 instances and were only replacing 2 of them, only 2 of the agents would be replaced. There’s only 1 agent per instance, it is a 1-to-1 mapping.

The simplistic way is to make the agent module dependent entirely on the instances module, but this means that all my agents would be replaced if any of the instances were changed, which is unsuitable.

How do I make each agent dependent upon the instance it is activated against? Thanks in advance for discussion and suggestions.

Hi @rba1-source,

I can’t quite picture what it is you are trying to do from the description, a more complete configuration example would be better than trying to infer the details from imprecise terminology.

I would start though by asking why you need depends_on at all? The statement “… all my agents would be replaced if any of the instances were changed” leads me to believe you have a incorrect understanding of what dependencies are in Terraform, because depends_on does not itself cause changes or force replacement, it only helps determine the order of operations when there is no other way to create references between the resources.

Normally dependencies are figured out implicitly via references and the data flow through the configuration. If you do need to create a dependency between resources in separate modules which don’t otherwise reference each other, you can pass some data from one resource through module input variables and outputs, then reference that value within the scope of the module.

Hi @jbardin, yes you’re correct, I was talking about depends_on but what I actually meant was that I need one resource to be replaced when a different resource is replaced where there’s no explicit resource connection between them.

I think my answer will lie in using the lifecycle block replace_triggered_by, or the terraform_data resource (because the instance and the agent are in different modules from the root, so I can use terraform_data to act as an “aws_instance” resource directly in my agent module):

# Replace the agent if the underlying EC2 instance is replaced
resource "terraform_data" "replacement" {
  input = var.datasync_ids_instance_id
  triggers_replace = [
    aws_datasync_agent.datasync
  ]
}