Possible to dynamically create several blocks of the same name?

I wanted to create a data structure to have a list of datadog_logs_custom_pipeline resources (docs), which has several blocks called processor.

Before I made a dynamic block with content so that I can build the block. However, since there are many blocks with the name processor, is there any way to dynamically create several of these objects.

The data structure is called pipelines where each element can have a map of processors. The pipelines api returns a structure such as a structure, where there is a list called pipelines. Example from the docs:

{
  "filter": {
    "query": "source:python"
  },
  "id": "string",
  "is_enabled": false,
  "is_read_only": false,
  "name": "",
  "processors": [
    {
      "grok": {
        "match_rules": "rule_name_1 foo\nrule_name_2 bar\n",
        "support_rules": "rule_name_1 foo\nrule_name_2 bar\n"
      },
      "is_enabled": false,
      "name": "string",
      "samples": [],
      "source": "message",
      "type": "grok-parser"
    }
  ],
  "type": "pipeline"
}

So again, if I have this list-of-map type of structure, how can I dynamically create processor blocks?

Example TF snippet from the docs:

resource "datadog_logs_custom_pipeline" "sample_pipeline" {
  filter {
    query = "source:foo"
  }
  name       = "sample pipeline"
  is_enabled = true

  processor {
    grok_parser {
      samples = ["sample log 1"]
      source  = "message"
      grok {
        support_rules = ""
        match_rules   = "Rule %%{word:my_word2} %%{number:my_float2}"
      }
      name       = "sample grok parser"
      is_enabled = true
    }
  }
  processor {
    lookup_processor {
      source         = "service_id"
      target         = "service_name"
      lookup_table   = ["1,my service"]
      default_lookup = "unknown service"
      name           = "sample lookup processor"
      is_enabled     = true
    }
  }

Hi @darkn3rd,

This is a horrible one :slight_smile: as the datadog_logs_custom_pipeline resource’s processor block also contains blocks for each processor each with a unique set of attributes (which sometimes are blocks themselves). Although, thankfully, the Nested Schema for processor does define them all.

This would be quite simple if the resource had been written to support Attributes as Blocks for the processor block, but unfortunately it seems not.

My example below takes a sample JSON output (created based upon your example and the docs - as I don’t have the ability to call the datadog API) which is the first 3 processors shown in the datadog_logs_custom_pipeline resource’s example.

It does the following:

  • Transforms the JSON into an HCL object which can be used in the for_each in the dynamic processor block.
  • In each iteration of the dynamic block it checks the processor against nested dynamic blocks
  • If it matches an included dynamic block it then includes this as dynamic content, setting the attributes
  • where a specific processor also includes nested blocks it creates those nested blocks.

I have included a datadog_logs_custom_pipeline.sample_pipeline resource (copied from the doc example) which shows the static (example) vs. the dynamic. From what I can see, the terraform plan is identical.

Copy the below code block into a main.tf in an empty directory, add the relevant datadog provider block and you should be able to run a terraform plan and experiment as needed to explore the solution

locals {

    # This is sample JSON output from the pipelines API that contains a list of processors
    # that will be used to create the processors in the `datadog_logs_custom_pipeline` resource
    # Note It replicates thje first 3 processors from the example at
    # https://registry.terraform.io/providers/DataDog/datadog/latest/docs/resources/logs_custom_pipeline
    processors_json = <<JSON
{
  "filter": {
    "query": "source:python"
  },
  "id": "string",
  "is_enabled": false,
  "is_read_only": false,
  "name": "",
  "processors": [
    {
      "arithmetic_processor": {
        "expression": "(time1 - time2)*1000",
        "is_enabled": true,
        "is_replace_missing": true,
        "name": "sample arithmetic processor",
        "target": "my_arithmetic"
      }
    },
    {
      "attribute_remapper": {
        "is_enabled": true,
        "name": "sample attribute processor",
        "override_on_conflict": false,
        "preserve_source": true,
        "source_type": "tag",
        "sources": ["db.instance"],
        "target": "db",
        "target_format": "string",
        "target_type": "attribute"
      }
    },
    {
      "category_processor": {
        "category": [
          { "filter": { "query": "@severity: \".\"" }, "name": "debug" },
          { "filter": { "query": "@severity: \"-\"" }, "name": "verbose" }
        ],
        "is_enabled": true,
        "name": "sample category processor",
        "target": "foo.severity"
      }
    }
  ],
  "type": "pipeline"
}
JSON

    # Convert the JSON and extract the list of processors and convert to an object of objects (Using merge and the ... expansion syntax)
    processors = merge(jsondecode(local.processors_json).processors...)
}

# This resource is taken from the example in the docs (using the first 3 processors)
# This is to show a comparison between the static and dynamic approach
resource "datadog_logs_custom_pipeline" "sample_pipeline" {
  filter {
    query = "source:foo"
  }
  name       = "sample pipeline"
  is_enabled = true
  processor {
    arithmetic_processor {
      expression         = "(time1 - time2)*1000"
      target             = "my_arithmetic"
      is_replace_missing = true
      name               = "sample arithmetic processor"
      is_enabled         = true
    }
  }
  processor {
    attribute_remapper {
      sources              = ["db.instance"]
      source_type          = "tag"
      target               = "db"
      target_type          = "attribute"
      target_format        = "string"
      preserve_source      = true
      override_on_conflict = false
      name                 = "sample attribute processor"
      is_enabled           = true
    }
  }
  processor {
    category_processor {
      target = "foo.severity"
      category {
        name = "debug"
        filter {
          query = "@severity: \".\""
        }
      }
      category {
        name = "verbose"
        filter {
          query = "@severity: \"-\""
        }
      }
      name       = "sample category processor"
      is_enabled = true
    }
  }
}


# This resource uses the dynamic block to create the processors
# each processor type is itself a dynamic block with the block
# as each processor type can have different attributes and the
# processor block does not support the attribues as blocks format :(
# https://developer.hashicorp.com/terraform/language/attr-as-blocks
resource "datadog_logs_custom_pipeline" "sample_pipeline_dynamic" {
  filter {
    query = "source:foo"
  }
  name       = "sample pipeline"
  is_enabled = true
  dynamic "processor" {
    for_each = local.processors
    content {
      # dynamic block for each processor type (You will need to add more for other processor types)
      dynamic "arithmetic_processor" {
        for_each = processor.key == "arithmetic_processor" ? [processor.value] : []
        content {
          expression         = arithmetic_processor.value.expression
          target             = arithmetic_processor.value.target
          is_replace_missing = arithmetic_processor.value.is_replace_missing
          name               = arithmetic_processor.value.name
          is_enabled         = arithmetic_processor.value.is_enabled
        }
      }
      dynamic "attribute_remapper" {
        for_each = processor.key == "attribute_remapper" ? [processor.value] : []
        content {
          sources              = attribute_remapper.value.sources
          source_type          = attribute_remapper.value.source_type
          target               = attribute_remapper.value.target
          target_type          = attribute_remapper.value.target_type
          target_format        = attribute_remapper.value.target_format
          preserve_source      = attribute_remapper.value.preserve_source
          override_on_conflict = attribute_remapper.value.override_on_conflict
          name                 = attribute_remapper.value.name
          is_enabled           = attribute_remapper.value.is_enabled
        }
      }
      dynamic "category_processor" {
        for_each = processor.key == "category_processor" ? [processor.value] : []
        content {
          target = category_processor.value.target
          # This processor type has a list of categories that are also dynamic!!!!
          dynamic category {
            for_each = category_processor.value.category
            content {
              name = category.value.name
              filter {
                query = category.value.filter.query
              }
            }
          }
          name       = category_processor.value.name
          is_enabled = category_processor.value.is_enabled
        }
        
      }
    }
  }
}

Hope that helps!

Happy Terraforming

1 Like

This is awesome! I have an list of pipelines, which have a list of processors. Below is an example.

I get this by using:

curl -sX GET "https://api.datadoghq.com/api/v1/logs/config/pipelines" \
-H "Accept: application/json" \
-H "DD-API-KEY: ${DD_API_KEY}" \
-H "DD-APPLICATION-KEY: ${DD_APP_KEY}"

Example

  {
    "id": "xxxxx",
    "type": "pipeline",
    "name": "Datadog Agent",
    "is_enabled": true,
    "is_read_only": true,
    "filter": {
      "query": "some_query"
    },
    "processors": [
      {
        "name": "Parsing Datadog Agent logs",
        "is_enabled": true,
        "source": "message",
        "samples": ["string1", "string2", "string3"],
        "grok": {
          "support_rules": "",
          "match_rules": "some_rule"
        },
        "type": "grok-parser"
      },
      {
        "name": "define timestamp",
        "is_enabled": true,
        "sources": [
          "timestamp"
        ],
        "type": "date-remapper"
      },
      {
        "name": "define level",
        "is_enabled": true,
        "sources": [
          "level"
        ],
        "type": "status-remapper"
      }
    ]
  }
]

So far what I have done was something like this:

locals {
    workspace = trimprefix(var.org_workspace, "my-org-")
    pipelines = jsondecode(file("${path.module}/tfvars/${local.workspace}.pipelines.json"))
}

resource "datadog_logs_custom_pipeline" "pipelines" {
  for_each = { for pipeline in local.pipelines : pipeline.name => pipeline }  

  name       = each.value.name
  is_enabled = each.value.is_enabled
      
  filter {
    query = each.value.filter.query
  }

    # work in progress
    dynamic "processor" {
      for_each = each.value.processors
      content {
        dynamic "grok_parser" {
          for_each = processor.value.type == "grok-parser" ? [processor.value] : []
          content {
            samples = grok_parser.value.samples
            source = grok_parser.value.source
//  cannot get this to work, as it seems to iterate through each line in multiline string
//            dynamic grok {
//             for_each = grok_parser.value.grok
//              content {
//                support_rules = grok.value.support_rules
//                match_rules = grok.value.match_rules
//              }
//            }
// use static for now
           grok {
            support_rules = grok_parser.value.grok.support_rules
            match_rules = grok_parser.value.grok.match_rules

          }
            name               = grok_parser.value.name
            is_enabled         = grok_parser.value.is_enabled
          }
        }
      }
    }

}

But I am not able to get the grok for_each block correct. So I just changed it to static.

I updated the solution I am working with some of the processors (not all of them). I’ll add them as needed. I am pasting below in case others are interested in this.

One thing I am not sure how to do, as it seems that this is recursive, as a processor can be another pipeline with processors in that pipeline. I am not sure how to do that. I suppose I’ll cross that bridge when I get there.

resource "datadog_logs_custom_pipeline" "pipelines" {
  for_each = { for pipeline in local.pipelines : pipeline.name => pipeline }

  name       = each.value.name
  is_enabled = each.value.is_enabled

  filter {
    query = each.value.filter.query
  }

  dynamic "processor" {
    for_each = each.value.processors
    content {
      dynamic "attribute_remapper" {
        for_each = processor.value.type == "attribute-remapper" ? [processor.value] : []
        content {
          sources              = attribute_remapper.value.sources
          source_type          = attribute_remapper.value.source_type
          target               = attribute_remapper.value.target
          target_type          = attribute_remapper.value.target_type
          preserve_source      = attribute_remapper.value.preserve_source
          override_on_conflict = attribute_remapper.value.override_on_conflict

          name       = attribute_remapper.value.name
          is_enabled = attribute_remapper.value.is_enabled
        }
      }

      dynamic "category_processor" {
        for_each = processor.value.type == "category-processor" ? [processor.value] : []
        content {
          target = category_processor.value.target

          dynamic "category" {
            for_each = category_processor.value.categories
            content {
              name = category.value.name
              filter {
                query = category.value.filter.query
              }
            }
          }

          name       = category_processor.value.name
          is_enabled = category_processor.value.is_enabled
        }
      }

      dynamic "date_remapper" {
        for_each = processor.value.type == "date-remapper" ? [processor.value] : []
        content {
          sources = date_remapper.value.sources

          name       = date_remapper.value.name
          is_enabled = date_remapper.value.is_enabled
        }
      }

      dynamic "grok_parser" {
        for_each = processor.value.type == "grok-parser" ? [processor.value] : []
        content {
          samples = grok_parser.value.samples
          source  = grok_parser.value.source
          grok {
            support_rules = grok_parser.value.grok.support_rules
            match_rules   = grok_parser.value.grok.match_rules
          }

          name       = grok_parser.value.name
          is_enabled = grok_parser.value.is_enabled
        }
      }

      dynamic "message_remapper" {
        for_each = processor.value.type == "message-remapper" ? [processor.value] : []
        content {
          sources = message_remapper.value.sources

          name       = message_remapper.value.name
          is_enabled = message_remapper.value.is_enabled
        }
      }

      dynamic "status_remapper" {
        for_each = processor.value.type == "status-remapper" ? [processor.value] : []
        content {
          sources = status_remapper.value.sources

          name       = status_remapper.value.name
          is_enabled = status_remapper.value.is_enabled
        }
      }

      dynamic "trace_id_remapper" {
        for_each = processor.value.type == "trace-id-remapper" ? [processor.value] : []
        content {
          sources = trace_id_remapper.value.sources

          name       = trace_id_remapper.value.name
          is_enabled = trace_id_remapper.value.is_enabled
        }
      }

      dynamic "url_parser" {
        for_each = processor.value.type == "url-parser" ? [processor.value] : []
        content {
          sources                  = url_parser.value.sources
          target                   = url_parser.value.target
          normalize_ending_slashes = url_parser.value.normalize_ending_slashes

          name       = url_parser.value.name
          is_enabled = url_parser.value.is_enabled
        }
      }

      dynamic "user_agent_parser" {
        for_each = processor.value.type == "user-agent-parser" ? [processor.value] : []
        content {
          sources    = user_agent_parser.value.sources
          target     = user_agent_parser.value.target
          is_encoded = user_agent_parser.value.is_encoded

          name       = user_agent_parser.value.name
          is_enabled = user_agent_parser.value.is_enabled
        }
      }

    }
  }

}

Hi @darkn3rd,

It’s looking great - good job!

As the documentation for the grok_parser states the following for the grok block:
grok (Block List, Min: 1, Max: 1) then just using a static block as you have done is all that’s needed.
If there could be 0 or >1 blocks then using the for_each would be relevant, but as the schema states the block will always appear, and as it will only appear once, then using the for_each is unnecessary.

I am wondering if there is method to have recursion, as the datastructure is recursive.

A Terraform provider schema cannot have unbounded nesting of configuration blocks because the schema format in the provider protocol is itself a strict tree.

Therefore there must be a finite amount of nesting possible here, rather than unbounded recursive nesting. While it might require an annoying amount of configuration boilerplate to represent it, you should always be able to describe the full extent of any resource type’s nested block structure using a finite set of dynamic blocks if you want to.

1 Like