How check if a list has any duplicate value(s)?

Hi there!

I’m trying to do an input validation, where I need to make sure values are unique in the list and should return false if not - any idea how to do that efficiently? So far. I could come up with this:

> ! can(length(var.my_list) == length(distinct(var.my_list)))
false

but doesn’t look any good at all. Couldn’t find anything in the forum either. What are my options?

-San

This example you’ve shared is returning whether it’s possible to compare the length of two collections for equality, so I expect this expression will always return false because it’s always possible to compare two numbers (and it’s negated).

However, the part you have inside the can call seems like a reasonable answer to me, reading as “the length of the list should remain the same after removing any duplicates”:

  validation {
    condition     = length(var.my_list) == length(distinct(var.my_list))
    error_message = "All elements must be unique."
  }

If this is a collection of unique items that aren’t in any particular order then a different answer could be to declare the type as being a set rather than a list. Sets automatically coalesce duplicate elements by definition, so you wouldn’t need a validation rule at all in that case:

variable "example" {
  type = set(string)
}

An advantage of this approach is that a potential user of your module can see from the type of the variable that it’s intended to be an unordered set rather than an ordered list. However, it won’t work if you need to preserve the order of elements given by the caller, because list is the only collection kind that preserves ordering.

2 Likes

I didn’t realize that’s the case - I tested with a negative set and that came out false, as expected and thought should be okay. Thanks for pointing out, @apparentlymart

Thanks for your example @apparentlymart , that helps a lot!
An extended question here… is there a way in today’s terraform that we can find out what the duplicate values are in that list?

Cheers,
Michael

Hi @sdhuang32,

I think your goal would require quite a different approach since identifying which elements are duplicated requires preserving more information throughout the process.

One way to do it would be over two steps, both using for expressions:

  1. Use a for expression with braces {}, using the “grouping mode” symbol ..., to produce a mapping from distinct string value to a list of elements of that value.
  2. Use a for expression with brackets [] on the result from the previous step to produce a sequence of keys from the mapping where the associated list has more than one element, which would therefore indicate a duplicate element.

The result would then be a tuple of strings that appeared more than once in the input list. The ordering of the input elements is not preserved, so you may wish to pass the result through the toset function to clearly indicate that the order is not meaningful.

This is an old topic, so if you have further questions please start a new topic so we can discuss without causing notifications for the folks who participated in this one. Thanks!

Hi @apparentlymart ,

I successfully implemented it using the “grouping mode” technique as you mentioned.

Thank you and @dsantanu so much for saving my day!

Cheers,
Michael

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.