How to preserve double backslashes in heredoc?

Terraform documentation says that “Backslash sequences are not interpreted in a heredoc string”, but I have an issue when two consecutive backslashes are replaced with four backslashes.
In order to create aws_glue_catalog_table for vpc flow logs I’m using regex saved in heredoc format and referencing it as input.regex parameter.
When plan is printed, every double backslash is replaced with four backslashes.
Using Terraform v0.12.29

vpc_flow_log_regex = <<EOF
^([^ ]+)\\s+([0-9]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([0-9]+)\\s+([0-9]+)\\s+([^ ]+)\\s+([^ ]+)$
EOF

resource "aws_glue_catalog_table" "vpc_flow_logs" {
...
      parameters = {
        "serialization.format" = "1"
        "input.regex"          = trimspace(local.vpc_flow_log_regex)
      }

Plan output:

+ parameters = {
+ "input.regex" = "^([^ ]+)\\\\s+([0-9]+)\\\\s+([^ ]+)\\\\s+([^ ]+)\\\\s+([^ ]+)\\\\s+([^ ]+)\\\\s+([^ ]+)\\\\s+([^ ]+)\\\\s+([^ ]+)\\\\s+([^ ]+)\\\\s+([0-9]+)\\\\s+([0-9]+)\\\\s+([^ ]+)\\\\s+([^ ]+)$"
+ "serialization.format" = "1"

Hi @romangres!

From what you’ve shared it doesn’t look like those are really four backslashes. Instead, Terraform’s rendering of the plan is illustrating the string value in quotes, and so it’s inserting those extra backslashes for display purposes only, because they would’ve been required if you had written vpc_flow_log_regex in a normal quoted string.

The real value of input.regex, without the quoting for display in the UI, would be this:

^([^ ]+)\\s+([0-9]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([0-9]+)\\s+([0-9]+)\\s+([^ ]+)\\s+([^ ]+)$

I’m not familiar with AWS Glue, so I’m not sure what the expected formatting of input.regex is, but I’d worry that this is still too many backslashes because in typical regular expression syntax \\s is an escaped of \s, not a the “space” metacharacter. If my assumptions here are correct, you’d need to write that string using only single backslashes, which will be taken literally by Terraform (because, as you noted, backslashes are not significant in a “heredoc”):

vpc_flow_log_regex = <<EOF
^([^ ]+)\s+([0-9]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([0-9]+)\s+([0-9]+)\s+([^ ]+)\s+([^ ]+)$
EOF

Due to Terraform’s rendering of this as a quoted string in the plan output, as I noted above you’ll see some extra slashes in the UI to make it valid quoted string syntax for display:

+ "input.regex" = "^([^ ]+)\\s+([0-9]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+([0-9]+)\\s+([0-9]+)\\s+([^ ]+)\\s+([^ ]+)$"

As before, those extra backslashes are there only because this value is being rendered in quotes for the UI, and they won’t be included in the data sent to the remote API.

(In case you’re wondering, Terraform uses the quoted form in the plan output here because you used trimspace on that value and so there are no longer any newlines in it by the time it’s assigned to input.regex. Therefore Terraform is forced to use quoted string syntax to render the value accurately in the UI, because the Terraform language has no way to write a heredoc that doesn’t introduce at least one newline character at the end of the string.)

Running into something similar here (also for creating a glue data table with a gnarly regex…): https://github.com/hashicorp/terraform/issues/24604#issuecomment-705897229

@apparentlymart thank you very much. Your assumptions we correct, I needed to write that string using only single backslashes.
@mwarkentin I believe this will work for you as well
In my case I save string to local variable, and referencing it using trimspace.

locals {
  vpc_flow_log_regex = <<EOF
^([^ ]+)\s+([0-9]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([^ ]+)\s+([0-9]+)\s+([0-9]+)\s+([^ ]+)\s+([^ ]+)$
EOF
}

"input.regex" = trimspace(local.vpc_flow_log_regex)

@romangres thanks for that, I’ll try it out.

I have managed to get this working I think, and it is the case the the \"([^\\s]+?)\" should actually be \"([^s]+?)\"

not sure why the docs needed \s - maybe it needed to be escaped for some reason when being entered through the console

In any case, the final regex I used was ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ ]*)[:-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) ([-0-9]*) ([-0-9]*) "([^ ]*) ([^ ]*) (- |[^ ]*)" "([^"]*)" ([A-Z0-9-]+) ([A-Za-z0-9.-]*) ([^ ]*) "([^"]*)" "([^"]*)" "([^"]*)" ([-.0-9]*) ([^ ]*) "([^"]*)" "([^"]*)" "([^ ]*)" "([^s]+?)" "([^s]+)" "([^ ]*)" "([^ ]*)" being loaded in via file()