Issues with running Terraform in a pod in Kubernetes

We’ve been running Terraform along with Terragrunt in our on prem k8s clusters for quite some time now and started having issues a while back after upgrading TF and TG.
When the same versions of TF and TG are ran on a local Windows system everything works. Note: The cluster is Linux, so that is one difference.
In the k8s clusters we use to have really good success with occasional failures, now we have mostly failures with occasional success. The failure mode is really weird as well with Terraform outputting what looks like verbose logging even though we never set it to verbose. See sample logs below.
The deploys will run until manually stopped, they never complete on their own once they get into this state.
The two primary providers are the Rancher2 provider and Kubernetes provider with state being saved in Artifactory.

Here’s a sample of log output for a successful run.

16:44:00  [terragrunt] [/terraform/jenkins-test] 2020/10/16 00:44:00 Running command: terraform --version
16:44:01  [terragrunt] 2020/10/16 00:44:00 Terraform version: 0.13.3
16:44:01  [terragrunt] 2020/10/16 00:44:00 Reading Terragrunt config file at /terraform/jenkins-test/terragrunt.hcl
16:44:01  [terragrunt] 2020/10/16 00:44:00 Generated file /terraform/jenkins-test/rancher2_provider_tg.tf.
16:44:01  [terragrunt] 2020/10/16 00:44:00 Generated file /terraform/jenkins-test/kubernetes_provider_tg.tf.
16:44:01  [terragrunt] 2020/10/16 00:44:00 Generated file /terraform/jenkins-test/versions_tg.tf.
16:44:01  [terragrunt] 2020/10/16 00:44:00 Generated file /terraform/jenkins-test/variables_tg.tf.
16:44:01  [terragrunt] 2020/10/16 00:44:00 Generated file /terraform/jenkins-test/backend_tg.tf.
16:44:01  [terragrunt] 2020/10/16 00:44:00 Initializing remote state for the artifactory backend
16:44:01  [terragrunt] 2020/10/16 00:44:00 Running command: terraform init
16:44:01  e[0me[1mInitializing modules...e[0m
16:44:01  Downloading git::https://bitbucket.metro.ad.selinc.com/scm/jenkinsconfig/terraform-jenkins-configuration.git for jenkins...
16:44:01  - jenkins in .terraform/modules/jenkins
16:44:01  
16:44:01  e[0me[1mInitializing the backend...e[0m
16:44:01  e[0me[32m
16:44:01  Successfully configured the backend "artifactory"! Terraform will automatically
16:44:01  use this backend unless the backend configuration changes.e[0m
16:44:01  
16:44:01  e[0me[1mInitializing provider plugins...e[0m
16:44:01  - Finding rancher/rancher2 versions matching "~> 1.10.3"...
16:44:01  - Finding hashicorp/kubernetes versions matching "~> 1.13.2"...
16:44:03  - Installing rancher/rancher2 v1.10.3...
16:44:10  - Installed rancher/rancher2 v1.10.3 (signed by a HashiCorp partner, key ID e[0me[1m2EEB0F9AD44A135Ce[0me[0m)
16:44:10  - Installing hashicorp/kubernetes v1.13.2...
16:44:11  - Installed hashicorp/kubernetes v1.13.2 (signed by HashiCorp)
16:44:11  
16:44:11  Partner and community providers are signed by their developers.
16:44:11  If you'd like to know more about provider signing, you can read about it here:
16:44:11  https://www.terraform.io/docs/plugins/signing.html
16:44:11  
16:44:11  e[0me[1me[32mTerraform has been successfully initialized!e[0me[32me[0m
16:44:11  e[0me[32m
16:44:11  You may now begin working with Terraform. Try running "terraform plan" to see
16:44:11  any changes that are required for your infrastructure. All Terraform commands
16:44:11  should now work.
16:44:11  
16:44:11  If you ever set or change modules or backend configuration for Terraform,
16:44:11  rerun this command to reinitialize your working directory. If you forget, other
16:44:11  commands will detect it and remind you to do so if necessary.e[0m
16:44:11  [terragrunt] [/terraform/jenkins-test] 2020/10/16 00:44:11 Running command: terraform --version
16:44:12  [terragrunt] 2020/10/16 00:44:12 Terraform version: 0.13.3
16:44:12  [terragrunt] 2020/10/16 00:44:12 Reading Terragrunt config file at /terraform/jenkins-test/terragrunt.hcl
16:44:12  [terragrunt] 2020/10/16 00:44:12 The file path /terraform/jenkins-test/rancher2_provider_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
16:44:12  [terragrunt] 2020/10/16 00:44:12 Generated file /terraform/jenkins-test/rancher2_provider_tg.tf.
16:44:12  [terragrunt] 2020/10/16 00:44:12 The file path /terraform/jenkins-test/kubernetes_provider_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
16:44:12  [terragrunt] 2020/10/16 00:44:12 Generated file /terraform/jenkins-test/kubernetes_provider_tg.tf.
16:44:12  [terragrunt] 2020/10/16 00:44:12 The file path /terraform/jenkins-test/versions_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
16:44:12  [terragrunt] 2020/10/16 00:44:12 Generated file /terraform/jenkins-test/versions_tg.tf.
16:44:12  [terragrunt] 2020/10/16 00:44:12 The file path /terraform/jenkins-test/variables_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
16:44:12  [terragrunt] 2020/10/16 00:44:12 Generated file /terraform/jenkins-test/variables_tg.tf.
16:44:12  [terragrunt] 2020/10/16 00:44:12 The file path /terraform/jenkins-test/backend_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
16:44:12  [terragrunt] 2020/10/16 00:44:12 Generated file /terraform/jenkins-test/backend_tg.tf.
16:44:12  [terragrunt] 2020/10/16 00:44:12 Backend artifactory has not changed.
16:44:12  [terragrunt] 2020/10/16 00:44:12 Running command: terraform apply -var-file=../common.tfvars -auto-approve
16:44:20  e[0me[1mmodule.jenkins.rancher2_namespace.jenkins-namespace: Refreshing state... [id=jenkins-test]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_secret.additional-users[0]: Refreshing state... [id=jenkins-test:deploy-creds]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_secret.bitbucket-service-account: Refreshing state... [id=jenkins-test:bitbucket-service-account]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_secret.teams-webhook[0]: Refreshing state... [id=jenkins-test:teams-webhook]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_secret.jira-service-account: Refreshing state... [id=jenkins-test:jira-service-account]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_secret.sonarqube-token[0]: Refreshing state... [id=jenkins-test:sonarqube-token]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_secret.artifactory-service-account: Refreshing state... [id=jenkins-test:artifactory-service-account]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_secret.kubernetes-token: Refreshing state... [id=jenkins-test:kubernetes-token]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_secret.container-repositories[0]: Refreshing state... [id=jenkins-test:container-repositories]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_registry.regcred[0]: Refreshing state... [id=jenkins-test:regcred]e[0m
16:44:20  e[0me[1mmodule.jenkins.rancher2_secret.bitbucket-hook-token: Refreshing state... [id=jenkins-test:bitbucket-hook-token]e[0m
16:44:21  e[0me[1mmodule.jenkins.kubernetes_persistent_volume_claim.jenkins-pv: Refreshing state... [id=jenkins-test/jenkins-test]e[0m
16:44:22  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Refreshing state... [id=p-ssc7m:jenkins-test]e[0m
16:44:29  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Modifying... [id=p-ssc7m:jenkins-test]e[0me[0m
16:44:39  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Still modifying... [id=p-ssc7m:jenkins-test, 10s elapsed]e[0me[0m
16:44:49  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Still modifying... [id=p-ssc7m:jenkins-test, 20s elapsed]e[0me[0m
16:44:59  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Still modifying... [id=p-ssc7m:jenkins-test, 30s elapsed]e[0me[0m
16:49:01  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Still modifying... [id=p-ssc7m:jenkins-test, 40s elapsed]e[0me[0m
16:49:01  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Still modifying... [id=p-ssc7m:jenkins-test, 50s elapsed]e[0me[0m
16:49:01  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Still modifying... [id=p-ssc7m:jenkins-test, 1m0s elapsed]e[0me[0m
16:49:01  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Still modifying... [id=p-ssc7m:jenkins-test, 1m10s elapsed]e[0me[0m
16:49:01  e[0me[1mmodule.jenkins.rancher2_app.jenkins: Modifications complete after 1m12s [id=p-ssc7m:jenkins-test]e[0me[0m
16:49:01  e[0me[1me[32m
16:49:01  Apply complete! Resources: 0 added, 1 changed, 0 destroyed.e[0m
16:49:04  Posting build status of SUCCESSFUL to Bitbucket for commit id [a9de72041505d536aef85e73566141cd87d266ac] and ref 'null'
16:49:06  Finished: SUCCESS

And here is what an unsuccessful deploy looks like. This is highly truncated for space and clarity

12:14:40  [terragrunt] [/terraform/jenkins-edapthw] 2020/11/17 20:14:40 Running command: terraform --version
12:14:41  [terragrunt] 2020/11/17 20:14:41 Terraform version: 0.13.5
12:14:41  [terragrunt] 2020/11/17 20:14:41 Reading Terragrunt config file at /terraform/jenkins-edapthw/terragrunt.hcl
12:14:41  [terragrunt] 2020/11/17 20:14:41 Generated file /terraform/jenkins-edapthw/rancher2_provider_tg.tf.
12:14:41  [terragrunt] 2020/11/17 20:14:41 Generated file /terraform/jenkins-edapthw/kubernetes_provider_tg.tf.
12:14:41  [terragrunt] 2020/11/17 20:14:41 Generated file /terraform/jenkins-edapthw/versions_tg.tf.
12:14:41  [terragrunt] 2020/11/17 20:14:41 Generated file /terraform/jenkins-edapthw/variables_tg.tf.
12:14:41  [terragrunt] 2020/11/17 20:14:41 Generated file /terraform/jenkins-edapthw/backend_tg.tf.
12:14:41  [terragrunt] 2020/11/17 20:14:41 Initializing remote state for the artifactory backend
12:14:41  [terragrunt] 2020/11/17 20:14:41 Running command: terraform init
12:14:42  e[0me[1mInitializing modules...e[0m
12:14:42  Downloading git::https://bitbucket.metro.ad.selinc.com/scm/jenkinsconfig/terraform-jenkins-configuration.git for jenkins...
12:14:42  - jenkins in .terraform/modules/jenkins
12:14:42  
12:14:42  e[0me[1mInitializing the backend...e[0m
12:14:42  e[0me[32m
12:14:42  Successfully configured the backend "artifactory"! Terraform will automatically
12:14:42  use this backend unless the backend configuration changes.e[0m
12:14:42  
12:14:42  e[0me[1mInitializing provider plugins...e[0m
12:14:42  - Finding rancher/rancher2 versions matching "~> 1.10.3"...
12:14:43  - Finding hashicorp/kubernetes versions matching "~> 1.13.2"...
12:14:45  - Installing rancher/rancher2 v1.10.6...
12:15:03  - Installed rancher/rancher2 v1.10.6 (signed by a HashiCorp partner, key ID e[0me[1m2EEB0F9AD44A135Ce[0me[0m)
12:15:03  - Installing hashicorp/kubernetes v1.13.3...
12:15:04  - Installed hashicorp/kubernetes v1.13.3 (signed by HashiCorp)
12:15:04  
12:15:04  Partner and community providers are signed by their developers.
12:15:04  If you'd like to know more about provider signing, you can read about it here:
12:15:04  https://www.terraform.io/docs/plugins/signing.html
12:15:05  
12:15:05  e[33m
12:15:05  e[1me[33mWarning: e[0me[0me[1mInterpolation-only expressions are deprecatede[0m
12:15:05  
12:15:05  e[0m  on .terraform/modules/jenkins/container_repositories.tf line 37, in resource "rancher2_secret" "kaniko-secret":
12:15:05    37:         e[4m"${var.containerRepositories[0].url}"e[0m = {
12:15:05  e[0m
12:15:05  Terraform 0.11 and earlier required all non-constant expressions to be
12:15:05  provided via interpolation syntax, but this pattern is now deprecated. To
12:15:05  silence this warning, remove the "${ sequence from the start and the }"
12:15:05  sequence from the end of this expression, leaving just the inner expression.
12:15:05  
12:15:05  Template interpolation syntax is still used to construct strings from
12:15:05  expressions when the template includes multiple interpolation sequences or a
12:15:05  mixture of literal strings and interpolations. This deprecation applies only
12:15:05  to templates that consist entirely of a single interpolation sequence.
12:15:05  
12:15:05  (and 19 more similar warnings elsewhere)
12:15:05  e[0me[0m
12:15:05  e[0me[1me[32mTerraform has been successfully initialized!e[0me[32me[0m
12:15:05  e[0me[32m
12:15:05  You may now begin working with Terraform. Try running "terraform plan" to see
12:15:05  any changes that are required for your infrastructure. All Terraform commands
12:15:05  should now work.
12:15:05  
12:15:05  If you ever set or change modules or backend configuration for Terraform,
12:15:05  rerun this command to reinitialize your working directory. If you forget, other
12:15:05  commands will detect it and remind you to do so if necessary.e[0m
12:15:05  [terragrunt] [/terraform/jenkins-edapthw] 2020/11/17 20:15:05 Running command: terraform --version
12:15:06  [terragrunt] 2020/11/17 20:15:06 Terraform version: 0.13.5
12:15:06  [terragrunt] 2020/11/17 20:15:06 Reading Terragrunt config file at /terraform/jenkins-edapthw/terragrunt.hcl
12:15:06  [terragrunt] 2020/11/17 20:15:06 The file path /terraform/jenkins-edapthw/kubernetes_provider_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
12:15:06  [terragrunt] 2020/11/17 20:15:06 Generated file /terraform/jenkins-edapthw/kubernetes_provider_tg.tf.
12:15:06  [terragrunt] 2020/11/17 20:15:06 The file path /terraform/jenkins-edapthw/versions_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
12:15:06  [terragrunt] 2020/11/17 20:15:06 Generated file /terraform/jenkins-edapthw/versions_tg.tf.
12:15:06  [terragrunt] 2020/11/17 20:15:06 The file path /terraform/jenkins-edapthw/variables_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
12:15:06  [terragrunt] 2020/11/17 20:15:06 Generated file /terraform/jenkins-edapthw/variables_tg.tf.
12:15:06  [terragrunt] 2020/11/17 20:15:06 The file path /terraform/jenkins-edapthw/rancher2_provider_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
12:15:06  [terragrunt] 2020/11/17 20:15:06 Generated file /terraform/jenkins-edapthw/rancher2_provider_tg.tf.
12:15:06  [terragrunt] 2020/11/17 20:15:06 The file path /terraform/jenkins-edapthw/backend_tg.tf already exists, but was a previously generated file by terragrunt. Since if_exists for code generation is set to "overwrite_terragrunt", regenerating file.
12:15:06  [terragrunt] 2020/11/17 20:15:06 Generated file /terraform/jenkins-edapthw/backend_tg.tf.
12:15:06  [terragrunt] 2020/11/17 20:15:06 Backend artifactory has not changed.
12:15:06  [terragrunt] 2020/11/17 20:15:06 Running command: terraform plan -var-file=../common.tfvars
12:15:18  e[0me[1mRefreshing Terraform state in-memory prior to plan...e[0m
12:15:18  The refreshed state will be used to calculate this plan, but will not be
12:15:18  persisted to local or remote state storage.
12:15:18  e[0m
12:15:20  e[0me[1mmodule.jenkins.rancher2_namespace.jenkins-namespace: Refreshing state... [id=jenkins-edapthw]e[0m
12:15:59  2020/11/17 20:15:55 [INFO] Terraform version: 0.13.5  
12:15:59  2020/11/17 20:15:55 [INFO] Go runtime version: go1.14.7
12:15:59  2020/11/17 20:15:55 [INFO] CLI args: []string{"terraform", "plan", "-var-file=../common.tfvars"}
12:15:59  2020/11/17 20:15:55 [DEBUG] Attempting to open CLI config file: /root/.terraformrc
12:15:59  2020/11/17 20:15:55 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
12:15:59  2020/11/17 20:15:55 [DEBUG] ignoring non-existing provider search directory terraform.d/plugins
12:15:59  2020/11/17 20:15:55 [DEBUG] ignoring non-existing provider search directory /root/.terraform.d/plugins
12:15:59  2020/11/17 20:15:55 [DEBUG] ignoring non-existing provider search directory /root/.local/share/terraform/plugins
12:15:59  2020/11/17 20:15:55 [DEBUG] ignoring non-existing provider search directory /usr/local/share/terraform/plugins
12:15:59  2020/11/17 20:15:55 [DEBUG] ignoring non-existing provider search directory /usr/share/terraform/plugins
12:15:59  2020/11/17 20:15:55 [INFO] CLI command args: []string{"plan", "-var-file=../common.tfvars"}
12:15:59  2020/11/17 20:15:55 [TRACE] Meta.Backend: built configuration for "artifactory" backend with hash value 3773684453
12:15:59  2020/11/17 20:15:55 [TRACE] Preserving existing state lineage "d44f6141-f343-24c5-ead7-7f17f6e2950e"
12:15:59  2020/11/17 20:15:55 [TRACE] Preserving existing state lineage "d44f6141-f343-24c5-ead7-7f17f6e2950e"
12:15:59  2020/11/17 20:15:55 [TRACE] Meta.Backend: working directory was previously initialized for "artifactory" backend
12:15:59  2020/11/17 20:15:55 [TRACE] Meta.Backend: using already-initialized, unchanged "artifactory" backend configuration
12:15:59  2020/11/17 20:15:55 [TRACE] Meta.Backend: instantiated backend of type *artifactory.Backend
12:15:59  2020/11/17 20:15:55 [TRACE] providercache.fillMetaCache: scanning directory .terraform/plugins
12:15:59  2020/11/17 20:15:55 [TRACE] getproviders.SearchLocalDirectory: .terraform/plugins is a symlink to .terraform/plugins
12:15:59  2020/11/17 20:15:55 [TRACE] getproviders.SearchLocalDirectory: found registry.terraform.io/hashicorp/kubernetes v1.13.3 for linux_amd64 at .terraform/plugins/registry.terraform.io/hashicorp/kubernetes/1.13.3/linux_amd64
12:15:59  2020/11/17 20:15:55 [TRACE] getproviders.SearchLocalDirectory: found registry.terraform.io/rancher/rancher2 v1.10.6 for linux_amd64 at .terraform/plugins/registry.terraform.io/rancher/rancher2/1.10.6/linux_amd64
12:15:59  2020/11/17 20:15:55 [TRACE] providercache.fillMetaCache: including .terraform/plugins/registry.terraform.io/hashicorp/kubernetes/1.13.3/linux_amd64 as a candidate package for registry.terraform.io/hashicorp/kubernetes 1.13.3
12:15:59  2020/11/17 20:15:55 [TRACE] providercache.fillMetaCache: including .terraform/plugins/registry.terraform.io/rancher/rancher2/1.10.6/linux_amd64 as a candidate package for registry.terraform.io/rancher/rancher2 1.10.6
12:15:59  2020/11/17 20:15:56 [TRACE] providercache.fillMetaCache: using cached result from previous scan of .terraform/plugins
12:15:59  2020/11/17 20:15:56 [DEBUG] checking for provisioner in "."
12:15:59  2020/11/17 20:15:56 [DEBUG] checking for provisioner in "/terraform"
12:15:59  2020/11/17 20:15:56 [INFO] Failed to read plugin lock file .terraform/plugins/linux_amd64/lock.json: open .terraform/plugins/linux_amd64/lock.json: no such file or directory
12:15:59  2020/11/17 20:15:56 [TRACE] Meta.Backend: backend *artifactory.Backend does not support operations, so wrapping it in a local backend
12:15:59  2020/11/17 20:15:56 [INFO] backend/local: starting Plan operation
12:15:59  2020/11/17 20:15:56 [TRACE] backend/local: requesting state manager for workspace "default"
12:15:59  2020/11/17 20:15:56 [TRACE] backend/local: requesting state lock for workspace "default"
12:15:59  2020/11/17 20:15:56 [TRACE] backend/local: reading remote state for workspace "default"
12:15:59  2020/11/17 20:15:56 [TRACE] backend/local: retrieving local state snapshot for workspace "default"
12:15:59  2020/11/17 20:15:56 [TRACE] backend/local: building context for current working directory
12:15:59  2020/11/17 20:15:56 [TRACE] terraform.NewContext: starting
12:15:59  2020/11/17 20:15:56 [TRACE] terraform.NewContext: loading provider schemas
12:15:59  2020/11/17 20:15:56 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/hashicorp/kubernetes"
12:15:59  2020-11-17T20:15:56.883Z [INFO]  plugin: configuring client automatic mTLS
12:15:59  2020-11-17T20:15:56.973Z [DEBUG] plugin: starting plugin: path=.terraform/plugins/registry.terraform.io/hashicorp/kubernetes/1.13.3/linux_amd64/terraform-provider-kubernetes_v1.13.3_x4 args=[.terraform/plugins/registry.terraform.io/hashicorp/kubernetes/1.13.3/linux_amd64/terraform-provider-kubernetes_v1.13.3_x4]
12:15:59  2020-11-17T20:15:56.974Z [DEBUG] plugin: plugin started: path=.terraform/plugins/registry.terraform.io/hashicorp/kubernetes/1.13.3/linux_amd64/terraform-provider-kubernetes_v1.13.3_x4 pid=445
12:15:59  2020-11-17T20:15:56.974Z [DEBUG] plugin: waiting for RPC address: path=.terraform/plugins/registry.terraform.io/hashicorp/kubernetes/1.13.3/linux_amd64/terraform-provider-kubernetes_v1.13.3_x4
12:15:59  2020-11-17T20:15:57.491Z [INFO]  plugin.terraform-provider-kubernetes_v1.13.3_x4: configuring server automatic mTLS: timestamp=2020-11-17T20:15:57.488Z
12:15:59  2020-11-17T20:15:57.685Z [DEBUG] plugin.terraform-provider-kubernetes_v1.13.3_x4: plugin address: address=/tmp/plugin482242119 network=unix timestamp=2020-11-17T20:15:57.683Z
12:15:59  2020-11-17T20:15:57.686Z [DEBUG] plugin: using plugin: version=5
12:15:59  2020/11/17 20:15:58 [TRACE] GRPCProvider: GetSchema
12:15:59  2020-11-17T20:15:58.176Z [TRACE] plugin.stdio: waiting for stdio data
12:16:01  2020/11/17 20:16:01 [TRACE] No provider meta schema returned
12:23:09  Aborted by Morgan Howard

Here’s the Docker container we build to do the deployes

# ##########################
# This is the Terraform Jenkins Deployer container image.
# It is used to deploy Jenkins instances using Terragrunt and Terraform
# Author: Dave Sargent
# Date: 06/04/2020
# ##########################

FROM docker.sel.inc/ubuntu:focal-20201008

LABEL maintainer="Dave Sargent <dave_sargent@selinc.com>"

# Needed to install packages, get remote repos and for Terraform to install providers during init
ENV http_proxy "http://10.105.116.2:8080/"
ENV https_proxy "http://10.105.116.2:8080/"
ENV HTTP_PROXY "http://10.105.116.2:8080/"
ENV HTTPS_PROXY "http://10.105.116.2:8080/"
ENV no_proxy "localhost,localaddress,rancher2.ad.selinc.com,ad.selinc.com,sel.inc,127.0.0.1,127.0.0.0,0.0.0.0,10.43.0.1,127.0.0.0/8,10.0.0.0/8,10.*.*.*,172.16.0.0/12,192.168.0.0/16,192.168.0.*"
ENV NO_PROXY "localhost,localaddress,rancher2.ad.selinc.com,ad.selinc.com,sel.inc,127.0.0.1,127.0.0.0,0.0.0.0,10.43.0.1,127.0.0.0/8,10.0.0.0/8,10.*.*.*,172.16.0.0/12,192.168.0.0/16,192.168.0.*"
 
# Install all of the tools we will need such as wget, git, unzip, etc
RUN apt-get update && apt-get full-upgrade -y && apt-get install -y --no-install-recommends \
    wget unzip git ca-certificates \
    && apt-get -qq clean \
    && rm -rf /var/lib/apt/lists/*

# Create and set the current directory to /terraform/
WORKDIR /terraform/

# Set up credential caching so when Terraform uses git to grab the remote data from BitBucket it uses the cached key.
# This assumes we will be cloaning a repo using the Personal Access Token before running TG
RUN git config --global credential.helper cache && \
    git config --global credential.helper 'cache --timeout=3600'

# Download Terraform and Terragrunt, unzip them and set them to executable.
# Terraform URL can be found here: https://www.terraform.io/downloads.html
ADD https://releases.hashicorp.com/terraform/0.13.5/terraform_0.13.5_linux_amd64.zip terraform.zip
# Terragrunt releases can be found here: https://github.com/gruntwork-io/terragrunt/releases
ADD https://github.com/gruntwork-io/terragrunt/releases/download/v0.26.2/terragrunt_linux_amd64 terragrunt
RUN unzip terraform.zip && \
    rm -r terraform.zip && \
    chmod +x terraform && \
    chmod +x terragrunt

# Add /terraform to the path so terragrunt and terraform are found.
ENV PATH="/terraform:$PATH"

And the Pod we spin up.

// Get the repository name from the Job Name.
// This is assuming convention of using the repository name as the name of the job in Jenkins.
// A job name will look like: "Jenkins Deploys/jenkins-demo/master"
String REPO_NAME = ""
String job=env.JOB_NAME
job=job.substring(job.indexOf('/') + 1)
REPO_NAME=job.substring(0, job.indexOf('/'))
String label = "deploy-${REPO_NAME}-${UUID.randomUUID().toString()}"

// Typically 'master' but can be any branch
// If is 'master' we do an apply, if not we do a plan
String BRANCH_NAME = env.BRANCH_NAME

// This pod template includes resource requests tailored for this build job.
// I determined needed resources by inserting a long sleep(120000) after the last line in the last stage 
// and then viewing the build pods resource usage using Rancher.  
// Note: It take a few minutes for the reosource graphs to be updated.
podTemplate(label: label, yaml: """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: terraform
    # Note: Though we do not usually recommend using the latest tag, in this case it is very useful as
    # we have a copy of this same file in every instance repository and it is not effecient to have to update
    # the Jenkinsfile every time I rev the terraform-jenkins-deployer image.
    # Since I am using latest, I must also use imagePullPolicy of Always. 
    image: devops-docker-dev.artifactory.metro.ad.selinc.com/terraform-jenkins-deployer:latest
    # Because we are using the "latest" tag, I must set imagePullPolicy to Always so we are assured that we have 
    # the latest image on the node that is doing the deployment.
    imagePullPolicy: Always
    command:
    - /bin/cat
    tty: true
    resources:
      requests:
        memory: "10Mi"
        cpu: "420m"
      limits:
        memory: "100Mi"
        cpu: "520m"
  """) {
    node(label) {
        stage("Clone terraform-jenkins-configuration") {
            container(name: "terraform", shell: "/bin/bash") {
                withCredentials([usernamePassword(credentialsId: 'deploy-creds', usernameVariable: 'USERNAME', passwordVariable: 'PASSWORD')]) {
                    sh """#!/bin/bash
                    cd /terraform/
                    git clone https://${USERNAME}:${PASSWORD}@bitbucket.metro.ad.selinc.com/scm/jenkinsconfig/terraform-jenkins-configuration.git
                    cp terraform-jenkins-configuration/examples/common.tfvars common.tfvars
                    cp terraform-jenkins-configuration/examples/terragrunt.hcl terragrunt.hcl
                    """
                }
            }
        }

        stage("Clone ${REPO_NAME}") {
            container(name: "terraform", shell: "/bin/bash") {
                withCredentials([usernamePassword(credentialsId: 'deploy-creds', usernameVariable: 'USERNAME', passwordVariable: 'PASSWORD')]) {
                    print "${REPO_NAME} branch is " + BRANCH_NAME
                    sh """#!/bin/bash
                    cd /terraform/
                    git clone -b ${BRANCH_NAME} https://${USERNAME}:${PASSWORD}@bitbucket.metro.ad.selinc.com/scm/jenkinsconfig/${REPO_NAME}.git
                    """
                }
            }
        }

        if (BRANCH_NAME == 'master') {
            stage("Deploy ${REPO_NAME}") {
                container(name: "terraform", shell: "/bin/bash") {
                    sh """#!/bin/bash
                    cd /terraform/${REPO_NAME}/
                    terragrunt init
                    terragrunt apply -auto-approve
                    """
                }
            }
        }

        if (BRANCH_NAME != 'master') {
            stage("Plan ${REPO_NAME}") {
                container(name: "terraform", shell: "/bin/bash") {
                    sh """#!/bin/bash
                    cd /terraform/${REPO_NAME}/
                    terragrunt init
                    terragrunt plan
                    """
                }
            }
        }
    }
}

Any pointers would be greatly appreciated!

Fixed.
I really thought we had already taken a look at the resource requests and limits, but turns out that was for a different project entirely. Yeah, it was getting OOM killed and restarted. Terraform switching into verbose log output is spooky, but beyond that everything is good.
To be clear, we just needed to up the memory limit. Looks like newer versions of Terraform or Terragrunt need more RAM, which is not a big deal as they were so low to begin with.