Cannot create backup after migration from replicated to docker

I recently migrated a Terraform Enterprise (TFE) installation from the replicated deployment option to Docker.
I cannot get the backup to work after the migration. My setup is as follows:

  • GCP
  • Managed PostgreSQL 14
  • TFE v202503-1 (this is the latest replicated version). I have also tried upgrading to v202507-1 but backup still fails.
  • Docker Engine 24.0.2
  • Google Cloud Storage for object storage
  • Single compute engine instance with an external application load balancer in front of it.

My docker-compose.yml is as follows:

---

services:
    tfe:
        image: 'images.releases.hashicorp.com/hashicorp/terraform-enterprise:v202503-1'
        environment:
            TFE_DISK_CACHE_VOLUME_NAME: '${COMPOSE_PROJECT_NAME}_terraform-enterprise-cache'
            TFE_ENCRYPTION_PASSWORD: '$TFE_ENCRYPTION_PASSWORD'
            TFE_HOSTNAME: $TFE_HOSTNAME
            TFE_HTTP_PORT: 8080
            TFE_HTTPS_PORT: 8443
            TFE_CAPACITY_CONCURRENCY: 5
            TFE_CAPACITY_MEMORY: 4096
            TFE_DATABASE_USER: postgres
            TFE_DATABASE_PASSWORD: '$TFE_DATABASE_PASSWORD'
            TFE_DATABASE_HOST: $TFE_DATABASE_HOST
            TFE_DATABASE_NAME: $TFE_DATABASE_NAME
            TFE_DATABASE_PARAMETERS: 'sslmode=disable'
            TFE_IACT_TIME_LIMIT: 60
            TFE_FLUENTBIT_BUFFERCHUNKSIZE: 128kb
            TFE_FLUENTBIT_BUFFERMAXSIZE: 128kb
            TFE_METRICS_HTTP_PORT: 9090
            TFE_METRICS_HTTPS_PORT: 9091
            TFE_OBJECT_STORAGE_TYPE: google
            TFE_OBJECT_STORAGE_GOOGLE_BUCKET: '$TFE_OBJECT_STORAGE_GOOGLE_BUCKET'
            TFE_OBJECT_STORAGE_GOOGLE_CREDENTIALS: "$TFE_OBJECT_STORAGE_GOOGLE_CREDENTIALS"
            TFE_OBJECT_STORAGE_GOOGLE_PROJECT: '$TFE_OBJECT_STORAGE_GOOGLE_PROJECT'
            TFE_OPERATIONAL_MODE: external
            TFE_RUN_PIPELINE_DRIVER: docker
            TFE_RUN_PIPELINE_DOCKER_EXTRA_HOSTS: $TFE_HOSTNAME:$TFE_INSTANCE_PRIVATE_IP
            TFE_RUN_PIPELINE_DOCKER_NETWORK: tfe_terraform_isolation
            TFE_TLS_CA_BUNDLE_FILE: /etc/ssl/private/terraform-enterprise/bundle.pem
            TFE_TLS_CERT_FILE: /etc/ssl/private/terraform-enterprise/cert.pem
            TFE_TLS_KEY_FILE: /etc/ssl/private/terraform-enterprise/key.pem
            TFE_TLS_VERSION: 'tls_1_2_tls_1_3'
            TFE_VAULT_ADDRESS: http://127.0.0.1:8200
            TFE_VAULT_CLUSTER_ADDRESS: http://{{ GetPrivateIP }}:8201
            TFE_VAULT_TOKEN_RENEW: 3600
            TFE_LICENSE: '$TFE_LICENSE'
        cap_add:
            - IPC_LOCK
        read_only: true
        tmpfs:
            - /tmp:mode=01777
            - /run
            - /var/log/terraform-enterprise
        ports:
            - 80:8080
            - 443:8443
        volumes:
            - type: bind
              source: /var/run/docker.sock
              target: /run/docker.sock
            - type: bind
              source: /etc/terraform-enterprise/certs
              target: /etc/ssl/private/terraform-enterprise
            - type: volume
              source: terraform-enterprise-cache
              target: /var/cache/tfe-task-worker/terraform
volumes:
    terraform-enterprise-cache: {}

backup endpoint request:

curl \
  --header "Authorization: Bearer $TOKEN" \
  --request POST \
  --data @payload.json \
  --output backup-$(date +%Y%m%d-%H%M%S).blob \
  https://$MY_TFE_HOSTNAME/_backup/api/v1/backup

where TOKEN is fetched using the command:
docker exec -t terraform-enterprise-tfe-1 /bin/bash -c 'cat /var/run/terraform-enterprise/backup-restore/config.hcl | grep backup_token'

I get a blob back with html content as below:

<html><head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>502 Server Error</title>
</head>
<body text=#000000 bgcolor=#ffffff>
<h1>Error: Server Error</h1>
<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>
<h2></h2>
</body></html>

One curious thing is that the automated snapshot that was enabled on the replicated deployment IS still running and creating snapshots as expected.
Any ideas?

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.