Hello, just upgraded from Terraform 1.3.3. to 1.3.9 after I got tired of seeing it crash on several occasions. The upgrade fixed the crash, but now I’m seeing another problem that I dont believe was seen earlier.
null_resource is used to run bunch of local and remote provisioners(we need this for the complex cluster/storage stack config) and recently I added the output/exeStatus file collection.
null_resource is reporting failure even when all of the provisioners have completed successfully. The ‘debug’ revealed this error
provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = error reading from server: EOF"
The problem is interesting, it only seems to happen when remote_exec has failed once before and terraform apply was repeated after the error in script was fixed. A failure in null_resource leads to subsequent resource is not run. Until recently, null_resource was the last item but recently script output handling has been added, so the problem might have been there in earlier versions and was silent so was unnoticed. A third rerun will actually not even rerun null_resource and will process to the end.
Relevant code:
resource "null_resource" "cluster_config" {
depends_on = [
local_file.scale_config_params
]
# Changes to any instance of the cluster requires re-provisioning
triggers = {
...
}
...
provisioner "remote-exec" {
inline = [
"set -o errexit",
"chmod +x /home/${var.admin.sshUserName}/scale-config-launchpad.sh",
"/bin/sudo /home/${var.admin.sshUserName}/scale-config-launchpad.sh ${local.masterNodeFEIP} ${var.clusterDeployMode} ${local.clusterExpCounter} /home/${var.admin.sshUserName}/${local.scaleParmsFile} >/home/${var.admin.sshUserName}/stdout",
]
on_failure = fail # OR continue
}
# Return code file 'exitstatus' on master is created by scale-config-launchpad.sh; 'stdout' was generated in 'remote_exec'
# Gather both those file locally for further processing of the output.
provisioner "local-exec" {
command = <<EOC
scp -i ${var.admin.sshPrivatekeyPath} -o LogLevel=error -o StrictHostKeyChecking=no -o "UserKnownHostsFile=/dev/null" ${self.triggers.user}@${self.triggers.host}:/home/${self.triggers.user}/stdout ${local.localTempdir}/;
scp -i ${var.admin.sshPrivatekeyPath} -o LogLevel=error -o StrictHostKeyChecking=no -o "UserKnownHostsFile=/dev/null" ${self.triggers.user}@${self.triggers.host}:/home/${self.triggers.user}/exitstatus ${local.localTempdir}/;
ls -l ${local.localTempdir}/ ;
EOC
interpreter = ["/bin/bash", "-c"]
}
}
resource "random_uuid" "complete" {
depends_on = [
null_resource.cluster_config
]
}
#--- from data-sources.tf---
data "local_file" "scaleConfigStatus" {
filename = "${local.localTempdir}/exitstatus"
depends_on = [
resource.random_uuid.complete
]
lifecycle {
precondition {
condition = random_uuid.complete.result != null
error_message = "Scale config did not finish successfully"
}
}
}
data "local_file" "scaleConfigOutput" {
filename = "${local.localTempdir}/stdout"
depends_on = [
resource.random_uuid.complete
]
lifecycle {
precondition {
condition = random_uuid.complete.result != null
error_message = "Scale config did not finish successfully"
}
}
}
Upon first run - remote-exec fails:
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): Preparing source code to upload...
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): Packing up scale backend config dirs from '/Users/neetagarimella/syscale' using the following tar command.
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): tar czf /Users/neetagarimella/syscale/TF/tmp/scale-config-load-VERSION_STRING.tar.gz tools lib
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'file'...
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'file'...
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'file'...
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'remote-exec'...
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Connecting to remote host via SSH...
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Host: 34.145.97.171
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): User: sycompacct
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Password: false
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Private key: true
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Certificate: false
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): SSH Agent: true
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Checking Host Key: false
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Target Platform: unix
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Connected!
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Error: The actual scale nodecount of '4' in the cluster is different from the passed current scale node count '3'. Please review parameters.
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): #LINE#Failed to retrieve or validate parameters from '/home/sycompacct/scaleParams_E1.json'. Execution time: '2'seconds, exit code: '2'. #LINE#
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): For installation log, please refer to log file 'sycomp-scale-UPDATE_20230307.232913.log'.
╷
│ Error: remote-exec provisioner error
│
│ with module.vmscale-gcp[0].null_resource.cluster_config,
│ on ../modules/sycomp-vmscale-gcp/main.tf line 888, in resource "null_resource" "cluster_config":
│ 888: provisioner "remote-exec" {
│
│ error executing "/tmp/terraform_375047149.sh": Process exited with status 2
Second run (fix error in the remote script) - remote-exec completes: The missing files are very much there as revealed in the debug output later.
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): Packing up scale backend config dirs from '/Users/neetagarimella/syscale' using the following tar command.
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): tar czf /Users/neetagarimella/syscale/TF/tmp/scale-config-load-VERSION_STRING.tar.gz tools lib
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'file'...
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'file'...
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'file'...
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'remote-exec'...
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Connecting to remote host via SSH...
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Host: 34.145.97.171
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): User: sycompacct
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Password: false
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Private key: true
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Certificate: false
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): SSH Agent: true
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Checking Host Key: false
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Target Platform: unix
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Connected!
module.vmscale-gcp[0].data.local_file.scaleConfigStatus: Still reading... [10s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigOutput: Still reading... [10s elapsed]
module.vmscale-gcp[0].null_resource.cluster_config: Still creating... [10s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigStatus: Still reading... [20s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigOutput: Still reading... [20s elapsed]
module.vmscale-gcp[0].null_resource.cluster_config: Still creating... [20s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigOutput: Still reading... [30s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigStatus: Still reading... [30s elapsed]
module.vmscale-gcp[0].null_resource.cluster_config: Still creating... [30s elapsed]
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'local-exec'...
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): Executing: ["/bin/bash" "-c" " scp -i ~/sycompacct-key -o LogLevel=error -o StrictHostKeyChecking=no -o \"UserKnownHostsFile=/dev/null\" sycompacct@34.145.97.171:/home/sycompacct/stdout /Users/neetagarimella/syscale/TF/tmp/;\n scp -i ~/sycompacct-key -o LogLevel=error -o StrictHostKeyChecking=no -o \"UserKnownHostsFile=/dev/null\" sycompacct@34.145.97.171:/home/sycompacct/exitstatus /Users/neetagarimella/syscale/TF/tmp/;\n sync; sleep 1;\n"]
module.vmscale-gcp[0].null_resource.cluster_config: Creation complete after 37s [id=2508484827329135134]
╷
│ Error: Read local file data source error
│
│ with module.vmscale-gcp[0].data.local_file.scaleConfigStatus,
│ on ../modules/sycomp-vmscale-gcp/data-sources.tf line 51, in data "local_file" "scaleConfigStatus":
│ 51: data "local_file" "scaleConfigStatus" {
│
│ The file at given path cannot be read.
│
│ +Original Error: open /Users/neetagarimella/syscale/TF/tmp/exitstatus: no such file or directory
╷
│ Error: Read local file data source error
│
│ with module.vmscale-gcp[0].data.local_file.scaleConfigOutput,
│ on ../modules/sycomp-vmscale-gcp/data-sources.tf line 66, in data "local_file" "scaleConfigOutput":
│ 66: data "local_file" "scaleConfigOutput" {
│
│ The file at given path cannot be read.
│
│ +Original Error: open /Users/neetagarimella/syscale/TF/tmp/stdout: no such file or directory
╵
Third run
module.vmscale-gcp[0].data.google_compute_instance.masterNodeData: Read complete after 0s [id=projects/sycomp-storage/zones/us-west1-b/instances/ng2mgmt01]
module.vmscale-gcp[0].google_compute_instance.scale[0]: Refreshing state... [id=projects/sycomp-storage/zones/us-west1-b/instances/ng2scale04]
module.vmscale-gcp[0].null_resource.cluster_config: Refreshing state... [id=2508484827329135134]
module.vmscale-gcp[0].random_uuid.complete: Refreshing state... [id=9584f45b-08d7-61b4-dae2-1c1ea2b8d131]
module.vmscale-gcp[0].data.local_file.scaleConfigOutput: Reading...
module.vmscale-gcp[0].data.local_file.scaleConfigStatus: Reading...
module.vmscale-gcp[0].data.local_file.scaleConfigOutput: Read complete after 0s [id=0dda603dc11e661e305e7d96e8744b35bc5b4d65]
module.vmscale-gcp[0].data.local_file.scaleConfigStatus: Read complete after 0s [id=09d2af8dd22201dd8d48e5dcfcaed281ff9422c7]
Changes to Outputs:
~ scaleConfigLogOutput = {
~ configDuration = "31 Seconds" -> "30 Seconds"
~ configLogFile = "sycomp-scale-UPDATE_20230307.231203.log" -> "sycomp-scale-UPDATE_20230307.232556.log"
~ meterUnits = "scaleStorageTBUnits:3 scaleRhelUnits:5 scaleSDMUnits:0 Install-Time:2023-03-07 23:12:33.459" -> "scaleStorageTBUnits:3 scaleRhelUnits:5 scaleSDMUnits:0 Install-Time:2023-03-07 23:26:25.681"
# (2 unchanged elements hidden)
}
You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure.
Do you want to perform these actions in workspace "ng2-E1"?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value:
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
Second - failed run was captured in the debug output.
Note ls -l of the failed files was added.
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'remote-exec'...
2023-03-07T15:31:53.293-0800 [INFO] using private key for authentication
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Connecting to remote host via SSH...
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Host: 34.145.97.171
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): User: sycompacct
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Password: false
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Private key: true
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Certificate: false
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): SSH Agent: true
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Checking Host Key: false
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Target Platform: unix
2023-03-07T15:31:53.294-0800 [DEBUG] Connecting to 34.145.97.171:22 for SSH
2023-03-07T15:31:53.326-0800 [DEBUG] Connection established. Handshaking for user sycompacct
2023-03-07T15:31:53.592-0800 [DEBUG] Telling SSH config to forward to agent
2023-03-07T15:31:53.592-0800 [DEBUG] Setting up a session to request agent forwarding
2023-03-07T15:31:53.683-0800 [INFO] agent forwarding enabled
module.vmscale-gcp[0].null_resource.cluster_config (remote-exec): Connected!
2023-03-07T15:31:53.683-0800 [DEBUG] starting ssh KeepAlives
2023-03-07T15:31:53.683-0800 [DEBUG] opening new ssh session
2023-03-07T15:31:53.718-0800 [DEBUG] Starting remote scp process: 'scp' -vt /tmp
2023-03-07T15:31:53.752-0800 [DEBUG] Started SCP session, beginning transfers...
2023-03-07T15:31:53.752-0800 [DEBUG] Beginning file upload...
2023-03-07T15:31:53.799-0800 [DEBUG] SCP session complete, closing stdin pipe.
2023-03-07T15:31:53.800-0800 [DEBUG] Waiting for SSH session to complete.
2023-03-07T15:31:53.834-0800 [ERROR] scp stderr: "Sink: C0644 219 terraform_1170819954.sh\n"
2023-03-07T15:31:53.834-0800 [DEBUG] opening new ssh session
2023-03-07T15:31:53.900-0800 [DEBUG] starting remote command: chmod 0777 /tmp/terraform_1170819954.sh
2023-03-07T15:31:53.987-0800 [DEBUG] remote command exited with '0': chmod 0777 /tmp/terraform_1170819954.sh
2023-03-07T15:31:53.987-0800 [DEBUG] opening new ssh session
2023-03-07T15:31:54.050-0800 [DEBUG] starting remote command: /tmp/terraform_1170819954.sh
module.vmscale-gcp[0].data.local_file.scaleConfigOutput: Still reading... [10s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigStatus: Still reading... [10s elapsed]
module.vmscale-gcp[0].null_resource.cluster_config: Still creating... [10s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigStatus: Still reading... [20s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigOutput: Still reading... [20s elapsed]
module.vmscale-gcp[0].null_resource.cluster_config: Still creating... [20s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigOutput: Still reading... [30s elapsed]
module.vmscale-gcp[0].data.local_file.scaleConfigStatus: Still reading... [30s elapsed]
module.vmscale-gcp[0].null_resource.cluster_config: Still creating... [30s elapsed]
2023-03-07T15:32:25.159-0800 [DEBUG] remote command exited with '0': /tmp/terraform_1170819954.sh
2023-03-07T15:32:25.159-0800 [DEBUG] opening new ssh session
2023-03-07T15:32:25.194-0800 [DEBUG] Starting remote scp process: 'scp' -vt /tmp
2023-03-07T15:32:25.230-0800 [DEBUG] Started SCP session, beginning transfers...
2023-03-07T15:32:25.230-0800 [DEBUG] Copying input data into temporary file so we can read the length
2023-03-07T15:32:25.251-0800 [DEBUG] Beginning file upload...
2023-03-07T15:32:25.293-0800 [DEBUG] SCP session complete, closing stdin pipe.
2023-03-07T15:32:25.293-0800 [DEBUG] Waiting for SSH session to complete.
2023-03-07T15:32:25.329-0800 [ERROR] scp stderr: "Sink: C0644 0 terraform_1170819954.sh\n"
module.vmscale-gcp[0].null_resource.cluster_config: Provisioning with 'local-exec'...
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): Executing: ["/bin/bash" "-c" " scp -i ~/sycompacct-key -o LogLevel=error -o StrictHostKeyChecking=no -o \"UserKnownHostsFile=/dev/null\" sycompacct@34.145.97.171:/home/sycompacct/stdout /Users/neetagarimella/syscale/TF/tmp/;\n scp -i ~/sycompacct-key -o LogLevel=error -o StrictHostKeyChecking=no -o \"UserKnownHostsFile=/dev/null\" sycompacct@34.145.97.171:/home/sycompacct/exitstatus /Users/neetagarimella/syscale/TF/tmp/;\n ls -l /Users/neetagarimella/syscale/TF/tmp/ ;\n"]
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): total 192
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): -rw-r--r-- 1 neetagarimella staff 2 Mar 7 15:32 exitstatus
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): -rw-r--r-- 1 neetagarimella staff 81222 Mar 7 15:31 scale-config-load-VERSION_STRING.tar.gz
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): -rwxr-xr-x 1 neetagarimella staff 859 Mar 7 10:14 scaleParams.json
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): -rwxr-xr-x 1 neetagarimella staff 1404 Mar 7 14:15 scaleParams_E1.json
module.vmscale-gcp[0].null_resource.cluster_config (local-exec): -rw-r--r-- 1 neetagarimella staff 717 Mar 7 15:32 stdout
module.vmscale-gcp[0].null_resource.cluster_config: Creation complete after 34s [id=4225292809342519822]
2023-03-07T15:32:26.472-0800 [DEBUG] provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = error reading from server: EOF"
2023-03-07T15:32:26.473-0800 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/hashicorp/null/3.2.1/darwin_amd64/terraform-provider-null_v3.2.1_x5 pid=64028
2023-03-07T15:32:26.473-0800 [DEBUG] provider: plugin exited
2023-03-07T15:32:26.473-0800 [DEBUG] states/remote: state read serial is: 23; serial is: 23
2023-03-07T15:32:26.473-0800 [DEBUG] states/remote: state read lineage is: 69fdf268-7ba3-9617-3a34-c14920b16402; lineage is: 69fdf268-7ba3-9617-3a34-c14920b16402
╷
│ Error: Read local file data source error
│
│ with module.vmscale-gcp[0].data.local_file.scaleConfigStatus,
│ on ../modules/sycomp-vmscale-gcp/data-sources.tf line 51, in data "local_file" "scaleConfigStatus":
│ 51: data "local_file" "scaleConfigStatus" {
│
│ The file at given path cannot be read.
│
│ +Original Error: open /Users/neetagarimella/syscale/TF/tmp/exitstatus: no such file or directory
╵
╷
│ Error: Read local file data source error
│
│ with module.vmscale-gcp[0].data.local_file.scaleConfigOutput,
│ on ../modules/sycomp-vmscale-gcp/data-sources.tf line 66, in data "local_file" "scaleConfigOutput":
│ 66: data "local_file" "scaleConfigOutput" {
│
│ The file at given path cannot be read.
│
│ +Original Error: open /Users/neetagarimella/syscale/TF/tmp/stdout: no such file or directory
╵
2023-03-07T15:32:27.241-0800 [DEBUG] provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = error reading from server: EOF"
2023-03-07T15:32:27.242-0800 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/hashicorp/local/2.3.0/darwin_amd64/terraform-provider-local_v2.3.0_x5 pid=64025
2023-03-07T15:32:27.242-0800 [DEBUG] provider: plugin exited
Please review the deployment results above and save them for future reference.
The file terraform_1170819954.sh exists on the remote node, though ‘0’ bytes.
[root@ng2mgmt01 tools]# ls -l /tmp/terraform_1170819954.sh
-rwxrwxrwx 1 sycompacct sycompacct 0 Mar 7 23:32 /tmp/terraform_1170819954.sh
[root@ng2mgmt01 tools]#