Terraform Version
Terraform v1.3.7
on darwin_arm64
+ provider registry.terraform.io/hashicorp/aws v4.54.0
+ provider registry.terraform.io/hashicorp/local v2.1.0
+ provider registry.terraform.io/hashicorp/random v3.1.0
+ provider registry.terraform.io/hashicorp/tls v4.0.4
Terraform Configuration Files
variable "key_name" {
description = "SSH Key Name For Authentication"
type = string
default = "ubuntu"
}
resource "tls_private_key" "ubuntu" {
algorithm = "RSA"
rsa_bits = 4096
}
resource "aws_key_pair" "generated_key" {
key_name = var.key_name
public_key = tls_private_key.ubuntu.public_key_openssh
}
data "aws_ami" "ubuntu" {
most_recent = true
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
owners = ["099720109477"] # Canonical
}
resource "aws_instance" "ubuntu" {
ami = data.aws_ami.ubuntu.id
instance_type = var.instance_type
key_name = aws_key_pair.generated_key.key_name
network_interface {
network_interface_id = aws_network_interface.ubuntu.id
device_index = 0
}
metadata_options {
http_endpoint = "disabled"
}
connection {
user = "ubuntu"
type = "ssh"
host = self.public_ip
private_key = tls_private_key.ubuntu.private_key_pem
timeout = "1m"
}
provisioner "remote-exec" {
inline = [
"apt get update -y"
]
}
depends_on = [
aws_key_pair.generated_key
]
}
Debug Output
Expected Behavior
Terraform Apply should work through fine and remote_exec should connect and execute
Actual Behavior
Throws an error as shown which is an SSH error when remote_exec tries to connect.
╷
│ Error: file provisioner error
│
│ with aws_instance.web-template,
│ on ec2.tf line 51, in resource "aws_instance" "web-template":
│ 51: provisioner "file" {
│
│ timeout - last error: SSH authentication failed (ubuntu@35.175.205.156:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported
│ methods remain
╵
However, If I disable the following lines , it all works smoothly, terraform apply works and remote_exec connects and executes the script.
metadata_options {
http_endpoint = "disabled"
}
Additonally, the SSH key generated is unable to connect and throws the same error.
Steps to Reproduce
terraform init
terraform apply
Additional Context
I need to build a bastion host with IMDS disabled by default as a security requirement and hence I need to use the following metadata configuration in the aws_instance
resource
metadata_options {
http_endpoint = "disabled"
}
What I fail to understand is why or rather how is this step/feature interfering with SSH communications ? Why does remote_exec need to contact IMDS service when all it really needs is an SSH private key which is being provided.
References
Other similar issues I looked at prior to filing this error
opened 05:46PM - 29 May 22 UTC
closed 08:04AM - 31 May 22 UTC
bug
new
<!--
Hi there,
Thank you for opening an issue. Please note that we try to ke… ep the Terraform issue tracker reserved for bug reports and feature requests. For general usage questions, please see: https://www.terraform.io/community.html.
If your issue relates to Terraform Cloud/Enterprise, please contact tf-cloud@hashicorp.support.
If your issue relates to a specific Terraform provider, please open it in the provider's own repository. The index of providers is at https://registry.terraform.io/browse/providers.
To fix problems, we need clear reproduction cases - we need to be able to see it happen locally. A reproduction case is ideally something a Terraform Core engineer can git-clone or copy-paste and run immediately, without inventing any details or context.
* A short example can be directly copy-pasteable; longer examples should be in separate git repositories, especially if multiple files are needed
* Please include all needed context. For example, if you figured out that an expression can cause a crash, put the expression in a variable definition or a resource
* Set defaults on (or omit) any variables. The person reproducing it should not need to invent variable settings
* If multiple steps are required, such as running terraform twice, consider scripting it in a simple shell script. Providing a script can be easier than explaining what changes to make to the config between runs.
* Omit any unneeded complexity: remove variables, conditional statements, functions, modules, providers, and resources that are not needed to trigger the bug
* When possible, use the [null resource](https://www.terraform.io/docs/providers/null/resource.html) provider rather than a real provider in order to minimize external dependencies. We know this isn't always feasible. The Terraform Core team doesn't have deep domain knowledge in every provider, or access to every cloud platform for reproduction cases.
-->
### Terraform Version
<!---
Run `terraform version` to show the version, and paste the result between the ``` marks below.
If you are not running the latest version of Terraform, please try upgrading because your issue may have already been fixed.
-->
```
Terraform v1.1.2
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v4.16.0
```
### Terraform Configuration Files
<!--
Paste the relevant parts of your Terraform configuration between the ``` marks below.
For Terraform configs larger than a few resources, or that involve multiple files, please make a GitHub repository that we can clone, rather than copy-pasting multiple files in here. For security, you can also encrypt the files using our GPG public key at https://www.hashicorp.com/security.
-->
```terraform
resource "aws_instance" "swarm-manager" {
count = "${var.swarm_managers}"
ami = "${var.swarm_ami_id}"
instance_type = "${var.swarm_instance_type}"
tags = {
Name = "swarm-manager"
}
vpc_security_group_ids = [
"${aws_security_group.docker.id}"
]
key_name = "devops21"
connection {
type = "ssh"
user = "ubuntu"
private_key = file(var.privatekeypath)
host = self.public_ip
timeout = "2m"
}
provisioner "remote-exec" {
inline = [
"if ${var.swarm_init}; then docker swarm init --advertise-addr ${self.private_ip}; fi",
"if ! ${var.swarm_init}; then docker swarm join --token ${var.swarm_manager_token} --advertise-addr ${self.private_ip} ${var.swarm_manager_ip}:2377; fi"
]
}
}
```
### Debug Output
<!--
Full debug output can be obtained by running Terraform with the environment variable `TF_LOG=trace`. Please create a GitHub Gist containing the debug output. Please do _not_ paste the debug output in the issue, since debug output is long.
Debug output may contain sensitive information. Please review it before posting publicly, and if you are concerned feel free to encrypt the files using the HashiCorp security public key.
-->
IP is differ, because I've collected `debug` output from the next try to apply.
```
[INFO] using private key for authentication
[DEBUG] Connecting to 3.69.156.72:22 for SSH
[ERROR] connection error: dial tcp 3.69.156.72:22: connect: connection refused
[WARN] retryable error: dial tcp 3.69.156.72:22: connect: connection refused
[DEBUG] Connecting to 3.69.156.72:22 for SSH
[DEBUG] Connection established. Handshaking for user ubuntu
[WARN] SSH authentication failed (ubuntu@3.69.156.72:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
[WARN] retryable error: SSH authentication failed (ubuntu@3.69.156.72:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
[WARN] Errors while provisioning aws_instance.swarm-manager[0] with "remote-exec", so aborting
[ERROR] vertex "aws_instance.swarm-manager[0]" error: remote-exec provisioner error
[DEBUG] provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = transport is closing"
[DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/hashicorp/aws/4.16.0/linux_amd64/terraform-provider-aws_v4.16.0_x5 pid=486858
2022-05-30T09:49:52.758+0300 [DEBUG] provider: plugin exited
```
### Expected Behavior
<!--
What should have happened?
-->
ec2 instance with docker swarm
### Actual Behavior
<!--
What actually happened?
-->
While appling:
```
aws_instance.swarm-manager[0] (remote-exec): Connecting to remote host via SSH...
aws_instance.swarm-manager[0] (remote-exec): Host: 18.184.14.73
aws_instance.swarm-manager[0] (remote-exec): User: ubuntu
aws_instance.swarm-manager[0] (remote-exec): Password: false
aws_instance.swarm-manager[0] (remote-exec): Private key: true
aws_instance.swarm-manager[0] (remote-exec): Certificate: false
aws_instance.swarm-manager[0] (remote-exec): SSH Agent: true
aws_instance.swarm-manager[0] (remote-exec): Checking Host Key: false
aws_instance.swarm-manager[0] (remote-exec): Target Platform: unix
```
At the end:
```
Error: remote-exec provisioner error
with aws_instance.swarm-manager[0],
on swarm.tf line 21, in resource "aws_instance" "swarm-manager":
21: provisioner "remote-exec" {
timeout - last error: SSH authentication failed (ubuntu@18.184.14.73:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
```
But I can successfully open ssh-session to the instance:
```
$ ssh -i ~/.ssh/devops21.pem ubuntu@18.184.14.73
The authenticity of host '18.184.14.73 (18.184.14.73)' can't be established.
ED25519 key fingerprint is SHA256:+ECy1hKKPM9neykXBHCkHZ5p5ceqohVxc1h8BvW2GLw.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '18.184.14.73' (ED25519) to the list of known hosts.
Welcome to Ubuntu 22.04 LTS (GNU/Linux 5.15.0-1008-aws x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Sun May 29 17:42:16 UTC 2022
System load: 0.080078125 Processes: 104
Usage of /: 27.7% of 7.58GB Users logged in: 0
Memory usage: 28% IPv4 address for docker0: 172.17.0.1
Swap usage: 0% IPv4 address for eth0: 172.31.44.233
0 updates can be applied immediately.
ubuntu@ip-172-31-44-233:~$
```
The `key_name` name is correct:
```
$ aws ec2 describe-key-pairs --key-names devops21
{
"KeyPairs": [
{
"KeyPairId": "key-0c9ae318a43621a13",
"KeyFingerprint": "e7:31:85:fb:ba:e6:09:dc:56:30:31:52:9f:7b:25:77:68:2f:7f:cf",
"KeyName": "devops21",
"KeyType": "rsa",
"Tags": [],
"CreateTime": "2022-05-29T15:57:30+00:00"
}
]
}
```
### Steps to Reproduce
<!--
Please list the full steps required to reproduce the issue, for example:
1. `terraform init`
2. `terraform apply`
-->
1. `terraform init`
2. `terraform apply -target aws_instance.swarm-manager -var swarm_init=true -var swarm_managers=1`
### Additional Context
<!--
Are there anything atypical about your situation that we should know? For example: is Terraform running in a wrapper script or in a CI system? Are you passing any unusual command line options or environment variables to opt-in to non-default behavior?
-->
### References
<!--
Are there any other GitHub issues (open or closed) or Pull Requests that should be linked here? For example:
- #6017
-->
opened 04:28PM - 13 Feb 21 UTC
closed 03:33PM - 13 Dec 21 UTC
bug
upstream
provisioner/file
provisioner/remote-exec
v0.14
v1.0
Hello,
with the most recent version of CoreOS, 33.20210117.3.2, I cannot use … the terraform ssh connection provisioner anymore. It used to work in prior versions.
Problem is that a SSH connection cannot be established by means of the tf connector, but it works with a simple “ssh core@”. Here are two log entries from the hosts syslog, acquired with journalctl while trying to login with either method:
When trying with terraform connector (failure):
```
Feb 13 14:27:10 k1.local.vlan audit[18899]: USER_LOGIN pid=18899 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=login acct="core" exe="/usr/sbin/sshd" hostname=? addr=192.168.56.1 terminal=ssh res=failed'
```
When trying with “ssh core@…” (success):
```
Feb 13 14:29:00 k1.local.vlan audit[20006]: USER_LOGIN pid=20006 uid=0 auid=1000 ses=12 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=login id=1000 exe="/usr/sbin/sshd" hostname=? addr=192.168.56.1 terminal=/dev/pts/1 res=success'
```
The user is in both cases the same (core) and authentication is to be performed via RSA pub key from ssh-agent. Here is a simple terraform module to reproduce the case:
```
resource "null_resource" "copy" {
connection {
type = "ssh"
host = "<some coreos host>"
user = "core"
timeout = "10m"
agent = true
}
provisioner "file" {
content = "blabla"
destination = "ttt"
}
}
```
when running “terraform init” and “terraform apply”, the apply hangs and dies. It works for any other host, which is not CoreOS.
When setting the TF_LOG to TRACE, there is a related message:
```
SSH authentication failed (core@k1.local.vlan:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
```
Some more context is below under "debug output".
There is some more info on the host's syslog, when I try to connect via the terraform connector:
```
userauth_pubkey: key type ssh-rsa not in PubkeyAcceptedKeyTypes [preauth]
USER_ERR pid=4191 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=PAM:bad_ident grantors=? acct="?" exe="/usr/sbin/sshd" hostname=192.168.56.1 addr=192.168.56.1 terminal=ssh res=failed'
```
But I suspect that this is not fully relevant, since the simple ssh command works with the same key on that host, while the terraform connector doesn't. I suspect it is related to the concrete invocation of ssh from terraform in combination with the particular sshd configuration on this fedora coreos version...
### Terraform Version
```
0.14
```
### Debug Output (relevant parts only)
```
2021-02-13T16:44:20.301+0100 [DEBUG] plugin.terraform: file-provisioner (internal) 2021/02/13 16:44:20 [INFO] sleeping for 1s
2021-02-13T16:44:21.305+0100 [DEBUG] plugin.terraform: file-provisioner (internal) 2021/02/13 16:44:21 [DEBUG] Connecting to k1.local.vlan:22 for SSH
2021-02-13T16:44:21.305+0100 [DEBUG] plugin.terraform: file-provisioner (internal) 2021/02/13 16:44:21 [DEBUG] Connection established. Handshaking for user core
2021-02-13T16:44:21.346+0100 [DEBUG] plugin.terraform: file-provisioner (internal) 2021/02/13 16:44:21 [WARN] SSH authentication failed (core@k1.local.vlan:22): ssh: handshake failed: s
sh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2021-02-13T16:44:21.346+0100 [DEBUG] plugin.terraform: file-provisioner (internal) 2021/02/13 16:44:21 [WARN] retryable error: SSH authentication failed (core@k1.local.vlan:22): ssh: ha
ndshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2021-02-13T16:44:21.346+0100 [DEBUG] plugin.terraform: file-provisioner (internal) 2021/02/13 16:44:21 [INFO] sleeping for 2s
2021-02-13T16:44:23.350+0100 [DEBUG] plugin.terraform: file-provisioner (internal) 2021/02/13 16:44:23 [DEBUG] Connecting to k1.local.vlan:22 for SSH
2021-02-13T16:44:23.351+0100 [DEBUG] plugin.terraform: file-provisioner (internal) 2021/02/13 16:44:23 [DEBUG] Connection established. Handshaking for user core
2021-02-13T16:44:23.394+0100 [DEBUG] plugin.terraform: file-provisioner (internal) 2021/02/13 16:44:23 [WARN] SSH authentication failed (core@k1.local.vlan:22): ssh: handshake failed: s
sh: unable to authenticate, attempted methods [none publickey], no supported methods remain
```
### Expected Behavior
It should be possible to connect to coreos hosts with terraform ssh connection and file provisioner (or any other provisioner respectively)
### Actual Behavior
ssh connection is not possible with terraform connector, while it is possible by simple ssh command.
### Steps to Reproduce
copy the code above into a main.tf, setup a simple coreos VM somewhere, then run
1. `terraform init`
2. `terraform apply`
### References
See also my [related forum post at Fedora Discussions](https://discussion.fedoraproject.org/t/cannot-login-per-ssh-via-terraform-connection/26976).