What I want to do
I’m trying to create multiple EC2 instances using Terraform, and then add a different SSH port to all of them because port 22 traffic might be blocked in different networks. I’m able to add different port to /etc/ssh/sshd_config of the instances using the user_data attribute of Terraform EC2 resource, but activating those ports is being problematic.
Can’t apply SSH changes
I need to use Ubuntu instances, where I noticed that it doesn’t have any sshd.service despite being able to receive SSH connections and tab-completion showing ssh.service as an option. If I connect to an instance through AWS Console and try running sudo systemctl restart sshd.service, it fails with the error message saying that the service isn’t available.
Previously working method
I found that the port changes get applied when I reboot the instance. So, I installed AWS CLI v2 and created a null_resource for executing aws ec2 reboot-instance using the Terraform code. The following code snippet was working and giving the expected outcome, until yesterday.
# EC2 INSTANCE MODULE
module "ec2_module" {
source = "./Modules/EC2"
depends_on = [ aws_key_pair.ec2_ssh_key, module.sg_module, tls_private_key.ansible_ssh_key ]
for_each = var.ec2_instances
instance_name = "${var.project_prefix}-${each.value.name}"
instance_type = each.value.type
instance_sg = [ module.sg_module.security_group_id ]
root_vol_size = each.value.root_size
ssh_public_key = aws_key_pair.ec2_ssh_key.key_name
user_data = <<-EOF
#!/usr/bin/env bash
echo -e "Port 22\nPort ${var.external_access_ports["SSH_Alt"]}" | sudo tee -a /etc/ssh/sshd_config
echo "${tls_private_key.ansible_ssh_key.public_key_openssh}" | sudo tee -a /home/${var.ec2_username}/.ssh/authorized_keys
EOF
}
# Reboot EC2 instances once after creation
resource "null_resource" "reboot_ec2_instances" {
for_each = module.ec2_module
triggers = {
instance_id = each.value.instance_id
# instance_state = each.value.instance_state
}
provisioner "local-exec" {
# aws ec2 wait instance-running --instance-ids ${each.value.instance_id} --region ${var.infra_region}
command = "aws ec2 reboot-instances --instance-ids ${each.value.instance_id} --region ${var.infra_region}"
}
depends_on = [ module.ec2_module ]
}
New/Current problems
The snippet above would successfully create the instances, add the SSH
port and then reboot the instance that lead to the new port being
applied and active. But for some reason, it’s stopped working since
yesterday. Moreover, the null_resource
seems to be making the instances take much longer than normal for being
available to connect through SSH, meaning connecting through AWS
Console is also not available until several minutes pass despite
completing Status Checks earlier.
Other approaches that didn’t work:
-
Changing the trigger to wait for instance state.
-
Running
aws ec2 wait instance-running ...before reboot command (commented in snippet). -
Any kind of rebooting command like
systemctl reboot,shutdown -r, etc. inuser_datadoesn’t apply the change. -
Tried CloudInit YAML on suggestion from AmazonQ, but it doesn’t even seem to add the port.
-
Same behaviour between Snap and direct download versions.
I just want to apply the SSH port changes. Rebooting seemed to do the
trick, but now my method of rebooting them don’t seem to work. What is a
reliable method to reboot or apply the port changes?