I looking to see if there is an easier way to tackle a problem I’m having. I’m trying to spin up a vpn with multiple ec2 instances, and an alb. The instances are private and should only be accessed via the vpn. So What I currently do is run terraform for create the vpc, and ec2 instances. Once the vpn appliance is up, I log into the vpn. I then run another terraform on a bastion to provision the machines internally. Any ideas where I don’t have to do this in multiple steps?
Hi @rmattier
I usually do Terraform and then Ansible, but both are manually applied and it takes two steps.
Couple of ideas:
- Using cloudInit the machines will install Ansible, download and apply the playbooks.
- Using golden Images
I would also like to know what is the best approach.
Hi @rmattier,
In the ideal case, Terraform should not need direct access to the machines it’s deploying because provisioners are a last resort. Indeed, that section alludes to this very problem as one of the justifications for seeking other approaches where possible:
Secondly, successful use of provisioners requires coordinating many more details than Terraform usage usually requires: direct network access to your servers, issuing Terraform credentials to log in, making sure that all of the necessary external software is installed, etc.
Without knowing the details of what you are setting up I can’t make specific recommendations or even be sure that there is an option other than provisioners, but hopefully the ideas in that section are helpful in giving you some other approaches that might allow the machines to self-bootstrap rather than requiring direct provisioning via Terraform.
If you aren’t sure how to make use of that advice in your specific situation then I’m happy to try to answer some more specific questions. 
Thanks for the quick response. This was my thought. I am creating 7 ec2 instances, one load balancer. Three of those ec2 instances are behind the loadbalancer, and the other is a bastion host. Now, I am successful in creating the instances, and the loadbalancers, target groups, etc. So, the vpc has a fortigate(vpn) appliance that is created. So,currently what I have been doing is building the vpc, with the fortigate, and the ec2 instances. Second, I would log into the fortigate, ssh to the bastion, then run terraform to provision the other 6 instances. I would like to be able to bunch this into one run instead of two. I thought of attaching eip’s to each instance then provisioning them that way, but I never was successful in disassociating eip’s from instances.
Hi @rmattier,
When you say “provisioning” here I’m assuming you’re talking specifically about the Terraform feature of “provisioners”, rather than the general idea of preparing a virtual machine to do its work, and so my previous answer was coming from that assumption and trying to suggest using other techniques to achieve provisioning in the general sense.
To give a more specific idea to consider:
Are the EC2 instances that are “behind” the bastion built from an AMI that has cloud-init installed? (Typical generic Linux distribution images, like Amazon Linux, Debian, and CentOS do)
If so, can you run the same steps you were previously running with remote-exec and file provisioners using cloud-init’s ability to run arbitrary shell scripts on boot and to write out arbitrary files?
You can use the user_data argument on aws_instance to provide a user data string which software running in the instance can read. When cloud-init is installed it will be the one to read the user_data on system boot and then take actions based on the cloud-init configuration you placed in user_data.
A key advantage of this technique over Terraform provisioners is that EC2 itself serves as an intermediary for the user_data value: Terraform submits it to the EC2 API, and then EC2 provides the data to the software in the EC2 instance once it’s running, and so there’s never any direct interaction between Terraform and the software running in the EC2 instance. That means that Terraform needs no access to the VPC network and thus won’t need to make use of the bastion.
This approach does have some limitations though:
- There is a maximum size limit for user_data. If you need to place large files into the remote filesystem then they may be too large to pack into theuser_datafield and so you might need to have Terraform instead publish those files on some other intermediate data store (e.g. an S3 bucket) and then arrange for the startup script to download the file from there. (I don’t recall what the limit is off the top of my head but you can find it in the EC2 documentation.)
- The content of user_datais stored in cleartext as part of the EC2 instance configuration and is visible via the EC2 Console and API, so it’s not a suitable way to transmit sensitive information. For sensitive information, it may be better to use something like KMS as an intermediary.
You’ll notice that the common theme in the above is using various sorts of intermediaries in order to avoid Terraform ever needing to make an SSH connection to the EC2 instances, which can therefore allow everything to happen in a single step. This approach is also suitable for instances created indirectly via EC2 autoscaling, because in that case new instances can be launched at any time and Terraform wouldn’t be around to SSH into the new instance and configure it.