Hi,
We need to have the ability to backup and restore EC2/EBS volumes. We use Terraform to create the infrastructure and store the backend in S3. Is anyone using a backup solution for there Terraform managed infrastructure? If so, how does Terraform deal with a restore? What happens to the state file etc?
I have had a look at using AWS backup which works fine for backing up and restoring, however on restores it means a new instance is created and is not managed by Terraform.
It could be done using snapshots and AMIs however we require a single pane of glass which gives a centralized view for another team to do restores. It would also mean a code change to the TF code to replace the volumes/AMIs.
Any advice would be appreciated.
Can you explain the intended workflow a bit better? Are you replacing the instance with a backup or creating another instance? Are the restored instances needed long term? Do you need multiple of them? Explaining the workflow would answer most of these.
Hi @gadgetmerc I am also facing the same issue with VM Restore where we need to create multiple disks and destroy it rather than that if you have a terraform automation our SREs will have some relief.
We have an original OS disk (disk 1).
Restore that from backup to a new disk (disk 2)
Create a snapshot of disk 2
In the VM, replace disk 1 with disk 2 and destroy disk 1.
Create a new disk from the snapshot, and rename it the same as disk 1 (disk 3)
In the VM, replace disk 2 with disk 3.
So multiple disks to be juggled in a number of steps just to restore the OS disk. Now move on to replace the data disks (less complex process for the non OS disks, but still each disk has to be done one by one).
I worry that we destroy disk 1 too early in the process thereby removing our opportunity to roll back if anything fails later in the restore process. So we probably need to take a copy of disk 1 under a new name (disk 4) before deleting disk 1 (we have to delete disk 1 before we can give disk 3 the same name).
Of course Azure has a native VM restore process, which is just a couple of clicks. We can’t currently use that because the new VM would have a different identifier which would confuse Terraform. The solution to that would probably be as simple as updating the identifier in a Terraform state file.
Messing with the statefile appears to be anathema to the devops team, but the alternative above is just not workable.
Maybe there is a middle ground where restoring a VM from backup becomes a Terraform function, I don’t know.