Hello,
I use terraform to create an HBase cluster in AWS.
When I use these settings, I am most of the time successful:
resource "aws_emr_cluster" "hbase" {
name = "hbase"
release_label = "emr-6.3.1"
applications = ["HBase"]
termination_protection = false
keep_job_flow_alive_when_no_steps = true
ec2_attributes {
key_name = <removed>
subnet_id = <removed>
instance_profile = aws_iam_instance_profile.emr_profile.arn
}
master_instance_group {
instance_type = "m1.medium"
instance_count = "1"
}
core_instance_group {
instance_type = "m1.medium"
instance_count = 4
ebs_config {
size = "20"
type = "gp2"
volumes_per_instance = 1
}
}
ebs_root_volume_size = 10
As soon as I increase the number of master nodes to three, the cluster creation fails with the error message:
Error: Error waiting for EMR Cluster state to be “WAITING” or “RUNNING”: TERMINATING: BOOTSTRAP_FAILURE: On the master instance (i-), application provisioning timed out
I checked the documentation for aws_emr_cluster, but could not find anything to set a timeout.
I also checked the timeout settings for IAM roles, but the default setting is one hour which would be absolutely sufficient.
I get the above mentioned error message every time cluster creation takes longer than about 16 minutes (16 minutes and 20 seconds to be exact).
I also create an AWS MSK resource in the same project, which took longer than 17 minutes. This finished successfully without complaining. So it does not seem like it is a global timeout value.
Any ideas would be much appreciated.
Best,
Denny