Datasources for large scale deployments

Hi,
We have an environment and deploy 100’s of instances using OCI provider. Each resource is provisioned using for_each and has corresponding data_sources. Problem is occurring when you try to do a import or a plan once all instances are deployed or imported into state file.

The terraform reads data_sources for every resource against all the resources defined in the tfvars even if we run the import command against one resource with proper ADDR.

This adds a lot delay and takes hours/days to import all the instances into state file as well. We have tried the data-only module approach and combined the common data into a map variables but that’s doesn’t help either. The data read happens every-time.

Can any one share best practices around and how to reduce the delay and improve the performance. Basically how to avoid those many data-source reads every time

Hi @ulags.n,

Perhaps it would help to explain a little more about what you are doing, maybe with some isolated examples.

You mention you are provisioning resources and each has a corresponding data source. Data sources should only be used for resources which are not managed by the same configuration. Is this spread out over multiple root modules, or are the same resources also represented by a data source in the same configuration?

What workflow do you have where you need to import so many resources? Importing is generally a one-off action for something which needs to be managed by Terraform but was provisioned in some other way; it’s not usually part of a normal workflow.

The general answer to your primary question though, is that you avoid the many datasource reads by not having so many data sources. Figuring out why you have so many would be the first step.