Like many things “it depends”.
There are various advantages and disadvantages of each option, and the decision of which to use (and you might choose to use more than one option) depends on the situation at your place of work, what you are doing & personal preference. The only thing I would strongly suggest is to be consistant - don’t do the same thing in different ways without there being a clear reason why.
The difference between using data sources & remote state is really the strength of the link between the two root modules. With remote state it is a very strong link - one root module exposes some outputs, which other root modules can then consume. Whereas for data sources there isn’t such an explicit link.
One advantage of having an explicit link (i.e. remote state) is you have clear links between root module, which you can more easily understand and document. A root module has to decide to expose something, whereas a data source could reference something that the producer doesn’t realise is being used/needed.
On the other had the advantage that data sources have is this looser association. This is particularly useful for example if something is produced that maybe isn’t managed by Terraform at all (and therefore remote state wouldn’t be possible) or is managed by Terraform but you shouldn’t have access to the root module’s state file (remote state requires access to the state file - for example S3 bucket - which includes more than just the exposed outputs, so could be a security issue in some situations). It is also useful when what is returned doesn’t matter as much - an example is where you just need the latest AMI instead of a specific one.
The idea of storing things in a totally different system is fairly similar to remote state - you have to explicitly send the data to that remote statem for it to be usable by someone else. One advantage is that only the shared information is available (instead of the whole state file), as well as the possibility of more granular access control, depending on what system you choose (so you could expose several items of information but restrict who can use them differently, whereas for remote state it is an all or nothing). The disadvantage is that there is additional complexity - you need to setup/maintain that extra system as well as any access control.
One other option you didn’t mention is just using hard coded values. This could be ARNs just included in a module as a string, or using a module that just contains such hard coded string/list values. With them being contained within a module you could then have loose versioning (such that you always get the latest version of that module). This can actually work pretty well for things which are pretty fixed - for example AWS account numbers or ARNs for a transit gateway (if they change it is likely to be due to some very major adjustment, which is unlikely and also likely to need other work anyway to accomodate).
Finally the other key thing you need to consider regardless of the option you use for sharing information between different root modules is how you perform changes. Within a single root module Terraform will produce a dependancy graph and figure out what is changing and the order to do things in. When you have coupled root modules that doesn’t exist, but there is still the same need.
You therefore need to have a way of rerunning moduleB if moduleA changes (assuming moduleB uses a value exposed by moduleA). This could be something totally manual (a process to run things in a certain order) or something more automated/complex (for example many CI systems can auto trigger other jobs once they complete).