Possible inaccuracy in aws_s3_objects documentation

ioannis · February 19, 2023, 12:45pm

Hello folks! The documentation on aws_s3_objects states the following

The objects data source returns keys (i.e., file names) and other metadata about objects in an S3 bucket.

However, it also returns the directories as keys e.g.

 keys            = [
          + "kotlin-lambda-demo/",
          + "kotlin-lambda-demo/0.0.1/",
          + "kotlin-lambda-demo/0.0.1/kotlin-lambda-demo.zip",
          + "kotlin-lambda-demo/0.0.2/",
          + "kotlin-lambda-demo/0.0.2/kotlin-lambda-demo.zip",
          + "kotlin-lambda-demo/0.0.3/",
          + "kotlin-lambda-demo/0.1.0/",
          + "kotlin-lambda-demo/1.0.0/",
          + "kotlin-lambda-demo/1.0.0/kotlin-lambda-demo.zip",
        ]

Is the documentation incorrect or is this the intended behaviour? I would have expected to only get the leaf nodes (i.e the actual files), and I don’t see any way to filter them out in a way that satisfies terraform plan

Thanks!

stuart-c · February 19, 2023, 4:28pm

Are you sure those objects don’t actually exist in the bucket?

Within S3 there is no concept of a “file” or “directory”. Everything is an object, and there is no requirement to have a containing “directory” at all. The UI showing things as a directory structure is just useful for people to more easily visualise things. Some tools do create “directory” objects for their own purposes (e.g. is it is being used to sync a filesystem and you need to store permissions/ownership data), so maybe that is the case here?

ioannis · February 20, 2023, 6:41am

Hey @stuart-c thanks for the reply, I am somewhat familiar with S3 nomenclature

If we go back to my question, my observation was that the documentation uses the phrase (i.e., file names) . This implies that for the purposes of this, a user could infer that the expected output is objects that are files.

Given that there is no clean way to filter out objects that are not files, I feel that the documentation is somewhat misleading.

I would have liked to have seen a property on aws_s3_objects that would allow me to filter out what I don’t need.

stuart-c · February 20, 2023, 6:56pm

“Object” and “file” are often used interchangeably. As far as S3 is concerned there is nothing different between an object called “a_path/a_file.zip” and “a_path/” (or “a_path”).

Remember that in S3 everything is indeed a “file” - there is nothing else that exists. In particular there is no such thing as a “directory”.

You can apply your own rules/heuristics (e.g. anything which ends “/” isn’t a file or anything which doesn’t have a “.” and a suffix isn’t a file), but that’s up to you and your use case (and wouldn’t for example work if you don’t finish a “directory” object with a “/” or use a “filename” which doesn’t have a suffix.

As the documentation states it is all about “keys”, which is also the name of the attribute returned. The only place filenames are mentioned is in the sentence “The objects data source returns keys (i.e., file names) and other metadata about objects in an S3 bucket.”, so other than just removing the bracketed section I’m not sure what could be adjusted?

Topic		Replies	Views
Retreive multiple AWS S3 objects Terraform	0	340	March 28, 2020
How to read only s3 object names not full object path AWS	2	170	September 11, 2024
Download Files from S3 bucket - subfolder - Locally Terraform	1	1227	May 3, 2020
Object-level events on s3 buckets without new aws_cloudtrail AWS	1	594	April 20, 2020
Aws_s3_bucket_object content is not expected here Terraform	1	1502	May 11, 2020

Possible inaccuracy in aws_s3_objects documentation

Related topics