Possible inaccuracy in aws_s3_objects documentation

Hello folks! The documentation on aws_s3_objects states the following

The objects data source returns keys (i.e., file names) and other metadata about objects in an S3 bucket.

However, it also returns the directories as keys e.g.

 keys            = [
          + "kotlin-lambda-demo/",
          + "kotlin-lambda-demo/0.0.1/",
          + "kotlin-lambda-demo/0.0.1/kotlin-lambda-demo.zip",
          + "kotlin-lambda-demo/0.0.2/",
          + "kotlin-lambda-demo/0.0.2/kotlin-lambda-demo.zip",
          + "kotlin-lambda-demo/0.0.3/",
          + "kotlin-lambda-demo/0.1.0/",
          + "kotlin-lambda-demo/1.0.0/",
          + "kotlin-lambda-demo/1.0.0/kotlin-lambda-demo.zip",

Is the documentation incorrect or is this the intended behaviour? I would have expected to only get the leaf nodes (i.e the actual files), and I don’t see any way to filter them out in a way that satisfies terraform plan


Are you sure those objects don’t actually exist in the bucket?

Within S3 there is no concept of a “file” or “directory”. Everything is an object, and there is no requirement to have a containing “directory” at all. The UI showing things as a directory structure is just useful for people to more easily visualise things. Some tools do create “directory” objects for their own purposes (e.g. is it is being used to sync a filesystem and you need to store permissions/ownership data), so maybe that is the case here?

Hey @stuart-c thanks for the reply, I am somewhat familiar with S3 nomenclature :wink:

If we go back to my question, my observation was that the documentation uses the phrase (i.e., file names) . This implies that for the purposes of this, a user could infer that the expected output is objects that are files.

Given that there is no clean way to filter out objects that are not files, I feel that the documentation is somewhat misleading.

I would have liked to have seen a property on aws_s3_objects that would allow me to filter out what I don’t need.

“Object” and “file” are often used interchangeably. As far as S3 is concerned there is nothing different between an object called “a_path/a_file.zip” and “a_path/” (or “a_path”).

Remember that in S3 everything is indeed a “file” - there is nothing else that exists. In particular there is no such thing as a “directory”.

You can apply your own rules/heuristics (e.g. anything which ends “/” isn’t a file or anything which doesn’t have a “.” and a suffix isn’t a file), but that’s up to you and your use case (and wouldn’t for example work if you don’t finish a “directory” object with a “/” or use a “filename” which doesn’t have a suffix.

As the documentation states it is all about “keys”, which is also the name of the attribute returned. The only place filenames are mentioned is in the sentence “The objects data source returns keys (i.e., file names) and other metadata about objects in an S3 bucket.”, so other than just removing the bracketed section I’m not sure what could be adjusted?