Coding just likes Art

This is the blog about the coding, the whole coding and nothing but the coding , so God would help me become a better developer.

SanCoder's Blog home

How S3 list objects works ?

Amazon Web Service (AWS) S3 is all based on the object naming mechanism. Any object in a S3 bucket has its name as an identity. Even though S3 console give us a view of the filing system with depth folders, it actually store the data all at the root. And it provides a way to allow us leveraging a concept of prefix to structure our files. This is the basic storage mechanism of S3.

When talking about listing (scanning) the objects in a pretty large S3 bucket, AWS forth you to get the results with pagination. There is a 1000 keys upper limit for the S3 list operation. Regarding the virtually unlimited number of keys support, obviously the operation is not atomic. So it is important to know how it works. So based on the AWS documents, there is a concept called marker which indicates the position of the pagination scrolling. I envision that S3 will order the files naturally. Then a marker could divide the whole files to two parts: one has been scanned and the other hasn't. And when you provide a marker, S3 could know which files should be put into the next page results. So there are some cases that S3 would return an empty page if at the point no file exists in unscanned part.

So assuming we are in the middle of scrolling, and we have a marker, the files in both parts could be changes in three ways:

A file has been just Put into it.
A file has been just Deleted from it.
A file has been just Updated from it.

And what will happen to the rest list results is illustrated as following:

	Put	Deleted	Updated
Scanned	Nothing	Nothing	Nothing
Unscanned	With the new file	Without the new file	With the updated file

One thing to keep in mind here is that it is all about the file names because it will determine which part should your file be divided in.

How S3 list objects works ?

Related Posts