Object storage is a device and software that stores data in structures called objects, which are combinations of data, associated metadata, and custom identifiers. It provides client data through RESTful HTTP APIs, which also helps with scalability. Simply adding nodes enables petabyte-scale expansion. Data and metadata are distributed, processed, and replicated across multiple nodes in a distributed cluster environment through the network, with erasure coding support for recovery. Comprehensive metadata configuration eliminates the need for hierarchical folder/subfolder structures like in file storage. However, while object storage has fast average response times, there can be significant latency at p99 and above. Due to distributed system characteristics, some requests can be very slow.
Object Storage Hedging
Send multiple requests simultaneously → significantly reduces p99 latency, though at increased cost.
Object Storage Systems

Taking out the Trash: Garbage Collection of Object Storage at Massive Scale
The process of physically removing files deleted in metadata from object storage is necessary, but simple bucket policies or synchronous deletion methods have limitations such as interfering with live queries and making recovery impossible when misuse (errors) occur.
Taking out the Trash: Garbage Collection of Object Storage at Massive Scale
Distributed systems built on object storage all have one common problem: removing files that have been logically deleted either due to data expiry or compaction. We review the pros and cons of five ways to solve this problem.
https://www.warpstream.com/blog/taking-out-the-trash-garbage-collection-of-object-storage-at-massive-scale

So you want to use Object Storage
Tips and lessons learned from building systems directly against object stores
https://spiraldb.com/post/so-you-want-to-use-object-storage
Not a File System
S3 is files, but not a filesystem
"Deep" modules, mismatched interfaces - and why SAP is so painful
https://calpaterson.com/s3.html


Seonglae Cho