git의 모든 객체 oid는 hash 로 표현
3 types:
- blob: file content
- tree: directory structure containing mode, name, and 20-byte hash repeated
- commit: tree hash, parent(s), author/committer, message, signature
Everything is stored in the format "type + space + length + NUL + content" then compressed with zlib. The filename is a SHA-1 hash (first 2 characters as directory name + remainder as filename).
Handcrafted reverse-engineering hash structure (git object)
Git's internal design is simple and elegant, and once you understand it, implementing it isn't difficult
Scaling Git's garbage collection | The GitHub Blog
At GitHub, we store a lot of Git data: more than 18.6 petabytes of it, to be precise. That's more than six times the size of the Library of Congress's digital collections. Most of that data comes from the contents of your repositories: your READMEs, source files, tests, licenses, and so on.
https://github.blog/2022-09-13-scaling-gits-garbage-collection


Seonglae Cho