summaryrefslogtreecommitdiff
path: root/content/2023/git-objects.md
diff options
context:
space:
mode:
Diffstat (limited to 'content/2023/git-objects.md')
-rw-r--r--content/2023/git-objects.md30
1 files changed, 30 insertions, 0 deletions
diff --git a/content/2023/git-objects.md b/content/2023/git-objects.md
new file mode 100644
index 00000000..c62aa852
--- /dev/null
+++ b/content/2023/git-objects.md
@@ -0,0 +1,30 @@
+---
+title: "Git Objects"
+category: "software"
+abstract: How does Git store it's database?
+date: 2023-04-28T22:37:57+02:00
+year: 2023
+draft: false
+tags:
+- Git
+- tutorial
+- engineering
+---
+Any git repository has a hidden `.git` folder. If you open it, all internals of Git are at your disposal. Today, something I should have learned a long time ago: objects.
+
+First: a commit is an object. You can see it via `git cat-file -p <SHA of commit>`. The first two lines of the output will look like this:
+```
+tree b4653c20c7486d8b9e4eb10a882b79a3a9f3cfdf
+parent 5eb01813d3e6b1f2ac1c7f432d5d994a7fee9ec1
+```
+
+The parent is the SHA of the parent's commit, but that's unimportant today. Instead, let's focus on the tree. You can check what's inside using the same `git cat-file -p <SHA>`, and you will see a listing of the top-level folder in the git repository. You can also `cat-file` any of those. There are two types of objects in Git:
+
+- tree - a tree of other objects
+- blob - a file (compressed)
+
+What does it mean? A commit is a reference to the state of the entire repository at a given moment in time. The state consists of entire files (blobs) and references to other nodes in the trees (directories). Neat.
+
+This is why you don't want to store big binary files in git, as each version is a copy of the file. Not very space-effective.
+
+You can see each of those objects in `.git/objects`, but since they are compressed, it's much easier to use `git cat-file`. Note that blob objects don't have any filename attached - just the content. Instead, the filename is taken from a tree object. This is a benefit: the blob object will be reused when you have the same file under multiple names.