From 6132c61730ad53ab940bf052c3cf75a9e64cffb4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20M=2E=20Sapka?= Date: Fri, 28 Apr 2023 22:43:02 +0200 Subject: feat: article for 2023-04-28 --- content/2023/git-objects.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 content/2023/git-objects.md (limited to 'content') diff --git a/content/2023/git-objects.md b/content/2023/git-objects.md new file mode 100644 index 0000000..c62aa85 --- /dev/null +++ b/content/2023/git-objects.md @@ -0,0 +1,30 @@ +--- +title: "Git Objects" +category: "software" +abstract: How does Git store it's database? +date: 2023-04-28T22:37:57+02:00 +year: 2023 +draft: false +tags: +- Git +- tutorial +- engineering +--- +Any git repository has a hidden `.git` folder. If you open it, all internals of Git are at your disposal. Today, something I should have learned a long time ago: objects. + +First: a commit is an object. You can see it via `git cat-file -p `. The first two lines of the output will look like this: +``` +tree b4653c20c7486d8b9e4eb10a882b79a3a9f3cfdf +parent 5eb01813d3e6b1f2ac1c7f432d5d994a7fee9ec1 +``` + +The parent is the SHA of the parent's commit, but that's unimportant today. Instead, let's focus on the tree. You can check what's inside using the same `git cat-file -p `, and you will see a listing of the top-level folder in the git repository. You can also `cat-file` any of those. There are two types of objects in Git: + +- tree - a tree of other objects +- blob - a file (compressed) + +What does it mean? A commit is a reference to the state of the entire repository at a given moment in time. The state consists of entire files (blobs) and references to other nodes in the trees (directories). Neat. + +This is why you don't want to store big binary files in git, as each version is a copy of the file. Not very space-effective. + +You can see each of those objects in `.git/objects`, but since they are compressed, it's much easier to use `git cat-file`. Note that blob objects don't have any filename attached - just the content. Instead, the filename is taken from a tree object. This is a benefit: the blob object will be reused when you have the same file under multiple names. -- cgit v1.2.3