Understanding the .git Folder and How Git Works Internally
Summary: Dive into what happens behind the scenes inside .git
.
Version control is at the heart of modern software development, and Git stands as the de facto standard. While most developers are comfortable using git
commands like commit
, push
, and pull
, fewer understand the inner workings that make Git both powerful and efficient. At the core of every Git repository lies the mysterious .git
folder, which quietly houses all the magic. This article explores the anatomy of the .git
directory and how Git functions internally to manage your codebase.
What is the .git
Folder?
Whenever you run git init
in a directory, Git creates a hidden folder named .git
. This directory contains all the metadata and object data Git needs to manage your repository's history. Everything—from commit history to branch pointers and configuration—is stored here. Without the .git
folder, your project is just another collection of files and directories; it’s this folder that turns it into a Git repository.
Key Components of the .git
Folder
The contents of the .git
folder may initially seem arcane, but every file and subdirectory serves a specific purpose.
1. HEAD
- The
HEAD
file is a straightforward text file pointing to the current branch reference. - For example:
ref: refs/heads/main
- It tracks which branch you're currently on.
2. config
- Contains repository-specific configuration settings (like remotes, user info, aliases), separate from global Git config.
- Example:
[core] repositoryformatversion = 0 filemode = true bare = false [remote "origin"] url = git@github.com:user/repo.git fetch = +refs/heads/*:refs/remotes/origin/*
3. description
- Used mostly by graphical interfaces or Gitweb to describe the repository. Usually ignored in bare repositories.
4. hooks/
- Contains scripts that Git can trigger at key events, such as before a commit (
pre-commit
) or after a push (post-receive
). - Custom logic for code quality checks, CI/CD, etc., can be executed here.
5. info/
- Houses the
exclude
file, which can be used to ignore files locally, supplementing.gitignore
.
6. objects/
- The heart of Git’s content storage.
- Git stores everything—files, directories, commits—as objects using the content-addressed storage model.
- Objects are named based on the SHA-1 (or SHA-256) hash of their content.
- blobs: store file contents
- trees: represent directory structures
- commits: record changes and metadata
- tags: point to specific objects
- Structure example:
objects/ab/cdef1234567890...
- This approach ensures deduplication and data integrity.
7. refs/
- Contains references, or pointers to commits.
refs/heads/
: Local branchesrefs/tags/
: Tagsrefs/remotes/
: Remote branches
8. logs/
- Stores recent updates to references—used for
git reflog
. - Helps recover lost commits or see previous HEAD positions.
9. index
- Also called the staging area. Keeps track of changes staged for the next commit.
- A binary file internally, tracking which content should be part of the next commit.
10. Packed files
- Git may also optimize storage using packfiles (visible as
pack
subfolder withinobjects/
). - Combines many objects into compressed files to save space and improve performance.
How Git Works Internally
Now that you know what’s inside .git
, let’s glimpse into what happens when you use Git.
1. Adding and Staging Files
When you run git add <file>
:
- Git takes a snapshot of the file’s contents.
- It creates a blob object and stores it in
objects/
if it doesn't already exist (based on hash). - Adds an entry to the index referencing this blob.
2. Committing
Running git commit
:
- Git reads the index, writes a tree object (directory structure), and stores it.
- Creates a commit object pointing to the tree, referencing parent commits, author, date, and a message.
- The commit is added to
objects/
. - The current branch reference (in
refs/heads/
) is updated to point to the new commit.
3. Branching
- Branches are just plain text pointer files in
refs/heads/
(e.g.,refs/heads/feature-x
). - Each branch contains the hash of its latest commit.
4. Merging & Rebasing
- Merges are recorded as commits with more than one parent.
- Rebasing moves or recreates commits, adjusting branch pointers.
5. Fetching, Pulling, Pushing
- Remotes: URLs and remote branch references are tracked in the
.git/config
file andrefs/remotes/
. - Pull: Gets commits from a remote and updates local refs.
- Push: Sends new commits and refs to a remote.
Visual Summary: Anatomy of a Commit
HEAD
↓
refs/heads/main → <commit hash>
↓
commit (author, date, message, parent(s))
↓
tree (snapshot of directory)
/ \
blob tree
(file) (subdir)
Why Understanding .git
Matters
- Troubleshooting: If you know where and how Git stores information, you can recover lost commits, fix damaged repos, or reset references.
- Advanced Features: Enables use of tools like
git reflog
, custom hooks, and low-level commands (git cat-file
,git fsck
). - Security: Recognizing the content-addressable nature of Git objects offers insight into data integrity and why commit histories are so robust.
Conclusion
The .git
folder is much more than a hidden subdirectory—it’s the central nervous system of your repository. By understanding what’s inside and how Git builds its history through objects, trees, commits, and references, you gain a powerful perspective over your codebase. Next time you run a Git command, you’ll know exactly what’s happening behind the scenes, inside the humble .git
directory.
Explore further: