logo
Community

Research Programs

BlogForum
Back to blog
How does Git store your data

May 19, 2022

Git Internals Part 1- List of basic Concepts That Power your .git Directory
byPragati VermainTips

Git is the most popular and commonly used open-source version control system in the modern-day. However, we barely focus on the basic concepts that are the building blocks of this system. 

In this article, we will learn about the basic concepts that power your .git directory.

The .git directory

Whenever we initialize a git repository, a .git directory gets created in the project’s root. This is the place where Git stores all its information. Digging a bit deeper you can see the directory structure as below:

$ ls -C .git
COMMIT_EDITMSG  MERGE_RR    config      hooks       info        objects     rr-cache
HEAD        ORIG_HEAD   description index       logs        refs

The detailed structure looks like the following:
.
|-- COMMIT_EDITMSG
|-- FETCH_HEAD
|-- HEAD
|-- ORIG_HEAD
|-- branches
|-- config
|-- description
|-- hooks
|   |-- applypatch-msg
|   |-- commit-msg
|   |-- post-commit
|   |-- post-receive
|   |-- post-update
|   |-- pre-applypatch
|   |-- pre-commit
|   |-- pre-rebase
|   |-- prepare-commit-msg
|   `-- update
|-- index
|-- info
|   `-- exclude
|-- logs
|   |-- HEAD
|   `-- refs
|-- objects
`-- refs
    |-- heads
    |-- remotes
    |-- stash
    `-- tags

Directories inside the .git directory

The .git directory consists of the following directories:

hooks:
This directory contains scripts that are executed at certain times when working with Git, such as after a commit or before a rebase.

info:
You can use this file to ignore files for this project, however, it’s not versioned like a .gitignore file would be.

logs:
Contains the history of different branches. It is most commonly used with the git reflog command.

objects:
Git’s internal warehouse of blobs, all indexed by SHAs. You can see them as following:

$ ls -C .git/objects
09  24  28  45  59  6a  77  80  8c  97  af  c4  e7  info
11  27  43  56  69  6b  78  84  91  9c  b5  e4  fa  pack

These directory names are the first two letters of the SHA1 hash of the objects stored in git.

You can enquire a little further as following:

$ ls -C .git/objects/09
6b74c56bfc6b40e754fc0725b8c70b2038b91e  9fb6f9d3a104feb32fcac22354c4d0e8a182c1

These 38 character strings are the names of the files that contain objects stored in git. They are compressed and encrypted, so it’s impossible to view their contents directly. 

rebase-apply: 

The workbench for git rebase. It contains all the information related to the changes that have to be rebased.

refs:

The master copy of all refs that live in your repository, be they for stashes, tags, remote-tracking branches, or local branches. 

You can see the existing refs in your .git directory as below:

$ ls .git/refs
heads
tags
$ ls .git/refs/heads
master
$ ls .git/refs/tags
v1
v1-beta
$ cat .git/refs/tags/v1
fa3c1411aa09441695a9e645d4371e8d749da1dc

Now, having discussed the directories inside the .git directory, let’s explore the files that reside inside the .git directory and their uses.

Files in the .git directory

  1. COMMIT_EDITMSG:

This file contains the commit message of a commit in progress or the last commit. Any commit message provided by the user (e.g., in an editor session) will be available in this file. 

If the git commit exits due to an error before generating a commit, it will be overwritten by the next invocation of git commit.

It’s there for your reference once you have made the commit and is not actually used by Git.

2. config:

This configuration file contains the settings for this repository. Project-specific configuration variables can be dumped in here including aliases. 

$ cat .git/config
[core]
    repositoryformatversion = 0
    filemode = true
    bare = false
    logallrefupdates = true
    ignorecase = true
[user]
    name = Pragati Verma
    email = pragati.verma@gmail.com

This file is mostly used to define where the remote repository lives and some core settings, such as if your repository is bare or not.

3. description:

This description will appear when you see your repository or the list of all versioned repositories available while using Git web interfaces like gitweb or instaweb.

4. FETCH_HEAD:

FETCH_HEAD is a temporary ref that keeps track of what has recently been fetched from a remote repository. 

In most circumstances, git fetch is used first, which fetches a branch from the remote; FETCH_HEAD points to the branch’s tip (it stores the SHA1 of the commit, just as branches do). After that, git merge is used to merge FETCH_HEAD into the current branch.

5. HEAD:

HEAD is a symbolic reference pointing to wherever you are in your commit history. It’s the current ref that you’re looking at. 

HEAD can point to a commit, however, typically it points to a branch reference. It is attached to that branch, and when you do certain things (e.g., commit or reset), the attached branch will move along with HEAD. In most cases, it’s probably refs/heads/master. You can check it as follows:

$ cat .git/HEAD
ref: refs/heads/master

6. ORIG_HEAD:

When doing a merge, this is the SHA of the branch you’re merging into.

7. MERGE_HEAD:

When doing a merge, this is the SHA of the branch you’re merging from.

8. MERGE_MODE:

Used to communicate constraints that were originally given to git merge to git commit when merge conflicts and a separate git commit is needed to conclude it.

9. MERGE_MSG:

Enumerates conflicts that happen during your current merge.

10. index:

Git index refers to the “staging area” between the files you have on your filesystem and your commit history with meta-data such as timestamps, file names, and also SHAs of the files that are already wrapped up by Git. 

The files in your working directory are hashed and stored as objects in the index when you execute git add, making them “staged changes.”

11. packed-refs:

It solves the storage and performance issues by keeping the refs in a single file. When a ref is missing from the /refs directory hierarchy, it is searched for in this file and used if it is found.

Conclusion

In this article, we covered a brief overview of the basic concepts that make up your git directory. These are the fundamental components of Git as we know it today and use on a regular basis. We’ll be learning more about these Git internal concepts in the upcoming articles.

Keep reading. In case you want to connect with me, follow the links below:

LinkedIn | GitHub | Twitter | Dev

Bio 

Pragati Verma is a software developer and open-source enthusiast. She has also been an active writer on various platforms and has written for many organizations as a freelance writer. As a Junior Editor at Hackernoon, Pragati helps numerous writers every day to publish their content on Hackernoon.

In her spare time, Pragati loves to read books or watch movies.

Git internals

Recent Posts

ocr

October 29, 2024

How OCR Helps in Text Extraction From Multiple Images at Once?

See post

September 27, 2024

Exploring the adoption of Go and Rust among backend developers

See post

September 17, 2024

Streamlining the Chatbot Development Life Cycle with AI Integration

See post

Contact us

Swan Buildings (1st floor)20 Swan StreetManchester, M4 5JW+441612400603community@developernation.net
HomeCommunityDN Research ProgramPanel ProgramBlog

Resources

Knowledge HubPulse ReportReportsForumEventsPodcast
Code of Conduct
SlashData © Copyright 2024 |All rights reserved