Git Repository Format

The set of commits in a Git repository is stored in the form of a directed acyclic graph, or DAG. This simply means that each commit can reference one or more earlier “parent” commits, and more than one commit can refer to the same parents. The word “acyclic” refers to the fact that the structure is not allowed to contain loops; no parent commit can refer back to a commit that lists it as a parent.

The structure of the DAG defines the repository’s history. Normally, each commit has exactly one parent, which describes the repository exactly as it was before the new commit was made. By comparing a commit to its parent, you can produce a diff, which is a precise set of changes that were applied to the parent in order to produce the new version.

Some commits have more than one parent. These commits are called merge commits because they express a merging of two separate branches of history. If two people have a copy of a particular repository and start making commits, those two histories will start to diverge, which is called forking. Eventually, someone will need to rejoin the two histories into one, which is called merging. (As with other version control systems, you can also create additional named branches in each repository if you want. For example, you might create a maintenance branch for each major release of your software.)

Get Linux in a Nutshell, 6th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.