A Beginner's Guide to Git
git
is
currently one of the most popular
DVCS (Distributed Version Control Systems)
in use. Created by
Linus Torvalds, known as the "father of Linux," in 2005,
git
not only offers powerful features but also embodies a
spirit of resilience and independence. Its history reflects the humor
and defiance of master developers and open-source advocates, something
that has always impressed me deeply.
Below is a quote from the Chinese version of Wikipedia that describes the backstory (original source here; English version here):
In 2002, Linus Torvalds decided to use BitKeeper as the main version control system for maintaining Linux kernel code. Since BitKeeper was proprietary software, this decision was long criticized within the community. Particularly, Richard Stallman and members of the Free Software Foundation argued that an open-source tool should be used for the Linux kernel's version control. Linus considered using existing solutions like Monotone, but these tools had various issues, particularly with performance. Other systems like CVS were dismissed by Linus for their architecture.
In 2005, Andrew Tridgell wrote a simple program that could connect to BitKeeper repositories. Larry McVoy, the owner of BitKeeper, believed that Tridgell had reverse-engineered the protocol used by BitKeeper and decided to withdraw the free usage rights of BitKeeper. Negotiations between the Linux kernel development team and BitMover failed to resolve the differences. As a result, Linus decided to create his own version control system to replace BitKeeper, and in just ten days, he developed the first version of git.
git
has since become more than just a version control
tool for programmers; it's widely adopted and now serves as an
essential collaboration tool for many projects. Whether it's gathering
data or writing articles, git
offers a highly valuable
skill for both work and daily life.
This article follows my usual tutorial style, aiming to provide a quick introduction. For those who want to dive deeper, I highly recommend referring to the official documentation. Here are some useful resources:
Popular git
hosting platforms:
Now, let's dive into the content. I hope this guide will be helpful, and feel free to email me if you spot any errors.
Table of Contents
- Step 1: Setting Up Your Identity
- Step 2: Creating or Cloning a Git Repo
- Step 3: Recording Changes
- Step 4: Viewing and Editing History
- Step 5: Managing Remote Repositories
- Conclusion
Step 1: Setting Up Your Identity
The first thing to do when using git
is to set up your
personal information, specifically your name and email address. This
helps ensure that anyone reviewing your project can reach out if they
encounter issues, and it also makes it easier to track contributions
within a team. Additionally, it safeguards your work by properly
attributing it to you. It's especially satisfying to see your name on
open-source projects.
More importantly, your identity information isn't easy to change once
it's set (especially when contributing to large or external projects).
I didn't realize this when I first started using git
, so
I didn't configure it. As a result, my git logs showed only my
computer's name (automatically set by the system), which I found quite
frustrating. Here's how to set your identity:
$ git config --global user.name "your name"
$ git config --global user.email "example@example.com"
You can also configure different information for specific projects using:
$ git config --local
For more options, you can refer to the manual:
$ man git config
To view your current settings, use:
$ git config -l
Now that your identity is configured, let's move on to actually using git.
Step 2: Creating or Cloning a Git Repo
You can work with git
within a "git repository"
(referred to as "repo" here). There are two ways to do this:
- Create a local repo
- Clone a repo from the web
Creating a Local Repo
Creating a local repo is straightforward. First, navigate to the directory where you want to create the repo:
$ cd /my/git/repo
Then, run:
$ git init
This creates a .git
subdirectory containing all the
necessary information. You don't need to worry about what's inside.
Cloning from the Web
If you want to clone an existing repo from the web, start by navigating to the directory where you want to store the repo, and then run:
$ git clone <url>
Either method will set up a git environment, allowing you to begin using git.
Step 3: Recording Changes
Now we get to the core of git
: recording changes. As a
version control system (VCS), tracking changes is its main purpose. In
git
, files have two states: tracked
and
untracked
. Tracked files are under git
's
control, while untracked files are not. The following diagram from the
official git site
clearly illustrates how a file's status changes within a git repo:
To check the current status of your files, use:
$ git status
This command will show the status of your files. For a deeper understanding of what these statuses mean, I recommend checking out the official guide. My goal here is simply to help you get started quickly.
Back to the example: let's say we create a new file named
README
in our repo:
$ touch README
This file is now untracked, so if we want git
to track
it, we need to run:
$ git add README
This moves the file into the staged
phase. To finalize
the change, commit it with:
$ git commit
This command opens your default editor so you can write a commit
message. Once done, the file moves to the
unmodified
state.
From this point on, whenever you modify a file, you can repeat these steps to track changes. Typically, you'll follow this pattern:
$ git add <file>
$ git commit
In most cases, you can simplify the process with:
$ git add -A # Stages all changes (equivalent to --all)
$ git commit -m "one line commit" # Creates a simple, one-line commit
These are the commands you'll use most often as a typical git user.
Step 4: Viewing and Editing History
After making several commits, you'll likely want to review the history
of your changes. The git status
command only shows the
current state, so to view past commits, use:
$ git log
This command displays a list of all commits. If you pay attention, you'll notice each commit has a unique hash code, which acts as its identifier. To revert to a previous commit, you'll need this hash. Use the following command to reset your repo to a specific commit:
$ git reset <hash>
This command is gentle—it primarily changes git
's own
internal records. For instance, if a file was untracked at the target
commit, it will become untracked again after the reset, but the file
itself remains. For a more forceful reset, I often use:
$ git reset --hard <hash>
This command will completely erase files that were untracked at the target commit. Be cautious, as this action is irreversible.
Step 5: Managing Remote Repositories
When people think of git
, they often think of
GitHub. As mentioned earlier, GitHub
is the world's most popular git hosting platform. If you're working
with git, you'll likely need to use such services, making it crucial
to understand remote repo management.
Connecting to an Existing Remote Repo
Let's say you have an existing repo on GitHub. First, initialize a local repo (it can be empty) and run:
$ git remote add origin <URL> # "origin" is a default alias that you can change
$ git remote -v # Verify the remote connection
Now your local repo is linked to the remote one. Next, pull the content from the remote repo:
$ git pull origin master # The first part is the remote name, the second is the branch name
This command fetches the content from the remote repo. You might
wonder how this differs from clone
. In short,
clone
just copies the repo without establishing a link,
while the remote
command creates a permanent connection
between your local and remote repos.
Once you've made changes, you can push them back to the remote repo with:
$ git push origin master
Forcing a Pull to Overwrite Local Files
Reference: Git force pull to overwrite local files
Sometimes, you may need to force a pull from a remote server. In such cases, use:
$ git fetch --all
$ git reset --hard origin/master
$ git pull origin master
Pushing an Existing Local Repo
If you already have a local repo, you'll need to first create an empty repo on GitHub. Then, link it to your local repo:
$ git remote add origin <URL>
$ git add .
$ git commit -m "Initial commit"
$ git push origin master
Since you'll likely be pushing often, you can simplify the process with:
$ git push -u origin master
After that, future pushes require only:
$ git push
Conclusion
Congratulations! You've reached the end of this guide. The content
covered here is just a basic introduction to git
, aimed
at helping you get started quickly. However, mastering
git
is a long journey—simply digesting this guide doesn't
make you an expert. Like many other Unix-like tools, becoming
proficient requires continuous learning and practice. I wish you
smooth sailing as you explore the world of git
!