Git Learning Trail
Adapted from CS2103T Git Learning Trail.
RCS: Revision Control Software
Revision Control
Git is a tool used for revision control, which is the process of managing multiple versions of a piece of information. Git is the most widely used RCS today. Github is a web-based project hosting platform for projects using Git for revision control. Some properties of RCS:
- Track history and evolution of project
- Makes it easier yo collaborate with others
- Helps to recover from mistakes
- Helps teams work simulatenously on and manage drift between multiple versions of a project
Repositories
The repository is the database that stores the revision history. To apply revision control on files in a directory, you would set up a repo in the project directory, referred to as the working directory of the repo.
- For example, Git uses a hidden folder named .git inside the working directory, to store the database of the working directory’s revision history.
1. Recording Revision History
- Git:
git init
in project directory to initialize a repository.1 2
$ cd /c/repos/things $ git init
- Use
git status
to check the status of newly-created repo.
Tracking and Ignoring
Staging and Commiting: Committing saves a snapshot of the current state of the tracked files in the revision control history. When ready to commit, you first add the specific changes you want to commit to a staging area.
1
2
git add filename.txt
git commit -m "Add filename.txt"
- Use the
git log
command to see the commit history. - Note the existence of something called the master branch. Git uses a mechanism called branches to facilitate evolving file content in parallel. Furthermore, Git auto-creates a branch named master on which the commits go on by default.
1
gitk
- The
gitk
command opens a rudimentary graphical view of teh revision graph.
Undo/Delete a Commit
To undo the last commit, but keep the changes in the staging area, use the following command.
1
$ git reset --soft HEAD~1
To undo the last commit, and remove the changes from the staging area (but not discard the changes), used --mixed
instead of --soft
.
1
$ git reset --mixed HEAD~1
To delete the last commit entirely (i.e., undo the commit and also discard the changes included in that commit), do as above but use the --hard flag
instead (i.e., do a hard reset).
1
$ git reset --hard HEAD~1
To undo/delete last n commits: HEAD~1
is used to tell get you are targeting the commit one position before the latest commit – in this case the target commit is the one we want to reset to, not the one we want to undo (as the command used is reset). To undo/delete two last commits, you can use HEAD~2
, and so on.
Omitting files from RCS
To ommit files that you don’t want to revision control such as temporary log files, configure Git to ignore that file.
- Create a file named
.gitignore
in the working directory root and add the following line in it.1
filename.txt
- The
.gitignore
file tells Git which files to ignore when tracking revision history. - To version control it (
.gitignore
), simply commit it as you would commit any other file.
Files recommended to be ommited from VC
- Binary files generated when building your project e.g.
*.class
,*.jar
,*.exe
. (They can be generated again from source code and RCS are optimized for tracking text-based files, not binary files.) - Temporary files, local files, sensitive context.
2. Using Revision History
RCS: Using History
RCS tools store the history of the working directory as a series of commits, one should commit after each change that you want the RCS to ‘remember’.
- Each commit in a repo is a recorded point in the history of the project uniquely identified by an auto-generated hash. e.g.
a16043703f28e5b3dab95915f5c5e5bf4fdc5fc1
- You can tag a specific commit with a more easily identifiable name. e.g.
v1.0.2
. - To see what changed between two points of the history, you can ask the RCS tool to diff the two commits in concern.
- To restore the state of the working directory at a point in the past, you can checkout the commit in concern. i.e., you can traverse the history of the working directory simply by checking out the commits you are interested in.
Git: tag
Each Git commit is uniquely identified by a hash e.g., d670460b4b4aece5915caf5c68d12f560a9fe3e4
.
Tags are different from commit messages, in purpose and in form. A commit message is a description of the commit that is part of the commit itself. A tags is a short name for a commit, which exists as a separate entity that points to a commit.
To add a tag to the current commit as
v1.0
:1
$ git tag –a v1.0
Git: show
, diff
Git can show you waht changed in each commit:
1
2
$ git show < part-of-commit-hash >
$ git show v1.0
Git can also show the difference between two points in the history of the repo.
git diff
: shows the changes (uncommitted) since the last commit.git diff 0023cdd..fcd6199
: shows the changes between the points indicated by commit hashes.- Note that when using a commit hash in a Git command, you can use only the first few characters (e.g., first 7-10 chars) as that’s usually enough for Git to locate the commit.
git diff v1.0..HEAD
: shows changes that happened from the commit tagged as v1.0 to the most recent commit.
Git: checkout
Git can load a specific version of the history to the working directory. Note that if you have uncommitted changes in the working directory, you need to stash them first to prevent them from being overwritten.
Use the checkout <commit-identifier>
command to change the working directory to the state it was in at a specific past commit.
git checkout v1.0
: loads the state as at commit tagged v1.0git checkout 0023cdd
: loads the state as at commit with the hash 0023cddgit checkout HEAD~2
: loads the state that is 2 commits behind the most recent commit
Git: stash
You can use Git’s stash feature to temporarily shelve (or stash) changes you’ve made to your working copy so that you can work on something else, and then come back and re-apply the stashed changes later on.
- The
git stash
command takes your uncommitted changes (both staged and unstaged), saves them away for later use, and then reverts them from your working copy. - You can reapply previously stashed changes with
git stash pop
.- Popping your stash removes the changes from your stash and reapplies them to your working copy.
- Alternatively, can reapply changes to working copy and keep in stash with
git stash apply
Access more details here.
3. Working with Remote Repos
Remote repositories are repos hosted on remote computers and allow remote access. Useful for sharing revision history of a codebase and serves as a remote backup of your codebase.
- Clone a repo to create a copy of that repo in another location on your computer.
- When you clone from a repo, the original repo is commonly referred to as the upstream repo. A repo can have multiple upstream repos.
- For example, let’s say a repo repo1 was cloned as repo2 which was then cloned as repo3. In this case, repo1 and repo2 are upstream repos of repo3.
- You can pull from one repo to another, to receive new commits in the second repo, but only if the repos have a shared history.
- New commits added to upstream repo after clone, to copy over new commits to your own clone you pull from the upstream repo to your clone.
Push new commits in one repo to another repo which will copy the new commits onto the destination repo. Requires write-access to it. Furthermore, you can push between repos only if repos have a shared history among them (i.e., one was created by copying the other at some point in the past).
- Cloning, pushing, and pulling can be done between two local repos too, although it is more common for them to involve a remote repo.
A repo can work with any number of other repositories as long as they have a shared history e.g., repo1 can pull from (or push to) repo2 and repo3 if they have a shared history between them.
- A fork is a remote copy of a remote repo.
- Cloning creates a local copy of a repo.
- In contrast, forking creates a remote copy of a Git repo hosted on GitHub.
- Particularly useful if you want to play around with a GitHub repo but you don’t have write permissions to it; you can simply fork the repo and do whatever you want with the fork as you are the owner of the fork.
- A pull request (PR for short) is a mechanism for contributing code to a remote repo, i.e., “I’m requesting you to pull my proposed changes to your repo”. For this to work, the two repos must have a shared history. The most common case is sending PRs from a fork to its upstream repo.
4. Working with Branches
Branching is the process of evolving multiple versions of the software in parallel.
- For example, one team member can create a new branch and add an experimental feature to it while the rest of the team keeps working on another branch. Branches can be given names e.g. master, release, dev.
A branch can be merged into another branch.
- Merging usually results in a new commit that represents the changes done in the branch being merged.
Merge conflicts happen when you try to merge two branches that had changed the same part of the code and the RCS cannot decide which changes to keep. In those cases, you have to ‘resolve’ the conflicts manually.
Relevant commands: Git: branch
, merge
,
Branch, Merging
- A Git branch is simply a named label pointing to a commit. The HEAD label indicates which branch you are on.
To start a branch:
1
2
3
4
5
6
$ git branch feature1
$ git checkout feature1
or: (create and switch to branch)
$ git checkout –b feature1
switch back to master branch:
1
$ git checkout master
- Always remember to switch back to the master branch before creating a new branch. If not, your new branch will be created on top of the current branch.
To sync the new branch with the master
branch:
1
2
3
4
- do commits / bug fixes in master branch
$ git checkout feature1
$ git merge master
To merge the new branch to the master branch:
1
2
$ git checkout master
$ git merge feature1
Merge Conflicts
Merge conflicts happen when you try to combine two incompatible versions (e.g., merging a branch to another but each branch changed the same part of the code in a different way).
- During merging, resolve conflicting part by editing the file
- Stage the changes, and commit
Remote Branches
Git branches in a local repo can be linked to a branch in a remote repo so the local branch can ‘track’ the corresponding remote branch, and revision history contained in the local and the remote branch pair can be synchronized as desired.
- [A] Pushing a new branch to a remote repo: (Branch exists locally but not in remote repo)
1
$ git push -u origin add-intro
The
-u
(or--update
) flag tells Git that you wish the local branch to ‘track’ the remote branch that will be created as a result of this push. - [B] Pulling a remote branch for the first time: (Branch in remote but not in locally)
- Fetch details from the remote, list the branches, create a matching local branch and switch to it.
1 2 3
$ git fetch myfork $ git branch -a $ git switch -c branch1 myfork/branch1
-a
flag tells Git to list both local and remote branches.
- Fetch details from the remote, list the branches, create a matching local branch and switch to it.
-c
flag tells Git to create a new local branch.
- [C] Syncing branches (Push / Pull new changes in local branch to remote branch)
1
2
3
$ git checkout branch1
$ git push origin add-intro
$ git pull origin branch1