Searching in Git

Searching in Git

·

6 min read

The history tracked by Git is a rich source of information on how a project evolved. Every commit captures what changed, when it changed and by whom. In this post, we'll look at ways of searching through the captured information.

Search the commit messages

Search by a commit message with log's --grep. For example, to look for commit messages containing "darwin":

git log --grep="darwin"

By default, the pattern ("darwin" in the above example) is a basic regular expression. Adding -E tells Git to treat it as an extended regular expression and (-P) treats it as a Perl compatible regular expression. For example:

git log -E --grep="s(tr)e"  # Extended
git log -P --grep="\w\d"    # Pearl compatible

Search the commit's diff

The log command has two options for finding commits with a specific change - -G and -S. The difference between the two:

  1. -G looks for the pattern in either the added or removed lines. -S will only match if the pattern was in the added or removed line, but not both. In other words, if the change either added or removed the match.
  2. -G takes only a regex, whereas -S takes a word by default. You must include --pickaxe-regex when you want -S to treat the parameter as a regex.

For example, let's say we have a file with the following history (commit 1 being the first commit and commit 5 the latest):

Commit #Content
1value = 1
2value = hash
3value = hash + 1
4value = hash + 1
print("Line 2")
5value = 0
print("Line 2")

If we want to find all changes involving the assignment hash, we could try:

git log --oneline -G '= hash'

This will output only the following commits:

  • commit 2 (= hash was added to the file)
  • commit 3 (changed line contains = hash) and
  • commit 5 (= hash was removed from the file)

It doesn't output commit 4 because the change (line 2's diff) doesn't contain contain = hash.

What would happen when we use -S instead? For example:

git log --oneline -S '= hash'

The output will contain only:

  • commit 2 (the change adds = hash) and
  • commit 5 (the change removes = hash)

Commit 3 is no longer matched because the change neither added nor removed = hash.

Searching for a change within a diff

The diff command also supports the same -G, -S and --pickaxe-regex options. The only difference is that they search through a diff. For example, to find any unstaged changes involving setBarBackgroundColor we could try:

git diff -G "setBarBackgroundColor"
git diff -S "setBarBackgroundColor"

Searching for a "deleted" file

The good thing about Git is that "deleted" files are kept in your revision history (assuming the file was added and committed at one stage). You can even still ask Git for the commits related to a deleted file. For example, if pyups/arrays.py was deleted, you can still see its related commits with:

git log -- pyups/arrays.py

What if you only remember the name of the file but not the path? Use ** look through all directories. For example, if we only remember the file was called arrays.py:

$ git log --oneline --name-status -- '**/arrays.py'
1a4be71 Remove "arrays.py"
D       pyups/arrays.py
a313485 Create initial version
A       pyups/arrays.py

Here, the output shows when arrays.py was added (the A under a313485) and deleted (the D under 1a4be71). The --diff-filter filters by how the file changed. For example, to select only the commit where arrays.py was deleted:

$ git log --oneline --name-status --diff-filter='D' -- '**/arrays.py'
1a4be71 Remove "arrays.py"
D       pyups/arrays.py

The diff command also recognises --name-status and --diff-filter. For example, to obtain a list of files from the last five commits:

$ git diff --name-status --diff-filter='D' HEAD~5 HEAD
D       pyups/arrays.py
D       tests/test_arrays.py

When was a file added or renamed?

Set --diff-filter to A to find when files were added or R for when files were renamed. For example:

$ git log --oneline --name-status --diff-filter='A' -n 3  # When files were added
2170dd3 Added github actions
A       .github/workflows/ci.yml
7359578 Added a new NasaSkin to Medusa
A       src/main/java/eu/hansolo/medusa/skins/NasaSkin.java
A       src/main/resources/eu/hansolo/medusa/Estricta-Medium.otf
A       src/main/resources/eu/hansolo/medusa/Estricta-MediumItalic.otf
A       src/main/resources/eu/hansolo/medusa/Estricta-Regular.otf
A       src/main/resources/eu/hansolo/medusa/Estricta-RegularItalic.otf
0f45231 Updates to fix the build problems with gradle 6.5
A       gradle/LICENSE_HEADER

$ git log --oneline --name-status --diff-filter='R' -n 3 # When files were removed
6c8b339 Renamed the DigitalClockSkin to SlimClockSkin
R094    src/main/java/eu/hansolo/medusa/skins/DigitalClockSkin.java     src/main/java/eu/hansolo/medusa/skins/SlimClockSkin.java
66b597b Added a demo package with two new demos
R095    src/main/java/eu/hansolo/medusa/Demo.java       src/main/java/eu/hansolo/medusa/demos/OverviewDemo.java
a3a1848 Replaced the FramedGauge with a FGauge
R061    src/main/java/eu/hansolo/medusa/FramedGauge.java        src/main/java/eu/hansolo/medusa/FGauge.java

You can also use multiple statuses at the same time. For example, to get when FlatUiColor.java was first added and then removed:

$ git log --oneline --name-status --diff-filter='AD' -- '**/FlatUiColor.java'
a4efdd8 Removed FlatUiColor and all dependencies from lib
D       src/main/java/eu/hansolo/medusa/tools/FlatUiColor.java
4b6787f Added FlatUiColor class that contains the flat ui colors
A       src/main/java/eu/hansolo/medusa/tools/FlatUiColor.java

What commits did I make yesterday? The last n days or weeks?

Git log lets you ask for commits from a certain date. Even better, you can just ask for the commits from the last n hours, days or weeks! For example, if you want to find all commits made during the last two weeks:

git log --since '2 weeks ago'

You can also use years, months, days, hours, minutes or even seconds. For any other two-week period, use both --since and --until. For example, for the period between 22 March 2021 and 2 April 2021:

git log --since '2021-03-22' --until '2021-04-02'

And if you want only your own commits, add --author. For example:

git log --since yesterday --author='Kah Goh'

Note on dates

Git matches dates by the commit date, not the author date. If the output has commits with dates outside the range, you could be seeing the author date.

What is the difference between the commit and author date? The author date is typically when the commit was first made and the commit date is when the commit was added or last edited. So, when a commit is first added, the commit date is typically the same as the author date. Amending the commit updates the commit date, but not the author date.

Grep files at revision

Git has its own grep command. Unlike the GNU grep, it works only on the files in the repository. This is useful if you want to look for occurrences of a pattern at particular revision! For example, the following looks for arrays in pyups/arrays.py before revision 1a4be71:

 git grep 'array' 1a4be711^ -- pyups/arrays.py

No need to checkout the revision!

Conclusion

We have just looked at some ways of querying Git. Whether it be a commit message or diff in the content, Git provides a number of ways to look for them. Hope you find this useful!

Further reading / references