This blog post is part of a series on everyday tools and strategies for code review, drawn from Drupal contribution experiences. xjm is a Drupal 8 core maintainer and release manager.
If you have spent much time developing software with others, you've probably asked yourself some of these questions at one time or another:
- How did this ever work?
- Who can explain this code to me?
- Why did we do that in the first place?
- Where did this bug come from anyway?
- If that's broken, what else is?
One of my favorite tools to answer these and other mystifying questions is git blame. It's like a time machine for that bit of code in your git repository that's making you scratch your head.
A quick example with git blame
Many IDEs and other tools provide user interfaces for the git history, but for this example, we'll use the command line. Let's say you find yourself questioning a specific design choice in a project, on lines 46 and 47 of src/Branding.php
:
increaseSize();
?>
You can use git blame
to check the history of these lines:
$ git blame src/Branding.php -L46,47
a2487ea5 (dsmith 2016-01-14 09:33:50 +0000 46) // Make the logo bigger.
a2487ea5 (dsmith 2016-01-14 09:33:50 +0000 47) $logo->increaseSize();
This shows you that these lines were most recently changed in git commit a2487ea5
by dsmith. You can inspect that commit with git show
:
$ git show a2487ea5
commit a2487ea57ea5594f5e2a54b3fec5cacb33030863
Author: dsmith <[email protected]>
Date: Thu Jan 14 09:33:50 2016 +0000
Client request for changed logo scale in ticket #329.
diff --git a/src/Branding.php b/src/Branding.php
index 3d60ff4..05c5d32 100644
--- a/src/Branding.php
+++ b/src/Branding.php
@@ -45,3 +45,8 @@
+ // Make the logo bigger.
+ $logo->increaseSize();
+
}
The commit shows you that the full change was just adding these two lines, and according to the commit message, it was at the client's request in ticket #329. So you presumably shouldn't just change it back, and you can probably refer to that ticket for more details.
Using git blame for Drupal core contribution
In my role as a Drupal 8 release manager, one of my responsibilities is reviewing Drupal 8 core patches that have been Reviewed and Tested by the Community to decide whether they are accepted for Drupal 8. I'm responsible for not only helping ensure that each change is correct in itself, but that it fits in the "big picture" for Drupal 8 and moves us toward the next release. By learning when and why a bit of code was introduced, I can check that we're not reverting something intentional, and become more confident that we're fixing the whole problem rather than just a symptom or single part.
An example: While studying the code added in #2409701: Field storage configuration is not exposed to config translation UI, I had some doubts about the parent API. An inline code comment in the existing codebase made me wonder whether there had been an earlier, similar bug (which could have meant a more complete fix was possible). So, I used git blame
to review the history of the code comment:
$ git blame core/modules/config_translation/src/ConfigEntityMapper.php -L150,154
cc7cae5f core/modules/config_translation/lib/Drupal/config_translation/ConfigEntityMapper.php (webchick 2013-11-17 18:41:04 -0800 150) // Add the list of configuration IDs belonging to this entity. We add on a
$ git show cc7cae5f
commit cc7cae5fe14ac9abe62e84467cd09534c269c913
Author: webchick <[email protected]>
Date: Sun Nov 17 18:41:04 2013 -0800
Issue #1952394 by vijaycs85, tstoeckler, webflo, Gábor Hojtsy, Schnitzel, falcon03, YesCT, kfritsche, Ryan Weal, dagmita, likin, toddtomlinson, nonsie, Kristen Pol, dawehner, tim.plunkett, penyaskito, EclipseGC, larowlan, robertdbailey, helenkim, David Hernández, EllaTheHarpy, lazysoundsystem, juanolalla, R.Hendel, Kartagis: Add configuration translation user interface module.
In Drupal core, we put the relevant Drupal.org issue ID in the commit message for each issue. This allowed me to easily locate #1952394: Add configuration translation user interface module in core. By reading that issue, I learned more about the code's history and guessed that my colleague Gábor Hojtsy, the Multilingual initiative coordinator and a subsystem maintainer for the subsystem, would be a good resource for my questions.
Gábor explained the purpose of the code and that it pre-dated the other API I would have suggested. Since Drupal 8 was very close to a release candidate, I judged that the straightforward fix was indeed the best way to address the bug (rather than suggesting changes to the parent implementation).
Changes in Drupal core are committed by a small team of branch maintainers (including me), but these maintainers commit patches from thousands of contributors. This means the commit author listed by git blame
doesn't give the full picture of who contributed. When you're deciding whom to blame, it's best to look at the relevant issue for the commit. ;) The contributors to that issue can help answer your questions about it.
Going further back
Sometimes, git blame
and other version control history tools will not give you the commit you're looking for, but instead some intermediate unrelated change. This is especially common in Drupal 8's git history, because of the extensive refactoring done since Drupal 7. Don't be perturbed when this happens -- instead, use your git time machine to go back in time to before the intermediate change was made.
In git, the ^
character points to the commit immediately before a given reference. So, in the first example above, you can get the codebase from immediately before the client's request was implemented:
git checkout a2487ea5^
Then you can explore the whole project's code from that point, perhaps to see what prompted the client's request at the time and whether there might be an alternate solution.
You can also use this strategy when code is moved between files or changed in other ways that git cannot track. In Drupal 8, it's common that the git history will appear to only go back to certain significant refactorings. For example, all automated tests in Drupal core were moved from shared module_name.test
files in Drupal 7 to many individual PSR-0 (and later PSR-4) class files for Drupal 8. To get the full history of tests written before this change, simply check out the commit immediately before the relevant PSR-0 conversion, and continue your git blame
exploration from that point back.
Full history of a line with git log -L
git log also supports its own version of the -L
option, which is a handy way to see the full history of a line at once. (Thanks to Moshe Weitzman for pointing me to this feature.) To try this example, jump back in time to the core codebase I'm using right now:
git clone --branch 8.1.x https://git.drupal.org/project/drupal.git
cd drupal
git checkout b26bac
Try exploring the history of the core Node module's help documentation, currently on line 84 of node.module
:
git log -L84,84:core/modules/node/node.module
This gives you a log of only the commits that changed that line, showing the relevant diff hunk for each. Page back through the history and see how the Node module's help has changed over the past fourteen years. Note how the context shown for the diffs expands as you go back to include the other lines changed in those hunks of the related commits.
Recommendations and further reading
git blame
and other version control features work best when you take care with your commit history. Use clear, specific commit messages and scope your commits appropriately. Where possible, include a ticket/issue reference or link in either the commit message or git notes.
Further reading:
- git-blame documentation
- git-log documentation
- Git revision selection
- Drupal.org handbook page on git blame
- Drupal.org issue #2285061: [feature regression] Bring back git blame UI
- PHPStorm: Viewing Changes History for a File or Selection (I'll share more about PHPStorm in an upcoming post.)
Have a favorite tool for exploring the git blame
history? Leave a comment!