Home | Posts by Category

Mergetools: Stop doing three-way merges!

Update 2020-12-19: This post took a different direction than I intended. Thanks to Felipe Contreras and several other people on the Git mailing list there is a patch and discussion underway to make this change in upstream Git rather than in individual mergetools. As such, I've updated this post to reflect what ramifications that upstream change will have on the mergetools surveyed below. The original post is still available.

Table of Contents:

Conflict Resolution

When there is a merge conflict in Git there are several versions of the conflicted file that all represent different times in the lifecycle of that file:

The LOCAL version of the file is what the file looks like on your branch before the merge started.

The REMOTE version of the file is what the file looks like on the other branch before the merge started.

The BASE version of the file is what the file looks like from before the point that your branch and the other branch diverged. It's the most recent common ancestor of both branches.

When there is a conflict a tool that performs conflict resolution will compare those three files against one another in order to try and resolve any conflicting changes without human intervention. Any conflicts that cannot be automatically resolved must be resolved manually by a person.

Git is one such tool that performs conflict resolution but there are also many others. In general, a conflict resolution algorithm will produce the best results by starting with all three versions of the conflicted file instead of just looking at the latest two versions. An excellent algorithm, such as the one Git uses, will do even more work.

If there is a conflict that must be resolved manually then Git will write a fourth file named MERGED which contains everything Git was able to resolve by itself and also everything that it was not able to resolve. This is the file containing conflict markers that you may already be familiar with.

The most notable thing about MERGED is that a file containing conflict markers represents a two-way diff. Writing conflict markers is a nice, simple, and static way to represent conflicts. Conflicts may be visualized directly by just looking at the file (though it's very difficult to spot subtle differences), or those same conflicts may be visualized another way using specialized tools often called "mergetools" in the Git ecosystem.

Mergetool Categories

There are three "categories" of mergetools that I've seen in my limited travels. There are many others that I haven't seen yet, and there's every likelihood that I have miscategorised some of them, so please take this broad categorization with a grain of salt and corrections are very welcome.

Blind Diff

Most mergetools surveyed below do not perform their own conflict resolution, nor do they make use of Git's conflict resolution, but rather they simply present the user with a diff of two or more files.

Often this is a diff of LOCAL and REMOTE. As explained above, this approach will often present the end-user with unnecessary differences that have already been resolved by Git. This forces the user to re-resolve those differences by hand.

It is also common to diff LOCAL and REMOTE and BASE. This approach will usually produce quite a lot of unhelpful visual noise and forces the end-user to perform all the same mental steps that a merge algorithm would perform -- and, again, steps Git's merge algorithm already performed.

Finally, another common configuration is to diff LOCAL and REMOTE and BASE and MERGED. This approach produces an impenetrable amount of visual noise and is effectively useless.

Some mergetools do allow the user to selectively turn off the diff comparison in order to only compare two panes at a time. This helps to reduce visual noise but still requires the end-user manually resolve all conflicts.

Custom Merge Algorithm

More sophisticated mergetools have their own conflict resolution algorithms. Sometimes these algorithms are quite clever. Although Git's algorithm is excellent and has many options it is by no means the final word and innovation in conflict resolution algorithms is alive and well. We want other tools to compete with Git in this arena because it will have positive outcomes for everyone.

As described above a conflict resolution algorithm will almost certainly want to start with LOCAL, REMOTE, and BASE, and any additional information about the merge or file history can help.

A mergetool with a custom conflict resolution algorithm may want to look at the result of Git's algorithm that is stored in MERGED or it may want to do its own thing entirely. Both approaches are fine -- Git does a great job but maybe somebody else can do better.

Reuse Git's Algorithm

The last category of mergetools entirely rely on the conflict resolution that Git automatically performs and stores in MERGED. They usually work by splitting MERGED into two halves and showing the end-user each half as a two-way diff.

This is a very simple approach that presents the smallest amount of visual noise to the end-user and relies on Git to do all of the hard work. (Which, it should be noted, Git is already doing anyway.)

These tools may, optionally, show the end-user additional information that could be useful in understanding the file history leading up to the conflict. This often includes temporarily showing LOCAL, REMOTE, or BASE or invoking additional Git commands to show the file history. However the actual conflict resolution is done by resolving the two halves of MERGED that contain the minimal, remaining conflicts.

autoMerge Proposal

There is a patch and discussion underway in upstream Git to add a flag that will make the Blind Diff mergetools work more like the tools that Reuse Git's Algorithm by splitting MERGED and overwriting LOCAL and REMOTE with each half.

This flag will allow these tools to benefit without making any other changes. At the time of this writing the proposal is to enable the flag by default.

Mergetools that want to display the original versions of LOCAL and REMOTE, or tools that want to use those original versions in their own conflict resolution algorithm may toggle this flag off. Mergetools that want the original verions of those files and the result of Git's resolution can simply disable the flag and split MERGED themselves.

Given the large prevalence of tools in that first category, defaulting to an opt-out setting will positively affect many more users than an opt-in setting would. Plus the authors of more sophisticated mergetools that prefer it to be disabled are better able to recognize the pros and cons and make an informed choice.


Mergetool Comparison

Below is a comparison of several default mergetools that ship with Git plus some other popular tools. I'll try to add others to the list over time. Fixes and contributions are welcome.

In addition, the tools surveyed below also have a before/after summary to visualize the ramifications of the new autoMerge flag proposal.

This uses a script in the diffconflicts repository that generates subtle merge conflicts. Here are some results to watch out for the comparisons:

  1. The bri1lig -> brillig conflict was automatically resolved. It should not be shown to the user.
  2. The m0me -> mome conflict was automatically resolved. It should not be shown to the user.
  3. The did -> Did conflict was automatically resolved. It should not be shown to the user.
  4. All conflicts in the second stanza were automatically resolved. They should not be shown to the user.
  5. The conflict on the first line is an "ours vs. theirs" situation. We only want theirs.
  6. The conflict on the third line is not an "ours vs. theirs" situation. We want changes from both:
    • Want the capitalization change from theirs.
    • Want the extra 'r' removal from ours.
    • Want the hanging punctuation change from ours.
  7. The conflict on the fourth line should be easily noticeable. We want the 'r'.

Araxis Merge

Category: Blind Diff — diffs LOCAL, REMOTE, & BASE.

Before autoMerge:

After autoMerge:

Summary: uneccessary conflicts are no longer shown; no adverse effects.

Suggestions for tool authors:

Beyond Compare

Category: Blind Diff — diffs LOCAL, REMOTE, & BASE.

Before autoMerge:

After autoMerge:

Summary: uneccessary conflicts are no longer shown; no adverse effects.

Suggestions for tool authors:

DiffMerge

Category: Custom Merge Algorithm

Before autoMerge:

After autoMerge:

Summary: uneccessary conflicts are no longer shown; no adverse effects on custom merge algorithm or end-result.

Suggestions for tool authors:

kdiff3

Category: Blind Diff & Custom Merge Algorithm

Before autoMerge:

Default view after autoMerge:

"Auto solve" after autoMerge:

Summary: uneccessary conflicts are no longer shown; no adverse effects on custom merge algorithm or end-result; end result is identical to builtin "auto solve" results.

Suggestions for tool authors:

Meld

Category: Blind Diff — diffs LOCAL, REMOTE, & BASE.

Before autoMerge:

After autoMerge:

Summary: uneccessary conflicts are no longer shown; no adverse effects.

Suggestions for tool authors:

Sublime Merge

Category: Custom Merge Algorithm

Before autoMerge:

The way Sublime Merge represents differences is much less visually distracting and the merge algorithm works well. It's tantalizing close to being useful in grasping the history of the conflict at-a-glance.

After autoMerge:

Identical output. I believe Sublime Merge opens files from the repository by itself rather than integrating with Git's mergetool. This also seems true when using the CLI smerge to invoke Sublime Merge.

Summary: Identical output; no benefits; no adverse effects.

Suggestions for tool authors:

SmartGit

Category: Blind Diff & Reuse Git's Algorithm (sort of)

The tool does an excellent job of navigating a whole repository, not just resolving conflicts, and provides easy-access to file history and the state of the repository. This is exactly the kind of tool that helps new programmers to see what needs to happen and helps seasoned programmers find relevant info quickly. The conflict resolution features are the weakest features (see suggestions below).

Default view before autoMerge:

"Conflict resolver" before autoMerge:

After autoMerge:

SmartGit opens files from the repository itself. I couldn't find a CLI util to use as a mergetool. (Corrections very welcome.)

Summary: Identical output; no benefits; no adverse effects.

Suggestions for tool authors:

Fork

Category: Blind Diff — diffs LOCAL, REMOTE, & BASE.

Before autoMerge:

After autoMerge:

Fork opens files from the repository itself. I couldn't find a CLI util to use as a mergetool. (Corrections very welcome.)

Summary: Identical output; no benefits; no adverse effects.

Suggestions for tool authors:

P4Merge

Category: Blind Diff — diffs LOCAL, REMOTE, & BASE.

Before autoMerge:

After autoMerge:

Summary: uneccessary conflicts are no longer shown; no adverse effects.

Suggestions for tool authors:

IntelliJ

Category: Blind Diff & Custom Merge Algorithm

Default view before autoMerge:

"Resolve simple conflicts" view before autoMerge:

After autoMerge:

Identical output. I could not figure out how to install the IntelliJ CLI util and configure this as Git mergetool. (Corrections very welcome.)

Summary: Identical output; no benefits; no adverse effects.

Suggestions for tool authors:

Tortoise Merge

Category: Custom Merge Algorithm (?)

Before autoMerge:

Tortoise appears to automatically resolves trivial conflicts without user intervention.

After autoMerge:

Identical output. Although I'm not convinced I'm using Tortoise correctly (see below), the result is identical with and without autoMerge enabled so I think we're safe to say this is a no-harm change for Tortoise.

Summary: Identical output; no benefits; no adverse effects.

Additional notes:

I mischaracterized Tortoise on my first pass. It does appear to perform it's own conflict resolution. Since I'm reviewing so many tools at once it's hard to spend more than an hour learning any one tool in depth and I don't think I'm giving Tortoise a fair shake; corrections welcome. I invoked it using Git-for-Windows with git mergetool -t tortoisemerge but I may have something misconfigured -- the bottom panel says MERGED but looks like BASE. I'm honestly not sure if repeating the contents of BASE in all three panes is intentional or if I have something misconfigured. The new-user thoughts below are assuming it's intentional.

Thoughts for tool authors:

WinMerge

Category: Custom Merge Algorithm

Default view before proposed autoMerge flag:

Results of built-in "auto merge" button before proposed autoMerge flag:

The built-in "auto merge" button does an admirable job of resolving conflicts although the results are not quite as good as Git's algorithm.

Results of built-in "auto merge" button after proposed autoMerge flag:

Summary: resolved additional conflict that the tool missed; no adverse effects.

Suggestions for tool authors:

tkdiff

Category: Custom Merge Algorithm

Before autoMerge:

tkdiff has an impressive conflict resolution algorithm. The simple, two-way diff is a simple, straightforward, and effective way to view differences. It will even go so far as to recommend resolutions for all conflicts, though that is more fraught (in this example that loses wanted changes).

After autoMerge:

Summary: resolved additional conflict that the tool missed; no adverse effects.

Suggestions for tool authors:

Emacs

(I would very much appreciate help filling this subsection out!)

Emacs + Magit

Category: Reuse Git's Algorithm

Before autoMerge:

After autoMerge:

Untested

General notes:

I need to do some code diving to see how they're achiving this result but the two-way diff on the top looks great. All the resolved conflicts are missing which frees the user to resolve only the remaining conflicts. The diff highlights show just the relevant conflicts. The bottom pane is somewhat noisy but still useful context to look at when resolving in the top panes.

(Thank you to u/tech_addictede for investigating and creating the screenshot.)

vimdiff

Category: Blind Diff — diffs LOCAL, REMOTE, BASE, & MERGED

Note: the screenshots below have :diffoff on the BASE and MERGED windows for clarity and brevity. The vimdiff, vimdiff3, and vimdiff2 mergetools are not individually detailed because they're all variations on the same theme -- all need to toggle the diff between individual windows to maximize effectiveness.

Before autoMerge:

After autoMerge:

Summary: uneccessary conflicts are no longer shown; no adverse effects.

Suggestions for tool authors:

diffconflicts

Category: Reuse Git's Algorithm

Before autoMerge:

After autoMerge:

Identical output in the first tab. The second tab is now missing the LOCAL and REMOTE versions of the file from before the merge since they were overwritten. Users that reference those versions to learn the conflict history will want to disable the autoMerge flag for this tool.

Summary: Identical output; minor adverse effects.

vim-mergetool

Category: Reuse Git's Algorithm

Before autoMerge:

(Same default output as diffconflicts above.)

After autoMerge:

Identical output when using the default layout. Users that have configured another default layout will see surprising results since LOCAL and REMOTE no longer contain the expected versions. Users that make use of other layouts will want to disable the autoMerge flag for this tool.

Summary: Identical default output; minor adverse effects.

VS Code

Category: Blind Diff & Reuse Git's Algorithm (sort of)

Before autoMerge:

VS Code presents the file containing conflict markers directly, but if you click the "compare changes" button it will open a new view as simple and effective two-way diff that makes it easy to identify differences at a glance. Unfortunately this new view is read-only and the user must return to the file containing conflict markers to resolve the conflict manually.

After autoMerge:

Identical output because LOCAL and REMOTE are not used.

Summary: Identical output; no benefits; no adverse effects.

Suggestions for tool authors:


How does Git generate MERGED?

It is worth asking how much work Git puts into creating the MERGED version of the file to appreciate how much work you lose by instead diffing LOCAL and REMOTE.

(The following snippets are from the Git manpages for version 2.28.0.)

First, find the common ancestor, the merge base, of both branches:

Given two commits A and B, git merge-base A B will output a commit which is reachable from both A and B through the parent relationship.

For example, with this topology:

        o---o---o---B
       /
---o---1---o---o---o---A

the merge base between A and B is 1.

git-merge-base

Second, trace the history of all the changes to the file between the merge base and the last commit on the branch (including renames):

git merge-file incorporates all changes that lead from the <base-file> to <other-file> into <current-file>.

git-merge-file

Then, try to perform the merge:

Assume the following history exists and the current branch is "master":

      A---B---C topic
     /
D---E---F---G master

Then "git merge topic" will replay the changes made on the topic branch since it diverged from master (i.e., E) until its current commit (C) on top of master, and record the result in a new commit along with the names of the two parent commits and a log message from the user describing the changes.

      A---B---C topic
     /          \
D---E---F---G---H master

git-merge

But when there is a conflict:

During a merge, the working tree files are updated to reflect the result of the merge. Among the changes made to the common ancestor’s version, non-overlapping ones (that is, you changed an area of the file while the other side left that area intact, or vice versa) are incorporated in the final result verbatim. When both sides made changes to the same area, however, Git cannot randomly pick one side over the other, and asks you to resolve it by leaving what both sides did to that area.

git-merge

Pretty cool. By default (with the recursive merge strategy) it will follow file renames and grab changes from both sides of the merge that don't change the same part of the file. For everything else it will wrap it in conflict markers.

A sophisticated mergetool could perform similar steps given LOCAL, REMOTE, and BASE but it couldn't follow renames, nor could it follow more complex merges like when the merge base is itself a merge (thus the "recursive" strategy name) and it's necessary to follow the ancestry even farther up.

Git also has other merge strategies (resolve, recursive, octopus, ours, subtree) and merge algorithms (patience, minimal, histogram, myers) that can be employed when useful. Some of those algorithms take arguments too (ignore-space-change, ignore-all-space, ignore-space-at-eol, ignore-cr-at-eol, renormalize, rename-threshold).

So, yes, a mergetool can do some of what Git does. But Git is already a sophisticated and very configurable mergetool. It's just lacking that graphical visualization of the end-result.