April 4, 2019

Merge Bugs: Part 2

by Rodney J. Lambert

In Part 1 of this series, the topic of merge bugs was introduced. Merge bugs are the result of changes in multiple branches that don't result in conflicts, but when the two changes are merged together the result is not consistent with the intent of either modification.

In Part 1, the following process was suggested as a method of identifying merge bugs:

  1. Find the common commit between the feature branch and the branch you intend to merge into.
  2. Find all of the files that have been modified since the common commit on your feature branch.
  3. Find all of the files that have been modified since the common commit on the merge target branch.
  4. Get the intersection of these two lists, which gives you all of the files that have been modified in both branches since they have diverged.
  5. Present diffs of the changes along the target branch, along the feature branch and a three-way diff between the common commit, feature branch and the target branch.

The rationale behind this process is that it will make it easier for someone doing the code review to understand the intent of the changes along each branch and then identify any artifacts of the merge that would violate this intent. While the process might seem complex, the combination of open source tools and the source code presented in this series form a simplifying addition to your merge procedure.

The Git repository located at https://bitbucket.org/RodneyJLambert/branchdiff contains source code for an application that automates the procedure described above. This tool will be referred to as the BranchDiff application. The source code is written in C++ and uses git command line commands and the Meld application. The source has been compiled on both Linux and Windows computers. It should not be difficult to extend this application to other tools or to write it in another language once the details have been explained.

The BranchDiff application is executed from the command line in the root directory of a Git repository. The application expects one or two arguments that are the branch names for the branches that you wish to compare. If only one branch name is provided, the BranchDiff application assumes that you want to compare the provided branch with the current checked out branch.

We are using the example repository that was introduced in Part 1 of the series, and we are assuming that we have the AllValuesInMksUnits branch checked out. We could run the BranchDiff application in the root directory of the repository with the master branch as an argument.

>BranchDiff master

The BranchDiff application determines the current branch by executing the command:

git rev-parse --abbrev-ref HEAD

In this case the result is AllValuesInMksUnits. Then BranchDiff issues the command:

git merge-base master AllValuesInMksUnits

to determine the common base commit. The result is:

0627815e6c1010cfd30bdff809294feee8e702b7

or in short form e8e702b7.

 

Next the BranchDiff issues the following command to determine the names of the files that have changed along the master branch since the common commit:

git diff --name-only 0627815e6c1010cfd30bdff809294feee8e702b7..master

In most real cases this command is likely to return a long list of file names. Next the application determines all of the files that have changed on the AllValuesInMksUnits branch by executing:

git diff --name-only 0627815e6c1010cfd30bdff809294feee8e702b7..AllValuesInMksUnits

At this point the BranchDiff application has the two lists, and it determines the intersection of those lists to get just the file names for the files that changed in both branches. The user is then presented with the option to see the difference between branches on a file by file basis. If you choose to see the differences for a file, the application will use the Show option to create a copy of the file from any branch that is not currently checked out. For example:

git show master:Sample.hpp > /tmp/master-Sample.hpp

creates a temporary copy of Sample.hpp from master so that it can be handed off to the Meld application.

The Meld application has command line options for doing two and three-way difference views of the source files that have the following form for a tab with a two-way comparison:

--diff filename1 filename2

or, for a tab with a three-way difference view:

--diff filename1 filename2 filename3

The BranchDiff application uses the following code to construct four tabbed views in Meld for the current file. Tab one is a three-way difference between the common commit and the two branches, the second tab is the difference between the common commit and branch1, the third tab is the difference between the common commit and branch2, and the final tab shows the difference between the file in the two branches.

std::ostringstream diffCmd;

diffCmd << "meld --diff " << tempBaseFileName << " " << branch1FileName << " " << branch2FileName;

diffCmd << " --diff " << tempBaseFileName << " " << branch1FileName;

diffCmd << " --diff " << tempBaseFileName << " " << branch2FileName;

diffCmd << " --diff " << branch1FileName << " " << branch2FileName;

std::system(diffCmd.str().c_str());

The BranchDiff source uses the experimental file system library that is not part of standard C++ at this time but should be soon. The dependency can be removed by setting the tempDir path manually in the source.

std::string tempDir = std::experimental::filesystem::temp_directory_path().string() + TempDirSlash;

If you do remove the dependency on the file system library you can also remove the following block from the CmakeList.txt.

if(UNIX)

target_link_libraries(${PROJECT_NAME} stdc++fs)

endif()

Another interesting part of the source code is the Exec function which opens a pipe to the process the system is going to execute. This allows the output of the spawned process to be returned to and used by the BranchDiff application. Also it is worth noting that the pointer returned by the call to popen or _popen is not a conventional pointer because one must call pclose or _pclose on the pointer when no longer needed. Notice that the returned pointer / handle can still be managed by std::unique_ptr by specifying the deleter function to use (e.g. pclose or _pclose). This technique can be extended to a plethora of system calls where a secondary cleanup function must be called to free a resource, making it much easier to create clean C++ code that can still throw exceptions and correctly free system resources.

std::string exec(const std::string &cmd)

{

std::cout << "exec cmd -> " << cmd << std::endl;

const int BufferSize = 256;

std::array<char, BufferSize> buffer;

std::ostringstream result;

#ifdef WIN32

std::unique_ptr<FILE, decltype(&_pclose)> pipe(_popen(cmd.c_str(), "r"), _pclose);

#else

std::unique_ptr<FILE, decltype(&pclose)> pipe(popen(cmd.c_str(), "r"), pclose);

#endif

if (!pipe)

{

throw std::runtime_error("popen() failed!");

}

while (!feof(pipe.get()))

{

if (fgets(buffer.data(), buffer.size(), pipe.get()) != nullptr)

{

result << buffer.data();

}

}

std::cout << "exec result : "  << std::endl << result.str() << std::endl;

return result.str();

}

Hopefully the BranchDiff code can help you prevent Merge Bugs from finding their way into your code. At a minimum, I hope these articles spark a discussion about how simply relying on the merge process can produce less than desirable results.

If you're interested in receiving technical articles like this one in your inbox, subscribe to our newsletter. Visit our job board to browse our current technical roles.

 

info@stoutsystems.com
877.663.0877
© Copyright 1995-2018 - STOUT SYSTEMS DEVELOPMENT INC. - All Rights Reserved
envelopephone-handsetlaptop linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram