Rockford Lhotka

 Thursday, August 23, 2018
« Issue referencing System.Data.SqlClient ... | Main | Why Containers are the Future »

Git can be confusing, or at least intimidating. In particular, if you end up working on a project that relies on a pull request (PR) model, and even more so if forks are involved.

This is pretty common when working on GitHub open source projects. Rarely is anyone allowed to directly update the master branch of the primary repository (repo). The way changes get into master is by submitting a PR.

In a GitHub scenario any developer is usually interacting with three repos:

Forks are created using the GitHub web interface, and they basically create a virtual "copy" of the primary repo in the developer's GitHub workspace. That fork is then cloned to the developer's workstation.

In many corporate environments everyone works in the same repo, but the only way to update master (or dev or a shared branch) is via a PR.

In a corporate scenario developers often interact with just two repos:

The developer clones the primary repo to their workstation.

Whether from a GitHub fork or a corporate repo, cloning looks something like this (at the command line):

$ git clone

This creates a copy of the repo in the cloud onto the dev workstation. It also creates a connection (called a remote) to the cloud repo. By default this remote is named "origin".

Whether originally from a GitHub fork or a corporate repo, the developer does their work against the clone, what I'm calling the Dev workstation repo in these diagrams.

First though, if you are using the GitHub model where you have the primary repo, a fork, and a clone, then you'll need to add an upstream repo to your dev workstation repo. Something like this:

$ git remote add MarimerLLC

This basically creates a (readonly) connection between your dev workstation repo and the primary repo, in addition to the existing connection to your fork. In my case I've named the upstream (primary) repo "MarimerLLC".

This is important, because you are very likely to need to refresh your dev workstation repo from the primary repo from time to time.

Again, developers do their work against the dev workstation repo. They should do their work in a branch other than master. Mostly work should be done in a feature branch, usually based on some work item in VSTS, GitHub, Jira, or whatever you are using for project and issue management.

Back to creating a branch in the dev workstation repo. Personally I name my branches with the issue number, a dash, and a word or two that reminds me what I'm working on in this branch.

$ git fetch MarimerLLC
$ git checkout -b 123-work MarimerLLC/master

This is where things get a little tricky.

First, the git fetch command makes sure my dev workstation repo has the latest changes from the primary repo. You might think I'd want the latest from my fork, but in most cases what I really want is the latest from the primary repo, because that's where changes from other developers might have been merged - and I want their changes!

The git checkout command creates a new branch named "123-work" based on MarimerLLC/master. So based on the real master branch from the primary repo; the one I just made sure was updated from the cloud to be current.

This means my working directory on my computer is now using the 123-work branch, and that branch is identical to master from the primary repo. What a great starting point for any new work.

Now the developer does any work necessary. Editing, adding, removing files, etc.

One note on moving or renaming files: if you want to keep the file's history intact as you move or rename a file it is best to use git to make the changes.

$ git mv OldFile.cs NewFile.cs

At any point while you are doing your work you can commit your changes to the dev workstation repo. This isn't a "backup", because it is on your computer. But it is a snapshot of your work, and you can always roll back to earlier snapshots. So it isn't a bad idea to commit after you've done some work, especially if you are about to take any risks with other changes!

Personally I often use a Windows shell add-in called TortoiseGit to do my local commits, because I like the GUI experience integrated into the Windows Explorer tool. Other people like different GUI tools, and some like the command line.

At the command line a "commit" is really a two part process.

$ git add .
$ git commit -m '#123 My comment here'

The git add command adds any changes you've made into the local git index. Though it says "add", this adds all move/rename/delete/edit/add operations you've done to any files.

The git commit command actually commits the changes you just added, so they become part of the permanent record within your dev workstation repo. Note my use of the -m switch to add a comment (including the issue number) about this commit. I think this is critical! Not only does it help you and your colleagues, but putting the issue number as a tag allows tools like GitHub and VSTS to hyperlink to the issue details.

OK, so now my changes are committed to my dev workstation repo, and I'm ready to push them up into the cloud.

If I'm using GitHub and a fork then I'll push to my personal fork. If I'm directly using a corporate repo I'll push to the corporate repo. Keep in mind though, that I'm pushing my feature branch, not master!

$ git push origin

This will push my current branch (123-work) to origin, which is the cloud-based repo I cloned to create my dev workstation repo.

GitHub with a fork:


The 123-work in the cloud is a copy of that branch in my dev workstation repo. There are a couple immediate benefits to having it in the cloud

  1. It is backed up to a server
  2. It is (typicaly) visible to other developers on my team

I'll often push even non-working code into the cloud to enable collaboration with other people. At least in GitHub and VSTS, my team members can view my branch and we can work together to solve problems I might be facing. Very powerful!

(even better, but more advanced than I want to get in this post, they can actually pull my branch down onto their workstation, make changes, and create a PR so I can merge their changes back into my working branch)

At this point my work is both on my workstation and in the cloud. Now I can create a pull request (PR) if I'm ready for my work to be merged into the primary master.

BUT FIRST, I need to make sure my 123-work branch is current with any changes that might have been made to the primary master while I've been working locally. Other developers (or even me) may have submitted a PR to master in the meantime, so master may have changed.

This is where terms like "rebase" come into play. But I'm going to skip the rebase concept for now and show a simple merge approach:

$ git pull MarimerLLC master

The git pull command fetches any changes in the MarimerLLC primary repo, and then merges the master branch into my local working branch (123-work). If the merge can be done automatically it'll just happen. If not, I'll get a list of files that I need to edit to resolve conflicts. The files will contain both my changes and any changes from the cloud, and I'll need to edit them in Visual Studio or some other editor to resolve the conflicts.

Once any conflicts are resolved I can move forward. Even if there weren't conflicts I'll need to commit the merged changes from the cloud into my local repo.

$ git add .
$ git commit -m 'Merge upstream changes from MarimerLLC/master'

It is critical at this point that you make sure the code compiles and that your unit tests run locally! If so, proceed. If not, fix any issues, then proceed.

Push your latest changes into the cloud.

$ git push origin

With the latest code in the cloud you can create a PR. A PR is created using the web UI of GitHub, VSTS, or whatever cloud tool you are using. The PR simply requests that the code from your branch be merged into the primary master branch.

In GitHub with a fork the PR sort of looks like this:

In a corporate setting it looks like this:

In many cases submitting a PR will trigger a continuous integration (CI) build. In the case of CSLA I use AppVeyor, and of course VSTS has great build tooling. I can't imagine working on a project where a PR doesn't trigger a CI build and automatic run of unit tests.

The great thing about a CI build at this point is that you can tell that your PR builds and your unit tests pass before merging it into master. This isn't 100% proof of no issues, but it sure helps!

It is really important to understand that there is an ongoing link from the 123-work branch in the cloud to the PR. If I change anything in the 123-work branch in the cloud that changes the PR.

The upside to this is that GitHub and VSTS have really good web UI tools for code reviews and commenting on code in a PR. And the developer can just go change their 123-work branch on the dev workstation to respond to any comments, then

  1. git add
  2. git commit
  3. git push origin

as shown above to get those changes into the cloud-based 123-work branch, thus updating the PR.

Assuming any changes requested to the PR have been made and the CI build and unit tests pass, the PR can be accepted. This is done through the web UI of GitHub or VSTS. The result is that the 123-work branch is merged into master in the primary repo.

At this point the 123-work branch can (and should) be deleted from the cloud and the dev workstation repo. This branch no longer has value because it has been merged into master. Don't worry about losing history or anything, that won't happen. Getting rid of feature branches once merged is necessary to keep the cloud and local repos all tidy.

The web UI can be used to delete a branch in the cloud. To delete the branch from your dev workstation repo you need to move out of that branch, then delete it.

$ git checkout master
$ git branch -D 123-work

Now you are ready to repeat this process from the top based on the next work item in the backlog.

Thursday, August 23, 2018 4:00:41 PM (Central Standard Time, UTC-06:00)  #    Disclaimer Related posts: