Over a year and a half ago, I caught one of Travis Swicegood's sessions on “Getting Git.” At the time, I didn't but it had interesting implications, so I bought his book “Pragmatic Version Control Using Git” and promptly forgot about it.
Fast forward a year and I found myself buried under web2project patches. At the time, all of them had to go through a few of us to make it into the core Sourceforge repository and the diffs started piling up. Needing a new approach where I could handle patches faster and easier, I decided to spend some time and actually explore Git.
First of all, Git is a Distributed Version Control System (DVCS), so it's an odd beast. Instead of having just one central repository which everyone interacts with throughout the project, you get your own. And every team member has their own. And anyone else can make their own. The biggest benefit of this is the ability to commit locally – and get related features like branching, merging, commiting, rollbacks, and diffs – without having to share with everyone else.
Some people – like Rich Bowen at Notes in the Margin – have concerns that this could easily result in developers working disconnected from the community and suddenly sharing huge batches of unreviewed code all at once. I'll admit that I have been guilty of this one once or twice, but I believe this is more of a workflow issue than a tool failure. If you explore Eli White's presentation on Code & Release Management (under the PHP Tek X heading). Starting on page 17, he talks about the concept of Feature Branches. If you're using a DVCS, this is an idea workflow, especially if the features are “small” and the codebase is well-segmented.
Next, Git allows for cheap branching, merging, and switching between branches. Coming from the SVN world, this didn't make sense until I adopted the Feature Branch approach. In my local development, if I know a Feature is big or a fix is wide-ranging, I make a branch and work on that. Once I'm confident in it, I merge it back with the master (trunk in SVN terms) and push it to the central repository. As a rule of thumb, once something is bigger than a couple commits, I branch as necessary.
Of course, it's possible to get in over your head and what you thought was a quick fix ends up interacting with all different parts of the system. Ideally, our systems aren't like that.. but we are in the real world. So if something ends up being bigger than you expect or even if you forget to branch, you can branch from an earlier commit with one command. Alternatively, if you think you've been commit-happy and made too many, you can merge them as needed.
Next, you have to understand that a DVCS tracks individual changes, not states of files. If I was giving you directions to the grocery store, SVN would be the equivalent of giving you pictures of each intersection where you had to turn and no other information. Sure, it might get you there but since you don't know how far it is between turns, it could get painful quickly. Alternatively, if I said “From my house, drive 1.45 miles north on First Street to Washington Ave. Turn right and drive east 2.5 miles to 3145 Washington Ave. Go in the north-facing entrance, walk 55 feet to the elevator, get in elevator, press 3..” There's a whole new level of information available.
Fundamentally, this is why merging in other systems is usually difficult. If something changes, it can be difficult or even impossible to adjust. The first explanation (aka how SVN works), may not be enough for the system to be able to handle simple changes properly. The second explanation (aka how Git works) offers an excruciating level of detal that includes multiple cues on how to get to the destination. Now, if you have road construction or others' commits changing everything beneath you, you have enough room to improvise and make educated guesses on the next step.
Github captures this all pretty well. As people fork your project and make their own changes, 90% of the time, it's trivial to merge them back into your core system. In fact, in the dozens of commits I've collected from people, only a handful have been problematic and most of those were because the same fix was applied already via another person.
Finally, Git is not perfect. With all of these useful features and concepts, the tool set is still relatively immature. I've heard numerous complaints with the Windows client, but since I'm on Ubuntu, I don't experience that. Next, the individual commands are incredibly powerful and normally well documented but the error messages are completely unexpressive. Stack Overflow has helped quite a bit with it, but the Git community should incorporate some of this back to give us meaningful errors.
Some criticize Git for the lack of a central repository. Since this is how Github is setup by default, it seems a weak criticism at best. Prior to writing this post, my biggest criticism was completely based on ignorance. I believed the hook system was mediocre at best. I'm happy to say that after trying this thing called “searching,” I found a great deal of information including many fully formed scripts.