?

Log in

No account? Create an account

DVCS - Musings of Unayok

2009 Jun 07

16:36DVCS 

Previous Entry Share Next Entry

Comments:

[User Picture]
From:unayok
Date:2009 Jun 16 - 02:49 (UTC)
Super ultra-short response: SVN is slow, it's unfun to deal with branches or anything other than the most trivial of merges (without external assistance). It requires fiddly set up of a master repository. It pollutes your project's directory structure with .svn (.cvs) dirs for tracking. It manages changes on a file by file basis rather than a project changeset basis (even though revision numbers are "global"). And it's slow. It doesn't differentiate between local editing and merges from other branches. Getting diffs between branches is ugly at best. You can't commit to a local repository and then deal with merges from an upstream repository; you *have* to slave your local repository to the ultimate master first.

Distributed VCS differ from more traditional VCS in a few ways (and I'm not getting all of them, but this is enough to get started). First: the design doesn't revolve around a central master repository. Every repository could be considered authoritative. While this sounds chaotic, it really isn't as two things happen. (1) they provide excellent tools for merging and cherrypicking updates from other "remotes", and (2) it's very flexible for many workflows: many projects maintain an effective central authoritative repository, but this can be quickly merged with any branch on any local repository.

Because it's distributed, disconnected operation is supported and de rigeur. Your commit cycle becomes much shorter. You're working with a local repository and can commit half-way done stuff to your local tree. Only when you're happy with it do you publish your mergeset or "git push remote" to provide your changes for others. This and the ability (at least with git) to have relatively lightweight branches allows you to have separate branches (for example) for each ticket or issue. Since it's managed in your local repository, no one outside it need see the low level branching or commits you've performed; just the finished ones you merge into "master" (or whatever your workflow calls it.

(When you push to or pull from a repository, you end up with that repository's history for the branch you're pushing/pulling. This allows for the disconnected operation; you really do have a complete repository on your system, and the shared history allows the commits you do offline to be merged safely into the remote when the next connection is made.)

DVCS works with change sets. It doesn't track an individual file's history separately. Thus it is trivial to follow the code as it gets moved from file/class to file/class during refactoring. There are tools (Which may also be implemented in SVN/CVS, but I can only imagine how glacially slow they'd be) to bisect the changeset history to find precisely when X changed (where X is a file, or a line in some file that used to be some place else). Git and Mercurial (and I assume others though I have less experience with them) are both fast. Git is just mindbogglingly fast, even under windows. (Git also allows cryptographic signatures of changesets which is a great idea I haven't needed to use yet.)

This isn't a well-structured note; I'm rambling. But I hope you get at least some of the idea.

The Design section of Git's Wikipedia article may prove enlightening. Yes, it is still wikipedia, but read as one source among several, it's a decent overview.