Bugzilla and VMware Revisited
Building the Abstraction With Customer Driven Organizations

Subversive Thoughts

I've been a big CVS fan for the last several years.  While I was at Intel (and later at ServerEngines) I used CVS to manage the source code for four reasonably sized cross-site development efforts.  Since I've been a consultant with Verilab I've been exposed to other systems including Clearcase and custom developed tools.  I've even come across some folks still using RCS!  Each system has it's pros and cons. 

CVS is free, widely used, supports cross-site development, is easy to set up, and relatively easy to use.  Clearcase is expensive, requires dedicated IT support, and can be difficult to use in a cross-site environment but it does a great job keeping track of the changes made to all sorts of objects and makes it much easier to do branching and merging.  It allows users to track changes made to directories (which isn't possible in CVS), and provides tools such as clearmake to speed up the generation of derived objects amongst a large group of users. 

I really like the flexibility provided by Clearcase but for my own projects have never been willing to deal with the complexity and expense involved in setting it up.  And though I'm adept at hand-modifying a CVS repository, I've always felt there must be a better, safer, and cleaner way.  Luckily, I'm not the only one frustrated with the limitations of CVS.  Subversion has been developed over the last several years to be a replacement for CVS.  Here are a list of features (taken from http://subversion.tigris.org, see that site for more details):

  • Most current CVS features.    
  • Directories, renames, and file meta-data are versioned.    
  • Commits are truly atomic.    
  • Apache network server option, with WebDAV/DeltaV     protocol.    
  • Standalone server option.    
  • Branching and tagging are cheap (constant time) operations
  • Natively client/server, layered library design    
  • Client/server protocol sends diffs in both directions    
  • Costs are proportional to change size, not data size    
  • Choice of database or plain-file repository implementations  
  • Versioning of symbolic links  
  • Efficient handling of binary files    
  • Parseable output    
  • Localized messages

I've been using Subversion (and it's Windows companion TortoiseSVN) for the last couple of years to manage simple things like saving word documents, but never really had the opportunity to use the tool in anger (as Verilab's veritable CEO Tommy Kelly likes to say).  Recently I've been porting some old C++ code and decided it would be a good opportunity to take Subversion for a spin. 

My first task before getting started was to read Appendix A: Subversion for CVS Users of the online Subversion book.  The appendix walked me through some of the highlights such as differences in the way file version numbers are tracked (they aren't, but changesets are), some of the new features described above, differences in the way branches and tags are created (using 'svn copy'), etc. I also read through the documentation describing how to set up my own personal repository.  The next step was to import my existing code - three C++ files, a Makefile, and a collection of Perl scripts.  I didn't bother to do a "vendor import" (as CVS gurus will be familiar with), but looking back I wish I would have. 

Once all the files were checked in, I started thinking about what sorts of changes needed to be made.  The first steps were easy - I wanted to separate the C++, Makefile, and scripts into separate directories.  Piece of cake!  All I had to do was "svn mkdir" the directories, and do an "svn move" to get the files in the correct locations.  Next I wanted to split the three C++ files into several smaller files - one class per file.  To ensure I was able to trace changes back to the original imported files I did an "svn copy" from the original files to the new ones, and edited the new files to remove content I didn't need.  As the cleanup continued I was able to commit atomic changesets to keep a snapshot of my progress.  It really didn't matter much since I was the only one working on the code, but in a large team being able to commit changesets atomically is a huge plus, ensuring that if there are problems committing any of the files, none of them will be saved. 

When I was done with my changes, I tagged everything so I could share my progress with others.  In Subversion, a directory of source files is tagged by doing an "svn copy" of the source files to another directory.  By convention that directory looks something like "{repo-root}/tags/1.0".  Branches are done the same way. 

As I worked with the tool, I did experience some problems.  I was occasionally able to get the repository in a strange state where the only way I was able to check in my files was to check out a new working copy and commit my changes there.  I also forgot at least once that I had two working copies checked out and copied older versions of files to locations where I'd meant to put newer versions.  As I became more familiar with the ins and outs of Subversion these types of problems occurred less frequently, but some time will need to be spent ramping up on the tool if you plan to use it in a production environment.  The same can be said for pretty much anything, though. 

All in all I found Subversion gave me the power and flexibility I needed to get the job done and resolved many of the limitations I'd experienced using CVS.  If you're starting a new project and plan to use an open-source version control system, Subversion should be at the top of your list!