Sunday, August 29, 2004

What I'd Like in a Version Control System

When I started the PhD I wanted to use a version control system to keep track of all the PhD-related files. So I installed CVS. But now I've come to the view that it doesn't suit my needs very well, and I'm looking into other options. And I thought the first step there would be to get it clear what I'd like a version control system to do.

When I'm writing something, especially if its something that takes me a lot of effort to put into words, I tend to be constantly creating new files. And passages of text get cut and pasted all over the place and files frequently get renamed. For various reasons, CVS doesn't seem to be suited to this kind of situation.

As a bit of a digression, here's a little on how my writing process seems to go. At any point in the process what I've already written will have tried out certain angles, and explored certain connections or aspects of the concepts, and at certain points it makes sense to start a new file.

I'm might start a new file if it seems more productive to explore something a bit different (because it's more of a "fresh start" and seems to keep things a bit clearer).

Or I might start a new file to "start over again", as what I've already written might give me a better sense of how "it all fits together", and it's usually much more effecient to just start with a new file than to go through and edit what I've already written.

When I put some information about the chronology of files into their filenames, which helps manage things. And at some point I'll go through the earlier files and see if there's anything in them that I should take out and use in the later files (though this description makes it sound a lot more straighfoward than it is in reality).

Back to version control systems. Here's what I'd like one to do (keep in mind this is just a wish list):

  • Seamlessly integrate with the file system and other tools. Pretty obvious, at least as a high level goal, though perhaps not as obvious in the details:
    • being aware of file-system events and the ability to automatically respond to them:
      • whenever a new file/directory is created, automatically put it under version control
      • whenever a file/directory is deleted, no longer keep track of it in the version control system, but retain the older versions of it
      • dealing with file and directory renaming and moving
      • etc
  • be able to automatically commit changes, such as when the file is saved or when the computer is shut down. Optionally present a dialog box for entering comments when the file is commited.
A major theme is that I want version control that doesn't require me to do anything in addition to the way I usually manage the files. Because I'm frequently creating, renaming and removing files, there's too much overhead with something like CVS that requires things like manual commits etc.

I understand there are reasons why you would want the ability to have manual control, and what I'm saying is that I'd like a system that lets you choose what sort of behaviour -- manual or automatic -- you'd like.

And ideally, a system that has this functionality built in / bundled with it (rather than requiring you to write your own tool to implement the automated functionality by itself making the manual calls to the version control system when the file system events occur).

I was going to write a bit about the investigation into version control systems I've done, but I need to finish up this post now... so I'll just briefly say that so far Subversion seems to be the best freely available one. Here's some more info: a book on Subversion; a version-control system comparison; and an article on the newer version-control systems out there.



What I'd really like is a system that does version control down to the level of individual characters, and keeps track of them when they're cut and pasted within and between documents. This is one of the things the Xanadu people wanted to do, and while they had trouble getting their ideas into working systems, there is a system that I recently saw a talk about, called TeNDaX, which actually does this for documents you're editing. From what I could tell from the talk and demo, it seemed pretty good.