Friday, November 07, 2008

Subversion – Introduction

I have been working with Subversion for a while now, but I started right out of an immediate need, and hence without any conceptual background. This time however, a lack of some basic understanding hindered my learning, so, I took out some time to skim through the most reliable source of information. I found this Subversion book titled as "Version Control with Subversion", which can be downloaded here. It's time consuming to go through a 400 pages book to learn about something secondary in importance to my (or any developer's) work, so, I thought it might be useful to summarize things I'll read right away.

Following is the summarized selection I have extracted from the first chapter of the book..

A version control system tries to enable collaborative editing and sharing of data. Subversion is one such system. Different systems use different technique to implement this collaborative environment; even Subversion supports a couple of different methods. It can manage any sort of file collections (not limited to source-code only). At its core, just like any other version control system, there is a repository; it stores information in the form of a file-system tree of files and directories (just like a typical file server). Clients connect to the repository, to read and write files. The operations are synchronized, and each client sees the latest version of the files stored on the repository.

What distinguishes it from a file server, however, is its ability to remember all the changes ever made to any of the files or directories, as well changes in the directory structure and addition and deletion of files. The fundamental problem faced by all version control systems is as questioned: how will the system allow users to share information, but prevent them from accidentally stepping on each other's feet? It's all too easy for users to accidentally overwrite each other's changes in the repository.

In sum, the problem reduces to the following questions: How the latest version of a file should represent all the changes made by some writers, when some reader is reading it? Anyways, two solutions have been proposed to this problem:

The Lock-Modify-Unlock Solution

Three problems:

  1. Locking may cause administrative problems – you lock a file and go on vacation
  2. Locking may cause unnecessary serialization – both need to modify different portions
  3. Locking may create a false sense of security – lock A, then ask for B; lock B, then ask for A

The Copy-Modify-Merge Solution (Used by Subversion and many other systems)

In the fourth action, Harry failed to write the file back to repository because he had an older version (which he was modifying) as compared to the one currently on the latest repository. This is where the concept of “merge” needs to be introduced. So, once Harry downloads the latest repository version, he merges his changes done in the older version into this downloaded latest version, and, writes this merged file back to the repository (A* above). Finally, Sally can now read this updated file that has her modifications as well as that of Harry’s. This solution is much preferred over The Lock-Modify-Unlock Solution in many cases except when the files are sound files or binary files where it will become almost impossible to ensure consistency of the changes made by multiple clients at the same time.


Repository URLs You can access Subversion repositories through many different methods—on local disk or through various network protocols, depending on how your administrator has set things up for you. A repository location, however, is always a URL. Table 1.1, “Repository access URLs” describes how different URL schemes map to the available access methods. Working Copies Subversion has this concept of a working copy, which is essentially an up to date copy of the project source-code that you’ll download (or checkout) from the underlying repository. If the repository has multiple projects, then you’ll need to specifically mention the exact URL of the project subdirectory while issuing a check-out command to SVN, say something like this:

svn checkout

Now, there are two possibilities:

  1. You can checkout/download the source-code of my_project in some ordinary directory; you’ll need to get some typical SVN client to do so. I use TortoiseSVN, which can be downloaded here.
  2. Alternatively, you can checkout/download the source-code into some workspace project’s source folder of the IDE you use for development. I use Eclipse Ganymede these days, and the Subclipse v1.4 plugin does the job for me; it can be downloaded here.

It’s easy to get used to the synchronization mechanisms adopted by Subversion. Most of the times, the only operations one shall use as a developer are Commit, Update, Merge, Compare, Restore, etc. I won’t get into details of each command here (respecting the order of knowledge in the book). Once you’ve checked out the source-code, it’s time for you to play around with it just like you can with any of your local projects. Important thing to note here is that whatever you’ll modify will not have an effect on the original source-code in the repository, and hence the name ‘working copy’. But, at some point in time, you’ll need to incorporate (or commit) your changes into the original files in the project repository. This operation is known as COMMIT, and to do this commit, you’ll execute a command like this:

svn commit -m "Fixed a bug in main class"

If all goes good, you’ll be shown a confirmation that your changes have been committed and the repository is now updated. So, the next time you’ll check-out the same my_project code from the repository, you can witness your updated code. There is just one last thing that needs to be understood. Consider the following case:

  1. You and some other developer checkout my_project at the same time (hence the same version).
  2. You make changes to’s methodX, and, you commit the code.
  3. Then, the other developer makes his/her changes to the same class’s methodY, and tries to commit. This commit however, will fail.

Reason: he/she is trying to commit a modified yet NOT up to date version of the file to the repository. It’s no more up to date, because the latest on repository is YOUR’S

To get over this problem, the other developer will issue an update command, like this:

svn update

The svn will then automatically TRY to update the working copy of this developer by incorporating the changes made by you (or any other changes in the latest version) . The developer, however, is required to do some manual modifications, if he was also updating the same method (i.e. method) or section of code.

Switching to Carrot2.. for now, over & out.

No comments: