Chapter 1. Revision Control & File Compression/Packaging

Ted Choc

Ryan Seekely

Revision History
Revision 2.12004-05-11TC
Lab revised for summer '04 semester
Revision 2.02003-05-08JC4
Lab revised for summer semester. Section on File Compression/Packaging.
Revision 1.12003-01-16RG
Added section on diff and additional links, corrections.
Revision 1.02003-01-16RG
Initial release in DocBook format.

Table of Contents

Overview of Revision Control
How does it work?
What software provides revision control?
Preparation
Rules of the road
Installing CVS
Starting the Repository
Using CVS
CVS and your EDITOR
CVS command structure
Starting a project
Checking out a project
Adding files and directories
Committing files
Updating your code
Removing files
Retrieving older versions of code
Tagging
Reverting changes
Displaying differences (diff)
Using CVS in the real world
Remote access to CVS
Using CVS with groups in the CoC
Additional Resources
Additional readings
CVS-related tools
CVS and your editor/IDE
CVS keywords
File Compression/Packaging
tar/gzip
jar - The Java ARchiver
What you need to do
CVS
tar
gzip
jar
Turnin
How your lab will be graded.
Point Values
[Important]Important

Due on WebCT by 2004/08/27 16:00:00

[Note]Note

All due dates for this class will be in international standard date and time notation. (Wed 00:00 == Tue 24:00) != Tue 12:00. Read ISO 8601 for more information.

Overview of Revision Control

Using revision control allows you to keep a history of all of your source files. One of the main benefits of this type of source control is the ability to easily revert to previous version(s) of your code. For example, when an unexplainable bug is discovered, you can check earlier editions of your files in order to determine which set of changes induced the bug. Another benefit of using revision control is the ability to coordinate changes made to files by multiple users. If you were to simply keep one directory of the current working code, user A and user B could both make substantial changes to the source files, possibly writing over each other's changes. Talk about a way to get an irate group member!

Revision control provides an automated (well, almost automated) mechanism for logging all changes to your source code. This log can never be rolled back. Everything that happens to the code is logged, even when you delete a file! By keeping exact logs of every change that ever happens, we have the power to selectively go back and look at the source tree at any point in time, or to see differences between files, or even to remove the changes introduced in one version of a file.

You may not believe us now, but revision control is used everywhere in the real world. Every major software development shop uses revision control for all of their software packages. Every group (and even single individuals) can benefit from the options that revision control provides. We *strongly* recommend that you set a CVS repository up for your groups as you begin to work on the group projects. CVS will not help your grade, but it will help aleviate some of the last-minute headaches that would hurt your grade.

How does it work?

Most revision control software packages do not store individual copies of each revision that you submit. Instead, a file (or other structure) listing the changes you have made from previous versions is usually implemented. This can greatly decrease the amount of disk space used. Typically, the main functions you will use when interfacing with the version control software will include commands that:

  1. Add new file(s)

  2. Check out existing file(s)

  3. Check in modified file(s)

  4. Do other functions, such as diffing files or checking history logs

What software provides revision control?

There are many competing packages that provide revision control. Some of the most popular include:

For the purposes of this lab, you will be performing a variety of tasks using the revision control package known as CVS (Concurrent Version Systems). CVS stores all of the files and directories currently under version control in the CVS repository. The repository should not be directly accessed. Instead, users check out their own copies of the repository, edit their own copy, and update the master files through the use of CVS commands.

Think of it like this--the repository is where the "master copies" are kept. The master files are kept in a format that you can't understand, as they are logs of everything that has changed in the file since it was introduced into the project. You will ask the CVS repository to give you a copy of the lastest versions of the files, which you can work on just like ANY OTHER FILE on your computer, because the files CVS gives you aren't special. They're normal C/Java/whatever files just like anything else. When you are done working, you will tell CVS to send your changes back to the master repository so everyone else can see the good work you have done.

Preparation

Rules of the road

When using CVS, there are a few guidelines that will help you maximize the benefits achieved by using revision control. Here are a few of our recommendations:

  1. The first, and most important rule, is that you MUST use CVS regularly if you are going to see any real benefit! This means you must check your files in VERY frequently. You may have hundreds of revisions for a single file by the end of a long project, but that's OK. The more revisions, the easier it is to pick out small changes that might have broken something and roll them back out.

  2. Commit useful log messages. When you're trying to track down a bug, there's nothing worse than 50 revisions each with a log that says "Hello, world!"

  3. Don't commit non-compiling code to the repository. Nobody hates anything more than someone sending in code that breaks their working copy when they update. Commit only when you have a working build (make sure you check)! Having to manually roll back consumes time you might not have (since almost every team does their project at the last minute).

  4. Always update your local copy before a development session begins. This will help you reduce the amount of duplicated effort and conflicts you will have to resolve. Every time you sit down at a terminal to start coding, update. This way you can see what everyone else has done since you last worked on the code.

  5. Arrange who will be working on what file when. Even though CVS does all of the automatic change merging for you, if two people make changes to the same portion of the same file, CVS doesn't know what to do, and will ask you to clarify. Normally this isn't an issue if you've split your project up well, but failure to do so could be a problem as you have to manully go back and merge the two different versions of the file together.

  6. When you get to a major point in time where parts are working, take a few seconds to tag the tree. Tagging the source will help you get back to that point later, should you ever need to start over or get a quick version of what was turned in again to check for a bug. Many professional developers prefer to set milestones for functionality, and when a milestone is reached, the code is tagged before work continues. Projects in this class aren't long enough to really benefit from well-defined milestones, but you should have smaller goals in mind and schedule those goals well before coding begins.

Installing CVS

CVS binaries can be downloaded from several web sites; many distributions of Linux should come with CVS already installed, if not, installing CVS is very simple. CVS can be obtained from its homepage at http://www.cvshome.org/. The Sun Solaris and Redhat Linux machines in the College of Computing all have CVS pre-installed. To check the verison of CVS installed, type cvs -v in a shell. You should see something like this:

        [68(F) tedchoc@helsinki tedchoc]$ cvs -v

        Concurrent Versions System (CVS) 1.11.2 (client/server)

        Copyright (c) 1989-2001 Brian Berliner, david d `zoo' zuhn,
                                Jeff Polk, and other authors

        CVS may be copied only under the terms of the GNU General Public License,
        a copy of which can be found with the CVS distribution kit.

        Specify the --help option for further information about CVS
        

The versions will change from machine to machine, but all CVS versions in the CoC interoperate without any problems, so don't worry. If you can't get this working, get a TA to help you.

Starting the Repository

The first thing you have to do before you can work with CVS is tell CVS where the repository is. A repository is a place where CVS will keep all of the data about all of the projects you're working on. A single repository can have as many projects (aka modules) in it as you wish. There is no need to start a new CVS repository for each project you work on in this class, you can use the same one over and over.

The repository can be any directory that you have write access to (in the CoC this is normally only your home directory and /tmp, and we do not advise you to EVER create anything worthwhile in /tmp, ESPECIALLY your CVS repositories. CVS will store everything it needs to maintain the revision control, and it is probably a good idea to use an empty directory for the repository so you don't get non-CVS files mixed up with CVS files. You tell CVS what directory the repository lives in with the environment variable CVSROOT. Here are some examples (to determine which shell you are using, type finger -m username and look at the "Shell:" line).

Example 1.1. Setting your CVSROOT

          For sh/bash shells:
            % export CVSROOT=/dev/null

          For csh/tcsh shells:
            % setenv CVSROOT /dev/null
          
[Note]Note

The above example will NOT work for you, as dev null is a special "empty" device that discards all data, not a directory. You need to make a new directory somewhere to keep your personal repository. If you have made a folder but don't know the absolute path to the folder, cd into that folder and type pwd to see the absolute path to the current folder.

[Important]Important

For the purpose of this lab you should call your cvsroot mycvsroot. However you may put it where ever you want (though putting it in your home directory is probably easiest).

Now that CVS knows where it has to live you can tell CVS to setup the repository for use with the simple command:

Example 1.2. 

            cvs init
          

That's it! Your CVS repository is ready to use!

[Warning]Warning

CVS will make a directory called CVSROOT inside of wherever you told it to keep the repository. Don't touch the things in the repository directory unless you know what you're doing, as it is critical to CVS operation (not to mention rather cryptic).

Using CVS

CVS and your EDITOR

From time to time CVS will give you the option to enter text, normally to describe changes made to a file or files. By default CVS will use the editor specified by the EDITOR environment variable. When you first log in, this is not set, and CVS will default to the system editor, vi. If you do not know vi, now is a great time to learn. However, if you find vi too cumbersome, you can override the EDITOR. On CoC systems, an alternative editor is available called pico. It is very simple to use, and may be a good choice if you are not familiar with vi or UNIX/Linux. To use pico instead of vi, do this from a bash shell (please be sure to use the appropriate line, depending on which machine you are using):

Example 1.3. 

            # bash/sh shells
            export EDITOR="/usr/bin/pico"       # RedHat machines
            export VISUAL="/usr/bin/pico"       
          
            export EDITOR="/usr/local/bin/pico" # Solaris machines
            export VISUAL="/usr/local/bin/pico" 
          
            # csh/tcsh shells
            setenv EDITOR pico         # RedHat and Solaris machines
            setenv VISUAL pico
          
[Note]Note

You may wish to use Emacs. Emacs can be started quickly in a console window by setting your editor to "emacs -nw".

If just setting the EDITOR environment variable does not change your editor, set the VISUAL environment variable in the same manner shown above. CVS should then use the editor you want.

If you get stuck in vi and can't get out, hit escape a few times, then type ":q!" (no quotes) and hit enter (you should see ":q!" at the bottom of your screen when you are typing).

CVS command structure

All of the CVS commands follow a similar pattern:

          cvs [global cvs options] [command] [command specific options]
        

Let's look at the first command we will run:

          .     +----+---+------+-------------------+
          .     |cvs |   |import|lab2 gburdell start|
          .     +-+--+-+-+--+---+--------+----------+
          .       |    |    |            |
          .       |    |  command        |  
          . cvs binary |            command options
          .            |
          .          global options
        

In the example above our command is import, which adds projects to the repository (located from CVSROOT). The 'lab2' is the name that cvs will use to keep track of this project (aka module). Anything before the command name (in this case "import") will be an option to CVS itself. Anything after the command will be an option to the command only. Watch out, as this can get tricky when there are two commands with the same name (i.e. -d for cvs and -d for checkout).

Starting a project

For CVS to be of any use it must have at least one project (aka module) created in it. Generally you only create a single CVS project for each large program/lab you are working on. Suppose we have a project named bigProj that's in a base directory called bigProjDir. Simply cd into bigProjDir and type:

Example 1.4. 

            % cvs import bigProj <insert your username here> start
          

The above command will then import everything in the current directory (recursively!) into the CVS project/module named bigProj.

If you are starting a brand-new project (as is the case with this lab), it is standard practice to make a new empty directory and import from within that empty directory. Importing an empty directory will simply make CVS start a new module in the CVS repository with the given name, and the new module will have no files.

[Warning]Warning

DO NOT IMPORT FROM YOUR HOME DIRECTORY! If you run the CVS import from your home directory, CVS will attempt to import ALL files in your home directory! CVS will continue to fill your repository with useless information and copies of EVERY file in your home directory until your quota is exhausted!

It is suggested that you make a directory called bigProjDir and then change into bigProjDir to run the import command (as shown above).

Make sure you change the <insert your username here> to whatever your CoC login is. CVS will ask you for some comments to document adding this project to the repository. When you save the changes and exit the editor, CVS will make the new directories required to host this project in the repository and start new versions of all of the files in the current directory (and all of its child directories).

The second and third options are the vendor and release tags. They are utilized for advanced CVS features--you don't have to worry about what they mean. You just have to give CVS two words to make it happy. We recommend using your username as the vendor and start as the release, but this is just a suggestion.

Once you have completed the CVS import, do NOT continue to work on the files in the directory that you imported from, as CVS is not tracking those files. To begin to work on your project/module, you MUST do a CVS checkout into another directory. If you are wondering whether a file is editable or being tracked by CVS, a quick way is to look for a "CVS" directory in the same directory as the file you are editing. If the directory is not there, CVS does not know about the file!

[Note]Note

Even if a CVS directory is present, if the file has not been added to the project CVS does not know about the file. You can use the command:

            cvs status <filename>
          

to check the status of a file. If it says "Unknown" then the file is not being tracked by CVS.

Checking out a project

Once you initially put something in to any code repository system you want to try to get it back and make sure that it worked. After that you probably want to remove your original code and only use the CVS-tracked code so that you don't mess anything up and confuse the repository system.

So, now we have a project in the repository. In order to check out a project from a CVS repository you use the CVS "checkout" command followed by the name of the project:

Example 1.5. 

            cvs checkout bigProj
          
[Note]Note

For the purposes of this lab, it is probably a good idea to be in your home directory when you run the above command.

This will checkout bigProj and put a working copy of it in a directory called bigProj. You can now look through that project and make changes in the files. Remember, the files that are in bigProj are normal files, just like any other file you might create. You can edit them in any editor you want. CVS will create CVS directories inside of every subdirectory of your project. Don't touch these directories or the files inside of them, or CVS will break and you won't be happy.

Adding files and directories

Adding a file to a CVS repository takes two steps. The first step is scheduling it to be added:

Example 1.6. 

            cvs add filename
          

If you are trying to add a file that is NOT plain text or sourcecode, you should tell CVS to store the file as a binary. To tell CVS that a file is a binary and avoid possible corruption, use the -kb flag to the add command:

Example 1.7. 

            cvs add -kb binary_filename
          

The next step is when the file is ACTUALLY added to the repository, as opposed to just being marked to be added:

Example 1.8. 

            cvs commit filename
          

Adding directories works in a similar fashion, only you don't have to commit the directory. When making new directories, you have to add the directory BEFORE you can add the files in the directory to the repository. Adding a directory has identical syntax to adding a file:

Example 1.9. 

            cvs add <directoryname>
          

Committing files

When it is time to commit, simply issue the "commit" command followed by the files you want to commit. If you don't supply any filenames, CVS will attempt to commit all changes in all files and directories including and below the current directory. This is generally NOT a good idea. Whenever you commit a file, CVS asks for a log message, which will be brought up in your EDITOR that we set above. Log messages should briefly describe the changes you made since the last commit. This way, when it's time to look at who broke what, you have some idea of what you did when you submitted your 400+ lines of changes.

Keep log messages as brief as possible, without losing important details. Here is the command to commit a file to the repository:

Example 1.10. 

            cvs commit <filename>
          

Updating your code

If someone else is making changes to CVS and you want to update your code all you have to do is type:

Example 1.11. 

            cvs update
          

Any updates that affect the directory you are in or any subdirectories will be updated by the cvs update command. However, if your partner has made new directories in the repository this WON'T get the new directories. To update and get all new directories, use the -d flag to the "update" command, as follows:

Example 1.12. 

            cvs update -d
          

Removing files

From time to time you'll end up with some cruft left over in your CVS tree that isn't needed. In the event you have to remove things from the CVS repository you must first delete the file from the disk and THEN you can remove the file from the CVS repository. Removing files works just like adding files. First you delete the file from the directory, then you cvs "remove" the file (you must specify a filename), then you cvs "commit" the nonexsistant file. Below is the sequence of commands to remove a file:

Example 1.13. 

            rm filename
            cvs remove filename
            cvs commit filename

Remember earlier, when we said CVS never forgets anything, even deletions? The same is true for your deleted file. CVS still has a copy of that file around, should you need it. More on this later.

Removing directories from CVS is very similar, you don't even have to do anything special, just do a cvs remove, no need to commit:

Example 1.14. 

            cvs remove directoryname
          

Retrieving older versions of code

Let's say you've been working on a piece of code for an hour or so, and at the end of the hour, due to lack of proper planning or fate or both, you decide it's best to just discard your changes. But you don't want to lose the file, so how do you get the old version back? This is simple enough. Simply issue a CVS update command, but using the -r flag to specify what version you want to retrieve (hint: this works for deleted files too!).

Example 1.15. 

            cvs update -r 1.3 filename

The CVS update command will only update your file if it is up to date with the tree of previously specified version. Once you specify the -r flag a "sticky" tag is associated with that file until you manually clear it. This means that the file will be permanently stuck in time at the version you specified and will NOT be updated to newer versions on subsequent update commands (unless explicitly specified). Also note that you CAN NOT CHANGE a file that has been given a sticky version number, as you have requested CVS to freeze that file in time. If you want to get the older version of a file and wish to begin making changes to that file, see section III. K. on Reverting your changes.

[Note]Note

If you want to clear any sticky versions or tags on the file, run this command:

            cvs update -A

The -A flag will clear all sticky versions/tags associated with a file and bring it up to the current HEAD version.

Tagging

The last useful command in CVS we'll mention is the "tag" command. At the beginning of the lab we talked about making a "moment in time" for your sourcecode. This moment in time we talk about is called a "tag". Tags are ways to quickly retrieve a certain state of the software. Tagging is a very simple process, and can be very useful. Before you tag, make up some sort of name for this tag that you can remember later. Then to tag, simply go to the top directory of the project and use the "tag" command.

Example 1.16. 

            cvs tag lab9-turnin

That command would assign every current version of a file in the repository to be a member of the "lab9-turnin" tag. If you wanted to get your project or a file back to whatever state it was in at the time you made the tag, use the name of the tag with the "-r" option.

Example 1.17. 

            cvs checkout -r lab9-turnin projectname
            cvs update -r lab9-turnin filename

Reverting changes

From time to time you may find it necessary to roll back changes made in a file to a previous revision. To do this you need to use the -j switch to the update command. For instance, let's say we had a file called test.c which is currently as revision 1.5. However, we want to revert this file to version 1.2. To do this we simply execute this sequence of cvs commands:

Example 1.18. 

            cvs update -j 1.5 -j 1.2 test.c
            cvs commit -m "Reverting to 1.2 code." test.c
          

The second command (the commit command) is optional, but probably a good idea so everyone knows that you rolled back to the 1.2 code before you continued to make changes.

Displaying differences (diff)

To "diff a file" is a term derived from the unix command diff(1) which compares two files and displays any differences between them. CVS has a similar command that will compare a file to the version in the repository.

Example 1.19. 

            cvs diff myfile
          

The previous command will compare the version of myfile in the current directory to the one in the repository. If no differences exist, nothing will be displayed.

When run without a file name all files in the current directory (and recursively all files in all sub-directories) will be compared and their differences listed. This is very useful if during a programming session several files have been modified and one is not certain if they've re-comitted all of files again.

Using CVS in the real world

Remote access to CVS

So CVS sounds very useful, but what happens if you want to use your CVS project somewhere other than the CoC? To do this, you have to tell CVS just a little more about where exactly you want it to look for the repository and how you want to connect to it.

The first is the CVS_RSH variable. When accessing CVS remotely you will almost always use SSH. SSH provides a sufficiently secure manner of logging into the server and encrypting data between your computer and the repository. To tell CVS to use ssh from a bash shell, you can use this command:

Example 1.20. 

            % export CVS_RSH=ssh #bash
            % setenv CVS_RSH ssh #csh/tcsh
          

The next variable is the CVSROOT directory. We used CVSROOT before with just a directory name when the repository was on the local machine. When asking CVS to connect to a remote machine, we have to be a little more explicit about how we need to get to the repository. We want to tell CVS that we are using an external program (ssh) to manage the connection and then we need to give CVS our user name to connect with, the machine where the repository is located, and (of course) the directory. Remote CVSROOT settings follow this pattern:

          :AUTHENTICATION:USERNAME@MACHINE:REPOSITORY_DIRECTORY

Example 1.21. 

            export CVSROOT=":ext:landfill@oscar.cc.gatech.edu:/tmp/cvsroot" #bash
            setenv CVSROOT ":ext:landfill@oscar.cc.gatech.edu:/tmp/cvsroot" #csh/tcsh
          

After you have done this you should be able to use the regular CVS commands just as if the repository were on the same machine, assuming that permissions are set up correctly on the host machine. You will be prompted to log in before CVS will do anything unless you've set it up otherwise (which is outside the scope of this class, consult your resident Unix guru if you want to know more).

Using CVS with groups in the CoC

CVS is possibly most useful when more than one person is working on a project. Since a good portion of this class is about groups, you'll probably want to use CVS with a group at some point. To do this, all you have to do is make the repository writable by everyone. Yes, those of you who know about security will know that this is a HORRIBLE idea in theory, but CNS does not leave us with any other choice if you wish to use CVS with groups on your CoC account.

To make your repository world-writable, do this:

          chmod -R ugo+w repository-directory

It's probably a good idea to make your home directory execute only so people can still get to the repository without being able to see your files. To do this, execute this command:

          chmod go-rwx ~
          chmod go+x ~

Those should provide sufficient permissions to your cvs repository so other group members can access it and make changes to your files.

Additional Resources

Additional readings

This lab is a brief introduction to some of the more basic features of CVS. The topics we covered are sufficient to get you started, but they're far from being a complete list of things you can do. To learn more about CVS, you can visit its homepage at the following URI:

http://www.cvshome.org/

A better resource is "Open Source Development With CVS", a book available at your favorite bookseller. The author (Karl Fogel) has been kind enough to post the content related to CVS online for free. You can read the CVS chapters online here, they provide an excellent resource to help you learn about everything CVS can do for you. The URI for the book is:

http://cvsbook.red-bean.com/

CVS-related tools

Below is a brief list of CVS-related resources that you may find useful. They range from diff-style tools to GUIs to use CVS. None of these resources are required for the purposes of this lab, but you may find them useful when you begin to use CVS on your own.

          URI: http://www.tortoisecvs.org/
          Name: TortoiseCVS
          Description: A great Windows tool that integrates CVS into Windows
          Explorer.  Makes CVS a breeze for Windows users.
        
          URI: http://www.cvsgui.org/
          Name: WinCVS/MacCVS/gCVS
          Description: A collection of interfaces to CVS for Windows, Mac, and
          GNOME (Unix) users.
        
          URI: http://www.lincvs.org/
          Name: LinCVS
          Description: Another Linux GUI frontend to CVS, based off of the Qt
          libraries.
        
          URI: http://xxdiff.sourceforge.net/
          Name: xxdiff
          Description: A replacement for diff(1).  Graphically shows
          differences between two files.
        
          URI: http://winmerge.sourceforge.net/
          Name: WinMerge
          Description: A diff-style tool for Windows users, can be used instead
          of normal cvs diff with Windows CVS UIs.
        

CVS and your editor/IDE

Now that you know all about CVS, you might be wishing it was a little easier to integrate with the way you work in the class. Luckily for you, there are a number of editors available that will seamlessly integrate with CVS, so you can check your changes in and see differences without ever having to leave your editor. NOTE: Editors and CVS integration will not be supported by your TA. If it doesn't work for you, ask your resident technical support "friend" to help you out.

If you're an Emacs person, you can easily start using CVS from within Emacs. Simply add the following line to your .emacs file (located in your home directory):

(require 'vc)

Now, when editing a file, use the C-x v v key sequence to check in changes, C-x v = to see a diff between the last version, and more. Explore the options for "Version Control" that appear in your "Tools" menu.

In the future you will be spending several weeks using the Java programming language. Here is a quick list of Java-centric editors that offer seamless CVS integration:

          URI:         http://www.netbeans.org/
          Name:        netBeans
          License:     Various (all free for professional or personal use)
          Description: One of the most powerful Java IDEs available,
                       integration with Tomcat, JSP, HTML, XML, Java,
                       plain text editing. Also features an integrated
                       CVS client, GUI development tools, powerful inline
                       automatic code completion, and more.
        
          URI:         http://www.intellij.com/
          Name:        IntelliJ IDEA
          License:     Commercial ($99 Academic License)
          Description: A popular commercial Java IDE that features an
                       integrated CVS client.  Visit the website for
                       more details.
        
          URI:         http://www.eclipse.org/
          Name:        Eclipse
          License:     CPL (Free)
          Description: An industry (IBM, Rational, Borland, Merant, RedHat,
                       and others) supported IDE.  Offers Java development
                       and CVS integration.  C/C++ is also supported with
                       a module, check in the Projects -> Eclipse Tools for
                       the C/C++ IDE. Other languages/modules available from
                       third parties, check out http://eclipse-plugins.2y.net/
                       for more information.
        
          URI:         http://java.sun.com/j2se/1.4.1/download.htm
          Name:        SunONE (formerly: "Forte For Java")
          License:     "Community Edition" available free of charge
                       (with some restrictions)
          Description: SunONE/Forte for Java is a nice IDE for Java
                       which includes GUI-based CVS interaction,
                       support for editing ANT files, working with
                       RMI and has several sample/template files
                       available for those new to RMI and ANT.
                       SunONE/Forte is based off of netBeans.
          Availability:Version 3.0 is installed on most Sun and
                       Linux systems in the CoC. Use "ffj" to
                       run it.
        

While we're on the topic of editors, here is a quick list of places you can find additional information about other Java editors:

JavaWorld Developer Tool Guide - IDE
Open Directory Project's Java IDE listing

CVS keywords

So what happens if you want to know what version a file is? How do you know other than doing a cvs status on it manually? Luckily, for those of you who like quick ways to identify your files, there is an answer. Putting $Id: lab1.xml,v 1.3 2004/05/13 02:13:33 tedchoc Exp $ in your file will replace it with the current version of the file, and it will be automatically updated every time you commit the file. Be sure to comment this out, as it definately won't compile. There are other keywords that have special meaning, like $Revision: 1.3 $, read your handy CVS manpage to learn more.

File Compression/Packaging

tar/gzip

History

tar was originally written as a utility for copying files stored on a local hard disk to a tape backup. In fact tar is short for "Tape ARchiver" (this simple fact explains some of the idiosyncrasies you may notice when using tar).

Over time tar has evolved into the preeminent file archiving tool used in the unix world. On many platforms (such as FreeBSD and RedHat Linux) tar has become integrated with various compression utilities (such as gzip, bzip2, or compress).

Format of a tar command

The tar command string may look confusing at first but it is realy quite simple. In fact it it is very similar to the command strings used for CVS.

            .     +----+---+----------+---------+
            .     |tar |  c|vf <file> | [<dir>] |
            .     +-+--+-+-+------+---+----+----+
            .       |    |        |        |
            .       |  command    |     [directory]
            . tar binary        options
            .
            .
            . Note: The directory is only included when creating a
            .       new archive.
          

Common commands include:

            . c - Create a new archive
            . x - Extract files from an existing archive
            . t - List files contained within an archive
          

Common options include (use them in this order):

            . z        - Use gzip to compress/uncompress the file
            .            (does NOT work under Solaris)
            .
            . v        - Run in "verbose" mode, causes tar to print out
            .            the names of all files as it (un-)compresses
            .            them (you probably want to use this whenever
            .            you work with tar)
            .
            . f [file] - Perform all operations using the specified file
            .            (without this option tar assumes it is using a
            .            tape drive - so don't forget it)
          

Be sure to notice that the options always follow the commands and that the file option is always last. Also, there is no space between the options/commands. (It is possible to use the "gnu style flags" [e.g. "-z"], however this is uncommon and does not translate over to using jar, so it will not be covered in this lab.)

Using tar to recursively archive a directory

The most useful feature of tar is its ability to recursively archive the entire contents of a directory (e.g. "myDirectory") into a single file (e.g. "myArchive.tar"). The command to do so looks like this:

Example 1.22. 

              % tar cvf myArchive.tar myDirectory
            

The ".tar" extension is the standard for a tar archive (sometimes called a "tarball"). Make sure to use this extension on all tar files.

Listing the files contained in a tarball

To list the contents of a tar file (e.g. "myArchive.tar") simply use the "t" command:

Example 1.23. 

              % tar tvf myArchive.tar
            

Extracting all files from a tarball

To extract the contents of a tar file (e.g. "myArchive.tar") simply use the "x" command:

Example 1.24. 

              % tar xvf myArchive.tar
            

This will extract the entire archive to the current directory, preserving the directory structure that had existed when the tar was created. (In the case of the example here, it will create the directory "myDirectory".)

Using gzip compression on a tar archive - FreeBSD/RedHat Linux

On FreeBSD or RedHat Linux one can just add the "z" option onto all commands and tar will automatically use gzip. For example:

Example 1.25. 

              % tar czvf myArchive.tar.gz myDirectory

              % tar tzvf myArchive.tar.gz

              % tar xzvf myArchive.tar.gz
            

Note that it is important to use ".tar.gz" extension when working with a gzip-compressed tar file. (Some people substitute ".tgz" for ".tar.gz" which is a widely-accepted alternative.)

Using gzip compression on a tar archive - Solaris

Solaris's version of tar does not support the "z" option (this is due in part to licensing restrictions on gzip). So you will have to run gzip separately when working in Solaris.

Running gzip to compress a file (e.g. "myfile.tar") is simple:

Example 1.26. 

              % gzip myfile.tar
            

This will gzip the file (replacing the origional with the gzipped-version). Note that gzip will automatically append the ".gz" extension onto the file name indicating that it has been compressed with gzip.

Uncompressing a gziped file is simply the reverse...

Example 1.27. 

              % gunzip myfile.tar.gz
            

This will replace the file (e.g. "myfile.tar.gz") with an uncompressed version (e.g. "myfile.tar").

jar - The Java ARchiver

jar is the standard packaging utility used in the Java community. In addition to packaging files (like tar) with compression (like gzip - though jar actually uses a version of zip compression), jar has the ability to make the make the archive "executable" by java (through use of a manifest file - this will be explained in detail in Lab2).

The jar syntax is identical to the tar syntax:

          .     +----+---+----------+---------+
          .     |jar |  c|vf <file> | [<dir>] |
          .     +-+--+-+-+------+---+----+----+
          .       |    |        |        |
          .       |  command    |     [directory]
          . jar binary        options
          .
          .
          . Note: The directory is only included when creating a
          .       new archive.
        
[Note]Note

The current version of jar is always kept under the ~cs2335 directory on the CoC systems:

            . On RedHat systems:  ~cs2335/java/j2sdk1.4.2_03-linux/bin/jar
          

Common jar commands include "c", "x" and "t"; all of which have the same meening as the identically-named tar commands. Also in common between jar and tar are the "v" and "f [file]" options.

Sample usage:

Example 1.28. 

            % jar cvf myJar.jar myDirectory

            % jar tvf myJar.jar

            % jar xvf myJar.jar
          

All Java Archives are named using the ".jar" extension.

[Note]Note

If you do not specify a manifest file to use (which will be explained in Lab2), jar will automatically add a basic one into your Java Archive when the archive is created (it will be named "META-INF/MANIFEST.MF"). Just ignore it for now.

[Warning]Warning

Unless otherwise stated in the lab writeup, all CS2335 assignments must be compresses/archived using one of the following:

        . tar ......... A simple tar archive which must be named with a
        .               ".tar" extension (note that gzip has not been
        .               used here).
        . tar + gzip .. A gzip-compressed tar archive which must be named
        .               with either a ".tar.gz" extension or a ".tgz"
        .               extension.
        . jar ......... A simple jar archive which must be named with a
        .               ".jar" extension.
        . zip ......... A zip-compressed archive which must be named with
        .               a ".zip" extension. (For details on using zip
        .               see the man page - zip(1) - for more information.)
      

NOTE: Always, always, always make sure you use the appropriate file name extension. If you use the wrong extension the TA's will not know how to unarchive/uncompress the file and you will get a ZERO on the lab.

No other type of file archival/compression will be accepted! If you submit another type of file (e.g. a .rar file) for any lab you will get a ZERO - you have been warned.)

What you need to do

For this Lab you will have to create a CVS repository, use it and then archive/compress it using tar/gzip/jar/zip.

CVS

Setup

While this part is not going to be graded directly, you will not be able to complete the rest of the lab if you omit it.

  1. Setting the necessary environment variables (including CVSROOT, EDITOR, VISUAL, etc - as appropriate).

    [Note]Note

    You should name your cvsroot "mycvsroot".

  2. Initialize a new CVS repository

  3. Create and import a new project

  4. Check out your project

Working with CVS

You must work with your CVS repository for a bit and in doing so you should complete all of the complete the following tasks at least once:

  1. Edit and commit files

  2. Add new files (BOTH text and binary ones)

  3. Remove files from CVS

  4. Log comments

  5. Roll back changes

  6. View CVS logs

  7. View a diff between two versions

tar

You must create a tarball of your "mycvsroot" directory. Be sure to name it appropriately.

For those of you using FreeBSD or RedHat Linux you can combine the "tar" requirement with the "gzip" requirement (listed next) if you use the "z" option. But make sure your file name is appropriate.

gzip

You must gzip the tarball you created in the previous section. Again, make sure to name it appropriately.

jar

You must make a directory called "MyLab1" and put your gziped tarball in it. Then use jar to add the "MyLab1" directory to a new Java Archive.

Turnin

Submit the Java Archive just created to WebCT before the deadline. Details on using WebCT will be posted to the class newsgroup once WebCT has been setup for the semester.

How your lab will be graded.

After submitting your lab to WebCT, you should retrieve it and perform the following tasks (this is what your TA will do to grade it):

  1. Make a directory to work with, download your WebWork submission to this directory.

  2. Extract the files from the Java Archive and change into the "MyLab1" directory.

                  % jar xvf <jar file name>
                  % cd MyLab1
                
  3. Uncompress/extract the files from the gziped tarball:

    In Solaris:

                  % gunzip <gzipped tarball name>
                  % tar xvf <tarball name>
                

    and in RedHat Linux:

                  % tar xzvf <gzipped tarball name>
                
  4. Now set your CVSROOT variable to point to the "mycvsroot" directory you just extracted.

  5. Checkout the project from the repository and work with it.

Point Values

          . Use of jar  - 10%
          . Use of gzip - 10%
          . Use of tar  - 10%
          . Use of CVS  - 70%
        
[Warning]Warning

If for any reason your TA can not extract your cvsroot using method outlined above, you will get NO CREDIT for that portion of the lab. So it is highly advisable that you retrieve your submission from WebCT and follow the procedure outlined above.