Software Engineering: Building with Ant

Software Engineering
Software Configuration Management

Automate your Java Build Process Using CVS, Ant, a Nightly Build System

Summary

Software development is full of best practices which are often talked about but seem to be rarely done.

The task of tracking and controlling changes in the software. Configuration management practices include revision control and the establishment of baselines.

Generally, a baseline may be a single work product, or set of work products that can be used as a logical basis for comparison. A baseline may also be established (whose work products meet certain criteria) as the basis for subsequent select activities.

The Concurrent Versions System (CVS), also known as the Concurrent Versioning System, implements a version control system: it keeps track of all work and all changes in a set of files, typically the implementation of a software project, and allows several (potentially widely separated) developers to collaborate. CVS utilizes a client-server architecture: a server stores the current version(s) of the project and its history, and clients connect to the server in order to check-out a complete copy of the project, work on this copy and then later check-in their changes. Typically, client and server connect over a LAN or over the Internet, but client and server may both run on the same machine if CVS has the task of keeping track of the version history of a project with only local developers.

Ant is a powerful scripting tool that lets you craft build processes around your code requirements using predefined tasks and provides expansion capability to handle even more difficult tasks. Ant is the powerful XML-based scripting tool that can automate your mundane tasks and allow you to concentrate on your business rules and code development. It is by nature an overhead task that accompanies a development effort. A defined build process ensures that the software in your development project is built in the exact same manner each time a build is executed. As the build process becomes more complex -- for example, with EJB builds or additional tasks -- it becomes more necessary to achieve such standardization. You should establish, document, and automate the exact series of steps as much as possible.

Jenkins is an open source continuous integration tool written in Java.
Jenkins provides continuous integration services for software development. It is a server-based system running in a servlet container such as Apache Tomcat. It supports SCM tools including CVS, Subversion, Git, Mercurial, Perforce and Clearcase, and can execute Apache Ant and Apache Maven based projects as well as arbitrary shell scripts and Windows batch commands.
Builds can be started by various means, including being triggered by commit in a version control system, scheduling via a cron-like mechanism, building when other builds have completed, and by requesting a specific build URL.

The process of an automated build-and-test cycle is continuous integration. It works by having an integration server check out code from the version control system at set time intervals, or at certain times of the day, build that code, run unit tests on the build, and report back build or test results to developers. By providing early build and test reports, developers can fix problems quickly, allowing the project to move forward in an agile fashion: Such failures will not block the progress of the team for long.

Agenda

CVS
- Creating Project CVS Repository
- Initialize a Software Project Repository
- Using CVS
- Create a Project from the CVS Repository
Ant
Continuous Integration
Create a Nightly Build System
Comparison of open source software hosting facilities

Creating Project CVS Repository

A CVS directory locations for the project prj_name:

prj_name/cvsroot

That is the directory we will use in this example:

# cd prj_name
# mkdir cvsroot

# chgrp cvsdev cvsroot
# ls -l 
total 2
drwxr-xr-x   2 root     cvsdev       512 Jun 28 16:12 cvsroot

# chmod g+srwx cvsroot
# ls -l
total 2
drwxrwsr-x   2 root     cvsdev       512 Jun 28 16:12 cvsroot

# cvs -d prj_name/cvsroot init
# ls -la cvsroot
total 6
drwxrwsr-x   3 root     cvsdev       512 Jun 28 16:12 .
drwxr-xr-x   3 root     other        512 Jun 18 12:24 ..
drwxrwsr-x   3 root     cvsdev      1024 Jun 28 16:13 CVSROOT

# chown -R cvs cvsroot
# ls -l
total 2
drwxrwsr-x   3 cvs      cvsdev       512 Jun 23 16:12 cvsroot

# setenv CVSROOT prj_name/cvsroot

Let's know examine the above commands:

A Unix group named, for example, cvsdev for your project will be used for group read/write to the repository. This will allow project team members that may be on this machine to access the repository.
We set the directory's SGID bit on the repository, so that files that get created in this directory have the same group ID as the directory's group ID. (This is a very important step that can save a lot of headaches later!)
We made the directory group writable/readable/executable.
We also created a CVS repository using the command:
cvs -d repository_root_directory init
For security reasons, we made a user (named cvs for example, the project leader) to own the repository and the administrative files. We then did a chown on the respository's root directory and administrative files to that username.
To use CVS let know the Unix environment the location of your project repository (inset that line in your shell config file).

Initialize a Software Project Repository

To start a source repository:

cd prj_name
cvs import -d prj_name vendor_name initial

where prj_name is a descriptive name for the project vendor_name can be anything, and initial is what we use to tag the initial set of sources. If everything worked OK, then you can remove the original sources. (Don't try to check-out the repository sources into the original source directory as this usually causes endless problems.)

Check Out Sources

Check out the sources from the CVS repository with the following command:

cd sandbox_dir
cvs co -P prj_name

which will create a sub-directory named prj_name with the sources and each directory will have a sub-directory named CVS that contains info about the repository sources. Once you've checked-out the sources, you do not need to define $CVSROOT to work within the local sources. All the cvs commands will work, if they're invoked within the local source directories, and it's local host access.

Compare Local Changes

Suppose you modified any of the checked-out or local sources. To compare the changes you've made to the repository sources:

cvs diff [source_file]

where you can give one or more optional source_file names, else cvs will compare all files in the current directory and all subdirectories.

However, be aware that this will not give any information about changes between the local source file and any changes that have been checked in by others. Only changes that have been made to the local source file and it's original source.

History of Changes

To look at the history of changes:

cvs log [source_file]

Status of Changes

To check the current status of a source_file or all the files:

cvs status [source_file]

A couple of useful C-shell aliases to create are:

cvsstat

shows just the status of all files

alias   cvsstat 'cvs status \!* |& grep Status:'

cvswhat

shows the status of files that are not "Up-to-date"

alias   cvswhat 'cvs status \!* |& grep Status: |& grep -v "to-date"'

Remove a File

To remove a file from the repository:

rm source_file             # must first remove it locally
cvs rm source_file         # schedules it for removal

Add a File

To add a file to the repository

vi source_file             # create the file first
cvs add source_file        # schedules it to be added

Move a File

This can not be done cleanly at the local level. The best way to do this with CVS is to go to the cvsroot repository and move the file or directory within the repository there (if you are interested in keeping the history of changes). The cvsroot repository keeps all files in their RCS form of filename,v . The next cvs update will manifest the file move.

Check In Local Changes

Once you've made all the changes you care to for the current batch then:

cvs ci [source_file] or cvs commit [source_file]

which checks-in the changes and updates the repository sources. CVS will pop-up an editor session where you can describe the changes made, which appears in the source_file log for each file affected.

Update Local Sources

If many people are working on the repository, you can obtain any changes in the repository that have been made since you've checked out the sources with:

cvs update [source_file]

and if there are conflicts, then CVS will notify you and flag it in the sources. On the Crays, I've noticed that CVS can't use the ``patch'' facility hence it will default to copying, which is not a problem, so ignore such messages.

Tagging Sources

You can tag the current set of changes (revisions) with:

cvs tag tag_name

then this set of local sources can be recovered with this tag_name

Another option is to tag the repository sources with

cvs rtag tag_name prj_name

which you want to do for each release of the code, so you can always backtrack any bugs to the version released to the users.

Creating Patches

You can create a patch file of changes with

cvs rdiff -u -r tag_name -r initial prj_name > patch_file

which will have all the changes you've made between the tag_name version and the initial version. You can also create patch files between any two tags.

You can also create a patch file of your local changes with:

cvs diff -N -u -r tag_name > patch_file

Backing Out Changes

Suppose you modify a file, but don't want to keep the changes:

rm source_file                  # remove it from local sources
cvs update source_file          # get a new copy from the repository

Using Branches

Working with branches is one of the more difficult concepts to master with CVS, but it is one of the most useful for an active development project.

The concept is that the software project has made a release, say version 3.1.0, and work is now progressing on version 3.2. However, a bug was discovered in the released 3.1.0 version, which you want to fix. Suppose that the project was tagged with prj_3_1_0. Also, it will be assumed that it wasn't marked as a branch (-b).

Need to tag the current tag sources as a branch with

cvs rtag -b -r prj_3_1_0 prj_3_1_0_branch project_name

Check out the given tagged version into a directory named prj.3.1.0 with
```
cvs checkout -d prj.3.1.0 -r prg_3_1_0_branch project_name
```
Get into the prj.3.1.0 directory for further work.
Make whatever changes to the sources, which will be identified as version 3.1.1
Check in changes for this branch as
```
cvs commit
```
Tag this version with
```
cvs tag -r prj_3_1_1
```
Make a tar ball for distribution, and remove the branch project directory, which is no longer needed.
If there are any fixes that can be merged into the main development branch. (This only works if the differences between this branch and the development branch are fairly small.) Get into a checked-out project directory (not the branch directory which should have been removed).
Merge the branch changes with the main development branch with
```
cvs update -j prj_3_1_1
```
Carefully, note the output, and resolve any conflicts, and test changes.
Note that merges can be incorporated into other branches by applying them to whatever checked-out version.

Sticky Tags!

Generally, what happens when a tagged version is checked out:

cvs checkout -d prj.3.1.0 -r prg_3_1_0 project_name

Something in the CVS directories makes the tag ``sticky'' and no changes can be updated or checked in. An attempt to cvs commit any local changes usually results in a message saying the ``sticky'' tag is not a branch!

The tag needs to be made into a branch with
```
cvs tag -b -r prj_3_1_0 prj_3_1_0_branch
```
Where the -b is the key here to making a branch.
Update the current working version as a branch with:
```
cvs update -r prj_3_1_0_branch
```
This will not affect the source files, only the CVS/Entries files will be updated to a different ``sticky'' tag ... a branch in this case.
The changes can now be checked in to that branch with
```
cvs commit
```
Changes in this branch can be merged into the development branch. (See the latter part of Using Branches for more details.)

Merging Revisions

Normally, it's best to edit files in the directory that you're using for checkouts. This way, cvs will automatically take care of merging in changes, just by running cvs update. However, in some cases that might not always be possible.

Hypothetical Situation: you took a copy of Myfile.java home, and did some work on it. In the meantime, your fellow developers have committed changes to the file. The dilemna - you'd like to incorporate what you've done, but your copy of the file is now out of date. Of course, you also don't want to undo work that others have done. Here's a way to deal with this situation.

Find out what revision your copy of the file is based on. This will be the revision number in the $Id$ or $Revision$ tags. If you can't determine the revision, this approach won't work, and you'll need to do a manual merge.
Run cvs update to refresh your repository copy.
Run cvs log MyFile.java (in the appropriate directory) to get the revision number of the copy that you just checked out of the repository.

For the sake of illustration, lets say that the copy of MyFile.java that you were working on at home is revision 1.6, and the current repository version is 1.10.

Copy the MyFile.java that you worked on at home to your checkout directory. We now have the following arrangement:

Repository version is 1.10, which you've just checked out. As far as cvs is concerned, your local copy is up to date.
The actual file in your checkout area is revision 1.6 + changes.
You're missing the differences from 1.7 - 1.10. (Note: this is why you don't want to commit the file yet. Doing so would remove anything done between 1.7 and 1.10).

To pick up the modifications made from 1.7 - 1.10, you need to merge:

cvs update -j 1.7 -j 1.10 MyFile.java

In cvs-speak, this means "take the changes from revision 1.7 through revision 1.10, and apply them to the local copy of the file." Assuming that there were no merge conflicts, examine the results:

cvs diff -w MyFile.java

Make sure it compiles, then commit.

If things didn't go well, you'll need to examine the results and resolve any conflicts that happened as a result of the merge.

On a related note, update -j ... can also be used to back out a bad commit, simply by reversing the revision order.

Resolving Conflicts

Eventually, something like this will happen:

$ cvs commit foo.java
cvs commit: Up-to-date check failed for `foo.java'
cvs [commit aborted]: correct above errors first!

Here, you've made changes to the foo.java, but someone else has already committed a new version to the repository (eg - the repository version has a higher number than your local copy). Before you can commit the file, you'll need to update your working copy.

If you and the other developer were working on different areas of the file, cvs is pretty intelligent about merging the changes together; it might see that the last set of modifications are in lines 75-100, and your changes are in lines 12-36. In this situation, the file can be patched and your work is unaffected.

However, if the two of you changed the same area of the file it's possible to have conflicts:

$ cvs update foo.java
RCS file: /home/srevilak/c/mymodule/foo.java,v
retrieving revision 1.1
retrieving revision 1.2
Merging differences between 1.1 and 1.2 into foo.java
rcsmerge: warning: conflicts during merge
cvs update: conflicts found in foo.java
C foo.java

Oh dear! What do we do now? The answer is "fix the merge". Two things have been done to help you with this.

A pre-merge copy of the file has been made.
```
$ ls -a .#*
   1 .#foo.java.1.1
```
However, being a dotfile, it's presence isn't immediately obvious

cvs has inserted a conflict marker in your working copy.

<<<<<<< foo.java
  static final int MYCONST = 3;
=======
  static final int MYCONST = 2;
>>>>>>> 1.2

The conflict lies between the rows of greater than and less than signs. The thing to do now is decide what version is right, remove the conflict markers, and commit the file.

Backing out a Bad Commit

Let's suppose that you've commited a file, but this ended up breaking something horribly. Here's how to undo your commit:

Get the version number from after the commit. You can use an $Id$ tag within the file, or cvs status. Let's say that the new version is 1.5.
Get the version number from before the commit. Typically, this will be one lower than the current version. Let's say that the old version is 1.4.

Now do this:

cvs update -j 1.5 -j 1.4 filename
cvs commit filename

The above is an example of a merge. You've asked cvs to take the difference between versions 1.5 and 1.4 and apply them to your working copy. The ordering of version numbers is significant - think of it as removing changes, or going backward in version history.

Other Useful Commands

There are a variety of other useful cvs commands. Here are a few examples:

`cvs diff` filename	Shows differences between your local copy of filename and the current repository copy
`cvs diff -r 1.2` filename	Shows differences between your local copy of filename and version 1.2 of filename.
`cvs diff -r 1.2 -r 1.3` filename	Shows differences between versions 1.2 and 1.3. (regardless of what version your local copy is).
`cvs log` filename	Show the commit log for filename (like rlog does with rcs).
`cvs annotate` filename	Shows each line of filename, prefixed with the version number where the line was added, and the name of the person who added it. Useful for seeing who made a particular set of changes.

More Info

To get more usage info:

cvs --help                      # usage info and general cvs-options
cvs --help-commands             # list & description of commands
cvs --help-options              # general cvs-options
cvs --help command              # command specific usage & command options

man cvs                         # gives an overview

CVS Quick Reference Card (PDF) and more references at the end of this module

Why do I need a defined build process?

A defined build process is an essential part of any development cycle because it helps close the gap between
- the development,
- integration,
- test, and
- production environments.
A build process alone will speed the migration of software from one environment to another.
It also removes many issues related to
- compilation,
- classpath, or
- properties that cost many projects time and money.

Here are some useful tasks that are built in the Ant distribution.

Command	Description
Ant	Used to execute another ant process from within the current one.
Copydir	Used to copy an entire directory.
Copyfile	Used to copy a single file.
Cvs	Handles packages/modules retrieved from a CVS repository.
Delete	Deletes either a single file or all files in a specified directory and its sub-directories.
Deltree	Deletes a directory with all its files and subdirectories.
Exec	Executes a system command. When the os attribute is specified, then the command is only executed when Ant is run on one of the specified operating systems.
Get	Gets a file from an URL.
Jar	Jars a set of files.
Java	Executes a Java class within the running (Ant) VM or forks another VM if specified.
Javac	Compiles a source tree within the running (Ant) VM.
Javadoc/Javadoc2	Generates code documentation using the javadoc tool.
Mkdir	Makes a directory.
Property	Sets a property (by name and value), or set of properties (from file or resource) in the project.
Rmic	Runs the rmic compiler for a certain class.
Tstamp	Sets the DSTAMP, TSTAMP, and TODAY properties in the current project.
Style	Processes a set of documents via XSLT.

While other tools are available for doing software builds, Ant is easy to use and can be mastered within minutes. In addition, Ant lets you create expanded functionality by extending some of its classes.

Installing Ant

1) Download Ant from the Apache Ant Project at http://ant.apache.org/

The binary distribution of Ant consists of the following directory layout:

  ant
   +--- bin  // contains launcher scripts
   |
   +--- lib  // contains Ant jars plus necessary dependencies
   |
   +--- docs // contains documentation
   |      +--- ant2    // a brief description of ant2 requirements 
   |      |
   |      +--- images  // various logos for html documentation
   |      |
   |      +--- manual  // Ant documentation (a must read ;-)
   |
   +--- etc // contains xsl goodies to:
            //   - create an enhanced report from xml output of various tasks. 
            //   - migrate your build files and get rid of 'deprecated' warning
            //   - ... and more ;-)

Only the bin and lib directories are required to run Ant. To install Ant, choose a directory and copy the distribution file there. This directory will be known as ANT_HOME

Setup

Before you can run ant there is some additional set up you will need to do:

Add the bin directory to your path (use %ANT_HOME%\bin on windows or $ANT_HOME/bin on unix).
Set the ANT_HOME environment variable to the directory where you installed Ant. On some operating systems the ant wrapper scripts can guess ANT_HOME (Unix dialects and Windows NT/2000) - but it is better to not rely on this behavior.
Optionally, set the JAVA_HOME environment variable .

Example Scenario
This example scenario should help show you the value of Ant and provide insight into its benefits and how you can use it.

Source Code Example

Simple build process with Ant (build.xml)

<project name="Greeter" default="compile" basedir="C:\java-cs3365\ass1" > <target name="init"> <property name="source" value="src" /> <property name="package0" value="edu\ttu\cs\greeter" /> <property name="package1" value="greeters" /> <property name="package2" value="greet" /> <property name="outputDir" value="classes" /> <property name="classpath" value="classes" /> </target > <target name="clean" depends="init"> <deltree dir="${outputDir}" /> </target > <target name="prepare" depends="clean"> <mkdir dir="${outputDir}" /> </target > <target name="compile" depends="prepare" > <ant antfile="${source}\${package0}\build.xml" /> <ant antfile="${source}\${package1}\build.xml" /> <ant antfile="${source}\${package2}\build.xml" /> </target > </project >

The first line contains information about the overall project that is to be built.

<project name="Greeter" default="compile" basedir="C:\java-cs3365\ass1" >

The most important attributes of the project line are the default and the basedir.

The default attribute references the default target that is to be executed. Because Ant is a command-line build tool, it is possible to execute only a subset of the target steps in the Ant file. For example, You could perform the following command:

% ant init

That will execute the ant command and run through the build.xml file until the init target is reached. So, in this example, the default is compile.

The Ant process invoked in the following line will run through the test.xml file until the default task is reached:

% ant -buildfile test.xml % ant -f test.xml

The basedir attribute is fairly self-explanatory as it is the base directory from which the relative references contained in the build file are retrieved. Each project can have only one basedir attribute so you can choose to either include the fully qualified directory location or break the large project file into smaller project files with different basedir attributes.

The next line of interest is the target line. Two different versions are shown here: <target name="init"> <target name="clean" depends="init">

The target element contains four attributes: name, if, unless, and depends. Ant requires the name attribute, but the other three attributes are optional.

Using depends, you can stack the Ant tasks so that a dependent task is not initiated until the task that it depends on is completed. In the above example, the clean task will not start until the init task has completed. The depends attribute may also contain a list of comma-separated values indicating several tasks that the task in discussion depends on.

The if and unless tasks let you specify commands that are to be performed either if a certain property is set or unless that property is set. The if will execute when the property value is set, and the unless will execute if the value is not set. You can use the available task to set those properties as shown below:

<available classname="org.whatever.Myclass" property="Myclass.present"/>
sets the Myclass.present property to the value "true" if the class org.whatever.Myclass is found in the classpath.

The init target from the simple example contains four lines of property commands as shown here:
<property name="sourceDir" value="src" />

These property lines let you specify commonly used directories or files. A property is a simple name value pair that allows you to refer to the directory or file as a logical entity rather than a physical one.

If you wanted to reference the sourceDir variable later in the Ant file, you could simply use the following syntax to alert Ant to obtain the value for this tag: ${sourceDir}.

Two other commands present in the above buildfile are:
<deltree dir="${ outputDir }" /> <mkdir dir="${ outputDir }" />These commands are used to ensure that there are no extraneous files in the outputDir (or classes directory when dereferenced as mentioned above). The first command removes the entire tree contained under the outputDir. The second command creates the directory again.

The last line of major interest to the developer is the following compilation line:
<javac srcdir="${sourceDir}" destdir="${outputDir}" />The javac command requires a source directory (the input location of the .java files) and a destination directory (the output location of the .classes file). It is important to note that all directories must either exist prior to the running of the ant command or be created using the mkdir command. Ant does not create directories based upon intuition, so you must create the outputDir, using the mkdir command prior to the compilation step above.

While it took several lines to explain the example, it should be evident that Ant is an easy-to-use tool. Using this buildfile as a starting point, you should be able to incorporate Ant into your development effort.

Continuous Integration

Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day.
Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible.
Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly.
- Building a Feature with Continuous Integration
- Practices of Continuous Integration
  - Maintain a Single Source Repository.
  - Automate the Build
  - Make Your Build Self-Testing
  - Everyone Commits Every Day
  - Every Commit Should Build the Mainline on an Integration Machine
  - Keep the Build Fast
  - Test in a Clone of the Production Environment
  - Make it Easy for Anyone to Get the Latest Executable
  - Everyone can see what's happening
  - Automate Deployment
More: Continuous Integration

Benefits of Continuous Integration

Reduced risk - the trouble with deferred integration is that it's very hard to predict how long it will take to do, and worse it's very hard to see how far you are through the process.
Continuous Integration completely finesses this problem - you completely eliminate the blind spots.
Continuous Integrations doesn't get rid of bugs, but it does make them dramatically easier to find and remove. In this respect it's rather like self-testing code.
Bugs are also cumulative. The more bugs you have, the harder it is to remove each one. It's also psychological - people have less energy to find and get rid of bugs when there are many of them - a phenomenon that the Pragmatic Programmers call the Broken Windows syndrome.
As a result projects with Continuous Integration tend to have dramatically less bugs, both in production and in development process. However it should be stressed that the degree of this benefit is directly tied to how good your test suite is.
If you have continuous integration, it removes one of the biggest barriers to frequent deployment. Frequent deployment is valuable because it allows your users to get new features more rapidly, to give more rapid feedback on those features, and generally become more collaborative in the development cycle. This helps break down the barriers between customers and development - barriers which are the biggest barriers to successful software development.

Continuous Integration Anti-patterns (patterns of what not to do)

Five o'clock check-in
Don't check code in after five p.m. Instead, have another look at your code with a fresh mind in the morning.
You need to have a culture where it's OK for a developer to break the build on occasion, but then that developer is also responsible for fixing the breakage as soon as possible.
Spoiled fruit
When you generate code, you often place the generated artifacts in the same directory as the hand-written code. That's very convenient, since build tools such as Ant can just use the current directory for code generation output. But remember that the integration server also must build that code and run the tests on that code. If your build process does not clean up all generated artifacts, the outdated generated code lingering around will break your build. That old generated code is like spoiled fruit on a dinner table.
The solution is to place all generated code in a separated directory. For instance, if your source code resides in the source top-level project directory, you might want generated code to be placed in a generated directory, and cause the build script to completely remove that directory before a clean build.
It's a small change
A developer thinks that there is no way a tiny change can break the build. Who would think that reformatting code, or adding a comment can break the build? Because you don't think your change can break anything, you don't do a clean build before checking the small change in. Yet, small changes are frequent causes of build breakage.
I am the senior developer so I can bypass established policies

Create a Nightly Build System

Jenkins:
http://jenkins-ci.org
Ant is available from the Apache Website:
http://ant.apache.org/
Download Tomcat from the Jakarta Website:
http://jakarta.apache.org/tomcat/
Anthill:
http://www.urbancode.com/projects/anthill/download.jsp

Reference

Software Engineering Software Configuration Management

Automate your Java Build Process Using CVS, Ant, a Nightly Build System Summary