February 22, 2009

Using ant to manage larger Java projects

In this post I'll give a brief introduction to my experience of managing larger java projects with ant by sticking to the DRY principle.

One of the first steps during setting up a java project is surely the carefully planning of the build system. A build system's purpose is doing the always repeating tasks that occur during a project's life circle. The obvious ones are creating temporary directories, calling the compiler, packaging everything to a jar. If the complexity of the projects grows, it is quite certain that the requirements to the build system grow as well. Surely we want to include unit testing, javadoc generation. And what about automatic coverage analysis? Or ordering a pizza? Yes. Surely it should be able to cope with these challenges (maybe not the pizza thing, but i would love that...). There are roughly 2 approaches to get a build system in java: Either rely on the functionality your favored IDE provides, or use one of widely accepted build systems like ant or maven. In my opinion there are a lot of reasons not to choose the first one. Most importantly a particular project should not depend on a certain IDE. Maybe there's a common agreement in your company to focus on IDE, but this must not be the common case. As for every other aspect of software development as well, you should try to avoid dependencies, or least the unnecessary ones. On the other hand it is good to avoid complexity as well. If your source code builds in a shell, then it will most likely build every else. Hence, let's use some established build system like ant or maven.

I am using ant for about eight years now, and I am quite happy with it. My experiences are even that good, that I avoided the master the fully learning curve towards maven up to now. But this post is not about ant and maven. Actually I want to talk about how to maintain more complex projects. Let's start small. In a common ant project you will most likely have a (very simplified) structure like

/root 
    /src 
    /build 
        /bin 
        /jars
    build.xml
Your source code lives in src, it gets compiled to build/bin and jared to build/jars. You would have some targets, let's say init, compile, jar, which create some dirs, compile the sources and, who would have guessed, pack them. Anyways. But reality tends to be more complex than mickey mouse examples. If your project is larger, it could be a good advice to break your code up into different libraries. And even more important, with clear policies how to put things into libraries. There are several aspects which would affect this decision. Since your architecture is most likely organized in different components, it is wise to reflect this structure in your source code too. You should already have some policies for dependencies of your components, like
  • more specialized components have to depend on more generalized ones
  • don't rely on transitive dependencies
  • permit cyclic dependencies
These rules are also valid on the actual library level.

But how should we reflect this in the directory layout? I've made good experiences with the following approach:

/root
   /libs
     /commons
        /src
        /build
           ...
        build.xml
     /derived
        /src
        /build
        ...   
        build.xml
But now we are getting to the root of the matter: All build files will more or less do the same. If we just copy them we would surely violate the DRY principle, and that's something we should avoid. Let's define what we need: We want to omit to redundantly specify the build process. Hence we want to have something like a template, which we always use, but, and that's important, it should leave enough freedom to make local modifications to this template. Why? Let's assume we have such a template mechanism specified somewhere and we call libraries with it. This is fine, as long as the libraries always have the same requirements. But what if we have one particular lib which differs from these requirements? let's say it consists of generated code, therefore it would have to call a source code generator after init but before compile? we could put this special target into the template as well and somehow ignore it as long as there's nothing to generate. But in my opinion this violates OCP. It would be better to provide some "hooks" when calling the template to do the special things.

Luckily ant has some nice tool in its toolbox to overcome this problem: The import target. This will let us specify some super build file, import its target and *tata* override some targets (if we want to). Let's think about import as ant's way of polymorphism. The super class (build file!) defines some abstract behaviour, the base class individualizes it to its specific needs. Therefore we will define the overall and general build steps in our template file and put wisely chosen hooks into it, which could be overridden by the importing build file. A very simplified super build file might look like

    <project name="Complex Ant Demo Master Libs">

        <!-- general library related properties -->


        <property name="dir.src" location="${dir.library}/src"/>
        <property name="dir.build" location="${dir.library}/build"/>

        <property name="dir.bin" location="${dir.build}/bin"/>
        <property name="dir.jars" location="${dir.build}/jars"/>

        <property name="file.lib.jar" value="${dir.jars}/${name.library}.jar"/>


        <target name="hook.init.pre"/>
        <target name="hook.init.post"/>

        <target name="init" description="Initializes the Library">

            <antcall target="hook.init.pre" />

            <mkdir dir="${dir.build}"/>    
            <mkdir dir="${dir.bin}"/>
            <mkdir dir="${dir.jars}"/>

            <antcall target="hook.init.post" />

        </target>

        <target name="hook.compiler.pre"/>
        <target name="hook.compile.post"/>

        <target name="compile" description="Compiles Sources" depends="init">
            <antcall target="hook.compiler.pre"/>

            <javac srcdir="${dir.src}" destdir="${dir.bin}"/>

            <antcall target="hook.compile.post" />
        </target>

        <target name="hook.jar.pre"/>
        <target name="hook.jar.post"/>

        <target name="jar" description="Creates Jars" depends="compile">
            <antcall target="hook.jar.pre" />

            <jar basedir="${dir.build}" destfile="${file.lib.jar}"/>

            <antcall target="hook.jar.post" />
        </target>    

        <target name="hook.clean.pre"/>
        <target name="hook.clean.post"/>

        <target name="clean" description="Cleans Up">

            <antcall target="hook.clean.pre"/>

            <delete dir="${dir.build}" />

            <antcall target="hook.clean.post"/>
        </target>

    </project> 
That's a long piece of example. But let's go through focussing on the interesting steps. First of all, we define some variables, which we will use later. Please note, that we implicitly assume there are variables called ${dir.library} and ${name.library}. We will see later where they come from. It's getting interesting in line 15. There we define those empty targets hook.init.pre and hook.init.post. But there purpose becomes clear in the init target below: First we call the (possible empty) pre hook, then we do our init action, finally the post hook. This mechanism is used in each target. (to be honest: I've seen this idea in the netbeans build system first...)

Let's assume we have library called "common", without any special treatment. Its build file is very simple now:

    <project name="Common Lib" basedir=".">

        <property name="dir.library" location="${basedir}"/>
        <property name="name.library" value="common"/>
        <property name="dir.project-root" location="${basedir}/../.."/>

        <import file="${dir.project-root}/build-master-lib.xml"/>

    </project> 
We simply specify those two implicit variables, just define the location for the super build file and then import it. Now we can call all the imported targets on common's build file. For instance
ant jar
would result in the appropriate jar file.

Finally consider our described specialized case. Do you remember? We want to call the code generator before compile. That's how we do it:

    <project name="A Generated Lib" basedir=".">

        <property name="dir.library" location="${basedir}"/>
        <property name="name.library" value="generated"/>
        <property name="dir.project-root" location="${basedir}/../.."/>

        <import file="${dir.project-root}/build-master-lib.xml"/>

        <property name="dir.gen" location="${dir.src}"/>

        <!-- overrides target from build-master-lib.xml -->
        <target name="hook.post.init">
            <mkdir dir="${dir.gen}"/>
        </target>

        <target name="hook.compiler.pre">
            <echo>Calling the source code generator</echo>
        </target>

        <target name="hook.clean.post">
            <mkdir dir="${dir.gen}"/>
        </target>

    </project>
We simply redefined the hooks to do all the housekeeping we need.

Of course this is only the first step in setting up a full blown, generically managed ant project. But this should be enough for now.