Tuesday, April 10, 2007

From maven to mvn : Part 6 -- Classloader Maddness & A Custom Plugin

Wow. I had no idea it has been a month since my last post. I would plead business but that's cheating. After all, we're all busy...

Tonight's post is going to be a bit eclectic & is the result of considerable (mythical) man-hours of work. Much of that time was spent figuring out why things were broken rather than actually fixing them. The fixing part was really quite short though it was interrupted from time to time by more figuring parts.

The Setup:

I have a utility that I wrote quite some time ago that implements the notion of Evolutionary Database Design. It's an internal tool that I may or may not be able to publish someday but that's not really relevant right now. What is relevant is that it is simply a jar that relies on a number of dependencies, most notably Spring. It runs happily from the command line, maven 1 and Eclipse.

The Plan:

Create a maven 2 plugin around my utility by firing it's Main class with the appropriate parameters.

The Pain:

Classloaders. Man I hate classloaders.

It's a long and unpleasant tale and I won't bore you with the details. You can read some about it here though the rafb posts are long gone.

The short version is that maven creates a custom classloader for itself (a reasonable thing to do) and puts a pom's dependencies into it. In my case, one of those is Spring and when the context begins loading classes it gets tripped up in classloader evilness and begins to belive org.springframework.beans.factory.xml. SimplePropertyNamespaceHandler] does
not implement the NamespaceHandler interface which is patently not true.

The Solution:

I can't explain exactly why but it took me *hours* to figure out that the problem was classloader related and *more* hours to work out the solution. I can only say that it was most likely a Monday (and possibly a Tuesday) and hope that you will forgive me for being somewhat dense.

What you really want to know, though, is how I solved the problem. It is my expectation that someone else may very well run into the same thing so here we go...

First we have my mojo:
/**
* Invoke EDD to generate SQL from XML.
*
* @goal generateSql
* @phase compile
* @requiresDependencyResolution runtime
*/
public class GenerateSqlMojo extends AbstractMojo
{
/**
* Where to get the *.xml/*.sql files to be processed.
*
*
* @parameter expression="${project.build.directory}/classes"
* @required
*/
private String inputDirectory;

/**
* Prefix for the files created by EDD.
*
* @parameter expression="${project.build.directory}/classes/${project.artifactId}"
* @required
*/
private String outputPrefix;

/**
* The classpath elements of the project. We need this so that we can
* get the JDBC driver for the project.
*
* @parameter expression="${project.runtimeClasspathElements}"
* @required
* @readonly
*/
private List classpathElements;

public void execute() throws MojoExecutionException
{
try
{
new File(outputPrefix).getParentFile().mkdirs();
new UtilityExecutor().execute(inputDirectory,
outputPrefix, classpathElements);
}
catch (final Exception cause)
{
throw new MojoExecutionException("Failed to invoke EDD", cause);
}
}
}
OK... Not very exciting... In particular we see nothing about classpaths or classloaders other than the classpathElements parameter/attribute. All of that evilness is hidden in my UtilityExecutor. I'll tell you now: It could be done better! A more robust / reusable solution would extract the classloader rot out of UtilityExecutor and into a more generic helper class. That's a great idea... I'll do it later.

So, let's see UtilityExecutor:
public class UtilityExecutor
{
public UtilityExecutor()
{
super();
}

public void execute(final String inputDirectory, final String outputPrefix,
final Boolean saveOnlyMode,
final List<String> classpathElements)
throws ...
{
final URL[] urls = buildUrls(classpathElements);
final URLClassLoader cl =
new URLClassLoader(urls, ClassLoader.getSystemClassLoader());
Thread.currentThread().setContextClassLoader(cl);

final Class clazz = cl.loadClass(getClass().getName());
final Constructor ctor =
clazz.getConstructor(
new Class[] { String.class, String.class, Boolean.class });
ctor.newInstance(new Object[] { inputDirectory, outputPrefix, saveOnlyMode });
}

public UtilityExecutor(final String inputDirectory,
final String outputDirectory,
final Boolean saveOnlyMode)
{
final Main main = new Main();
main.setSaveOnlyMode(saveOnlyMode.booleanValue());
main.setSaveTarget(outputDirectory);
main.execute(new String[] { inputDirectory });
}

So let's break this down...

1) GenerateSqlMojo fires new UtilityExecutor().execute(...) (a convenience method which simply delegates to the execute(...) method shown).

2) execute(...) builds a URL[] of for our custom classloader (more on that in a minute).

3) The custom classloader is created.

Now it gets a bit weird...

4) We ask the custom classloader to load the Class for ourselves. This causes the class to be loaded from the custom classloader against whatever classpath we built in #2 then

5) fetch the "do some work" constructor for ourself and

6) finally invoke the constructor to launch the utility. (I chose to do the "work" in the constructor to reduce the need for reflection BTW.)


Now a note about the buildUrls(...) method. There isn't anything really magic about what we're doing here. Simply take the URLs from our current classloader (the classloader that loaded the mojo) and combine them with the classpath elements provided from our execution environment. These classpath elements are the dependencies of the pom in which the plugin is being invoked. In my case I need this because the JDBC driver to be used is specific to the client of the plugin rather than to the plugin itself. I also need these because the utility expects to find some of it's runtime configuration in a properties file in ${basedir}/target/classes.

Pay particular attention to the code in bold... If your classpath element is a directory you need to append the trailing slash or your resources there (e.g. -- ${basedir}/target/classes) will not be found. This was something of a painful and frustrating lesson...
  private URL[] buildUrls(final List classpathElements) throws MalformedURLException
{
final URL[] mojoUrls = ((URLClassLoader) getClass().getClassLoader()).getURLs();
final URL[] urls = new URL[mojoUrls.length + classpathElements.size()];
int ndx = 0;
for (final URL url : mojoUrls) urls[ndx++] = url;
for (String cpe : classpathElements)
{
final File file = new File(cpe);
if (file.isDirectory())cpe += "/";
urls[ndx++] = new URL("file:/" + cpe);
}
return urls;
}
}

BTW, I also have an UpdateSchemaMojo that is basically the same as GenerateSqlMojo but invokes a different convenience method on UtilityExecutor.


A Custom Lifecycle:

Now the point of my plugin is to create SQL from XML and optionally apply it to your schema. Great. That doesn't really fit with the typical maven build lifecycle. Never one to try the shallow end of the pool first I decided to jump in the deep end (with lead weights) and create a custom package type for my plugin. It turns out not to be too difficult.

To create a custom lifecycle you only need to create a plexus components.xml file. In my case it looks like this:
<component-set>
<components>
<component>
<role>
org.apache.maven.lifecycle.mapping.LifecycleMapping
</role>
<role-hint>sa</role-hint>
<implementation>
org.apache.maven.lifecycle.mapping.DefaultLifecycleMapping
</implementation>
<configuration>
<phases>
<process-resources>
org.apache.maven.plugins:maven-resources-plugin:resources
</process-resources>
<compile>
myco.util:myco-util-edd-plugin:generateSql
</compile>
<package>
org.apache.maven.plugins:maven-jar-plugin:jar
</package>
<install>
org.apache.maven.plugins:maven-install-plugin:install,
myco.util:myco-util-edd-plugin:updateSchema
</install>
<deploy>
org.apache.maven.plugins:maven-deploy-plugin:deploy
</deploy>
</phases>
</configuration>
</component>
<component>
<role>
org.apache.maven.artifact.handler.ArtifactHandler
</role>
<role-hint>sa</role-hint>
<implementation>
org.apache.maven.artifact.handler.DefaultArtifactHandler
</implementation>
<configuration>
<type>sa</type>
<extension>zip</extension>
<packaging>sa</packaging>
</configuration>
</component>
</components>
</component-set>

It isn't really that bad... All I'm doing here is defining the stages of the lifecycle I'm interested in and the plugins to fire at each one. Notice that in install I specify both the standard install plugin as well as my own updateSchema plugin. This means that after my sa artifact is published to the local repo updateSchema will be fired to (optionally) update my database for me. (In case you're wondering, UpdateSchemaMojo has a Boolean parameter that will enable/disable the database update.)

The type, extension and packaging tags are there to tell maven about my custom packaging type and what to call it when it is installed. We will see what this means in just a bit.


The only thing left is the pom.xml for my plugin. There is nothing magic here either:
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">

<modelVersion>4.0.0</modelVersion>

<version>1.4-SNAPSHOT</version>
<groupId>myco.util</groupId>
<artifactId>myco-util-edd-plugin</artifactId>
<packaging>maven-plugin</packaging>

<name>EDD Maven Mojo</name>

<dependencies>
... As necessary
</dependencies>
</project>

Using The Plugin:

OK, so now we know everything there is to know about the plugin. All that's left is to use it. Since we're introducing a new packaging type there is a detail we need to be aware of.

The pom starts out as usual:
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">

<modelVersion>4.0.0</modelVersion>

<parent>
... as necessary
</parent>

<name>contacts database</name>
<artifactId>contacts-database</artifactId>
then we specify our custom packaging type:
  <packaging>sa</packaging>
and then define the plugin. There are two things to note here: (a) the extensions tag tells maven that the plugin extends the standard behavior and (b) the enableEDD property that enables/disables the UpdateSchemaMojo.
  <build>
<plugins>
<plugin>
<groupId>myco.util</groupId>
<artifactId>myco-util-edd-plugin</artifactId>
<version>1.4-SNAPSHOT</version>
<extensions>true</extensions>
<configuration>
<enableEDD>${enableEDD}</enableEDD>
</configuration>
</plugin>
</plugins>
In my case the plugin needs some runtime configuration from a property file so I use a resource tag to be sure that gets copied into ${basedir}/target/classes. I also need the Oracle JDBC driver so that the utility can connect to the database to update the schema.
    <resources>
... As necessary
</resources>
</build>

<dependencies>
<dependency>
<!--
We have an Oracle database so we need to have
this in our classpath to pickup the JDBC driver.
-->
<groupId>oracle</groupId>
<artifactId>ojdbc14</artifactId>
<version>10.2.0.2</version>
</dependency>
</dependencies>
Modifying the schema with every build is probably not a good idea so we disable the update by default and invoke 'mvn -DenableDD=true install' on demand. You could also control this with your ~/.m2/settings.xml and profiles.
  <properties>
<enableEDD>false</enableEDD>
</properties>

</project>

Postfix:

That's it my friend. It is a bit of a long post (hopefully making up for the previous one's shortness) but there was rather a lot to cover. Let's review what I've got here:

A) How to create a Java Mojo implementing your custom Maven 2 plugin functionality.

B) How to create a custom classloader to deal with oddball classloader issues.

C) A clever, IMO, way to isolate the classloader issues and minimize reflection. (Though, honestly, UtilityExecutor could be a bit more clever and less hackish.)

D) How to specify a custom package type with it's own custom lifecycle.

E) How to use your shiny new plugin and it's custom packaging type.


Yes, it took me a bit of time to get everything working. That's one of the problems with jumping into the deep end of the pool when you're first learning something new. On the other hand, I'm reasonably confident that I've faced many (hopefully most) of the odd edge-cases when creating plugins. With any luck this will be a true statement and I can lean on these experiences when I create my next, and hopefully simpler, plugin.

As always, thanks for reading. Feel free to post feedback & questions. Peace ya'll.

No comments: