Tag: java



Once again, jasypt (“Java Simplified Encryption”) makes me smile.

Java comes with a comprehensive set of encryption utilities: JCE. I had to do some decryption the other day and ended up with this code.

public class Decryptor {
    private static final String ALGORITHM = "PBEWithMD5AndDES";
    private final Base64 base64 = new Base64();
    private final Cipher cipher;
    public Decryptor(String password) {
        try {
            cipher = Cipher.getInstance(ALGORITHM);
            // Just generate an empty salt as we don't want to use one.
            byte[] salt = new byte[cipher.getBlockSize()];
            Arrays.fill(salt, (byte) 0);
            // Set up the cipher.
            PBEKeySpec pbeKeySpec = new PBEKeySpec(password.toCharArray());
            SecretKeyFactory factory = SecretKeyFactory.getInstance(ALGORITHM);
            SecretKey key = factory.generateSecret(pbeKeySpec);
            PBEParameterSpec parameterSpec = new PBEParameterSpec(salt, 1000);
            cipher.init(Cipher.DECRYPT_MODE, key, parameterSpec);
        } catch (NoSuchAlgorithmException e) {
            throw new RuntimeException(e);
        } catch (NoSuchPaddingException e) {
            throw new RuntimeException(e);
        } catch (InvalidKeySpecException e) {
            throw new RuntimeException(e);
        } catch (InvalidKeyException e) {
            throw new RuntimeException(e);
        } catch (InvalidAlgorithmParameterException e) {
            throw new RuntimeException(e);
    public String decrypt(String cipherText) {
        try {
            // cipherText is base64, so will always be ASCII.
            byte[] cipherBytes = base64.decode(cipherText.getBytes("US-ASCII"));
            byte[] decryptedMessage = cipher.doFinal(cipherBytes);
            // Assume encrypted text is UTF-8.
            return new String(decryptedMessage, "UTF-8");
        } catch (UnsupportedEncodingException e) {
            throw new RuntimeException(e);
        } catch (IllegalBlockSizeException e) {
            throw new RuntimeException(e);
        } catch (BadPaddingException e) {
            throw new RuntimeException(e);

This does also include conversion from base64 using Apache commons-codec, but it’s a lot of code. Even though I’ve tried to make things simpler (attempting to not use a salt [and I am aware of the consequences]), there’s still a lot of baggage.

Which is why I’m so grateful to have found jasypt. Essentially, all the above code is wrapped up for you in a nice little API. The above comes down to:

  SimplePBEConfig config = new SimplePBEConfig();
  config.setSaltGenerator(new ZeroSaltGenerator());
  StandardPBEStringEncryptor encryptor = new StandardPBEStringEncryptor();

So why did I have to write the code in the first place? Well, we need to get a 3rd party to implement it, and I can’t enforce dependencies upon them. But the difference in the amount of code is shocking.

Embedding the maven version number

If you build a jar file with maven, it helpfully embeds a properties file so that you can pull out information about the build environment. You just need some code like this.

  String path = "/META-INF/maven/groupId/artifactId/pom.properties"
  InputStream stream = getClass().getResourceAsStream(path);
  Properties props = new Properties();
  System.out.println("my version is" + props.get("version"));

Now, this is all well and good, but it only works when you’ve built the artifact1. In development, it simply doesn’t exist, and you have to work around that in the code.

It would be much simpler if you could just compile in the version as a constant.

Thankfully, when I asked the question on the maven-users list, Lee Meador showed me an example. In a nutshell, you use ant to generate an extra source file containing the version number.

First, you need a little ant script to emit some Java code.

  <project basedir=".." default="generate-version">
    <target name="generate-version">
    	<mkdir dir="${src.dir}/com/mycompany/myproject"/>
        <echo file="${src.dir}/com/mycompany/myproject/Version.java">
            package com.mycompany.myproject;
            /** Automatically generated by ant. */
            public class Version {
              public static final String VERSION = "${version}";

Then, you need to add the antrun plugin to your pom.xml.

            <ant antfile="etc/build-version.xml" inheritAll="false" inheritRefs="false">
              <property name="version" value="${project.version}" />
              <property name="src.dir" value="${project.build.directory}/generated-sources" />

This binds the antrun:run goal to the “generate-sources” phase (which happens right at the start of the build). It runs the little ant script above, passing in the correct properties. Then, it adds a new source directory, so maven knows to compile that one file that we’re generating.

When I initially tried this in Eclipse, it couldn’t find the Version class. However, running “Maven → Update Project Configuration” added target/generated-sources as a new source folder. And everything worked just fine after that. If you don’t have that option, you need to update to a new version of m2eclipse. Deleting and reimporting the project should sort it out.

This might seem like a fairly complicated way of solving the problem of getting the maven version number. But it’s more reliable than any of the other methods I have seen, so I’m pretty happy with it.

1 Yeah, I know. American spelling. I’ve been polluted by years of exposure.

AntUnit inside Maven

I’m in the middle of converting jslint4java to use maven as its build system (yes, really). Part of this is ensuring that the antunit tests I wrote continue to work. Maven has the antrun plugin, but it’s not 100% obvious how to use an antlib inside it.

Normally, to run an antlib extension as part of your build, you first dump the jar in ~/.ant/lib (or on the command line using a -lib flag) and then reference it in a namespace. e.g.

  <project name="foo"

This relies on ant being able to pick out org/apache/ant/antunit/antlib.xml from the classpath.

When doing this in maven through the antrun plugin, you can’t just dump stuff into ~/.ant/lib however. You need to tell ant where everything is. Originally, I was following the example of the rat ant task. This attempts to invoke a second copy of ant with the correct -lib argument. It’s ugly though. Why should we fork a second copy of the JVM? And ant may not even be installed (in maven, it’s just jar files in the local repository).

Eventually, I looked closer at the antlib documentation and found Load antlib from inside of the buildfile. This shows how a typedef is the key: You can associate a given antlib URI with a particular classpath entry. Thankfully, maven-antrun-plugin provides ${maven.test.classpath} (amongst others) which contains every entry that we need. So the solution now looks something like this.

Firstly, pom.xml.

                <ant antfile="${build.testOutputDirectory}/antunit/tests.xml"
                  inheritAll="false" inheritRefs="false">
                  <property name="test.classpath" refid="maven.test.classpath" />
          <!-- Force an upgrade to a newer ant version.  Required by antunit. -->

So, when we get to the test phase, we automatically run antunit/tests.xml, passing in the correct classpath. This is what the ant file looks like:

  <project name="jslint-antunit"
      <taskdef uri="antlib:org.apache.ant.antunit"
               classpath="${test.classpath}" /></project>

This appears to work quite well, and runs my tests a sight quicker than before. If anybody from the apache rat project is listening, you might want to update your POMs. 🙂

REST Web Services in Java

I’ve been taking a look at jersey today. It’s an implementation of JSR 311, which is a proposed spec for implementing REST-like web services in Java. I started with the tutorial, which is quite frankly pretty bloody simple.

  public class HelloWorldResource {
    public String getClichedMessage() {
      return "Hello World";

The javadoc for the annotations explains what they do in some more detail (Path, GET, Produces).

The tutorial also includes a method of running that standalone using the grizzly HTTP connector. But I’m a stick-in-the-mud. I want it to run under tomcat. So how do I do that? Again, it turns out to be fairly simple. Just add the jersey-supplied integration servlet to your web.xml.


I specified an extra parameter to say which package I want scanned for resources. Otherwise Jersey will scan everything in /WEB-INF/classes and /WEB-INF/lib by default. Anyway, this works a treat:

% curl -i http://localhost:8080/jerseytest/hello
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: text/plain
Transfer-Encoding: chunked
Date: Sat, 13 Sep 2008 20:44:41 GMT

hello world

I’m going to play with this some more. The API feels quite nice. Particularly when compared to Restlets, which I found a little bit too uniform. Mind you, the latest restlet code has its own version of JSR 311 support, so it may still be useful. It’s also actually worth checking out the JSR-311 spec, as it’s quite readable.

Sitemap components in Cocoon 2.2

For the cocoon 2.1 project I did last year, I wrote a few components in Java (mostly Generators and one InputModule). It’s a bit of a pain because it’s built on the out-of-date and intrusive avalon framework. Anyway the end result is that you can write things in your cocoon sitemap like:

  <map:match pattern="foo">
    <map:generate type="foo" />
    <map:transform type="xslt" src="{my:something}" />
    <map:serialize type="html" />

Now with the newer cocoon 2.2 project, we needed to reuse the InputModule. In order to get an Avalon component working in cocoon 2.2, you need:

  • src/main/java/com/myco/MyInputModule.java
  • src/main/resources/META-INF/cocoon/avalon/my-input-module.xconf

The latter is the old style avalon configuration file and looks like this:


But cocoon 2.2 is meant to be Spring based. So we should be able to write a Spring bean that does the same thing. Unfortunately, this doesn’t appear to be documented anywhere. But there’s always the source code for cocoon itself.

As an aside, I’m hugely grateful to Jukka Zitting for making available git mirrors of apache projects. This let me download all of cocoon’s source, including full history in a very short time. It’s a helluva useful resource, and I hope it gets proper support from the ASF.

So, I started looking through the Cocoon source code for Spring config files that might reference InputModules.

⇒ find . -name '*.xml' | grep META-INF/cocoon/spring | xargs egrep InputModule
./core/cocoon-servlet-service/cocoon-servlet-service-components/src/main/resources/META-INF/cocoon/spring/cocoon-servlet-linkRewritingReader.xml:        <property name="inputModule" ref="org.apache.cocoon.components.modules.input.InputModule/servlet"/>
./core/cocoon-servlet-service/cocoon-servlet-service-components/src/main/resources/META-INF/cocoon/spring/cocoon-servlet-service-complete-path-module.xml:    <bean name="org.apache.cocoon.components.modules.input.InputModule/servlet"
./core/cocoon-servlet-service/cocoon-servlet-service-components/src/main/resources/META-INF/cocoon/spring/cocoon-servlet-service-complete-path-module.xml:        <property name="blockPathModule" ref="org.apache.cocoon.components.modules.input.InputModule/block-path"/>
./core/cocoon-servlet-service/cocoon-servlet-service-components/src/main/resources/META-INF/cocoon/spring/cocoon-servlet-service-path-module.xml:    <bean name="org.apache.cocoon.components.modules.input.InputModule/block-path"
./core/cocoon-servlet-service/cocoon-servlet-service-components/src/main/resources/META-INF/cocoon/spring/cocoon-servlet-service-property-module.xml:	<bean name="org.apache.cocoon.components.modules.input.InputModule/block-property"

That seems to suggest that by creating a Spring bean whose name is the interface followed by a slash and then a name, you should be able to get at it in the sitemap. And it does indeed work like that. This is the bean name I ended up with.

  <bean name="org.apache.cocoon.components.modules.input.InputModule/my"
    <property name="message" value="aardvark" />

In order for it to not be an Avalon component, I also removed the base class I was using and just implemented the InputModule interface directly. And to my surprise, I could use the bean directly in the sitemap.

But we have the source, so we can see why it’s working. Did you notice the ROLE constant in InputModule? We can grep for that in the source. That quickly led me to PreparedVariableResolver.java, which contains:

  InputModule module;
  try {
    module = (InputModule) this.manager.lookup(InputModule.ROLE + '/' + moduleName);
  } catch (ServiceException e) {
    throw new PatternException("Cannot get module named '" + moduleName +
    "' in expression '" + this.originalExpr + "'", e);

The manager field is an instance of ServiceManager, which just happens to be AvalonServiceManager. And lookup() is a standard Spring construct:

  try {
    return this.beanFactory.getBean(role);
  } catch (BeansException be) {
    throw new ServiceException("AvalonServiceManager",
      "Exception during lookup of component with '" + role + "'.", be);

Further nosing around reveals that Generators are looked up in Spring in a similar manner inside AbstractProcessingPipeline:

  try {
    this.generator = (Generator) this.newManager.lookup(Generator.ROLE + '/' + role);
  } catch (ServiceException ce) {
    throw ProcessingException.throwLocated("Lookup of generator '" + role + "' failed", ce, getLocation(param));

I’m guessing that transformers, serializers and readers follow a similar pattern.

So, that’s how to do Cocoon components in Spring. It’s a bit simpler than before, which can’t be a bad thing. I just wish it was documente, but that’s a big problem for Cocoon anyway…

jslint4java 1.2

I’ve finally gotten around to finishing off the code that I’ve had sitting around in google code for over a year, and released jslint4java 1.2. The changes are actually pretty small:

  • Update to jslint 2008-09-04. This adds several new options.
  • Several updates to the ant task:
    • Move the antlib definition to “antlib:net.happygiraffe.jslint” (incompatible change).
    • Default to failing the build if validation fails (incompatible change).
    • Allow use of the fileset element to be more flexible about specifying where your javascript files are. This replaces several attributes on the jslint task (incompatible change).
    • Add a formatter sub-element, which can output either XML or plain text, either to the console or to a file.
    • See the documentation for some examples of the new usage.
  • Allow access to the HTML report produced by JSLint through an API.

I’ve also uploaded the javadoc directly into subversion on google code, so it’s permanently online. The google-collections guys seem to do this, so it can’t be a bad idea, right?

Typical. Five minutes after I release, I notice that it’s been compiled with 1.6 instead of 1.5. So, I’ve just released v1.2.1…

What's in an IP address?

What is an IP address? To many people, it’s just a sequence of four digits separated by dots1. But really it’s just a 32 bit integer. So, my IP address right now is As a 32 bit integer, that’s:

  • 0xC0A80134 (hexadecimal)
  • 11000000 10101000 00000001 00110100 (binary)

An IP address on it’s own isn’t terribly useful. It’s normally part of a network. That means that a certain part of the IP address designates the network and a certain part designates the “host” (your computer). Conventionally, this is represented through a network “mask” — a series of bits stating which bits identify the network and which bits don’t. My netmask is presently displayed as Again, as an integer that is:

  • 0xffffff00 (hexadecimal)
  • 11111111 11111111 11111111 00000000 (binary)

Combining an IP address and a netmask, you can find out just the network. You have to use a “binary AND” to work it out.

  11000000 10101000 00000001 00110100 (
& 11111111 11111111 11111111 00000000 (
= 11000000 10101000 00000001 00000000 (

Effectively, that’s the lowest address in the network. Likewise, it’s fairly simple to work out the highest address in the network. You use a “binary OR” with the inverse of the netmask. i.e.

  11111111 11111111 11111111 00000000 (
~ 00000000 00000000 00000000 11111111 (
| 11000000 10101000 00000001 00110100 (
= 11000000 10101000 00000001 11111111 (

Thankfully, these are both far more expressible in high level programming languages.

  lowest  = address &  mask;
  highest = address | ~mask;

Now you may have noticed that the netmask is a contiguous range of bits. It’s common to express the netmask as just the number of bits in the netmask. So my IP address and netmask are expressible together as This is known as CIDR format (and the number of bits is known as a prefix length).

Here’s some sample Java code for representing an IP address2. Doing this in Java is particularly entertaining because Java only has signed integers, unlike C. So there are some shenanigans involved in making the netmask.

  public class IP implements Comparable<ip> {
    private static final int MAX_MASK = 32;
    private static final int MIN_MASK = 0;
    private final int addr;
    private final int prefixLength;
    public IP(int addr) {
      this(addr, MAX_MASK);
    public IP(int addr, int prefixLength) {
      this.addr = addr;
      if (prefixLength < MIN_MASK || prefixLength > MAX_MASK)
        throw new IllegalArgumentException("mask must be >= 0 && <= 32");
      this.prefixLength = prefixLength;
    public int compareTo(IP o) {
      if (o == null)
        throw new NullPointerException("can't compare to null");
      if (addr < o.addr)
        return -1;
      if (addr > o.addr)
        return 1;
      return prefixLength - o.prefixLength;
    public boolean equals(Object obj) {
      if (!(obj instanceof IP)) {
        return false;
      IP other = (IP) obj;
      return prefixLength == other.prefixLength
          && (addr & getMask()) == (other.addr & other.getMask());
    public IP getLowerBound() {
      return new IP(addr & getMask());
    public int getMask() {
      // note: the >>> operator on 32-bit ints only considers the
      // five lowest bits of the shift count, so 32 shifts would
      // actually perform 0 shift!
      return prefixLength == 32 ? -1 : ~(-1 >>> prefixLength);
    public IP getUpperBound() {
      return new IP(addr | ~getMask());
    public int hashCode() {
      return addr ^ getMask();
    private String intToString(int i) {
      return (i >> 24 & 0xFF) + "."
           + (i >> 16 & 0xFF) + "."
           + (i >> 8  & 0xFF) + "."
           + (i & 0xFF);
    public String toString() {
      StringBuilder sb = new StringBuilder();
      if (prefixLength != MAX_MASK) {
      return sb.toString();

This is fairly basic (it could use a contains() method, for example). But it’s been quite useful for me over the last couple of weeks for representing CIDR addresses. By treating an IP address as a CIDR range of /32, it’s handy for the general case.

I hope that this is of use to somebody who is tempted to use a String to store an IP address in future.

Of course, whilst it’s useful, you also need a parser of sorts to go from the String form to an instance of the IP class. This is what I’ve been using, which I “borrowed” from limewire’s IP.java.

  public class IpParser {
    private int parseMask(String maskStr) {
      return Integer.parseInt(maskStr);
    public IP parseCidr(String addr) {
      int slash = addr.indexOf('/');
      if (slash == -1) {
        return parseSingleAddr(addr);
      } else if (addr.lastIndexOf('/') == slash) {
        return parseCidrOnly(addr);
      } else {
        throw new IllegalArgumentException("Can't parse " + addr);
    private IP parseCidrOnly(String addr) {
      int slash = addr.indexOf('/');
      String maskStr = addr.substring(slash + 1);
      String addrStr = addr.substring(0, slash);
      return new IP(stringToInt(addrStr), parseMask(maskStr));
    private IP parseSingleAddr(String addr) {
      return new IP(stringToInt(addr));
    private int stringToInt(final String ipStr) {
      int ip = 0;
      int numOctets = 0;
      int length = ipStr.length();
      // loop over each octet
      for (int i = 0; i < length; i++, numOctets++) {
        int octet = 0;
        // loop over each character making the octet
        for (int j = 0; i < length; i++, j++) {
          char c = ipStr.charAt(i);
          if (c == '.') { // finished octet?
            // can't be first in octet, and not ending 4th octet.
            if (j != 0 && numOctets < 3) {
              break; // loop to next octet
          } else if (c >= '0' && c <= '9' && j <= 2) {
            octet = octet * 10 + c - '0';
            // check it's not a faulty addr.
            if (octet <= 255) {
          throw new IllegalArgumentException("Could not parse " + ipStr);
        ip = ip << 8 | octet;
      // if the address had less than 4 octets, push the ip over suitably.
      if (numOctets < 4) {
        ip <<= (4 - numOctets) * 8;
      return ip;

1 Ignoring IPv6 for now, as I don’t have any need for it.

2 Don’t get me started on how non-useful Java’s InetAddress class is.

Logging in Java

Logging has long been a pet peeve of mine. I find it intensely irritating to arrive at a tomcat installation and see a catalina.out file hundreds of megs large because some fool of a developer thought that System.out.println() was a good logging tool. I recently gave a presentation at work about logging, and this is the contents that it was based on (mildly rewritten).

Why use logging?

Well, you could use System.out.println(). But as alluded to above, this is a very coarse tool. There’s no way to turn it off without altering the code. With a logging system, you get much greater control, including the ability to control what gets logged and where it ends up. And you get to control it at runtime without altering your code.

Generally, proper logging gives you better visibility into your project and its behaviour.

Perhaps most importantly you should do it in order to not piss off your sysadmins by generating a very large non-rotateable file.

Which logging system to use?

There are quite a few out there.

In addition, there are “abstraction libraries,” which allow you to choose different logging systems to actually use. For example:

My advice is that unless you’re writing a library that you expect 3rd parties to use (over whom you have no control), then you should stick to log4j. It’s extremely widely used and well known. It works very well. It’s also wonderfully configurable.

If you do have to write a library for 3rd parties, then commons-logging is the more popular choice. Whilst it certainly has it’s flaws, it does appear to work for me. Caveat emptor.

But what about JDK Logging?

JDK Logging was patterned after log4j. Only they made the configuration a lot less flexible. The main flaw (as I see it) is that you can only configure it via a system property, which means it’s inherently global. When you’re running several webapps in the same JVM (via tomcat), this is a fatal flaw.

Actually, I’ve since learned that tomcat works around this by providing it’s own replacement implementation of LogManager. Regardless, the configuration is still very anaemic when compared with log4j.

What should I log?

  • At DEBUG, log whatever the hell you like so you know you’ve been down a particular code path, or what a certain piece of data looks like.
  • At INFO, support should be able to trace through transactions and figure out what’s going on.
  • At WARN, note things that could constitute a problem.
  • At ERROR, note things are are a problem.
  • At FATAL, note why you cannot continue (“my CPU is on fire”).

All messages above DEBUG must be readable by non-developers — plain English, please!

Remember to include as much context as possible. Don’t just note that something failed. Say what the conditions were that caused that failure.

Beginning log4j

First, create a very simple log4j.properties (you can use log4j.xml but it’s more verbose and the extra features it makes available are rarely needed).

  log4j.rootLogger=WARN, stdout

NB: I set the default logging level to WARN and left our code logging at INFO. This is in order to reduce junk output from 3rd party libraries like Spring and Hibernate. Of course, you can always enable that if needed.

Place this somewhere it will end up at the root of your classpath.

  • For maven, put it in src/main/resources/log4j.properties.
  • For an ant project, put it in src/log4j.properties.

Now, create a logger and use it.

  package com.me;
  public class Foo {
    // Non-static allows use of getClass().
    private Logger LOG = Logger.getLogger(getClass());
    public void logSomething() {

That should end up on your console. Hurrah!

NB: Always using a consistent field name for your logger makes it much easier to grep for log messages later on. It’s a good candidate for an Eclipse template.

How do I enable DEBUG for a single class/package?

Add this to log4j.properties.


Everything else will continue to be logged at the level of the rootLogger, but Spring stuff will come out more verbosely.

To understand how this works, you have to understand the idea of “logger hierarchies.” If you have configured the rootLogger as WARN and the “com.me.proj2” logger as INFO, this is how the logger hierarchy could end up looking.


How do I customize the output format?

Configure a different Layout in log4j.properties. e.g. PatternLayout.

 log4j.appender.stdout.layout.conversionPattern=%d{ISO8601} %c{2} %p %m%n

That configures each line with the following fields:

  • The date & time in standard ISO 8601 format.
  • The last two components of the classname which made the log message.
  • The priority level.
  • The message itself.
  • A newline.

How does log4j get configured?

The configuration process is fairly simple. The first time you call log4j, it looks on the classpath for log4j.xml and then log4j.properties. That’s all there is to it, unless you change it.

There are several ways to impose your own order on things.

  • You can specify a log4j.configuration system property pointing a file. But as we discussed earlier, system properties are generally best avoided.
  • You can use the log4j API (PropertyConfigurator or DOMConfigurator) to find the config file in the location you want. You have to execute this code before the rest of your app starts up.

If it still doesn’t seem to be doing what you expect, you can (as a one-off) specify the log4j.debug system property on the command line. That will show you where log4j is looking for its configuration on stderr.

How do I configure logging in my tests?

Generally you don’t want logging in tests. They’re verbose enough already, and JUnit will let you know when something’s wrong. You can enforce this by creating a second log4j.properties that lives in the classpath of your tests, which sets the rootLogger to WARN (and nothing else more detailed).

  • For maven, put it in src/test/resources/log4j.properties.
  • For an ant project, put it in test/log4j.properties.

How should I configure log4j in a webapp?

As early as possible. This lets you see debug output from the rest of your app starting up.

That is, in a special ServletContextListener. Trying to configure log4j from within a Spring applicationContext is far, far too late (this appears to be a common mistake).

Spring has a special purpose Log4jConfigListener for this purpose. You can write your own, however. Here’s an example of the web.xml configuration.


How should log4j be setup for development / production in a webapp?

Don’t have a default /log4j.properties on the classpath. Instead, have the Log4jConfigListener set up in the web.xml, pointing at /WEB-INF/log4j.properties (as shown above). That file should contain your development logging setup (INFO level output to the console) as well as commented out examples of production logging setup (WARN level output to a rotated file).

Create a file META-INF/context.xml (for tomcat), which contains a commented out override for the log4jConfigLocation context-param. e.g.

   <parameter override="false"

When packaged as a war, the two files are easily extracted and can be used to override the logging behaviour.

 % unzip -p ~/foo.war META-INF/context.xml > $sitecode/conf/Catalina/localhost/foo.xml
 % vi $CATALINA_BASE/conf/Catalina/localhost/foo.xml
 % unzip -p ~/foo.war WEB-INF/log4j.properties > /etc/myapp/log4j.properties
 % vi /etc/myapp/log4j.properties

This is probably the most flexible solution. It means that you can deploy the same war file anywhere, and just have a small amount of one-time setup to get it going.

How do I get my log files rotated?

Log4j comes with a builtin DailyRollingFileAppender which will give you a new log file every day. However, I’ve tended to prefer the DatedFileAppender addon, which works in a very similar manner to cronolog. That is, each log file is named with the current date in it.

This is how it’s configured in log4j.properties.

log4j.rootLogger = WARN, logfile
log4j.appender.logfile.layout.ConversionPattern=%d{ISO8601} %p %c{2} - %m%n

What about exceptions?

Lots of people tend to log that an exception occurred and then move on as if nothing happened.

  // FAIL
  try {
  } catch (SomeNastyException e) {

This is bad because it loses all the context associated with the exception, as well as failing to stop the course of the program. As I noted before, you’re much better off rethrowing it for something higher up to deal with.

  try {
  } catch (SomeNastyException e) {
    throw new RuntimeException(e);

Whilst this doesn’t work in all cases, it’s a far better default stance to take.

You might also want to look at Best Practises For Exception Handling.

How do I get a unique per-request ID in my logs?

Tools like postfix and sendmail include the message-id in the log files. This is invaluable for tracing a message over several lines of a log file. I needed to do this in a webapp and found log4j’s Nested Diagnostic Context. This is how I implemented it.

  public class AddIdListener implements ServletRequestListener {
    private static long requestNum = 0;
    private static synchronized long getNextRequestNum() {
      return requestNum++;
    public void requestInitialized(ServletRequestEvent sre) {
    public void requestDestroyed(ServletRequestEvent sre) {

With this configured in the web.xml, you need to use the PatternLayout to get it into the logs.

 log4j.appender.stdout.layout.conversionPattern=%d{ISO8601} %x %c{2} %p %m%n

Now, all requests with the same value in the %x field belong to the same request.

This may not be a perfect solution, but it works quite well for me.

A performance tip

Sometimes, a call to a logger can be expensive. Particularly in the middle of a tight loop. For this, you can ask log4j if it’s even going to bother using the output, which is quicker.

  for (BigHairyObject bho : bigHairyObjects) {
    if (LOG.isDebugEnabled()) {
      LOG.debug("here, bho=" + bho);

SAX EntityResolver

I was trying to resolve entities (&weirdChar;) in an XML file. Easy enough, use a validating parser. But here’s the tricky bit: get the entity definitions from the classpath. This should still be easy, as SAX provides an EntityResolver.

Unfortunately, the interactions between JAXP and SAX make life complicated. I found that you have to ignore the SAXParser (from JAXP) and instead focus on the XMLReader interface (part of plain old SAX).

This is what I came up with. First, a small driver.

  public void parseIt() {
    SAXParserFactory spf = SAXParserFactory.newInstance();
    XMLReader reader = spf.newSAXParser().getXMLReader();
    reader.setEntityResolver(new MyResolver());
    // Look for test.xml on the classpath.
    InputStream testXmlStream = App.class.getClassLoader().getResourceAsStream("test.xml");
    reader.parse(new InputSource(testXmlStream));

That references the EntityResolver implementation I wrote:

  class MyResolver implements EntityResolver2 {
    public InputSource resolveEntity(String name, String publicId, String baseURI, String systemId)
      throws SAXException, IOException {
      InputStream stream = getClass().getClassLoader().getResourceAsStream(systemId);
      return new InputSource(stream);

Actually, I had to use EntityResolver2 for reasons I don’t entirely understand.

On top of this, I found that I had to include xerces 2.8 explicitly as a dependency. The version bundled with Java 1.5 is Xerces 2.6.2, which has a bug: It passes the entity resolver an absolutized systemId. Which makes it very difficult to resolver further. What a pain in the arse.

But it does now work, and I can successfully resolve entities off the classpath.

Logging in Cocoon 2.2

I’ve had to try and understand logging in Cocoon 2.2 for a project at work recently. It’s been “interesting,” so I thought I’d blog the process in case anybody else needs to o this…

Normally, logging in Java is quite simple: you add log4j to your classpath, then create a log4j.properties to say what gets logged. If you’re running as part of a webapp, you use something like Spring’s Log4jConfigListener to ensure that the configuration gets applied as soon as the webapp is active.

Cocoon is different. By default (in development) you run it using a combination of the cocoon-maven-plugin and mvn jetty:run. This is quite cunning (as Grzegorz explained in a comment a while back), it lets you edit all sorts of stuff and have it dynamically reloaded. In order to make the cocoon “block” work with jetty, the maven plugin creates things like web.xml for you automatically, so there’s no chance to edit things. Drat.

Now, if you follow the documentation for logging in Cocoon, it advises:

The usual Cocoon web application sets up Log4j through the Cocoon Spring Configurator.

Lovely advice, except that by the time Spring has started up, read your logging configuration and applied it, a great deal of interesting events have already occurred. You really need to enable logging as early as possible using a ServletContextListener.

Thankfully, it’s possible to do so, even when using mvn jetty:run.

First, you need to create a log4j configuration that’s suitable. I think it has to be XML, which is a shame as it’s more complicated than the properties file. I wanted to change the defaults to get my FlowScript calls to log to the console. This is what I ended up with in etc/log4j.xml.

  <!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
  <log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">
    <appender name="stdout" class="org.apache.log4j.ConsoleAppender">
      <param name="target" value="System.err"/>
      <layout class="org.apache.log4j.PatternLayout">
        <param name="ConversionPattern" value="%d{ISO8601} %c{2} %p - %m%n"/>
    <logger name="cocoon">
      <level value="INFO" />
    <logger name="org.apache.cocoon.components.flow.javascript.fom">
      <level value="INFO" />
      <priority value="WARN"/>
      <appender-ref ref="stdout"/>

Note that we shut everything up (WARN) by default, and then explicitly enable messages for things we want to see (INFO). I’ve found that even in development, this helps to tell the wood from the trees.

With that in place, you have to edit pom.xml in order to tell the cocoon-maven-plugin to use that instead if its default. The default pom should have a build/plugins/plugin section, to which this stanza needs adding.

    <!-- Gets copied to target/.../WEB-INF/log4j.xml -->

Finally, you need to arrange for the auto-generated web.xml to be patched with a reference to Log4jConfigListener. This is done through Cocoon’s slightly arcane mechanism, xpatch. Create a file src/main/resources/META-INF/cocoon/xpatch/log4j.xweb which looks like this.

  <xweb xpath="/web-app"
        unless="comment()[contains(., 'Log4j Configuration')]"
    <!--Log4j Configuration-->

Now if you run mvn jetty:run in your block and inspect the generated web.xml, you should see the above patched in to place. Also, you should be able to generate messages on the console from within FlowScript by doing:

  cocoon.log.info("hello world");

The procedure above is a hassle. But the benefit of being able to see logging messages coming out on the console in front of you is significant.

One final point to note. When you do run mvn jetty:run, you’ll see a few log4j errors, i.e.

log4j:WARN No appenders could be found for logger (org.apache.commons.configuration.ConfigurationUtils).
log4j:WARN Please initialize the log4j system properly.

log4j:WARN No appenders could be found for logger (org.apache.commons.jci.stores.MemoryResourceStore).
log4j:WARN Please initialize the log4j system properly.

As far as I can tell these are completely ignorable, just very annoying. They appear to happen before jetty itself starts up, and are irrelevant to the web app (as far as I can see).