mod_perl 1 blows chunks

At $WORK, I’m looking at a web service built on mod_perl 1 / Apache 1. The service takes XML as input and returns XML as output. So far, so good.

Unfortunately, whilst I was testing with curl, I found something odd:

  curl -s -v -T- -H'Content-Type: text/xml;charset=UTF-8' http://localhost/api < input.xml

That -T- says to do a PUT request from stdin. It fails and my code returned “no input”.

But when I did this, things worked:

  curl -s -v -Tinput.xml -H'Content-Type: text/xml;charset=UTF-8' http://localhost/api

That reads directly from the file. The only difference between the two requests is that the latter includes a Content-Length header whilst the former has Transfer-Encoding: chunked instead.

This is the code that was reading the request.

    my $content;
    if ( $r->header_in( 'Content-Type' ) ) {
        $r->read( $content, $r->header_in( 'Content-Length' ) );
    return $content;

So, if there’s no Content-Length, what should we do? My first stop is always the venerable eagle book. There’s a little footnote next to read():

At the time of this writing, HTTP/1.1 requests which do not have a Content-Length header, such as one that uses chunked encoding, are not properly handled by this API.

Marvellous. Now, I had a look around in the source code and noticed a function called new_read(). Unfortunately, that failed to work. It stopped chunked reads, but failed to work for ordinary ones.

I did see a post on the mod_perl mailing list which reckoned you could loop and read all input. But I was unable to get that to work either.

So I just decided to disallow chunked input. That’s fairly easy to do, and HTTP has a special status code for it: 411 Length Required. It’s not ideal, but unless this project gets upgraded to Apache 2 (unlikely, quite frankly), it seems to be the best option.


mod_perl 2 not ready yet

I’ve spent nearly three days this week trying to port one of our sites to Apache 2.2 and mod_perl 2.0.2 (from Apache 1.3.33). It should be a relatively simple exercise thanks to the porting notes available:

Yet sadly, there’s still a lot of problems with mod_perl 2. Firstly, much CPAN software still hasn’t adjusted. In my case it was SOAP-Lite. But I also noticed that libapreq wasn’t classed as “ready” by Mason, so we had to fall back to there.

But the real killer is that they managed to completely break environment variables, in the name of thread safety. Unfortunately, our application uses Inline::Java from inside Apache to talk to Lucene. Now Inline::Java spawns a JVM to run a JAR file and be the server. So that the JVM can find the jar (as well as the lucene jar and our code), it sets $ENV{CLASSPATH}. Except that change never shows up, so the JVM just says “unknown class” and exits.

Basically, mod_perl2 breaks system. This is not clever.

So we won’t be upgrading to Apache 2.2 for a while. This is a shame, as it has a bug fix we really need (>4Gb file support). Instead, we switched to using FTP. Yuck.


Servlets vs mod_perl

I’ve been looking at the Java servlets stuff for a couple of days now and it’s clear that it bears a remarkable similarity to mod_perl.

  • In mod_perl, you define handler() which gets passed a $r object to deal with the request and send a response. Servlets make things a little more explicit. You make doGet() or doPost() methods, which get passed separate request and response. But they’re basically equivalent to $r.
  • mod_perl and servlets both have filters, but they work in slightly different ways. servlets are much more explicit; you have to forward the call on to the next servlet when you’re done. mod_perl handles things nearly implicitly (with help from Apache::Filter).
  • mod_perl is exposing the apache httpd API so you get access to different request phases. servlets run in a completely different environment, so that just isn’t applicable to them.
  • Whats also incredibly similiar to both pieces of software is that most people think that they’re far too low level and have come up with higher level “frameworks” on top. The standard one for Java appears to be JSP, which on close inspection bear a marked similiarity to HTML::Mason. As with mod_perl though, no one framework covers everybody’s needs, so there are lots of similiar, competing ones.
  • One thing that’s markedly different between the two is deployment and configuration. Most mod_perl deployments tend towards the messy, unless the application is packaged up like a CPAN module. But configuration is always manually done. Servlets come with a nicely defined “deployment descriptor” (aka “config file”—they love big words in the Java world). In addition, there’s a standardised directory layout and packaging format making it easy to deploy your application. In theory. I haven’t actually got as far as deploying an application in a production environment yet, but it looks reasonable to me.