PUTting an URL

Today, somebody at work wanted to PUT some data to an URL. It’s part of an internal web service we use. Should be easy write. In Perl, it’s pretty trivial.

my $ua = LWP::UserAgent->new();
my $req = PUT '', Content => $ARGV[0];
my $resp = $ua->request( $req );
die $resp->status_line, "n" unless $resp->is_success;

That takes the first command line argument and PUTs it Pretty simple stuff. But compare this to the Java way, which is what my colleague was attempting.

public class HttpPutTest {
    public static void main(String[] args) throws IOException {
        URL dest = new URL("");
        HttpURLConnection conn = (HttpURLConnection) dest.openConnection();
        if (conn.getResponseCode() < 200 || conn.getResponseCode() > 299)
            throw new IOException(conn.getResponseMessage());

Firstly, as with all Java, the sheer verbosity is staggering. But there are two things of interest to me here.

  1. The request and response are conflated into one object, an UrlConnection (or variation thereof). There’s no way to retrieve header values on the request once they’ve been set.
  2. Setting the content is done through the most contorted route. Firstly, we have to tell the connection that it’s to expect output (why? I’m doing output things below, it should work it out). Secondly, we have to fetch an OutputStream, marshal our input into bytes and then write those bytes to the stream.

It’s that OutputStream that gets me. Oh, I can see why you’d need it occasionally. But 90% of the time when you’re doing a PUT or a POST, the amount of data is tiny, and you have it all up front. So you don’t need to stream it to the server. You’re paying the price for it 100% of the time, even though it’s only needed 1% of the time, if that. Really, how hard would a setContent() method be?


Java Unicode Characters

Working on jenx, I’ve started looking at Characters. In particular, Astral characters. My first question was “how do I create one in a string literal?” Well I still don’t know. But my researches have shown that to do anything outside the Basic Multilingual Plane (BMP) requires JDK 5. Drat. That kind of limits the usefulness of this library. But I really need the stuff in JSR 204.

Which is a story in itself. It’s good thing that Java can handle the full Unicode range. But the support is (to be quite frank) a bit crap. Mostly down to the fact that char is a UTF-16 codepoint, not a “Unicode Character.” I personally don’t find it helpful that they’ve propogated the C problem of confusing char and int, and generally allowing the two to roam freely amongst each other. Plus, JSR 204 looks like it was extremely careful to avoid breaking backwards compatibility, which is always a noble goal, but in the case makes the end result incredibly difficult to use. I shouldn’t have to test each codepoint to see whether or not it’s a surrogate. Really. This is an OO language, I should be able to get the next Character object from the String. Shocking, I know.

It strikes me that Python is pretty much the only language I know that got Unicode right by making an entirely separate object, the “Unicode String.”

Update: Oh alright. In my whinging, I managed to miss String.codePointAt, which does what I need.



Recently, in yet another fit of distraction from my existing projects, I’ve started working on jenx. It’s an XML writer for Java along similiar lines to GenX. At the moment, I’m just at the stage of banning invalid characters that go through it. So it’s extremely fortuitous that I’ve just seen a link to HOWTO Avoid Being Called a Bozo When Producing XML.

I’ve mostly been basing it on the GenX source code and it’s certainly made me realise quite how complicated a job it is to produce well-formed XML reliably. In particular, namespaces add a very large amount of complexity.

This just underlines how important it is to have good libraries to produce this sort of thing.

Hmmm, looking through that HOWTO makes me realise that I need to check that XML::Genx correctly supports astral characters. I’m sure it does, but I’d better double check in a test… For that matter, does Perl support them?



The ever interesting Ajaxian Blog highlighted this very interesting example implementing AJAX autocompletion for wicket. I had a (very) brief look at Wicket a couple of months ago, and it seems very promising. A really nice design, that cleanly enforces separation of presentation and content. In particular, I am very enamoured by the fact that it doesn’t prevent you from generating valid XHTML templates, unlike Tapestry.

Anyway, that article really confirms the view that it’s a very nice framework, seeing how easy it is to extend it. Of course, it’s not as simple as Rails, but it’s pretty good for Java. 🙂


Perl CruiseControl Integration

Today and yesterday have been spent fighting with CruiseControl (CC). It’s a nice bit of software for doing continuous builds, like Mozilla’s tinderbox. Anyway, I got it running with a single project, the Money example from Test Driven Development by Example. Things I have learned about cruisecontrol:

  • ant and junit suck. Suckage can be avoided by using the ant included with CC.
  • The latest release, doesn’t pass its tests. So I used the prebuilt version instead.
  • The latest release also includes a builtin Jetty server, which is neat. However, it’s not perfect yet and you will need to symlink the webapps directory into your build area.
  • CC gives weird ClassCastException errors if you mistype element names in the config.xml because it tries to load classes dynamically based upon the element name.
  • The Junit ant task can output XML. This format is totally undocumented, but appears to be simple to understand. I need to double check the source to be certain there aren’t any weird edge cases. And what is it with people who stuff all their data into attributes?

Anyway, the ultimate aim of all this is to get Perl builds integrated into CC. In particular, I want CC to checkout my CPAN modules, build them and run the tests. And I want those tests reported in the same manner as the JUnit ones inside CC. That’s not too much to ask is it?

So. I need to knock up an enhancement to Test::Harness, which has the ability to save its results in XML format in a specified directory. Just like the JUnit formatter. For the classname I shall use the name of the test file that’s being executed and for the name of the test, just use the number (plus description if applicable). I don’t know how to handle SKIP and TODO tests (I don’t think that JUnit has them), but if I just put out the description, that should be enough for now.

Once I’ve got that, I need to be able to shoehorn it arbitrarily into the build process. Probably the easiest way to do that is to import it by saying PERL5OPT=-MMy::Test::Harness.

After that, we should be set. My main concern is that the xml log that cruisecontrol uses won’t be able to make much sense of the build process that Perl uses, but so long as it can understand stdout / stderr I don’t think it’ll be a problem.


Servlets vs mod_perl

I’ve been looking at the Java servlets stuff for a couple of days now and it’s clear that it bears a remarkable similarity to mod_perl.

  • In mod_perl, you define handler() which gets passed a $r object to deal with the request and send a response. Servlets make things a little more explicit. You make doGet() or doPost() methods, which get passed separate request and response. But they’re basically equivalent to $r.
  • mod_perl and servlets both have filters, but they work in slightly different ways. servlets are much more explicit; you have to forward the call on to the next servlet when you’re done. mod_perl handles things nearly implicitly (with help from Apache::Filter).
  • mod_perl is exposing the apache httpd API so you get access to different request phases. servlets run in a completely different environment, so that just isn’t applicable to them.
  • Whats also incredibly similiar to both pieces of software is that most people think that they’re far too low level and have come up with higher level “frameworks” on top. The standard one for Java appears to be JSP, which on close inspection bear a marked similiarity to HTML::Mason. As with mod_perl though, no one framework covers everybody’s needs, so there are lots of similiar, competing ones.
  • One thing that’s markedly different between the two is deployment and configuration. Most mod_perl deployments tend towards the messy, unless the application is packaged up like a CPAN module. But configuration is always manually done. Servlets come with a nicely defined “deployment descriptor” (aka “config file”—they love big words in the Java world). In addition, there’s a standardised directory layout and packaging format making it easy to deploy your application. In theory. I haven’t actually got as far as deploying an application in a production environment yet, but it looks reasonable to me.

Learning Java, Again

After a period of nearly 8 years, I’m coming back to Java. I looked at it in 1996 when I was in college. I bought Java 1.0 in a nutshell. Times have definitely changed. I picked up Java 1.4 in a nutshell and it’s 4 times the size. Yikes, it’s at least 3 times the size.

But anyway, the employer is directing me at Java, after 4 years of Perl. It’s time to learn about servlets, which thankfully appear to be quite similar to mod_perl. It’s not totally surprising given that they cover much of the same ground. But as everything with Java seems, it’s far more wordy, verbose and formal. I fully understand those who call it the COBOL of the future.

Of course, it’s not just Java to learn. It’s also the environment that goes with it. I’ve been using Unix for a long time and know my way around compilers, linkers, makefiles and so on. But now I have to contend with eclipse, netbeans, ant, tomcat and a host of other things. Lots to learn, but hopefully rewarding.