Go Readability

If you haven’t seen them, do take a look at these slides on Go readability, from one of Google’s Go readability group.

Readability is an important process at Google. In theory, it’s about ensuring the style guide for a language is applied. In practice, it’s also about ensuring that idiomatic code is produced. This is highly language specific, and not something that can easily be done with tooling.

In the case of Go readability, it feels like a mentoring process over a series of code reviews (other languages take a more “big-bang” approach). The end result is that I have a better idea of not just how to write Go, but how we like Go to be written at Google. I really appreciate the strong emphasis on simplicity in Go code. Hopefully, that comes through in the slides.


Google Analytics in XHTML

I’e been attempting to get Google Analytics to work correctly in both FireFox and IE6 for a site at $WORK. This is not normally a problem, apart from the fact that we’re serving up pages to firefox as application/xhtml+xml in order to get MathML support.

Now, the sample code from Google is pretty gnarly.

var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "' type='text/javascript'%3E%3C/script%3E"));

var pageTracker = _gat._getTracker("UA-xxxxxx-x");
} catch(err) {}

This fails in XHTML as document.write() isn’t there.

I tried a number of ways to get this to work.

  • Replace document.write() with some jQuery code to insert a script tag.
    • This didn’t work in IE6 — as the second script block ended up getting called before newly inserted script tag had loaded.
    • But I did find out that jQuery will replace script tags with Ajax calls for you. Which means you don’t end up with a script tag in the DOM tree, which is highly confusing when you’re looking for it in firebug.
  • Replace document.write() with native DOM calls to insert a script tag.
    • I did find the neat idea of adding an id to the script tag you’re currently in, so you know where to insert new DOM elements.
    • But it still failed, and for the same reason as above.

I was just about to start implementing something evil involving setInterval(), when I realised…

… this site will never use SSL!

So I replaced the code to generate a script tag with the script tag.

var pageTracker = _gat._getTracker("UA-xxxxxx-x");
} catch(err) {}

Tada! If only I’d thought of this a few hours earlier… The moral is to be more aware of the context in which you’re doing something. Keep an eye on the “big picture” to use a particularly horrible metaphor.


dependency complexity

I love the google-collections library. It’s got some really nice features. But, it’s not stable yet. They’ve explicitly stated that until they hit 1.0 it’s not going to be a stable API. So there are changes each release. Nothing major, but changes.

As an example, in the jump from 0.9 to 1.0rc1, the static methods on the Join class became the fluent API on the Joiner class.

(as an aside, could we have some tags, please?)

Following this change is simple.

@@ -310,7 +310,7 @@
         } catch (KeyStoreException e) {
             throw new RuntimeException(e);
-        return Join.join(" | ", principals);
+        return Joiner.on(" | ").join(principals);


But the knock-on effect comes when you start getting lots of things which have google-collections dependencies. At $WORK, I’ve got a project whose dependencies look like this.


I wanted to extract a part of DC2 into its own library, commslib. This was pretty easy as the code was self contained. Naturally, I wanted it to use the latest version of everything, so I upgraded google-collections to 1.0rc1. Again, fairly simple.

This is what I ended up with.


Except that now there’s a problem.

  • commslib uses Joiner, so it’ll blow up unless it upgrade DC2‘s google-collections to 1.0rc1.
  • GSK uses Join, so it’ll blow up if I upgrade DC2‘s google-collections to 1.0rc1.

And thus have I painted myself into a corner. 🙂

As it happens, DC2 had a dependencyManagement section forcing everything to use google-collections 0.8. → Instant BOOM.

The solution is to upgrade all my dependencies to use google-collections 1.0rc1. But this turns out to be a much larger change than I had originally envisaged, as now I have to create releases for two dependent projects. This isn’t too much of a hassle in this case (yay for the maven-release-plugin), but it could be a large undertaking if either of those projects is not presently in a releasable state.

I’m not trying to pick on google-collections (I still love it). I’m just marvelling at how quickly complexity can blossom from something so simple.


Google Collections to the rescue

A few days ago, I was writing a piece of code that turned a line at a time into an Object. And it was using iterators. I had a RecordStream, which wrapped a LineStream (just a thin veneer over LineNumberReader).

Then I discovered that there was a terminating record at the end of each file. And it was in a completely different format to all the other lines. Bother.

Ok, I know, I’ll insert another iterator in the middle, which specifically ignores that record. Well, easier said than done as it turns out. I spent the best part of a day trying to create an Iterator which reads the next value and pretends that it’s not there. It turns out to have an awful lot of state.

Eventually I managed the task, and it worked. But boy, was it ugly. And it was long—about two pages of code.

Then the light bulb went off. I remembered that google collections had some tools for dealing with Iterators. In particular, there’s a function filter(), which takes a Predicate. And look! The Predicates class contains some handy builtins!

After about 5 minutes work, my two pages of code boiled down to three lines of code.

    import static*;

    private static final String END_RECORD = "END RECORD,END RECORD,END RECORD";

    public Iterator<T> iterator() {
        // Produce an iterator that returns one line at a time.
        Iterator<String> lines = new LineStream(reader).iterator();
        // A predicate to return all records which are not the end record.
        Predicate<String> notEndRecord = not(isEqualTo(END_RECORD));
        // Apply the predicate to the iterator.
        final Iterator<String> it = Iterators.filter(lines, notEndRecord);
        return new Iterator<T>() { … };

Marvellous and powerful stuff. It’s seriously worth checking out in case you haven’t played with it before. My favourite is the static factory methods. e.g.

  // Before
  Map<String, String> myMap = new HashMap<String,String>();

  // After
  Map<String, String> myMap = Maps.newHashMap();

Isn’t it lovely how the compiler just figures it all out for you? Anything that can save space like that has to be a Good Thing™.

There are a whole bunch of other useful things in there.

  • Preconditions.checkNotNull() is a compact way of validity checking your arguments.
  • Join.join()—I don’t know how many times I’ve written this by hand (usually badly). Much better to have somebody else do it for me.

Do yourself a favour and go check them out. You won’t regret it.


Google Code Hosting – svn import

I’ve just started a new project on Google Code Hosting (of which more later). I’ve been developing it in a local svn repository, and I’d like to transfer it up to the google svn server.

This isn’t easy. Google help has How do I import an existing Subversion repository?, but that’s only half the answer. The problem is that I develop many projects in one single repository (I find it easier to manage). So I wanted to export a subset of my repository to google.

Sadly, svnsync doesn’t support that.

The workaround is simple (yet tiresome). You have to create a mini-repository containing just the subset of the original repository you want, then send that to google. This is what I came up with to export just /project.

  % svnadmin create /tmp/project-repos
  % svnadmin dump -q /home/svn/public |
  > svndumpfilter include /project --renumber-revs -drop-empty-revs |
  > svnadmin load -q /tmp/project-repos

Of course, now that I have the subset isolated, the path structure is wrong. Everything is living under /project/trunk instead of /trunk. So, we have to fix that.

  % svn mv file:///tmp/project-repos/project/trunk file:///tmp/project-repos/
  % svn rm file:///tmp/project-repos/project -m 'No longer needed.'

Finally, I can use svnsync to send the changes to google:

  % svnsync init --username file:///tmp/project-repos
  % svnsync sync --username file:///tmp/project-repos

Phew. What a palaver. It would have been nice if they could accept a file containing the output of svnadmin dump instead…


Google Reader

Like many other people in the blogosphere, I’ve been playing with Google Reader, and I’ve found it to be a pretty good aggregator. But it’s lacking one big, important feature: The “catchup” button. I want to be able to mark all my feeds read with a single keystroke.

What would be really nice would be the ability to mark all articles from a selected feed read. But I think that’s difficult in the current interface because it seems to melt all articles into the same pot.

Before I sound too critical, I have to point out that what Google Reader does, it does very well. I found it quick, responsive and easy on the eyes. Good work, chaps!

Oh, and there’s one really, really important innovation: It brings vi-style keystrokes to the people. Yay Google!