subversion diff

Normally, subversion uses a builtin diff command to show you your changes. This works pretty well most of the time, except that you can’t use it to disable whitespace-only changes. How to do this isn’t spelled out 100% clearly in the FAQ, so here’s what I found out:

  1. Make a shell script ~/bin/svn_diff. It should look like this:
    exec diff -bu "$@"
  2. chmod +x ~/bin/svn_diff
  3. Put these lines in ~/.subversion/config:
    diff-cmd = /home/dom/bin/svn_diff

Naturally, you can adjust the diff flags to taste in the svn_diff script..


Java Unicode Characters

Working on jenx, I’ve started looking at Characters. In particular, Astral characters. My first question was “how do I create one in a string literal?” Well I still don’t know. But my researches have shown that to do anything outside the Basic Multilingual Plane (BMP) requires JDK 5. Drat. That kind of limits the usefulness of this library. But I really need the stuff in JSR 204.

Which is a story in itself. It’s good thing that Java can handle the full Unicode range. But the support is (to be quite frank) a bit crap. Mostly down to the fact that char is a UTF-16 codepoint, not a “Unicode Character.” I personally don’t find it helpful that they’ve propogated the C problem of confusing char and int, and generally allowing the two to roam freely amongst each other. Plus, JSR 204 looks like it was extremely careful to avoid breaking backwards compatibility, which is always a noble goal, but in the case makes the end result incredibly difficult to use. I shouldn’t have to test each codepoint to see whether or not it’s a surrogate. Really. This is an OO language, I should be able to get the next Character object from the String. Shocking, I know.

It strikes me that Python is pretty much the only language I know that got Unicode right by making an entirely separate object, the “Unicode String.”

Update: Oh alright. In my whinging, I managed to miss String.codePointAt, which does what I need.