For a project at $WORK, we want to implement Solr’s spelling suggestions. When you ask solr to provide suggestions, it comes back with something like this (the original search was spinish englosh):
… 1 19 26 spanish 1 27 34 english 1 60 67 spanish …
What we want to do is transform this into:
Did you mean spanish english?
As it turns out, this is a non-trivial task in XSLT. It’s doable, but significantly easier in XSLT 2, since you are less restricted by the rules on result-tree-fragments.
The first problem to solve is getting the data into a sensible data structure for further processing. In a real language, I’d want a list of
(from, to) pairs. In XSLT, sequences are always flat. The way to simulate this is to construct an element for the pair.
Note the commented caveat: we always pick the first suggestion for any given name. From my (small) experience, this isn’t an issue as the suggestions for a given word are always identical.
This results in
$suggestions containing a sequence of elements looking like this.
Now one of the nice things about XSLT 2 is that you can define functions which are visible to XPath. So we can write a fairly simple recursive function to do the search and replace.
There are a few things to note:
- You have to give your function a namespace prefix.
xsl:param‘s are used in order (not by name) to specify the arity of the function.
asattributes aren’t necessary, but the idea of types in XSLT is growing on me. I’d rather know about type problems as soon as possible.
- The notion of cdr (tail) in XSLT is rather odd: the sequence of all nodes in the sequence whose position is greater than one.
- Even though I’m using
replace(), I’m not taking any precautions against escaping regex characters. I’m certain that these won’t occur given my data.
So finally, we end up with:
I don’t think all this will win any awards for elegance, but it does work. 🙂