Categories
Uncategorized

Solr’s Lucene Source

I’m debugging a plugin for Solr. I’ve just about got the magic voodoo set up so that I can make Eclipse talk to tomcat and stick breakpoints in and so on. But I’ve immediately run into a problem.

Even though Solr itself comes with -sources jars, the bundled copy of lucene that they’ve used doesn’t. Needless to say, this is a bit of a hindrance.

Thankfully, the apache people have set up git.apache.org, which makes this situation a lot less annoying than it could be.

First, I checked out copies of lucene & solr.

$ git clone git://git.apache.org/solr.git
$ git clone git://git.apache.org/lucene.git

Now, I need to go into solr and figure out which version of lucene is in use. Unfortunately, it’s not a released version, it’s a snapshot of the lucene trunk at a point in time.

$ cd …/solr
$ git branch -r
  origin/HEAD -> origin/trunk
  origin/branch-1.1
  origin/branch-1.2
  origin/branch-1.3
  origin/sandbox
  origin/solr-ruby-refactoring
  origin/tags/release-1.1.0
  origin/tags/release-1.2.0
  origin/tags/release-1.3.0
  origin/trunk
$ git whatchanged origin/tags/release-1.3.0 lib
…
commit 904e378b7b4fd18232f657c9daf484a3e63b272c
Author: Yonik Seeley 
Date:   Wed Sep 3 20:31:42 2008 +0000

    lucene update 2.4-dev r691741

    git-svn-id: https://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.3@691758 13f79535-47bb-0310-9956-ffa450edef68

:100644 100644 a297b74... 54442dc... M  lib/lucene-analyzers-2.4-dev.jar
:100644 100644 596625b... 5c6e003... M  lib/lucene-core-2.4-dev.jar
:100644 100644 db13718... f0f93a7... M  lib/lucene-highlighter-2.4-dev.jar
:100644 100644 50c8cb4... a599f43... M  lib/lucene-memory-2.4-dev.jar
:100644 100644 aef3fb8... 79feaef... M  lib/lucene-queries-2.4-dev.jar
:100644 100644 1c733b9... 440fa4e... M  lib/lucene-snowball-2.4-dev.jar
:100644 100644 0195fa2... b5ff08b... M  lib/lucene-spellchecker-2.4-dev.jar
…

So, the last change to lucene was taking a copy of r691741 of lucene’s trunk. So, lets go over there. And see what that looks like.

$ cd …/lucene
$ git log --grep=691741

Except that doesn’t return anything. Because there was no lucene commit at that revision in the original repository (it was something to do with geronimo). So we need to search backwards for the commit nearest to that revision. Thankfully, git svn includes the original subversion revision numbers of each commit.

$ cd …/lucene
$ git log | perl -lne 'if (m/git-svn-id:.*@(d+)/ && $1 <= 691741){print $1; exit}'
691694

So now we can go back and find the git commit id that corresponds.

$ cd …/lucene
$ git log --grep=691694
commit 71afff2cebd022fe63bdf2ec4b87aaa0cee41dc8
Author: Michael McCandless 
Date:   Wed Sep 3 17:34:29 2008 +0000

    LUCENE-1374: fix test case to close reader/writer in try/finally; add assert b!=null in RAMOutputStream.writeBytes (matches FSIndexOutput which hits NPE)

    git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@691694 13f79535-47bb-0310-9956-ffa450edef68

Hurrah! Now I can checkout the same version of Lucene that’s in Solr. But, probably more useful for Eclipse, is just to zip it up somewhere.

$ cd …/lucene
$ git archive --format=zip 71afff2 >/tmp/lucene-2.4-r691741.zip

Excellent. Now I can resume my debugging session. 🙂

NB: I could have just used subversion to check out the correct revision of Lucene. But, I find it quicker to use git to clone the repository, and I get the added benefit that I now have the whole lucene history available. So I can quickly see why something was changed.

4 replies on “Solr’s Lucene Source”

I am banging my head trying to setup eclipse to debug solr.
Could you please share how you got this setup.
Thanks

@Sudarshan Gaikaiwari There are three steps that I used:

First, ensure that tomcat gets started with CATALINA_OPTS=-agentlib:jdwp=transport=dt_socket,address=localhost:8000,server=y,suspend=y. When you restart tomcat it will pause, waiting for you to connect a debugger to it.
Next, you need a project in eclipse, which contains a dependency on solr-core, complete with source attachment. I did this using maven and m2eclipse.
Finally, you need to create a “remote debugger connection”. If you go to Run → Debug Configurations… and make a new “Remote Java Application”. Make it connect to localhost:8000 (as specified in CATALINA_OPTS above) and associate it with the project you made above.

This is probably worth a post in itself. 🙂 This particular approach is more customized towards my situation of debugging a plugin. You may find it easier to import the Solr war file as an eclipse project, then run it in debug mode with WTP.

Annoyingly, I’ve just discovered that this doesn’t include the output of JavaCC. So you don’t get source for things like BooleanQuery and so on. Bah.

Leave a Reply to dom Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s