Solr's Lucene Source

I’m debugging a plugin for Solr. I’ve just about got the magic voodoo set up so that I can make Eclipse talk to tomcat and stick breakpoints in and so on. But I’ve immediately run into a problem.

Even though Solr itself comes with -sources jars, the bundled copy of lucene that they’ve used doesn’t. Needless to say, this is a bit of a hindrance.

Thankfully, the apache people have set up git.apache.org, which makes this situation a lot less annoying than it could be.

First, I checked out copies of lucene & solr.

$ git clone git://git.apache.org/solr.git
$ git clone git://git.apache.org/lucene.git

Now, I need to go into solr and figure out which version of lucene is in use. Unfortunately, it’s not a released version, it’s a snapshot of the lucene trunk at a point in time.

$ cd/solr
$ git branch -r
  origin/HEAD -> origin/trunk
  origin/branch-1.1
  origin/branch-1.2
  origin/branch-1.3
  origin/sandbox
  origin/solr-ruby-refactoring
  origin/tags/release-1.1.0
  origin/tags/release-1.2.0
  origin/tags/release-1.3.0
  origin/trunk
$ git whatchanged origin/tags/release-1.3.0 lib
…
commit 904e378b7b4fd18232f657c9daf484a3e63b272c
Author: Yonik Seeley <yonik@apache.org>
Date:   Wed Sep 3 20:31:42 2008 +0000
 
    lucene update 2.4-dev r691741
 
    git-svn-id: https://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.3@691758 13f79535-47bb-0310-9956-ffa450edef68
 
:100644 100644 a297b74... 54442dc... M  lib/lucene-analyzers-2.4-dev.jar
:100644 100644 596625b... 5c6e003... M  lib/lucene-core-2.4-dev.jar
:100644 100644 db13718... f0f93a7... M  lib/lucene-highlighter-2.4-dev.jar
:100644 100644 50c8cb4... a599f43... M  lib/lucene-memory-2.4-dev.jar
:100644 100644 aef3fb8... 79feaef... M  lib/lucene-queries-2.4-dev.jar
:100644 100644 1c733b9... 440fa4e... M  lib/lucene-snowball-2.4-dev.jar
:100644 100644 0195fa2... b5ff08b... M  lib/lucene-spellchecker-2.4-dev.jar
…

So, the last change to lucene was taking a copy of r691741 of lucene’s trunk. So, lets go over there. And see what that looks like.

$ cd/lucene
$ git log --grep=691741

Except that doesn’t return anything. Because there was no lucene commit at that revision in the original repository (it was something to do with geronimo). So we need to search backwards for the commit nearest to that revision. Thankfully, git svn includes the original subversion revision numbers of each commit.

$ cd/lucene
$ git log | perl -lne 'if (m/git-svn-id:.*@(\d+)/ && $1 <= 691741){print $1; exit}'
691694

So now we can go back and find the git commit id that corresponds.

$ cd/lucene
$ git log --grep=691694
commit 71afff2cebd022fe63bdf2ec4b87aaa0cee41dc8
Author: Michael McCandless <mikemccand@apache.org>
Date:   Wed Sep 3 17:34:29 2008 +0000
 
    LUCENE-1374: fix test case to close reader/writer in try/finally; add assert b!=null in RAMOutputStream.writeBytes (matches FSIndexOutput which hits NPE)
 
    git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@691694 13f79535-47bb-0310-9956-ffa450edef68

Hurrah! Now I can checkout the same version of Lucene that’s in Solr. But, probably more useful for Eclipse, is just to zip it up somewhere.

$ cd/lucene
$ git archive --format=zip 71afff2 >/tmp/lucene-2.4-r691741.zip

Excellent. Now I can resume my debugging session. 🙂

NB: I could have just used subversion to check out the correct revision of Lucene. But, I find it quicker to use git to clone the repository, and I get the added benefit that I now have the whole lucene history available. So I can quickly see why something was changed.

Comments 4

  1. Sudarshan Gaikaiwari wrote:

    I am banging my head trying to setup eclipse to debug solr.
    Could you please share how you got this setup.
    Thanks

    Posted 18 Jul 2009 at 03:15
  2. dom wrote:

    @Sudarshan Gaikaiwari There are three steps that I used:

    1. First, ensure that tomcat gets started with CATALINA_OPTS=-agentlib:jdwp=transport=dt_socket,address=localhost:8000,server=y,suspend=y. When you restart tomcat it will pause, waiting for you to connect a debugger to it.
    2. Next, you need a project in eclipse, which contains a dependency on solr-core, complete with source attachment. I did this using maven and m2eclipse.
    3. Finally, you need to create a “remote debugger connection”. If you go to Run → Debug Configurations… and make a new “Remote Java Application”. Make it connect to localhost:8000 (as specified in CATALINA_OPTS above) and associate it with the project you made above.

    This is probably worth a post in itself. 🙂 This particular approach is more customized towards my situation of debugging a plugin. You may find it easier to import the Solr war file as an eclipse project, then run it in debug mode with WTP.

    Posted 18 Jul 2009 at 15:22
  3. dom wrote:

    Annoyingly, I’ve just discovered that this doesn’t include the output of JavaCC. So you don’t get source for things like BooleanQuery and so on. Bah.

    Posted 17 Aug 2009 at 23:31
  4. dom wrote:

    No, I’m talking crap. The BooleanQuery source is definitely there.

    Posted 17 Aug 2009 at 23:32