I was trying to resolve entities (&weirdChar;
) in an XML file. Easy enough, use a validating parser. But here’s the tricky bit: get the entity definitions from the classpath. This should still be easy, as SAX provides an EntityResolver.
Unfortunately, the interactions between JAXP and SAX make life complicated. I found that you have to ignore the SAXParser (from JAXP) and instead focus on the XMLReader interface (part of plain old SAX).
This is what I came up with. First, a small driver.
public void parseIt() { SAXParserFactory spf = SAXParserFactory.newInstance(); spf.setValidating(true); XMLReader reader = spf.newSAXParser().getXMLReader(); reader.setEntityResolver(new MyResolver()); // Look for test.xml on the classpath. InputStream testXmlStream = App.class.getClassLoader().getResourceAsStream("test.xml"); reader.parse(new InputSource(testXmlStream)); }
That references the EntityResolver implementation I wrote:
class MyResolver implements EntityResolver2 { public InputSource resolveEntity(String name, String publicId, String baseURI, String systemId) throws SAXException, IOException { InputStream stream = getClass().getClassLoader().getResourceAsStream(systemId); return new InputSource(stream); } }
Actually, I had to use EntityResolver2 for reasons I don’t entirely understand.
On top of this, I found that I had to include xerces 2.8 explicitly as a dependency. The version bundled with Java 1.5 is Xerces 2.6.2, which has a bug: It passes the entity resolver an absolutized systemId. Which makes it very difficult to resolver further. What a pain in the arse.
But it does now work, and I can successfully resolve entities off the classpath.