Problems with the DOMParser (s4s-elt-character Error)
I was messing around with the Apache Xerces based XML DOMParser class (from the com.sun.org.apache.xerces.internal.impl.xs.dom package)for the JTwitt project and I noticed some quirky behavior. I used the following snippet of code:
DomParser parser = new DOMParser(); parser.parse(new InputSource(xmlStream)); Document d = parser.getDocument();
Pretty straightforward stuff - in fact, you probably find the same few lines in just about every single DOMParser tutorial out there. The xmlStream is an InputStream instance object with the XML data. Where do I get it from? I pull it off the Twitter as I described here. I tested this code before, and got the XML to print out in the console so my InputStream is not the issue here. Every time I called the parse method I got few dozen errors like this:
s4s-elt-character: Non-whitespace characters are not allowed in schema elements other than ‘xs:appinfo’ and ‘xs:documentation’
It was basically one error per each node which had text data as a child. I was googling this message for hours and it seems that no one has a clue what causes it. I’m definitely not the first person who got it, but I have yet to see a working solution.
In the end I decided to abandon DomParser. There is about a bazillion different ways to parse XML files in Java so I simply switched to the JAXP parser (javax.xml.parsers). Now my code looks like this:
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder(); Document d = builder.parse(xmlStream);
Both snippets are essentially equivalent and achieve the same thing. So as far as I’m concerned DocumentBuilder > DOMParser. Still, if anyone has a clue what is that s4s-elt-character error all about, please leave a note in the comments so that future generations do not have to suffer because of it.
Related Posts:


June 3rd, 2007 at 5:33 am (4654) [Quote]
Hey, have you tried Nano XML. It’s small, very small. I’ve used it to parse config files written in xml and works great.
Posted usingJune 3rd, 2007 at 3:12 pm (4660) [Quote]
I haven’t used it. Both the Xerces and JAXP parser ship with the core set of Java libraries - at least in 5.0. This way there is no need to bundle any 3rd party Jar files with my releases. I didn’t really look into stuff that was not built in - but I will check it out.
Thanks,
Posted usingJune 4th, 2007 at 1:49 pm (4670) [Quote]
For XML manipulation, I always use JDom. I find that it’s the best all-around XML library.
Posted using