f



How to retrieve XML CDATA text contents by org.xml.sax.ext.DefaultHandler2?

For example I have a XML tag

<script>
<![CDATA[
My script is here
]]>
</script>

I am using org.xml.sax.ext.DefaultHandler2 to parse my XML
file. How do I retrieve my script contents?




What shall I do in these two methods?
@Override
public void startElement(String uri, String localName, String qName, 
Attributes attributes)
throws SAXException
{
	if (qName.equals("script"))
	{
		// How to retrieve my script contents?
	}
}
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException
{
	if (qName.equals("script"))
	{
		// How to retrieve my script contents?
	}
}



Below two methods have no print out at all
@Override
public void endCDATA()
{
	System.out.println("End of CDATA");
}
	
@Override
public void startCDATA()
{
	System.out.println("Start of CDATA");
}

Thank you very much in advance!
0
4/30/2009 4:02:36 PM
comp.lang.java.programmer 52711 articles. 1 followers. Post Follow

6 Replies
915 Views

Similar Articles

[PageSpeed] 2

RC wrote:
> For example I have a XML tag
>
> <script>
> <![CDATA[
> My script is here
> ]]>
> </script>
>
> I am using org.xml.sax.ext.DefaultHandler2 to parse my XML
> file. How do I retrieve my script contents?

Via the 'characters()' method.

> What shall I do in these two methods?

Mark the beginning and end of each element so that your parser knows
where it is in the parse process.

> @Override
> public void startElement(String uri, String localName, String qName,
> Attributes attributes)
> throws SAXException
> {
> =A0 =A0 =A0 =A0 if (qName.equals("script"))
> =A0 =A0 =A0 =A0 {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 // How to retrieve my script contents?

Not here.  What do the Javadocs tell you about the purpose of this
method and the event it handles?

> =A0 =A0 =A0 =A0 }}
>
> @Override
> public void endElement(String uri, String localName, String qName)
> throws SAXException
> {
> =A0 =A0 =A0 =A0 if (qName.equals("script"))
> =A0 =A0 =A0 =A0 {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 // How to retrieve my script contents?

Not here.  What do the Javadocs tell you about the purpose of this
method and the event it handles?

> =A0 =A0 =A0 =A0 }
>
> }
>
> Below two methods have no print out at all

Did you read the Javadocs?

> @Override
> public void endCDATA()
> {
> =A0 =A0 =A0 =A0 System.out.println("End of CDATA");
> }
>
> @Override
> public void startCDATA()
> {
> =A0 =A0 =A0 =A0 System.out.println("Start of CDATA");
> }

The Javadocs will tell you:
> The contents of the CDATA section will be reported through the regular
> characters event; this event is intended only to report the boundary.

While not always enough, the API Javadocs are always a good place to
start, and often will completely answer your questions.

--
Lew
0
lew (2468)
4/30/2009 4:56:20 PM
Thu, 30 Apr 2009 12:02:36 -0400, /RC/:

> For example I have a XML tag
> 
> <script>
> <![CDATA[
> My script is here
> ]]>
> </script>
> 
> I am using org.xml.sax.ext.DefaultHandler2 to parse my XML
> file. How do I retrieve my script contents?

You retrieve it as ordinary text content delivered through 
'characters' events to your ContentHandler.  Whether the text is 
written as CDATA section (or not) in the source is purely a 
syntactic detail which shouldn't bother you.

> Below two methods have no print out at all
> @Override
> public void endCDATA()
> {
>     System.out.println("End of CDATA");
> }
>     
> @Override
> public void startCDATA()
> {
>     System.out.println("Start of CDATA");
> }
> 
> Thank you very much in advance!

You need to set the "lexical-handler" [1] property of the parser 
with the reference to your handler in addition to setting it as a 
'contentHandler':

     XMLReader parser;
     DefaultHandler2 myHandler;
     ...
     parser.setContentHandler(myHandler);
     parser.setProperty("http://xml.org/sax/properties/"
                        + "lexical-handler", myHandler);

[1]  SAX2 Standard Handler and Property IDs 
<http://www.saxproject.org/apidoc/org/xml/sax/package-summary.html>

-- 
Stanimir
0
s7an10 (279)
5/4/2009 1:01:03 PM
Stanimir Stamenkov wrote:
> You need to set the "lexical-handler" [1] property of the parser with 
> the reference to your handler in addition to setting it as a 
> 'contentHandler':

Are you sure about that?

-- 
Lew
0
noone7 (4050)
5/4/2009 1:39:44 PM
In article <gtmr70$e6j$1@news.albasani.net>, Lew <noone@lewscanon.com> 
wrote:

> Stanimir Stamenkov wrote:
> > You need to set the "lexical-handler" [1] property of the parser 
> > with the reference to your handler in addition to setting it as a 
> > 'contentHandler':
> 
> Are you sure about that?

I was surprised to see that the default value of lexical-handler is 
unspecified [1]. On closer reading, I see that the LexicalHandler 
interface is optional [2]. The API suggests setting the property and 
handling any SAXNotRecognizedException to determine if the feature is 
implemented.

[1]<http://www.saxproject.org/apidoc/org/xml/sax/package-summary.html>
[2]<http://www.saxproject.org/apidoc/org/xml/sax/ext/LexicalHandler.html>

-- 
John B. Matthews
trashgod at gmail dot com
<http://sites.google.com/site/drjohnbmatthews>
0
nospam59 (11089)
5/4/2009 3:04:41 PM
Mon, 04 May 2009 09:39:44 -0400, /Lew/:
> Stanimir Stamenkov wrote:
>
>> You need to set the "lexical-handler" [1] property of the parser with 
>> the reference to your handler in addition to setting it as a 
>> 'contentHandler':
> 
> Are you sure about that?

Yes.  As you've suggested you may consult with the API docs 
reference to which I've supplied.  If you perform a simple test 
you'll see for yourself, too.  Note I've meant one needs to set a 
"lexical-handler" only to detect CDATA section boundaries, i.e. to 
receive 'startCDATA' and 'endCDATA' events, not as requirement to 
read the content of CDATA sections (if that wasn't clear).

-- 
Stanimir
0
s7an10 (279)
5/4/2009 9:07:15 PM
Stanimir Stamenkov wrote:
> Note I've meant one needs to set a "lexical-handler" 
> only to detect CDATA section boundaries, i.e. to receive 'startCDATA' 
> and 'endCDATA' events, not as requirement to read the content of CDATA 
> sections (if that wasn't clear).

Thanks, that wasn't.

-- 
Lew
0
noone7 (4050)
5/4/2009 11:33:49 PM
Reply: