f



XML problem with special characters like "<" and ">"

Hello!

I prepare my XML document like this way:

-------------------------------------------------------
PrintWriter writer;
Document domDocument;
Element domElement;

// Root tag
domElement = domDocument.createElement ("ROOT_TAG");
domDocument.appendChild (domElement);

// XML from an external source as a "String"
Text data = domDocument.createTextNode (externalXML);
domElement.appendChild (data);

writer.println (...);
-------------------------------------------------------

As you can see, I create a normal Root-Node and then I get an XML
stream from an external source. For the external XML I use the
function "createTextNode" because it is a text in some way.

The problem is the output when I write all together to the PrintWriter
object. It looks like this for this example:

--------------------------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>

<ROOT_TAG>

&lt;DATA&gt;
   &lt;AFL&gt;
      &lt;AFLNR&gt;XX&lt;/AFLNR&gt;
      &lt;BENENNUNG&gt;MY TEST&lt;/BENENNUNG&gt;
      &lt;LA_VER&gt;&lt;/LA_VER&gt;
      &lt;FA_KR&gt;&lt;/FA_KR&gt;
      &lt;POL_COD&gt;&lt;/POL_COD&gt;
      &lt;FA_KZ&gt;&lt;/FA_KZ&gt;
      &lt;G_KZ&gt;&lt;/G_KZ&gt;
      &lt;AFL_KZ&gt;1&lt;/AFL_KZ&gt;
   &lt;/AFL&gt;
&lt;/DATA&gt;
</ROOT_TAG>
--------------------------------------------------------------

Strange, isn't it!? The sign "<" is being replaced by "&lt;" and ">"
is being replaced by "&gt;", but only for the XML coming from the
external source.

Does anybody know this problem or can think about a solution? Should I
use another function than "createTextNode" or do I have to change the
special characters manually?

Thank you for every hint!

Best regards,
Christian Schmidbauer
0
purzel30 (13)
7/28/2004 7:25:01 AM
comp.lang.java.programmer 52711 articles. 1 followers. Post Follow

7 Replies
698 Views

Similar Articles

[PageSpeed] 25

On 28 Jul 2004 00:25:01 -0700, Christian Schmidbauer wrote:

> The sign "<" is being replaced by "&lt;" 

&lt; is the (proper) way to encode < if
you want it to appear in a web page/HTML.  

That way the UA knows to treat it as a 
presentational character, rather than the
closing char of an HTML tag.

> Strange, isn't it!? 

No.

-- 
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
0
SeeMySites (5478)
7/28/2004 7:35:22 AM
On Wed, 28 Jul 2004 07:35:22 GMT, Andrew Thompson
<SeeMySites@www.invalid> wrote or quoted :

>&lt; is the (proper) way to encode < if
>you want it to appear in a web page/HTML.  
>
>That way the UA knows to treat it as a 
>presentational character, rather than the
>closing char of an HTML tag.
>
>> Strange, isn't it!? 

But perhaps thoughtless or inconsiderate.  Ideally you would arrange
things so that quoting would almost never be needed.

The problem is HTML grew up without ever knowing it would be merged
with Java.  If we had this all to do over, HTML would use some rare
character to mark its tags like ~ or ` or !. Alternatively you would
generate your HTML with methods.  It would not have reserved
characters.

-- 
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming. 
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
0
look-on (4215)
7/28/2004 7:41:29 AM
On 28-7-2004 9:25, Christian Schmidbauer wrote:

> Hello!
> 
> I prepare my XML document like this way:
> 
> -------------------------------------------------------
> PrintWriter writer;
> Document domDocument;
> Element domElement;
> 
> // Root tag
> domElement = domDocument.createElement ("ROOT_TAG");
> domDocument.appendChild (domElement);
> 
> // XML from an external source as a "String"
> Text data = domDocument.createTextNode (externalXML);
> domElement.appendChild (data);
> 
> writer.println (...);
> -------------------------------------------------------
> 
> As you can see, I create a normal Root-Node and then I get an XML
> stream from an external source. For the external XML I use the
> function "createTextNode" because it is a text in some way.
                                                  ^^^^^^^^^^^

> 
> The problem is the output when I write all together to the PrintWriter
> object. It looks like this for this example:
> 
> --------------------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> 
> <ROOT_TAG>
> 
> &lt;DATA&gt;
>    &lt;AFL&gt;
>       &lt;AFLNR&gt;XX&lt;/AFLNR&gt;
>       &lt;BENENNUNG&gt;MY TEST&lt;/BENENNUNG&gt;
>       &lt;LA_VER&gt;&lt;/LA_VER&gt;
>       &lt;FA_KR&gt;&lt;/FA_KR&gt;
>       &lt;POL_COD&gt;&lt;/POL_COD&gt;
>       &lt;FA_KZ&gt;&lt;/FA_KZ&gt;
>       &lt;G_KZ&gt;&lt;/G_KZ&gt;
>       &lt;AFL_KZ&gt;1&lt;/AFL_KZ&gt;
>    &lt;/AFL&gt;
> &lt;/DATA&gt;
> </ROOT_TAG>
> --------------------------------------------------------------
> 
> Strange, isn't it!? The sign "<" is being replaced by "&lt;" and ">"
> is being replaced by "&gt;", but only for the XML coming from the
> external source.

It isn't strange: you are treating the external XML not as XML but as text (as a string). Upon 
output, characters with a special meaning in XML will be replaced by an entity reference (< becomes 
&lt; etc.)


> 
> Does anybody know this problem or can think about a solution? Should I
> use another function than "createTextNode" or do I have to change the
> special characters manually?
> 

You'll need to parse your external piece as a (partial) XML DOM tree and insert that into your 
domDocument. I don't think the standard API allows you to parse a partial XML document (i.e. without 
  an <?xml ...?> declaration and a root element), so probably you'll have to add the declaration and 
a root element to the string representing the external piece.

If you need more info on parsing and manipulating XML/DOM, see 
<http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/dom/index.html>.

> Thank you for every hint!
> 
> Best regards,
> Christian Schmidbauer

HTH,
Z.
0
zoopy (134)
7/28/2004 12:49:53 PM
I don't want to show it within a web page! I definetely want to have
to real characters "<" respectively ">". How can I avoid the "&gt;"
and "&lt;" signs?

By the way, the XML is given back to the user.

Thank you,
Christian


Andrew Thompson <SeeMySites@www.invalid> wrote in message news:<ldp99srajv24.1b5hd3qo4g11i$.dlg@40tude.net>...
> On 28 Jul 2004 00:25:01 -0700, Christian Schmidbauer wrote:
> 
> > The sign "<" is being replaced by "&lt;" 
> 
> &lt; is the (proper) way to encode < if
> you want it to appear in a web page/HTML.  
> 
> That way the UA knows to treat it as a 
> presentational character, rather than the
> closing char of an HTML tag.
> 
> > Strange, isn't it!? 
> 
> No.
0
purzel30 (13)
7/28/2004 2:20:26 PM
On 28 Jul 2004 07:20:26 -0700, Christian Schmidbauer wrote:

(Please do not top-post Christian, 
as I find it most confusing..
<http://www.physci.org/codes/javafaq.jsp#netiquette>)

See further replies inline..

> Andrew Thompson <SeeMySites@www.invalid> wrote in message news:<ldp99srajv24.1b5hd3qo4g11i$.dlg@40tude.net>...
>> On 28 Jul 2004 00:25:01 -0700, Christian Schmidbauer wrote:
>> 
>>> The sign "<" is being replaced by "&lt;" 
>> 
>> &lt; is the (proper) way to encode < if
>> you want it to appear in a web page/HTML.  
...
> I don't want to show it within a web page! I definetely want to have
> to real characters "<" respectively ">". How can I avoid the "&gt;"
> and "&lt;" signs?
> 
> By the way, the XML is given back to the user.

I suspect you will find that the conversion
back to '<' happens on *read*, so your user
will get back exactly what they expect, but
it seems you are still not getting the basic 
concept that (AFAIU) these symbols cannot be 
written in the XML as '<' they would make the 
XML invalid.

-- 
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
0
SeeMySites (5478)
7/28/2004 2:48:34 PM
purzel30@web.de (Christian Schmidbauer) wrote in 
news:46ca2aaa.0407272325.32c1ffee@posting.google.com:

> I prepare my XML document like this way:

> // XML from an external source as a "String"
> Text data = domDocument.createTextNode (externalXML);
> domElement.appendChild (data);

> <ROOT_TAG>
> 
> &lt;DATA&gt;
> 
> Strange, isn't it!? The sign "<" is being replaced by "&lt;" and ">"
> is being replaced by "&gt;", but only for the XML coming from the
> external source.

For clarity sake: suppose that externalXML is the string:

"I like women with weight <55kg and height>170cm"

Now DOM is shielding you from considering the text <55kg and height>
as a XML element, which in this case definitely isn't.

DOM is right; guess who is wrong! :)

I suspect you might find help in the DocumentFragment class,
which seems to me near to your needs.

-- 
  Andrea Spinelli - IT&T srl aspinelli@no-spam-plase-imteam.it
  Via Sigismondi, 40 - 24018 Villa d'Alme' (BG)
  tel: +39+035636029 - fax: +39+035638129   
  http://www.imteam.it/
0
aspinelli (8)
7/28/2004 5:17:42 PM
purzel30@web.de (Christian Schmidbauer) wrote in message news:<46ca2aaa.0407272325.32c1ffee@posting.google.com>...
> Hello!
> 
> I prepare my XML document like this way:
[....] 
> Text data = domDocument.createTextNode (externalXML);
> domElement.appendChild (data);
[....]
> 
> Strange, isn't it!? The sign "<" is being replaced by "&lt;" and ">"
> is being replaced by "&gt;", but only for the XML coming from the
> external source.

That is not quite true. The replacement would have occured even if the
string was a literal. Try domDocument.createTextNode("5<7");

> Does anybody know this problem 

This is the semantics of using createTextNode. How would you,
otherwise put the string "The symbol for greater-than is >" in an XML
document using this method? The element name ROOT_TAG has been put
inside has been put inside "<" and ">" for similar reasons.

> or can think about a solution? Should I use another function than 
> "createTextNode" 

You could use importNode but as others have pointed out the external
document will have to be parsed first. Which will also not work if the
externalNode has been created as a String (not as an XML Document).

> or do I have to change the special characters manually?

I can think of a kludge. Instead of inserting the variable
externalDocument as a TextNode insert some other marker there and then
later replace that marker with externalDocument, with
(String.replace?). This assumes that the externalXML is not going
break the final document, otherwise the receiving side might reject
it.
0
hemalpandya (141)
7/28/2004 9:48:59 PM
Reply: