Hello,
What are the Java APIs out there that can simply correct malformed
HTML code, like take a input stream of badly formed HTML and produce
an output stream of clean HTML code (parsable by the Swing HTML
parser) ?
|
|
0
|
|
|
|
Reply
|
sunrise1647 (4)
|
6/9/2004 1:03:20 PM |
|
MCP wrote:
> What are the Java APIs out there that can simply correct malformed
> HTML code, like take a input stream of badly formed HTML and produce
> an output stream of clean HTML code (parsable by the Swing HTML
> parser) ?
Maybe this can help http://jtidy.sourceforge.net/ No idea if it fulfills
all your requirements.
/Thomas
|
|
0
|
|
|
|
Reply
|
nobody89 (1419)
|
6/9/2004 1:28:31 PM
|
|
On 9 Jun 2004 06:03:20 -0700, sunrise@cliffhanger.com (MCP) wrote or
quoted :
>What are the Java APIs out there that can simply correct malformed
>HTML code, like take a input stream of badly formed HTML and produce
>an output stream of clean HTML code (parsable by the Swing HTML
>parser) ?
I have been bugging the HTMLValidator people to write such a beast. I
figured it could save me a ton of work if it did simple unambiguous
corrections like insert missing </li> or convert stray & to &
His fear is making a change that the user did not want. He did not
want to be morally liable for messing up the source.
I have done a number of one shot programs to clean up various problems
in my website. They do it all with indexof and substring. If you are
just trying to correct a single problem at a time, it can be pretty
simple.
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
|
|
0
|
|
|
|
Reply
|
look-on (3298)
|
6/9/2004 8:54:17 PM
|
|
On Wed, 09 Jun 2004 20:54:17 GMT, Roedy Green wrote:
> ..it could save me a ton of work if it did simple unambiguous
> corrections like insert missing </li>
(whispers) W3C defininition for the <li>
is that it does not require a closing </li>..
<http://www.w3.org/TR/1999/REC-html401-19991224/struct/lists.html#didx-list>
--
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
|
|
0
|
|
|
|
Reply
|
SeeMySites (3836)
|
6/10/2004 4:03:36 AM
|
|
On Thu, 10 Jun 2004 04:03:36 GMT, Andrew Thompson
<SeeMySites@www.invalid> wrote or quoted :
>(whispers) W3C defininition for the <li>
>is that it does not require a closing </li>..
what about </td> and </tr>?
Anyway I like to have the HTML consistent.
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
|
|
0
|
|
|
|
Reply
|
look-on (3298)
|
6/10/2004 6:14:58 AM
|
|
On Thu, 10 Jun 2004 06:14:58 GMT, Roedy Green wrote:
> On Thu, 10 Jun 2004 04:03:36 GMT, Andrew Thompson
> <SeeMySites@www.invalid> wrote or quoted :
>
>>(whispers) W3C defininition for the <li>
>>is that it does not require a closing </li>..
>
> what about </td> and </tr>?
I am pretty sure they need to be
explicitly closed. (shrugs) If in doubt,
leave one out and throw it at the validator
(which is usually quicker than finding the
element on W3C's site)
> Anyway I like to have the HTML consistent.
;-) I know what you mean, it has taken
some training to *prevent* myself from
typing </p> and </li>..
--
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
|
|
0
|
|
|
|
Reply
|
SeeMySites (3836)
|
6/10/2004 7:41:26 AM
|
|
On Thu, 10 Jun 2004 18:37:46 GMT, arne thormodsen wrote:
>> ;-) I know what you mean, it has taken
>> some training to *prevent* myself from
>> typing </p> and </li>..
>>
>
> Why bother? All new broswers..
...not all browser are new, not all users
can update, not all sites can afford to
turn away customers just because their
browser is not flavour of the month.
That's why.
--
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
|
|
0
|
|
|
|
Reply
|
SeeMySites (3836)
|
6/10/2004 6:18:09 PM
|
|
>
> ;-) I know what you mean, it has taken
> some training to *prevent* myself from
> typing </p> and </li>..
>
Why bother? All new broswers interpret XHTML properly, so you might
as well make your HTML well-formed as XML too. Then you can use XML
tools to process it.
--arne
|
|
0
|
|
|
|
Reply
|
arneDOTthormodsen (21)
|
6/10/2004 6:37:46 PM
|
|
>
> Maybe this can help http://jtidy.sourceforge.net/ No idea if it
fulfills
> all your requirements.
>
I've used it extensively in the past. It works pretty well.
--arne
> /Thomas
|
|
0
|
|
|
|
Reply
|
arneDOTthormodsen (21)
|
6/10/2004 6:38:34 PM
|
|
Andrew Thompson wrote:
> On Thu, 10 Jun 2004 18:37:46 GMT, arne thormodsen wrote:
>
>>> ;-) I know what you mean, it has taken
>>> some training to *prevent* myself from
>>> typing </p> and </li>..
>>>
>>
>> Why bother? All new broswers..
>
> ..not all browser are new, not all users
> can update, not all sites can afford to
> turn away customers just because their
> browser is not flavour of the month.
>
> That's why.
>
I'm pretty sure even netscape 4.7 or Lynx interprets </p> and </li>
correctly. Even pure XHTML should pose no problem for those, when you write
the empty elements like <br> as <br /> instead of <br/>. Any browser better
than those (that's all of the currently used browsers :) should have no
problems if you close your tags.
As it says in the spec, the closing tags are not *required*, it doesn't say
that they shouldn't be present. And the advantages of writing XML
compatible HTML are bigger than adjusting to the lowest possible
denominator IMHO.
Have you got any example of a browser which breaks when you add the optional
closing tags?
--
Kind regards,
Christophe Vanfleteren
|
|
0
|
|
|
|
Reply
|
c.v4nfl3t3r3n (486)
|
6/10/2004 8:17:16 PM
|
|
Christophe Vanfleteren <c.v4nfl3t3r3n@pandora.be> wrote:
> I'm pretty sure even netscape 4.7 or Lynx interprets </p> and </li>
> correctly.
I can confirm that both do. I always use <p></p> and <li></li> in my HTML.
--
JustThe.net Internet & New Media Services, http://JustThe.net/
Steven J. Sobol, Geek In Charge / 888.480.4NET (4638) / sjsobol@JustThe.net
PGP Key available from your friendly local key server (0xE3AE35ED)
Apple Valley, California Nothing scares me anymore. I have three kids.
|
|
0
|
|
|
|
Reply
|
sjsobol (486)
|
6/10/2004 9:29:41 PM
|
|
On Thu, 10 Jun 2004 20:17:16 GMT, Christophe Vanfleteren wrote:
> Andrew Thompson wrote:
>> On Thu, 10 Jun 2004 18:37:46 GMT, arne thormodsen wrote:
>>
>>>> ;-) I know what you mean, it has taken
>>>> some training to *prevent* myself from
>>>> typing </p> and </li>..
...
>>> Why bother? All new broswers..
>>
>> ..not all browser are new,
....
> I'm pretty sure even netscape 4.7 or Lynx interprets </p> and </li>
> correctly. Even pure XHTML should pose no problem for those, when you write
> the empty elements like <br> as <br /> instead of <br/>.
Oh, alright,.. I suppose I tuned out at
the 'new browsers' comment.
I had rejected XHTML earlier for some reason
...no 'target' for 'href's.. no applet tags or
something.. I do not quite remember.
Maybe I should take another look..
[ ..but damn-it, if it does not work on
my NN 4.08, it is *out*! ;-) ]
--
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
|
|
0
|
|
|
|
Reply
|
SeeMySites (3836)
|
6/11/2004 12:43:24 AM
|
|
|
11 Replies
36 Views
(page loaded in 1.303 seconds)
Similiar Articles: "Platform default encoding" - comp.lang.java.help-- Beware of bugs in the above code; I have only proved it correct, not tried it. ... g., http://download.oracle.com/javase/6/docs/api/java/nio/charset/Charset.html ... how do you put images on a screen? - comp.lang.java.help ...... docs/api/java/awt/Image.html> is another good starting point. -- Beware of bugs in the above code; I have only proved it correct, not ... docs/api/java/awt/Image.html> is ... How to strip comments out of code - comp.lang.java.programmer ...... file....Assuming the use of correct Java sources as an input, the code below ... BTW, The OP may also utilize the Java Compiler API ... – MyKey_ Jul 5 '09 at 20:24 ... HTML code ... NetBeans jList Add/Clear Items - comp.lang.java.guiCODE: public void RefreshQuestionList ... Is this the correct documentation? http://java.sun.com/j2se/1.5.0/docs/api/javax/swing/ListModel.html Or what is this: http ... Animating sine waves - comp.lang.java.programmerYou should look into the HTML code of ... into the Java documentation at <http://java.sun.com/j2se/1.5.0/docs/api ... on the "Dimension" page).> --little correction ... How to inform web site visitor that Sun Java is required? - comp ...... to inform web site visitor that Sun Java is required?Hi,I hope this is the correct ... Is there some html code that Ican insert that ... things) aJava based GUI testing API that ... JDBC compare the content of two tables - comp.lang.java.help ...Thanks Patricia, Is FieldComparator standard Java API? ... Some virtual sample code? regards, George Patricia ... method), but then polymorphism selects the correct method ... Java function that will convert an arbitrary base to a decimal ...... in Java: http://download.oracle.com/javase/6/docs/api/java/lang/Long.html ... -- Beware of bugs in the above code; I have only proved it correct, not tried it. Deploy applet with JNLP file load wrong package - comp.lang.java ...... so I'm sure the jar is correct) with no jnlp for my own program. But for the html using ... inside MonopolyEntry.java? What code is ... have tried deploying the Comm API ... generics, compareTo, and ArrayList( ) - comp.lang.java.help ...... of bugs in the above code; I have only proved it correct, not ... com/jgloss/arraylist.html -- Roedy Green Canadian Mind Products The Java ... Please read the ArrayList API ... Passing Variable from One class to another, using an ...<http://java.sun.com/docs/books/tutorial/uiswing/concurrency/initial.html> Also, Java by ... A0} > =A0> } > > Then just create the ButtonListener object with the correct ... write read string data - comp.lang.java.help... CodeConventions.doc8.html [2] <http://72.5.124.55/javase/6/docs/api/java/lang/Object.html ... Now you gave me the code to correct that, but I haven't tested it yet. Maybe I ... How to start jboss as background process - comp.unix.solaris ...... http://java.sun.com/javase/6/docs/api/java/awt/Image.html> is another good starting point. -- Beware of bugs in the above code; I have only proved it correct, not ... HTTPS Servlet with Tomcat - comp.lang.java.security... apache.org/tomcat-4.1-doc/ssl-howto.html on a ... tells me is valid, every thing seems to be correct. ... available from java.lang.Class.)<http://java.sun.com/javaee/5/docs/api ... print a pdf file from within a MS Access report - comp.text.pdf ...... there is a command line tool (or API) that I can call from within Access VB code ... file from war within ear - comp.lang.java ... eHow.com... are running the correct report ... MalformedURLException (Java Platform SE 6)Java™ Platform Standard Ed. 6 PREV CLASS NEXT CLASS ... Thrown to indicate that a malformed URL has occurred. ... definitions of terms, workarounds, and working code ... javadoc-The Java API Documentation GeneratorYou can use it to generate the API (Application Programming Interface ... Page for uses of AudioClip interface src-html Source code directory java Package ... 7/8/2012 7:31:27 AM
|