For Pentecost, I've written a Java applet that displays the Jesus Prayer,
Lord Jesus Christ, son of God,
have mercy on me, a sinner.
in various languages:
http://www.albany.net/~hello/jp2.htm
So far I have:
English, French, German, Latin, Italian, Spanish, Cherokee,
Dutch, Finnish, Swedish, Serbian, Vietnamese, Slovak, Old Church
Slavonic, Russian, Hebrew, Hawaiian, Greek, Polish, Czech, Maori,
Swahili, Albanian, Portuguese, Plautdietsch, Romanian, Haitian,
Japanese, Tagalog, Danish, Icelandic, Hungarian, Ukrainian, Norse,
Irish, Hindi, Thai, Malay
I would appreciate being e-mailed versions in languages I don't have, as
well as corrections to any mistakes in the languages I do have. Since
Java has Unicode, I would like the prayer (in two lines, as above) in
native characters as well as in Latin transcriptions.
Some may find useful these variations of the applet:
http://www.albany.net/~hello/jp3.htm
http://www.albany.net/~hello/jp4.htm
as well as in a very different JApplet (so not everyone can see it):
http://www.albany.net/~hello/cp1a.htm
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/30/2004 2:37:26 PM |
|
Leo Wong <hello@albany.net> says...
>
>For Pentecost, I've written a Java applet that displays the Jesus Prayer,
Go back and ask Jesus to cure you of spamming posts about Java into
newsgroups that are about Forth.
|
|
0
|
|
|
|
Reply
|
Guy
|
5/30/2004 3:19:23 PM
|
|
Leo Wong <hello@albany.net> says...
>
>For Pentecost, I've written a Java applet that displays the Jesus Prayer,
Go back and ask Jesus to cure you of spamming posts about Java into
newsgroups that are about Forth.
|
|
0
|
|
|
|
Reply
|
|
5/30/2004 3:19:30 PM
|
|
In article <nJGdnW_-G_MJbCTdRVn-vg@thebiz.net>,
Leo Wong <hello@albany.net> wrote:
>For Pentecost, I've written a Java applet that displays the Jesus Prayer,
>
>Lord Jesus Christ, son of God,
>have mercy on me, a sinner.
>
>in various languages:
>
>http://www.albany.net/~hello/jp2.htm
>
>So far I have:
>
>English, French, German, Latin, Italian, Spanish, Cherokee,
>Dutch, Finnish, Swedish, Serbian, Vietnamese, Slovak, Old Church
>Slavonic, Russian, Hebrew, Hawaiian, Greek, Polish, Czech, Maori,
>Swahili, Albanian, Portuguese, Plautdietsch, Romanian, Haitian,
>Japanese, Tagalog, Danish, Icelandic, Hungarian, Ukrainian, Norse,
>Irish, Hindi, Thai, Malay
So you are studying for pope then?
You have to go a long way before you beat the current pope.
>Leo Wong
>--
>http://www.albany.net/~hello/
--
--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
One man-hour to invent,
One man-week to implement,
One lawyer-year to patent.
|
|
0
|
|
|
|
Reply
|
albert37 (2989)
|
5/30/2004 5:24:34 PM
|
|
Leo Wong wrote:
> For Pentecost, I've written a Java applet that displays the Jesus Prayer,
>
> Lord Jesus Christ, son of God,
> have mercy on me, a sinner.
>
> in various languages:
>
> http://www.albany.net/~hello/jp2.htm
>
> So far I have:
>
> English, French, German, Latin, Italian, Spanish, Cherokee,
> Dutch, Finnish, Swedish, Serbian, Vietnamese, Slovak, Old Church
> Slavonic, Russian, Hebrew, Hawaiian, Greek, Polish, Czech, Maori,
> Swahili, Albanian, Portuguese, Plautdietsch, Romanian, Haitian,
> Japanese, Tagalog, Danish, Icelandic, Hungarian, Ukrainian, Norse,
> Irish, Hindi, Thai, Malay
>
Added Yoruba.
Leo
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/30/2004 5:24:50 PM
|
|
Leo Wong <hello@albany.net> says...
>Added Yoruba.
Still spamming posts about Java into a Forth newsgroup, I see...
Repent, you sinner! God hates spammers!
|
|
0
|
|
|
|
Reply
|
|
5/30/2004 6:10:31 PM
|
|
Leo Wong <hello@albany.net> says...
>Added Yoruba.
Still spamming posts about Java into a Forth newsgroup, I see...
Repent, you sinner! God hates spammers!
|
|
0
|
|
|
|
Reply
|
Guy
|
5/30/2004 6:10:35 PM
|
|
Leo Wong <hello@albany.net> scribbled the following
on comp.lang.java.programmer:
> For Pentecost, I've written a Java applet that displays the Jesus Prayer,
> Lord Jesus Christ, son of God,
> have mercy on me, a sinner.
> in various languages:
> http://www.albany.net/~hello/jp2.htm
> So far I have:
> English, French, German, Latin, Italian, Spanish, Cherokee,
> Dutch, Finnish, Swedish, Serbian, Vietnamese, Slovak, Old Church
> Slavonic, Russian, Hebrew, Hawaiian, Greek, Polish, Czech, Maori,
> Swahili, Albanian, Portuguese, Plautdietsch, Romanian, Haitian,
> Japanese, Tagalog, Danish, Icelandic, Hungarian, Ukrainian, Norse,
> Irish, Hindi, Thai, Malay
I would like to proofread the German, Finnish and Swedish versions.
Also, can I submit a version in Klingon?
--
/-- Joona Palaste (palaste@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"We sorcerers don't like to eat our words, so to say."
- Sparrowhawk
|
|
0
|
|
|
|
Reply
|
palaste (2323)
|
5/30/2004 6:35:32 PM
|
|
\ JP2PJ.f Leo Wong, May 24, 2004 +
\ The Hebrew comes out backwards in some browsers.
\ I can't prevent that but was curious how to reverse
\ the Unicode, given spaces and punctuation not in Unicode.
\ Store ca u as a counted string at s
: place ( ca u s -- )
2DUP 2>R CHAR+ SWAP CHARS MOVE 2R> C! ;
\ Reserve space for and store a string
: string, ( ca u -- )
HERE OVER 1+ CHARS ALLOT place ;
\ Look for first c in ca1 u1
: scan ( ca1 u1 c -- ca2 u2 ) >R
BEGIN DUP WHILE OVER C@ R@ <> WHILE 1 /STRING REPEAT THEN
R> DROP ;
: cin ( ca1 u char - ca2|0)
scan 0<> AND ;
\ Add char to contents of ca
: c+! ( char ca -- ) DUP >R C@ + R> C! ;
\ Add ca u to the counted string at s
: append ( ca u s -- )
2DUP 2>R COUNT CHARS + SWAP CMOVE 2R> c+! ;
CREATE delims S" \,. " string, \ etc.
\ Look for last instance of delims in string ca1 u
\ Return ca2, its address or 0 if not found
: <delim-scan ( ca1 u -- ca2|0 )
BEGIN DUP
WHILE 1- 2DUP CHARS + C@ delims COUNT ROT cin
UNTIL 1+ THEN ;
\ From ca u return code or delim ca2 u2 and remaining ca u1
: <code ( ca u -- ca u1 ca2 u2 )
2DUP 2>R <delim-scan DUP 2R> ROT 1- /STRING ;
create JesusPrayer
S" \u05d0\u05d3\u05d5\u05df \u05d9\u05e9\u05d5\u05e2
\u05d4\u05de\u05e9\u05d9\u05d7, \u05d1\u05df
\u05d0\u05dc\u05d5\u05d4\u05d9\u05dd \u05ea\u05e8\u05d7\u05dd
\u05e2\u05dc\u05d9, \u05d0\u05e0\u05d9 \u05d7\u05d5\u05d8\u05d0" string,
create reyarPsuseJ JesusPrayer C@ 1+ CHARS ALLOT
: jp2pj ( -- ) \ should be generalized
JesusPrayer COUNT
0 reyarPsuseJ C!
BEGIN DUP WHILE <code reyarPsuseJ append 1- REPEAT 2DROP ;
: test
jp2pj
reyarPsuseJ COUNT CR TYPE ;
Leo Wong wrote:
>
> http://www.albany.net/~hello/jp2.htm
>
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/30/2004 7:15:12 PM
|
|
On 30 May 2004 18:35:32 GMT, Joona I Palaste <palaste@cc.helsinki.fi>
wrote or quoted :
>> For Pentecost, I've written a Java applet that displays the Jesus Prayer,
I have a similar secular version.
It counts in words in a variety of languages.
see http://mindprod.com/inwords.html
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
|
|
0
|
|
|
|
Reply
|
look-on (3298)
|
5/30/2004 8:51:45 PM
|
|
Joona I Palaste wrote:
> Leo Wong <hello@albany.net> scribbled the following
> on comp.lang.java.programmer:
>
>>For Pentecost, I've written a Java applet that displays the Jesus Prayer,
>
>
>>Lord Jesus Christ, son of God,
>>have mercy on me, a sinner.
>
>
>>in various languages:
>
>
>>http://www.albany.net/~hello/jp2.htm
>
>
>>So far I have:
>
>
>>English, French, German, Latin, Italian, Spanish, Cherokee,
>>Dutch, Finnish, Swedish, Serbian, Vietnamese, Slovak, Old Church
>>Slavonic, Russian, Hebrew, Hawaiian, Greek, Polish, Czech, Maori,
>>Swahili, Albanian, Portuguese, Plautdietsch, Romanian, Haitian,
>>Japanese, Tagalog, Danish, Icelandic, Hungarian, Ukrainian, Norse,
>>Irish, Hindi, Thai, Malay
>
>
> I would like to proofread the German, Finnish and Swedish versions.
> Also, can I submit a version in Klingon?
>
If you can run:
http://www.albany.net/~hello/jp4.htm
you will be able to choose the language you want to proofread.
You had better post your Klingon version here.
All best,
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/30/2004 11:41:36 PM
|
|
Roedy Green wrote:
>
> see http://mindprod.com/inwords.html
Nice! How would you handle of problem of browsers handling Hebrew (and
Arabic?) differently -- some going displaying Unicode left to right and
others right to left?
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/31/2004 1:50:33 AM
|
|
On Sun, 30 May 2004 21:50:33 -0400, Leo Wong <hello@albany.net> wrote
or quoted :
>Nice! How would you handle of problem of browsers handling Hebrew (and
>Arabic?) differently -- some going displaying Unicode left to right and
>others right to left?
I don't know. I don't know either language. The only one I have
tackled is getting Esperanto accents.
I don't know if you can make some fields Arabic, some English and
others Hebrew.
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
|
|
0
|
|
|
|
Reply
|
look-on (3298)
|
5/31/2004 2:37:25 AM
|
|
Roedy Green wrote:
> On Sun, 30 May 2004 21:50:33 -0400, Leo Wong <hello@albany.net> wrote
> or quoted :
>
>
>>Nice! How would you handle of problem of browsers handling Hebrew (and
>>Arabic?) differently -- some going displaying Unicode left to right and
>>others right to left?
>
>
> I don't know. I don't know either language. The only one I have
> tackled is getting Esperanto accents.
>
> I don't know if you can make some fields Arabic, some English and
> others Hebrew.
>
I think it is a browser question. It would be nice if there were a
standard (maybe there is one?). I suppose one could be a "reverse"
button that would only apply to certain languages, but that seems ugly.
Anyway, thanks, and good-bye. It's ordinary time.
Leo (and so Forth)
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/31/2004 2:49:01 AM
|
|
Leo Wong <hello@albany.net> scribbled the following:
> Joona I Palaste wrote:
>> Leo Wong <hello@albany.net> scribbled the following
>> on comp.lang.java.programmer:
>>>For Pentecost, I've written a Java applet that displays the Jesus Prayer,
>>
>>>Lord Jesus Christ, son of God,
>>>have mercy on me, a sinner.
>>
>>>in various languages:
>>
>>>http://www.albany.net/~hello/jp2.htm
>>
>>>So far I have:
>>
>>>English, French, German, Latin, Italian, Spanish, Cherokee,
>>>Dutch, Finnish, Swedish, Serbian, Vietnamese, Slovak, Old Church
>>>Slavonic, Russian, Hebrew, Hawaiian, Greek, Polish, Czech, Maori,
>>>Swahili, Albanian, Portuguese, Plautdietsch, Romanian, Haitian,
>>>Japanese, Tagalog, Danish, Icelandic, Hungarian, Ukrainian, Norse,
>>>Irish, Hindi, Thai, Malay
>>
>>
>> I would like to proofread the German, Finnish and Swedish versions.
>> Also, can I submit a version in Klingon?
> If you can run:
> http://www.albany.net/~hello/jp4.htm
> you will be able to choose the language you want to proofread.
> You had better post your Klingon version here.
Currently I can't run *any* applet because my browser refuses to have
the Java plugin installed. Can you just post the appropriate texts here,
or e-mail them to me?
--
/-- Joona Palaste (palaste@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"To doo bee doo bee doo."
- Frank Sinatra
|
|
0
|
|
|
|
Reply
|
palaste (2323)
|
5/31/2004 5:05:52 AM
|
|
On 31 May 2004 05:05:52 GMT, Joona I Palaste <palaste@cc.helsinki.fi>
wrote or quoted :
>Currently I can't run *any* applet because my browser refuses to have
>the Java plugin installed. Can you just post the appropriate texts here,
>or e-mail them to me?
Try Opera. It is very good about accepting whatever Java you have
installed without any fuss.
See http://mindprod.com/jgloss/opera.html
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
|
|
0
|
|
|
|
Reply
|
look-on (3298)
|
5/31/2004 5:30:59 AM
|
|
On Sun, 30 May 2004 21:50:33 -0400, Leo Wong wrote:
> Roedy Green wrote:
...
>> see http://mindprod.com/inwords.html
>
> Nice! How would you handle of problem of browsers handling Hebrew (and
> Arabic?)
Since you mention it, an example..
<http://www.physci.org/launcher.jsp?class=/codes/eg/JArabicInUnicode>
--
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
|
|
0
|
|
|
|
Reply
|
SeeMySites (3836)
|
5/31/2004 7:50:34 AM
|
|
Op Sun, 30 May 2004 17:24:34 GMT schreef albert@spenarnc.xs4all.nl
(Albert van der Horst):
>So you are studying for pope then?
>You have to go a long way before you beat the current pope.
>
Ha, I just saw (Ned 1, RKK) Michael I, self-acclaimed Pope since 1990.
He and his 30 followers say that since Pius XII there have been only
heretic popes in Rome. Up to Belvue, America!
Coos
|
|
0
|
|
|
|
Reply
|
j.j.haak (137)
|
5/31/2004 11:20:58 AM
|
|
Joona I Palaste wrote:
> Currently I can't run *any* applet because my browser refuses to have
> the Java plugin installed. Can you just post the appropriate texts here,
> or e-mail them to me?
Jesus Christus, Sohn Gottes, erbarme Dich meiner, des Sünders.
Herra Jeesus Kristus, Jumalan Poika, armahda minua syntistä.
Herre Jesus Kristus, Guds Son, benåda mig, en syndare.
Hope the characters get through.
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/31/2004 2:24:26 PM
|
|
Leo Wong <hello@albany.net> scribbled the following:
> Joona I Palaste wrote:
>> Currently I can't run *any* applet because my browser refuses to have
>> the Java plugin installed. Can you just post the appropriate texts here,
>> or e-mail them to me?
> Jesus Christus, Sohn Gottes, erbarme Dich meiner, des Sünders.
> Herra Jeesus Kristus, Jumalan Poika, armahda minua syntistä.
> Herre Jesus Kristus, Guds Son, benåda mig, en syndare.
> Hope the characters get through.
The characters are in UTF-8 while I'm using ISO-8859-1, but I can guess
their intended meaning well enough. I suppose "Son of God" is supposed
to be capitalised regardless of language. In this case I can vouch for
the correctness of the Finnish version, and can not find any fault in
the German and Swedish versions either. Any native speakers here?
I have not yet come up with a Klingon version but will do so when I
have more free time.
--
/-- Joona Palaste (palaste@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"Roses are red, violets are blue, I'm a schitzophrenic and so am I."
- Bob Wiley
|
|
0
|
|
|
|
Reply
|
palaste (2323)
|
5/31/2004 2:28:23 PM
|
|
Andrew Thompson wrote:
> Since you mention it, an example..
> <http://www.physci.org/launcher.jsp?class=/codes/eg/JArabicInUnicode>
Hi Andrew,
I don't read Arabic, but isn't your example reading left to right?
-Paul
|
|
0
|
|
|
|
Reply
|
goodhill.REMOVE (78)
|
5/31/2004 2:31:45 PM
|
|
Joona I Palaste <palaste@cc.helsinki.fi> scribbled the following:
> Leo Wong <hello@albany.net> scribbled the following:
>> Joona I Palaste wrote:
>>> Currently I can't run *any* applet because my browser refuses to have
>>> the Java plugin installed. Can you just post the appropriate texts here,
>>> or e-mail them to me?
>> Jesus Christus, Sohn Gottes, erbarme Dich meiner, des Sünders.
>> Herra Jeesus Kristus, Jumalan Poika, armahda minua syntistä.
>> Herre Jesus Kristus, Guds Son, benåda mig, en syndare.
>> Hope the characters get through.
> The characters are in UTF-8 while I'm using ISO-8859-1, but I can guess
> their intended meaning well enough. I suppose "Son of God" is supposed
> to be capitalised regardless of language. In this case I can vouch for
> the correctness of the Finnish version, and can not find any fault in
> the German and Swedish versions either. Any native speakers here?
> I have not yet come up with a Klingon version but will do so when I
> have more free time.
Here we go... This is a very rough draft, because I have only learned
Klingon from one book. It might have some mistakes.
yeSuS QIStuS pIn'a' puqloD jIHDaq yempu'bogh pung yInobneS
Translation: Jesus Christ, Son of God, do me the honour of giving
mercy to me, who have sinned.
--
/-- Joona Palaste (palaste@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"A friend of mine is into Voodoo Acupuncture. You don't have to go into her
office. You'll just be walking down the street and... ohh, that's much better!"
- Stephen Wright
|
|
0
|
|
|
|
Reply
|
palaste (2323)
|
5/31/2004 2:51:40 PM
|
|
On Mon, 31 May 2004 08:31:45 -0600, P.Hill wrote:
> Andrew Thompson wrote:
>> <http://www.physci.org/launcher.jsp?class=/codes/eg/JArabicInUnicode>
...
> I don't read Arabic, but isn't your example reading left to right?
Not that I am aware of.
...Did you scroll down? The launchable
example shows it being constructed
part by part (from right to left).
--
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
|
|
0
|
|
|
|
Reply
|
SeeMySites (3836)
|
5/31/2004 2:53:28 PM
|
|
On 31 May 2004 14:28:23 GMT, Joona I Palaste wrote:
> I have not yet come up with a Klingon version but will do so when I
> have more free time.
...what about Vulcan, ..Borg?
Or for that matter Auk, ..Elven? ;-)
--
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
|
|
0
|
|
|
|
Reply
|
SeeMySites (3836)
|
5/31/2004 2:55:21 PM
|
|
>>isn't your example reading left to right?
Andrew Thompson wrote:
> ..Did you scroll down? The launchable
> example shows it being constructed
> part by part (from right to left).
Yes I did. Sorry My mistake. I see what you are saying each line
differs from the previous on the LEFT.
Never mind,
-Paul
|
|
0
|
|
|
|
Reply
|
goodhill.REMOVE (78)
|
5/31/2004 3:25:18 PM
|
|
Andrew Thompson wrote:
> On Sun, 30 May 2004 21:50:33 -0400, Leo Wong wrote:
> Since you mention it, an example..
> <http://www.physci.org/launcher.jsp?class=/codes/eg/JArabicInUnicode>
>
Thank you.
I talking about a Java applet in which "Aslam Alykm" displays correctly
in some browsers and as "mkylA malsA" in others.
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/31/2004 3:30:52 PM
|
|
On Mon, 31 May 2004 11:30:52 -0400, Leo Wong wrote:
> Andrew Thompson wrote:
>> On Sun, 30 May 2004 21:50:33 -0400, Leo Wong wrote:
....
>> <http://www.physci.org/launcher.jsp?class=/codes/eg/JArabicInUnicode>
....
> I talking about a Java applet in which "Aslam Alykm" displays correctly
> in some browsers and as "mkylA malsA" in others.
I still do not quite understand.
Are you saying that some browser/JRE
combinations will display the phrase
I am trying to write, incorrectly?
[ In the example I point to, you
should be able to see the string
being constructed character by
character - from right to left. ]
Have you got an example that displays
the problem you mention?
--
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
|
|
0
|
|
|
|
Reply
|
SeeMySites (3836)
|
5/31/2004 3:52:58 PM
|
|
Andrew Thompson wrote:
> Have you got an example that displays
> the problem you mention?
>
Here is "Peace Be Upon You" in Arabic (after your program):
http://www.albany.net/~hello/jptest.htm
It seems right on IE6 and Firefox with JVM 1.4, wrong on IE6 with MS
JVM. Other can probably report on how it displays on other browser-jvm
combinations.
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/31/2004 4:31:22 PM
|
|
On Mon, 31 May 2004 12:31:22 -0400, Leo Wong wrote:
> Andrew Thompson wrote:
>> Have you got an example that displays
>> the problem you mention?
....
> Here is "Peace Be Upon You" in Arabic (after your program):
>
> http://www.albany.net/~hello/jptest.htm
Wow.. Thanks!
> It seems right on IE6 and Firefox with JVM 1.4,
My IE 6 w 1.4 JRE..
<http://www.physci.org/codes/eg/arright.png>
Yep. That is as I expect.
>..wrong on IE6 with MS JVM.
...hmmm. OK, flips to MSVM..
<http://www.physci.org/codes/eg/arwrong.png>
Ooooohhhh, yeah.. Typical ehh? :-(
What's the bet it renders correctly in 1.1.8?
Fortunately, the version at PhySci.codes
is 'safe' from the MSVM.. It uses Swing. ;-)
>..Other can probably report on how it displays on other browser-jvm
> combinations.
Yes, I would be interested to hear from
other people if this is a problem for
any VM other than the MSVM.
We have the actual images of both, throw
your browser at both the link above and
(if you have a 1.2+ VM) my version at..
<http://www.physci.org/launcher.jsp?class=/codes/eg/JArabicInUnicode>
Report back whether either/both are
rendering correctly.
--
Andrew Thompson
http://www.PhySci.org/ Open-source software suite
http://www.PhySci.org/codes/ Web & IT Help
http://www.1point1C.org/ Science & Technology
|
|
0
|
|
|
|
Reply
|
SeeMySites (3836)
|
5/31/2004 5:03:34 PM
|
|
Joona I Palaste wrote:
> Leo Wong <hello@albany.net> scribbled the following:
>
>>Joona I Palaste wrote:
>
>
>>>Currently I can't run *any* applet because my browser refuses to have
>>>the Java plugin installed. Can you just post the appropriate texts here,
>>>or e-mail them to me?
>
>
>>Jesus Christus, Sohn Gottes, erbarme Dich meiner, des Sünders.
>>Herra Jeesus Kristus, Jumalan Poika, armahda minua syntistä.
>>Herre Jesus Kristus, Guds Son, benåda mig, en syndare.
>
>
>>Hope the characters get through.
>
>
> The characters are in UTF-8 while I'm using ISO-8859-1, but I can guess
> their intended meaning well enough. I suppose "Son of God" is supposed
> to be capitalised regardless of language. In this case I can vouch for
> the correctness of the Finnish version, and can not find any fault in
> the German and Swedish versions either. Any native speakers here?
> I have not yet come up with a Klingon version but will do so when I
> have more free time.
>
Thank you very much. Is there a Usenet standard for character coding?
Should I be using ISO-8859-1?
Leo
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
5/31/2004 6:03:43 PM
|
|
"Guy Macon" <http://www.guymacon.com> wrote in message
news:jeSdndNU-rEWvifdRVn-hA@speakeasy.net...
>
> Leo Wong <hello@albany.net> says...
>
> >Added Yoruba.
>
> Still spamming posts about Java into a Forth newsgroup, I see...
>
> Repent, you sinner! God hates spammers!
>
Although I'm somewhat curious why he chose to post to the Forth newsgroup,
Mr. Wong has the right to do so. Mr. Guy Macon also has the right to express
a bitter attitude towards Christians and God. And we all have the right to
*PLONK* people who inconsiderately mock our faith using abusive speech.
|
|
0
|
|
|
|
Reply
|
nospam52 (1479)
|
5/31/2004 10:44:46 PM
|
|
DM McGowan II <nospam@nospam.net> wrote:
> "Guy Macon" <http://www.guymacon.com> wrote in message
> news:jeSdndNU-rEWvifdRVn-hA@speakeasy.net...
>>
>> Leo Wong <hello@albany.net> says...
>>
>> >Added Yoruba.
>>
>> Still spamming posts about Java into a Forth newsgroup, I see...
>>
>> Repent, you sinner! God hates spammers!
>>
> Although I'm somewhat curious why he chose to post to the Forth newsgroup,
> Mr. Wong has the right to do so.
Eh? Since when? It'd blatantly off topic.
> Mr. Guy Macon also has the right to express a bitter attitude
> towards Christians and God.
Looks to me like he has a bitter attitude towards spammers.
Andrew.
|
|
0
|
|
|
|
Reply
|
andrew29 (3681)
|
5/31/2004 11:32:09 PM
|
|
DM McGowan II wrote:
>
> Although I'm somewhat curious why he chose to post to the Forth newsgroup,
> Mr. Wong has the right to do so. Mr. Guy Macon also has the right to express
> a bitter attitude towards Christians and God. And we all have the right to
> *PLONK* people who inconsiderately mock our faith using abusive speech.
>
Because I thought I had friends here who had knowledge of languages. I
thought that OT sufficiently announced "off-topic", and in fact I used
"OT" for the posts to Java groups, since I was not asking for Java help.
However, only the Forth group responded abusively. Live and learn.
Leo
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/1/2004 12:42:15 AM
|
|
DM McGowan II <nospam@nospam.net> says...
>
>"Guy Macon" <http://www.guymacon.com> wrote...
>>
>> Still spamming posts about Java into a Forth newsgroup, I see...
>>
>> Repent, you sinner! God hates spammers!
>
>Although I'm somewhat curious why he chose to post to the Forth newsgroup,
>Mr. Wong has the right to do so. Mr. Guy Macon also has the right to express
>a bitter attitude towards Christians and God. And we all have the right to
>*PLONK* people who inconsiderately mock our faith using abusive speech.
You are wrong on several levels.
[1] Mr. Wong didn't just post to a Forth newsgroup. he posted JAVA
to a Forth newsgroup. that's rude.
[2] Unless some tenet of your religion call for posting Java to a
Forth newsgroup, mocking Mr. Wong is not the same as mock your
faith.
[3] Mr. Wong has no "right" to post. Neither do you. Neither do I.
Usenet is not the USA. He has *permission* to post. So do
you. So do I. My criticizing him for his post is no different
from you criticizing me for my post.
[4] My "attitude" is against off-topic posts, not towards Christians
or God.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/1/2004 3:28:21 AM
|
|
On Mon, 31 May 2004 20:42:15 -0400, Leo Wong <hello@albany.net> wrote
(more or less):
>DM McGowan II wrote:
>
>>
>> Although I'm somewhat curious why he chose to post to the Forth newsgroup,
>> Mr. Wong has the right to do so. Mr. Guy Macon also has the right to express
>> a bitter attitude towards Christians and God. And we all have the right to
>> *PLONK* people who inconsiderately mock our faith using abusive speech.
>>
>
>Because I thought I had friends here who had knowledge of languages. I
>thought that OT sufficiently announced "off-topic", and in fact I used
>"OT" for the posts to Java groups, since I was not asking for Java help.
> However, only the Forth group responded abusively. Live and learn.
There was a marginal (tho' poor) justification for posting to Java
groups - you were seeking help with a Java applet.
No such tenuous "[OT]" justification helps when posting to a Forth
newsgroup about help with your Java applet.
--
Cheers,
Euan
Gawnsoft: http://www.gawnsoft.co.sr
Symbian/Epoc wiki: http://html.dnsalias.net:1122
Smalltalk links (harvested from comp.lang.smalltalk) http://html.dnsalias.net/gawnsoft/smalltalk
|
|
0
|
|
|
|
Reply
|
xlucid (70)
|
6/1/2004 3:30:10 AM
|
|
Leo Wong <hello@albany.net> says...
>Because I thought I had friends here
If you want to have friends, don't post about Java in a Forth newsgroup.
>I thought that OT sufficiently announced "off-topic",
So if you put in "PORN" then that would make it O.K. to post
porn in a Forth newsgroup?
The fact of the matter is that you were advertising your religious
Java applet that displays the Jesus Prayer in a Forth newsgroup.
How is that different from someone who likes Porn and wishes to
advertise a pornographic Java applet that displays dirty stories
in a Forth newsgroup?
You may be thinking that your Jesus prayer isn't offensive the way
porn is, but it is offensive to atheists and to the particular branch
of Christianity that I happen to belong to (Quaker), which does not
believe in rote prayers. Not that that matters - comp.lang.forth is
for discussing Forth, not religion.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/1/2004 3:44:56 AM
|
|
Guy Macon wrote:
> Leo Wong <hello@albany.net> says...
>
>>For Pentecost, I've written a Java applet that displays the Jesus Prayer,
>
>
> Go back and ask Jesus to cure you of spamming posts about Java into
> newsgroups that are about Forth.
Regular and long-time denizens of a newsgroup are usually permitted some
leeway in what they may ask of their old friends. You know not of whom
you write.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/1/2004 4:22:59 AM
|
|
Jerry Avins <jya@ieee.org> says...
>Regular and long-time denizens of a newsgroup are usually
>permitted some leeway in what they may ask of their old friends.
I agree with this practice, and was unaware that Leo Wong was a
regular or long-time denizen of comp.lang.forth. If I had known
such a thing I would have let his post pass without comment.
A google search says that he has posted five times in 2004 and
twenty six times in all of 2003, which explains why I did not
realize that he was a regular.
I am sure that you agree that it looked like typical off-topic
crossposted advertising, not a conversation with old friends.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/1/2004 6:22:58 AM
|
|
"DM McGowan II" <nospam@nospam.net> wrote in message news:<0t-dneSvNs0cKCbdRVn-hA@comcast.com>...
> Although I'm somewhat curious why he chose to post to the Forth newsgroup,
Of course, as the quote on Leo's site says, "But it should not be
overlooked that even in outwardly imperfect works, whether originally
so or having become so through damage, the image may remain intact;
for in the first case the image, which was not of the artist's own
invention but inherited, can still be recognized in its imperfect
embodiment."
When we have sorted out how to do this in Forth, I'm sure that Leo
will do it in Forth as well.
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/1/2004 9:37:05 AM
|
|
Guy Macon wrote:
...
> I am sure that you agree that it looked like typical off-topic
> crossposted advertising, not a conversation with old friends.
Sure. Being so easily fooled is why, when I'm sober, I usually hold my
comments for a few rounds.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/1/2004 2:20:49 PM
|
|
On Sun, 30 May 2004 08:19:23 -0700, Guy Macon
<http://www.guymacon.com> wrote:
>Go back and ask Jesus to cure you of spamming posts about Java into
>newsgroups that are about Forth.
I had hoped the Taliban had cured everybody from religious
intolerance. Obviously not.
|
|
0
|
|
|
|
Reply
|
kochenburger (40)
|
6/1/2004 3:01:34 PM
|
|
Andreas Kochenburger <kochenburger@gmx.de> says...
>>Go back and ask Jesus to cure you of spamming posts about
>>Java into newsgroups that are about Forth.
>
>I had hoped the Taliban had cured everybody from religious
>intolerance. Obviously not.
I had hoped that by the 21st century everyone would have been cured
of the "chip on my shoulder" practice of interpreting every comment
mentioning religion as being religious intolerance. Obviously not.
I am against off-topic advertising in inappropriate newsgroups,
not religion.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/1/2004 3:20:59 PM
|
|
Leo Wong wrote:
> DM McGowan II wrote:
>> Although I'm somewhat curious why he chose to post to the Forth
>> newsgroup, Mr. Wong has the right to do so. Mr. Guy Macon also has the
>> right to express a bitter attitude towards Christians and God. And we
>> all have the right to *PLONK* people who inconsiderately mock our faith
>> using abusive speech.
> Because I thought I had friends here who had knowledge of languages. I
> thought that OT sufficiently announced "off-topic", and in fact I used
> "OT" for the posts to Java groups, since I was not asking for Java help.
> However, only the Forth group responded abusively. Live and learn.
Was it one poster on the Forth group who responded abusively? Two?
I thought it better not to defend you since the result is to make a
continued little flamewar of no particular value.
I consider your explanation quite clear, and I hope you won't feel the
need to keep explaining when you get flamed further.
Probably if you do something similar again -- say christmas -- no one
here then will object. Or maybe they will, clf seems to have gotten a
bit more abrasive in the last few years.
|
|
0
|
|
|
|
Reply
|
j2thomas (670)
|
6/1/2004 3:45:45 PM
|
|
Jerry Avins wrote:
> Guy Macon wrote:
>
>> Leo Wong <hello@albany.net> says...
>>
>>> For Pentecost, I've written a Java applet that displays the Jesus
>>> Prayer,
>>
>>
>>
>> Go back and ask Jesus to cure you of spamming posts about Java into
>> newsgroups that are about Forth.
>
>
> Regular and long-time denizens of a newsgroup are usually permitted some
> leeway in what they may ask of their old friends. You know not of whom
> you write.
>
> Jerry
You're being tooooooo easy on Mr. Macon who obviously never
investigated Leo's site [ http://www.albany.net ] nor considered the
customized Forth routines he has posted to assist newbies, such as
myself. Nor did he investigate that there was once a long running
thread titled "Re: Is LISP dying?(are Christians Good?)" (among others).
Mr. Macon,
posters and lurkers here have a *BROAD* range of interests.
My personal opinion is that those attracted to Forth use similar
methods to attack problems in a wide range of fields.
Don't bother asking me to support that statement. It is just my "gut
feel".
|
|
0
|
|
|
|
Reply
|
rowlett10 (1881)
|
6/1/2004 4:57:31 PM
|
|
Andreas Kochenburger wrote:
> On Sun, 30 May 2004 08:19:23 -0700, Guy Macon
> <http://www.guymacon.com> wrote:
>
>>Go back and ask Jesus to cure you of spamming posts about Java into
>>newsgroups that are about Forth.
>
>
> I had hoped the Taliban had cured everybody from religious
> intolerance. Obviously not.
Decidedly not. A recent news release from planned parenthood:
In an outrageous move to stop women from preventing unintended
pregnancies, some pharmacists in Wisconsin, Texas, and New York have
refused to fill prescriptions for birth control pills and emergency
contraception (a high dosage of birth control pills).
If that isn't imposing one's religious views on others, neither is
pushing girls back into a burning building because they don't have
their head scarves.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/1/2004 4:58:27 PM
|
|
Guy Macon wrote:
> Andreas Kochenburger <kochenburger@gmx.de> says...
>
>
>>>Go back and ask Jesus to cure you of spamming posts about
>>>Java into newsgroups that are about Forth.
>>
>>I had hoped the Taliban had cured everybody from religious
>>intolerance. Obviously not.
>
>
> I had hoped that by the 21st century everyone would have been cured
> of the "chip on my shoulder" practice of interpreting every comment
> mentioning religion as being religious intolerance. Obviously not.
>
> I am against off-topic advertising in inappropriate newsgroups,
> not religion.
You need a better way to distinguish between advertising and a request
for OT help. And you need to exercise more tolerance toward what you
don't at first like. A more accurate notion of what constitutes spam
would be useful too.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/1/2004 5:03:39 PM
|
|
Jerry Avins <jya@ieee.org> says...
>You need a better way to distinguish between advertising and a request
>for OT help.
I disagree. The original poster needs to realize that his post was
indistiguishable from the dozens and dozens of spam posts that I see
every week advertising some web page. If you look like a duck, quack
like a duck and fly like a duck you shouldn't be terribly offended if
someone mistakes you for a duck.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/1/2004 7:39:24 PM
|
|
Guy Macon wrote:
> Jerry Avins <jya@ieee.org> says...
>
>
>>You need a better way to distinguish between advertising and a request
>>for OT help.
>
>
> I disagree. The original poster needs to realize that his post was
> indistiguishable from the dozens and dozens of spam posts that I see
> every week advertising some web page. If you look like a duck, quack
> like a duck and fly like a duck you shouldn't be terribly offended if
> someone mistakes you for a duck.
As I wrote, you need a better distinguisher. One way to start would have
been to check the level of cross posting, noting from the form of the
URL that the cited web site is not commercial, and from the site's
contents. If those things are not worth doing -- often they are not --
then it behooves you to withhold critical comment. Far worse than going
off half cocked is firing a full load in the wrong direction.
Jerry (http://users.rcn.com/jyavins/)
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/1/2004 8:58:40 PM
|
|
Joona I Palaste <palaste@cc.helsinki.fi> wrote:
> Here we go... This is a very rough draft, because I have only learned
> Klingon from one book. It might have some mistakes.
>
> yeSuS QIStuS pIn'a' puqloD jIHDaq yempu'bogh pung yInobneS
>
> Translation: Jesus Christ, Son of God, do me the honour of giving
> mercy to me, who have sinned.
Very close. The word order is a little off, and you're probably not
aware of a bit of vocabulary that would do well here. Oh, and
{QIStoS} is closer to the Greek for "anointed one". Here's what I'd
suggest:
yeSuS QIStoS, Qun'a' puqloD, jIyempu'bogh jIHvaD pung yInobneS.
{Qun} means "supernatural being". {jIH} "me" is the subject of {yem}
"sin", so it has to come after the verb. {-Daq} is the "locative"
suffix, but {-vaD} "beneficiary" is more appropriate for the recipient
of {nob} "give".
Using the honorific {-neS} is perfect, though. Well done.
--
Alan Anderson, professional programmer and amateur Klingonist
proud member of the Klingon Language Institute since 1995
qo'mey poSmoH Hol -- language opens worlds -- http://www.kli.org/
|
|
0
|
|
|
|
Reply
|
aranders (202)
|
6/1/2004 9:27:24 PM
|
|
Jerry Avins <jya@ieee.org> says...
>As I wrote, you need a better distinguisher.
I would love to have one. Let's see how yours works:
>One way to start would have been to check the level of cross posting,
It was typical of spammers. They tend to crosspost to between one
and five newsgroups to escape filters, and to crosspost to groups
unrelated except that they are next to each other in a sorted
aphabetical list. He crossposted to:
comp.lang.forth,
comp.lang.java.help,
comp.lang.java.programmer,
comp.lang.java
Typical spammer crossposting practice.
>noting from the form of the URL that the cited web site is not commercial,
Another poor identifier. Lots of spam is religious instead of commercial.
>and from the site's contents.
Religious and uses ActiveX which I would be an idiot to enable.
Once again, typical of religious spammers.
....and let us not forget the typical religion spam title...
>If those things are not worth doing -- often they are not --
>then it behooves you to withhold critical comment. Far worse than going
>off half cocked is firing a full load in the wrong direction.
I did not go off half cocked. The post was indistiguishable from
the millions of other religion spams that were sent today. And it
was and still is off-topic in comp.lang.forth.
I did right when I criticized him for being off-topic.
Your proposed distinguishers fail to identify religion spammers.
I don't believe that he wanted help. I believe that he wanted
to promote his religion website through off-topic posting.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/2/2004 11:07:37 AM
|
|
Guy Macon <http://www.guymacon.com> wrote in message news:<IqqdnfmQ3ZHFRiHdRVn-hA@speakeasy.net>...
> Jerry Avins <jya@ieee.org> says...
> >You need a better way to distinguish between advertising and a request
> >for OT help.
> I disagree. The original poster needs to realize that his post was
> indistiguishable from the dozens and dozens of spam posts that I see
> every week advertising some web page.
It was easy for me to distinguish. It had the author of the WINK,
LF, "Simple Forth: Rudiments of a Programming Language" forth tutorial,
etc. in the "from" line.
If you had enough time to dash off a response, and not enough time
to google for
"Leo Wong" Forth
you didn't have enough time to dash off a well-considered response.
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/2/2004 12:39:23 PM
|
|
The intolerance displayed by the christians for Guy's humorously
expressed opinions, and the savageness of their gang-attack on him, are
shocking and sickening, but not really surprising.
DM McGowan II wrote:
>people who inconsiderately mock our faith using abusive speech
Christians feel they have the right to demean non-christians with such
terms as "pagan", "heathen", and "infidel", and to assert that
non-christians are so "wicked" that they deserve to be tortured in hell.
But when a non-christian dares jokingly to display his feelings about
the christian superstition, he is "abusive" and is behaving
"inconsiderately".
"The priesthood have, in all ancient nations, nearly monopolized
learning.... And, even since the Reformation, when or where has existed
a Protestant or dissenting sect who would TOLERATE a free inquiry? The
blackest billingsgate, the most ungentlemanly insolence, the most
yahooish brutality is patiently endured, countenanced, propagated, and
applauded. But touch a solemn truth in collision with a dogma of a sect,
though capable of the clearest proof, and you will soon find you have
disturbed a nest, and the HORNETS WILL SWARM about your legs and hands,
and fly into your face and eyes."
--- John Adams, letter to John Taylor, 1814
"Experience witnesseth that ecclesiastical establishments, instead of
maintaining the purity and efficacy of religion, have had a contrary
operation. During almost fifteen centuries has the legal establishment
of Christianity been on trial. What has been its fruits? More or less,
in all places, pride and indolence in the clergy; ignorance and
servility in the laity; in both, superstition, BIGOTRY, AND PERSECUTION."
--- James Madison
"Whenever we read the obscene stories, the voluptuous debaucheries, the
cruel and tortuous executions, the UNRELENTING VINDICTIVENESS with which
more than half the Bible is filled, it would be more consistant that we
call it the WORD OF A DEMON than the word of god. It is a history of
wickedness that has served to corrupt and BRUTALIZE mankind; and, for my
part, I sincerely detest it, as I detest everything that is cruel."
"As to the book called the bible, it is blasphemy to call it the Word of
God. It is a book of lies and contradictions and a history of bad times
and bad men."
--- Thomas Paine
"Millions of innocent men, women and children, since the introduction of
Christianity, have been BURNT, TORTURED, FINED AND IMPRISONED...."
"I can never join Calvin in addressing his god. He was indeed an
Atheist, which I can never be; or rather his religion was Daemonism. If
ever man worshipped a false god, he did."
"I concur with you strictly in your opinion of the comparative merits of
atheism and demonism, and really see nothing but the latter in the being
worshipped by many who think themselves Christians."
"As you say of yourself, I too am an Epicurian. I consider the genuine
(not the imputed) doctrines of Epicurus as containing everything
rational in moral philosophy which Greece and Rome have left us."
---Thomas Jefferson
Would that there were more true Epicurians in our midst, and fewer
of those intolerant bullies, those christians!
|
|
0
|
|
|
|
Reply
|
w_a_x_man (2782)
|
6/2/2004 5:26:39 PM
|
|
Dr. Bruce R. McFarling <agila61@netscape.net> says...
>
>Guy Macon <http://www.guymacon.com> wrote...
>
>> >You need a better way to distinguish between advertising and a request
>> >for OT help.
>
>> I disagree. The original poster needs to realize that his post was
>> indistinguishable from the dozens and dozens of spam posts that I see
>> every week advertising some web page.
>
>It was easy for me to distinguish. It had the author of the WINK,
>LF, "Simple Forth: Rudiments of a Programming Language" forth tutorial,
>etc. in the "from" line.
>
>If you had enough time to dash off a response, and not enough time
>to google for
>
>"Leo Wong" Forth
>
>you didn't have enough time to dash off a well-considered response.
My response *was* well-considered. I criticized him for advertising
of a religious Java applet that displays the Jesus Prayer in a Forth
newsgroup. You may think that to be acceptable behavior. I don't.
I don't care who he is. I don't have the time or the inclination to
google every person who spams his advertisements into inappropriate
newsgroups on the off chance that this particular spammer is a well
respected author. Being an author doesn't make advertising a
religious Java program in a Forth newsgroup acceptable. Being a
regular participant in said newsgroup is reason to cut someone some
slack in this area, but writing a book is not. Also, spammers often
use other people's names. Should I not respond to the many spams that
are supposedly written by Bill Gates of George W. Bush?
You seem to be very keen on defending the advertising of a religious
Java applet that displays the Jesus Prayer in a Forth newsgroup. Would
you also defend Mr. Wong if he decided to advertise a pornographic Java
applet that displays dirty stories in a Forth newsgroup? Is there some
hidden rule that says that if you write a forth tutorial that you can
now advertise your religious Java applet in a newsgroup that isn't about
religion of Java?
Go ahead and defend him with whatever plea for special privilege you
wish to be applied. He was still off-topic, and your desire that
nobody comment on that fact is unrealistic.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/2/2004 6:43:03 PM
|
|
Guy Macon wrote:
...
> I don't believe that he wanted help. I believe that he wanted
> to promote his religion website through off-topic posting.
You are wrong, like many dogmatic religious activists. (Leo is not one
of those.)
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/2/2004 9:41:37 PM
|
|
Guy Macon wrote:
...
> You seem to be very keen on defending the advertising of a religious
> Java applet that displays the Jesus Prayer in a Forth newsgroup.
Nonsense! We defend a long-time and respected friend. It begins to seem
that it is Leo's religious content that irks you. For shame! I don't
begrudge Leo his religion any more than he begrudges me the lack of one.
Enough of this.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/2/2004 9:48:14 PM
|
|
William James wrote:
> The intolerance displayed by the christians for Guy's humorously
> expressed opinions, and the savageness of their gang-attack on him, are
> shocking and sickening, but not really surprising.
Surely not _the_ William James whose works I've read, brother of Henry?
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/2/2004 9:56:00 PM
|
|
Richard Owlett wrote:
> My personal opinion is that those attracted to Forth use similar methods
> to attack problems in a wide range of fields.
The original of the Java program was a Forth program posted to clf
2003 in the thread "Encouraging Newbies -- A MODEST Proposal [
apologies to J. Swift ;]". See hny.f The Forth program has about
70 languages, the Java applet only about 40 (now including Welsh). I
need help.
Leo
--
http://www.albany.net/~hello/
Quote for Quakers:
"It is well to use a single sentence, repeated over and over and
over again, such as this: "Be Thou my will. Be Thou my will," or
"I open all before Thee. I open all before Thee," or "See earth
through heaven. See earth through heaven." - Thomas R. Kelly, A
Testament of Devotion
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/2/2004 10:17:30 PM
|
|
jonah thomas wrote:
> I consider your explanation quite clear, and I hope you won't feel the
> need to keep explaining when you get flamed further.
>
I don't and won't.
> Probably if you do something similar again -- say christmas -- no one
> here then will object. Or maybe they will, clf seems to have gotten a
> bit more abrasive in the last few years.
>
No need to wait.
\ jp.f Leo Wong 02 June 04 +
\ Jesus Prayer in Forth
\ With apologies for any solecisms.
80 CONSTANT ncols
12 CONSTANT middle-row
CREATE prayers 100 CELLS ALLOT
: emits ( ca u -- ) 0 ?DO COUNT EMIT 100 MS LOOP DROP 1000 MS ;
: ejaculate ( n a -- )
@ COUNT 2>R ncols R@ - 2/ SWAP 1 AND 1 XOR middle-row + AT-XY
2R> emits ;
: now ( n a -- n++ a++ ) >R 1+ R> CELL+ ;
: rest ( -- )
0 middle-row 2DUP AT-XY ncols SPACES 1+ AT-XY ncols SPACES ;
: pray ( # -- )
2* 1+ DUP CELLS prayers + 2DUP ejaculate now ejaculate rest ;
: string, ( a u ) DUP C, 0 ?DO COUNT C, LOOP DROP ;
: puts ( a1 a2 u -- a3 )
DUP IF HERE >R string, R> OVER ! CELL+ ELSE 2DROP THEN ;
: gets ( -- )
prayers CELL+
BEGIN 0 PARSE 2DUP s" done" COMPARE
WHILE puts REFILL 0= ABORT" not done?"
REPEAT 2DROP
prayers - 1 CELLS / 1- prayers ! ;
gets
Unelanvhi Tsisa Galonedv, Unelanvhi Uwetsi,
uha adadolisdi nahna asgani nasgiyai ayv.
Lord Jesus Christ, Son of God,
have mercy on me, a sinner.
Herra Jeesus Kristus, Jumalan Poika,
armahda minua syntist�.
Seigneur, J�sus-Christ, Fils de Dieu,
prends piti� de moi, p�cheur.
Heer Jezus Christus, Zoon van God,
heb medelijden met mij, zondaar.
Jesus Christus, Sohn Gottes,
erbarme Dich meiner, des S�nders.
Kyrie Iesou Christe, Yie Theou,
eleyson me, ton amartolon.
Haku Iesu Kristo, ke Keiki a ke Akua,
e aloha mai oe ia'u i ka mea i hewa.
Adon Yeshua Hamashiach, Ben Elohim,
terachem alay, ani choteh.
He PrabhuYeshu Mashi Parmeshwar
Ke Putra Mujh papi par daya kar.
A Thiarna �osa Christ, mac D�,
dean trocare ormse peacadh.
Signore Ges� Cristo, Figlio di Dio,
abbi piet� di me peccatore.
Shu Iesu Kristo, Kami-no on-ko, Watashitachi,
tumibito wo awarende kudasai.
Iesu Christe, Fili Dei, Domine,
miserere mei peccatoris.
Tuhan Yesus Kristus, Anak Allah,
kasihanilah aku, se orang yang berdosa.
E te Ariki Ihi Karaiti, et te Tama a te Atua,
kia aroha ki ahau, te tangata hara.
Old Church Slavonic Hospodi Isuse Christe,
Syne Bo�ij, pomiluj mja hri�naho
Herr Jesus Christus, du Gottes Saen,
erboarm die aewa mie, en Sinda.
Gospode Isuse Hriste, Sine Bozhiji,
pomiluj me, greshnog.
Se�or Jesucristo, Hijo de Dios,
ten piedad de mi, pecador.
Herre Jesus Kristus, Guds Son,
ben�da mig, en syndare.
Pane Je�i�u Kriste, Synu Bo��,
zmiluj sa nado mnou hrie�nym
Mheshimiwa Yesu Kristo, Mwana wa Mungu,
unionee huruma mimi mwenye dhambi.
Panginoong Hesukristo, Anak ng Diyos,
maawa ka sa aming makasalanan.
Arglwydd Iesu Grist, Fab Duw,
trugarha wrthyf bechadur.
Jesu Kristi, Omo Olorun,
see rere ati anu fun mi, emi elese.
done
\ The Java applet also has:
\ Albanian, Croation, Czech, Danish, Greek in Greek, Haitian,
\ Hebrew in Hebrew, Indian in Devanagari, Hungarian, Icelandic,
\ Norse, Polish, Russian in Cyrillic, Serbian in Cyrillic,
\ Thai, Ukrainian in Cyrillic, Vietnamese
\ Random number generator from Brodie, Starting Forth
VARIABLE rnd TIME&DATE + + + + + rnd !
: random ( -- u ) rnd @ 31421 * 6927 + DUP rnd ! ;
: choose ( u - 0...u-1) random UM* NIP ;
: JP ( -- )
PAGE prayers @ 2/
BEGIN DUP choose pray 1000 MS KEY? UNTIL DROP ;
JP
Leo
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/2/2004 10:21:04 PM
|
|
"Jerry Avins" <jya@ieee.org> wrote in message
news:40be4cf7$0$2983$61fed72c@news.rcn.com...
> William James wrote:
>
> > The intolerance displayed by the christians for Guy's humorously
> > expressed opinions, and the savageness of their gang-attack on him, are
> > shocking and sickening, but not really surprising.
>
> Surely not _the_ William James whose works I've read, brother of Henry?
The same thought struck me while reading through that
long (and somewhat vitriolic) post!
William James (the famous one) doesn't appear to be too
well known these days despite being perhaps one of the
keenest minds of the time.
Ed
|
|
0
|
|
|
|
Reply
|
nospam358 (1421)
|
6/3/2004 4:15:06 AM
|
|
Leo Wong <hello@albany.net> says...
>Quote for Quakers:
>
>"It is well to use a single sentence, repeated over and over and
>over again, such as this: "Be Thou my will. Be Thou my will," or
>"I open all before Thee. I open all before Thee," or "See earth
>through heaven. See earth through heaven." - Thomas R. Kelly, A
>Testament of Devotion
Your direct attack on Quaker beliefs is off topic in a Forth
newsgroup. I respectfully ask you to either post about Forth or
go elsewhere.
Even if it wasn't off-topic, Quaker beliefs are that if someone
attacks your religion you should stand silently and offer no
defense.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 4:58:49 AM
|
|
Jerry Avins <jya@ieee.org> says...
>> You seem to be very keen on defending the advertising of a religious
>> Java applet that displays the Jesus Prayer in a Forth newsgroup.
>
>Nonsense! We defend a long-time and respected friend.
....who is posting about a Java applet in a Forth newsgroup.
This isn't about religion, no matter how much hard you try to
make it be about religion. This is about off-topic posts.
I am still waiting for your answer as to whether it's OK to
post porn or other off-topic material.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 5:03:41 AM
|
|
Jerry Avins <jya@ieee.org> says...
>You are wrong, like many dogmatic religious activists.
Je ne ferai pas � cela l'honneur d'une r�ponse. Aujourd'hui, les
guerres d'invectives ne m'int�ressent pas. Commencez sans moi.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 5:04:49 AM
|
|
Leo Wong <hello@albany.net> says...
>\ Jesus Prayer in Forth
Thank you for posting something about Forth in a Forth newsgroup.
I will give your question the serious consideration it deserves in
the hope that it will encourage you to post on-topic.
>The Forth program has about 70 languages, the Java applet
>only about 40
Is there a reason why you have not ported the languages in the
Forth program to the Java applet?
Is your goal to simply get the Jesus Prayer translated into more
languages or are there programming issues involved?
>Lord Jesus Christ, Son of God,
>have mercy on me, a sinner.
These phrases are all found in the New Testament ("have mercy on
me, a sinner" is in Luke 18:13, for example) and Wycliffe Bible
Translations ( http://www.wycliffe.org ) has translated the New
Testament into 578 languages and has 1,246 partial translations
in progress. Why not get copies of all of those translations
and concatenate the individual phrases that make up your prayer?
How are you handling right-to-left text, vertical text, foreign
character sets, etc? Does Java handle those or do you have to
code it yourself?
Your sample code seems to consist of phonetic approximations of the
various languages using western script. Is there a reason why you
are not using the proper alphabets? Does Java support Unicode?
Is it your intention that the phrases be said aloud? If so, you
might wish to consider adding diacritical marks. Also, this limits
your language choices, excluding click languages and tonal languages.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 5:47:29 AM
|
|
Guy Macon <http://www.guymacon.com> wrote in message news:<10bs7t4iacek081@corp.supernews.com>...
> You seem to be very keen on defending the advertising of a religious
> Java applet that displays the Jesus Prayer in a Forth newsgroup.
You seem to be hung up on whether it is religious or not and
ignoring whether a link to a Java applet translation of
well-known Forth example code is an appropriate post in a
Forth newsgroup.
And it should be obvious that if the initial response was, "in
what way is this appropriate in a Forth newsgroup", then the
reply would have been, "that's the Javification of the original
Forth code, available if you look on the website, but copied
below", and the whole silly religious wars angle would have
been avoided.
Of course, if the code was modified by other people to
spout the foundation of faith of Dar Islam, the middle
way of the Buddha, and a favourite passage from Lao Tzu,
I would be even happier. I see nothing wrong with
proselytising Forth among Christians, though I would
not want that to be the only religious group where
people are proselytising Forth.
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/3/2004 7:26:34 AM
|
|
According to Guy Macon <http://www.guymacon.com>:
> Does Java support Unicode?
It depends on what you call "supporting Unicode".
There are several "Unicode" capabilities:
1. being able to copy and move around Unicode strings without crashing;
2. handling correct Unicode string rendering, including combining
diacritics and bidirectional output;
3. handling appropriate Unicode string input (e.g., line editing);
4. being able to compare and sort Unicode strings (collation) in a way
which is meaningful and "correct" with regards to a configurable set
of languages.
The capability "1" is very easy: virtually all software which can handle
latin-1 strings can use UTF-8 encoded Unicode strings. As long as only
copying is concerned, and search & match is limited to ASCII characters,
Unicode characters other than ASCII are handled as bunch of bytes with
values between 128 and 255. Forth has this capability, just like most
C implementations, and Java too.
"2" is much more difficult. In order to render Unicode strings, you
must have the correct font and implement the "BiDi" algorithm. If your
program runs in some sort of terminal, the font is a terminal property,
and your application does not have to worry about it; however, a Java
applet runs in graphic context and names fonts explicitely. Hence, those
fonts must be installed. This can be done in Java, see for instance:
http://java.sun.com/j2se/1.3/docs/guide/intl/addingfonts.html
"BiDi" is a more complex matter. Right now, it is unclear whether
BiDi should be handled by the terminal or by the application. Current
Unicode-enabled terminals in the Unix world do not handle BiDi
themselves (they expect UTF-8 encoded characters in their order of
graphical appearance). A Java applet runs in its own graphical context
and must reorder the Unicode character according to the BiDi algorithm
prior to handling them to the display system. This means that Java must
recognize individual characters. The Java native character handling type
is "char", a 16-bit unsigned integer, which is defined to contain a
Unicode character value from the first "plane" (which includes western
alphabets, Georgian, Japanese, Thai, Runic... but not Ugaritic). Recent
Java versions (1.4+) include java.text.BiDi, a class which implements
the BiDi algorithm.
Therefore, Java supports "2", provided that the corrected fonts are
installed, and that a recent version of Java is used (unfortunately, the
Microsoft Java VM is _not_ recent). Character handling is limited to the
first Unicode plane (those characters which have a 16-bit value).
"3" is somewhat similar to "2", except that things are even more complex
and unclear. What happens, for instance, when you strike the "right"
arrow key ? Basically, you would like the cursor to graphically move to
the right, but the cursor may presently be in a portion of text where
characters are ordered right-to-left. In order to fully support Unicode,
the input system (line editing and so on) must be BiDi-aware; it must
also know which characters are "combining diacritics" (which are just
"additions" to another character, and, as such, must be skipped when
moving the cursor). As far as I know, the Java GUI systems (even the
recent ones) have only very limited support for that. Of course, that is
not a problem for a programs which outputs prayers but inputs nothing.
"4" is traditionnaly ill-supported. The problem is that two strings
may be ordered differently, depending on the current "language". The
Java solution is to use java.text.Collator, which is locale-driven. If
you want to order strings in the operating system natural order, then
things are likely to work (Java will use the system default locale,
supposedly correct). If you want to handle other languages as well,
then you'd better have the correct locale present; you may also use
java.text.RuleBasedCollator, which can be configured at runtime to
implement new collators even without the corresponding locale support.
In brief: the short answer is that Java supports Unicode, albeit not
as fully as could be wished. But it is not clear which parts of
Unicode should be handled by application code and which parts should
be managed automatically by the operating system and the programming
environment.
Forth is much simpler: it does nothing per se. Every Unicode support
you may dream of MAY be added to Forth with appropriate code, but
is NOT present by default.
--Thomas Pornin
|
|
0
|
|
|
|
Reply
|
pornin (75)
|
6/3/2004 8:06:26 AM
|
|
Dr. Bruce R. McFarling <agila61@netscape.net> says...
>You seem to be hung up on whether it is religious or not
The fact that you continue to claim this despite me clearly
stating on several occasions that my objection was and is
about it being Java instead of Forth says a lot more about
you than it does about me.
As soon as he started talking about Forth, I was only too
happy to have a discussion with him. I can't be sure because
his question was ill-defined, but I may have even been able
to make some suggestions that could help him. The religious
content of his posts don't bother me even a tiny bit - as long
as he is talking about Forth and not Java.
Try to wrap your mind around the foreign concept that there
are people in this world who do not tell lies. Try to at
least consider the possibility that when I tell you that it's
about being off-topic and not about religion, that I might
possibly be one of those people who tell the truth.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 9:44:53 AM
|
|
Thomas Pornin <pornin@nerim.net> says...
>
>According to Guy Macon <http://www.guymacon.com>:
>
>> Does Java support Unicode?
[Excellent discussion of Java/Unicode snipped to save space]
I am looking forward to Mr. Wong clarifying what he is trying
to accomplish. If it's just making the number of languages
the prayer is displayed in bigger and he is happy with using
a western alphabet to do it. he doesn't need Unicode at all.
If the wants to display the prayer properly rendered in such
languages as Chinese and Hebrew, the Forth code he posted
won't do the job. As I said, I am looking forward to Mr. Wong
clarifying what he is trying to accomplish.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 10:34:27 AM
|
|
I'm on my way to work, so I will be brief, but I hope not inaccurate.
Guy Macon wrote:
> Leo Wong <hello@albany.net> says...
>
>
>>\ Jesus Prayer in Forth
>
>
> Thank you for posting something about Forth in a Forth newsgroup.
> I will give your question the serious consideration it deserves in
> the hope that it will encourage you to post on-topic.
>
>
>>The Forth program has about 70 languages, the Java applet
>>only about 40
>
>
> Is there a reason why you have not ported the languages in the
> Forth program to the Java applet?
Because I don't know how to say:
Lord Jesus Christ, Son of God,
have mercy on me, a sinner.
in them.
>
> Is your goal to simply get the Jesus Prayer translated into more
> languages or are there programming issues involved?
>
>
>>Lord Jesus Christ, Son of God,
>>have mercy on me, a sinner.
>
>
There are several unresolved Java issues that I'm still working on. The
Forth may also have an issue I'll look at today.
> These phrases are all found in the New Testament ("have mercy on
> me, a sinner" is in Luke 18:13, for example) and Wycliffe Bible
> Translations ( http://www.wycliffe.org ) has translated the New
> Testament into 578 languages and has 1,246 partial translations
> in progress. Why not get copies of all of those translations
> and concatenate the individual phrases that make up your prayer?
>
One problem is the grammar. I've made several errors. Another problem
is "have mercy on me" appears more than once (Luke 18:13, Mark 10:47,
Matthew 8:29) and is not always translated the same way. A third
problem is that I want standard ways of saying:
Lord Jesus Christ, Son of God,
have mercy on me, a sinner.
which is a old prayer, not just something made up, and I was hoping that
some people knew it as a prayer. As it is now, most of what I have is
what I or others have made up ("This is how I would say....").
> How are you handling right-to-left text, vertical text, foreign
> character sets, etc? Does Java handle those or do you have to
> code it yourself?
The right-to-left is a problem in Java -- or I should say in some
browsers/jvms. The Hebrew sometimes appears backwards. Please see this
thread in comp.lang.java.programmer. I haven't decided about vertical
text. I'll see some Chinese and Japanese friends soon, and I'll ask
fortheir advice. Rendering will be one of the questions. I'll probably
go horizontal, but I haven't decided.
>
> Your sample code seems to consist of phonetic approximations of the
> various languages using western script. Is there a reason why you
> are not using the proper alphabets? Does Java support Unicode?
> Is it your intention that the phrases be said aloud? If so, you
> might wish to consider adding diacritical marks. Also, this limits
> your language choices, excluding click languages and tonal languages.
>
I don't know how to do Unicode in ANS Standard Forth. Java supports
Unicode up to 0xffff. Not everything shows up in all fonts. Please see:
http://www.albany.net/~hello/jp4.htm
(WARNING: PORNOGRAPHY)
Now, can you help me say:
Lord Jesus Christ, Son of God,
have mercy on me, a sinner?
in a language I don't have, or make corrections to versions I have?
Btw, Thomas R. Kelly is my favorite Quaker writer.
God bless,
Leo
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/3/2004 11:21:27 AM
|
|
Ed wrote:
> "Jerry Avins" <jya@ieee.org> wrote in message
> news:40be4cf7$0$2983$61fed72c@news.rcn.com...
>
>>William James wrote:
>>
>>
>>>The intolerance displayed by the christians for Guy's humorously
>>>expressed opinions, and the savageness of their gang-attack on him, are
>>>shocking and sickening, but not really surprising.
>>
>>Surely not _the_ William James whose works I've read, brother of Henry?
>
>
> The same thought struck me while reading through that
> long (and somewhat vitriolic) post!
>
> William James (the famous one) doesn't appear to be too
> well known these days despite being perhaps one of the
> keenest minds of the time.
>
> Ed
>
>
>
>
>
Have you read Barzun's From Dawn to Decadence, p. 666-668 (bit of
trivia: my name appears on p. xii of the paperback edition) or his
Stroll with William James? Of course, we in our generation have all
read several of William's and Henry's books. I haven't found any Forth
code in them, though.
Leo
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/3/2004 11:25:58 AM
|
|
I am not happy with using just the Western alphabet.
Leo Wong
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/3/2004 12:57:59 PM
|
|
Dear Leo,
> I haven't decided about vertical
> text. I'll see some Chinese and Japanese friends soon, and I'll ask
> fortheir advice.
Traditionally, both Chinese and Japanese are written top to bottom,
right to left. I. e., each line is written top to bottom, with the next
line, if any, to the left of its previous one. On occasion, for
instance, in technical writing, Japanese is written left to right, top
to bottom. Chinese I am not so sure about. I know it can be written
right to left, top to bottom, but I have heard that that may be changing
to left to right, top to bottom.
Best regards,
Bill
|
|
0
|
|
|
|
Reply
|
bspight1 (63)
|
6/3/2004 1:22:08 PM
|
|
Jerry
<clerihew>
Jerry Avins
Is counted among the Forth mavens.
As there are very few,
He merits a clerihew.
</clerihew>
said: Enough of this.
Perhaps an explanation (not apology, except in the Newman sense) from me
will end this fruitless thread.
Having posted about Forth since the GEnie days, I had gotten comfortable
with clf. I have corresponded with some Forthers by e-mail, have given
money to some, and have talked with some, have met some. I knew them to
be interesting and intelligent people, knowledgeble of many things,
including languages, and mostly tolerable of my foibles. So when I posted
here asking for off-topic help, I did not think that formalities were
required. I did not reckon with Guy Macon, Albert van der Horst,
"andrew", "William James," "Gawnsoft." I did not know that some people
equated "OT" with "PORN".
I didn't advertise a Java applet to a Forth newsgroup. I must have more
than a hundred Java applets on my site. I don't remember mentioning one
of them to this group, except perhaps as an aside (will someone please
correct me if I'm wrong?). I haven't done the demographics, but is
advertising a Java applet to a Forth group very effective?
The Java applet had to be mentioned because it shows the languages I have
in the format I use. This is still the best way to view what I have:
http://www.albany.net/~hello/jp4.htm
It will then be obvious why I am using Java and not Forth (did somebody
say Unicode?). How else can what I have be corrected?
I don't consider my site a religious site, though I am glad that it is
thought so. "Stories by Mary Murphy" are not primarily religious.
Jacques Barzun is not famous for his church going. It would make me happy
if someone went through the Forth pages and found religion, but that is
not why I wrote them. My Java is for Jesus, but that is because I'm a
lousy Java programmer, and the Java applets are only justified by their
dedicatee. The Gospel Scene contains my best writing--Jacques Barzun
called it "elegant and eloquent" (but then, he's a friend)-- plus a nice
essay by John Jay Chapman, an anti-Catholic bigot whose writings I love.
I recommend my Gosple Scenes as literature to anyone who enjoys English
prose (remember though, it's somewhat dated--written by someone who went
to Columbia College in the 60s).
I am Roman Catholic, but not particularly Christian or hierarchical. I
have it against the American Bishops that they did not vigorously oppose
the war in Iraq. I have friends of all kinds. The friend I most love
being with is an atheist. The friend I see most often is homosexual.
Despite my opposition to the Iraq war, I maintain a friendly
correspondence with the wife of a person very much involved in that war
(you would all known his name). I do love the Pope, but for some reason I
haven't finished any of his books. My family would support me if I
declined to be his successor.
I pride myself on my tact, so I resent it when people tell me how to make
friends or how I should write, but I recognize that pride and resentment
are faults. I'm taken aback when someone who is honest takes me for a
liar. People who know me know how bad I am at dissembling. I'm surprised
when a Quaker thinks that my quoting Thomas R. Kelly, a famous Quaker all
of whose books and whose biography I have read, is an attack against
Quakers, a sect I love more than Roman Catholicism (though how can one not
love what produced Palestrina, Victoria, Bernini, Fra Angelico, Dante,
Berlioz [who may have died an atheist, but who wrote a Requiem, a Te Deum,
an Infant Christ], while hating its evils?).
So when I asked clf for versions of:
Lord Jesus Christ, Son of God,
have mercy on me, a sinner.
I really was asking for off-topic help.
I still am.
Leo Wong, off-topic
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/3/2004 2:37:49 PM
|
|
I found 18 more posts on this topic today, close to a third of the
unread posts in the group.
We're spamming ourselves about this trivial matter far, far more than
Leo might hypothetically have spammed us in the first place.
Please stop.
|
|
0
|
|
|
|
Reply
|
j2thomas (670)
|
6/3/2004 2:52:11 PM
|
|
I briefly went to Chinese school, so I know how Chinese is traditionally
written, though I have no idea how to say:
Lord Jesus Christ, Son of God,
have mercy on me, a sinner.
in Chinese in any direction.
I have friends from Taiwan who could easily give me the Cantonese and the
Mandarin and perhaps some others, but they're not Christian, so I've been
shy about asking them and have been waiting to get home to ask my family.
When there I'll also try to get Japanese and Korean and other Pacific Rim
languages. If would be nice, though, to get these and other languages now
and forget about Java and Forth while on vacation.
I now regret I posted to clf, not because the original post was off-topic,
but because it resulted in ill-feeling and no additional languages (from
this group). I should thank the friends who spoke up for me here. And so
I do: Thanks!
It's ironic I'm irenic.
Leo Wong, off topic
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/3/2004 3:00:28 PM
|
|
Bill Spight <bspight@pacXbell.net> says...
>Traditionally, both Chinese and Japanese are written top to bottom,
>right to left. I. e., each line is written top to bottom, with the next
>line, if any, to the left of its previous one. On occasion, for
>instance, in technical writing, Japanese is written left to right, top
>to bottom. Chinese I am not so sure about. I know it can be written
>right to left, top to bottom, but I have heard that that may be changing
>to left to right, top to bottom.
Japanese written left to right, top to bottom (same order as English)
is what I see most often on commercials and television shows when I
visit Japan.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 3:15:08 PM
|
|
ok
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/3/2004 3:17:41 PM
|
|
ok
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/3/2004 3:18:03 PM
|
|
LeoWong <hello@albany.net> says...
>I did not know that some people equated "OT" with "PORN".
They don't. A serious question was asked, and nobody has attempted
to give an answer. The question is this; Given the argument presented
that it OK for Leo Wong to post about a Java applet that displays the
Jesus Prayer in a Forth newsgroup, would it be OK for Leo Wong to post
about a Java applet that displays pornographic stories in a Forth
newsgroup? If the answer is "no porn", then I must conclude that
the person presenting the argument thinks that some off-topic posts
are OK but others are not. If the answer is "porn is OK too", then I
must conclude that the person presenting the argument thinks that all
off-topic posts are OK.
>I'm surprised when a Quaker thinks that my quoting Thomas R.
>Kelly, a famous Quaker all of whose books and whose biography
>I have read, is an attack against Quakers, a sect I love more
>than Roman Catholicism
I must tread carefully here, because my religious convictions
forbid me defending my religion when I believe it to be attacked,
but I think that it's fair that you know why I think that what
you wrote was an attack.
I wrote:
"...the particular branch of Christianity that I happen to
belong to (Quaker), which does not believe in rote prayers."
to which you replied:
"Quote for Quakers:
'It is well to use a single sentence, repeated over and over
and over again, such as this: "Be Thou my will. Be Thou
my will," or "I open all before Thee. I open all before
Thee," or "See earth through heaven. See earth through
heaven." - Thomas R. Kelly, A Testament of Devotion
In my opinion, that was a direct attack on the religious belief
that I expressed. just as surely as if you had expressed the
commonly held Roman Catholic view that the pope is the head of
the church and I quoted some famous Catholic who believes the
exact opposite. The quote you chose is in direct opposition
to what most Quakers believe:
"Prayer and thanksgiving are an important part of worship.
May they be offered in spirit and in truth, with a right
understanding, seasoned with grace. When engaged therein,
avoid many words and repetitions, and be cautious of too
often repeating the High and Holy Name of God."
-Online faith & practice (Quaker book of discipline)
As I said, my religious convictions forbid me from defending
the Quaker view on this subject, but I think that it's OK for
me to explain why I think what you wrote was an attack.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 3:52:31 PM
|
|
Guy Macon wrote:
> Leo Wong <hello@albany.net> says...
>
>
>>Quote for Quakers:
>>
>>"It is well to use a single sentence, repeated over and over and
>>over again, such as this: "Be Thou my will. Be Thou my will," or
>>"I open all before Thee. I open all before Thee," or "See earth
>>through heaven. See earth through heaven." - Thomas R. Kelly, A
>>Testament of Devotion
>
>
> Your direct attack on Quaker beliefs is off topic in a Forth
> newsgroup. I respectfully ask you to either post about Forth or
> go elsewhere.
>
> Even if it wasn't off-topic, Quaker beliefs are that if someone
> attacks your religion you should stand silently and offer no
> defense.
Guy,
I must conclude either that you are ignorant of usenet practice, or you
have lost your sense of proportion. I assume the former and offer these
facts:
There can be no expectation that a sig, that part of a message below the
'-- ', bear any relation to the topic of the rest of it. The quote which
you construed as an attack was a sig.
Thomas R. Kelly was an eminent Quaker activist and philosopher, as well
as a prolific author. Quoting him in a sig is probably an indication of
approval or admiration. One I use occasionally, "Conventionality is not
morality. Self-righteousness is not religion", is not intended to attack
Charlotte Bronte.
Jerry
--
If I am not for myself, who is for me? If I am only for myself,
what am I? If not now, when? --- Rabbi Hillel
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/3/2004 4:13:18 PM
|
|
Guy Macon wrote:
> Thomas Pornin <pornin@nerim.net> says...
>
>>According to Guy Macon <http://www.guymacon.com>:
>>
>>
>>>Does Java support Unicode?
>
>
> [Excellent discussion of Java/Unicode snipped to save space]
>
> I am looking forward to Mr. Wong clarifying what he is trying
> to accomplish. If it's just making the number of languages
> the prayer is displayed in bigger and he is happy with using
> a western alphabet to do it. he doesn't need Unicode at all.
> If the wants to display the prayer properly rendered in such
> languages as Chinese and Hebrew, the Forth code he posted
> won't do the job. As I said, I am looking forward to Mr. Wong
> clarifying what he is trying to accomplish.
It may please you to learn that Leo is an eclectic pixie. I gather that
he programs professionally in some capacity for an agency of New York
State, and amuses himself and others by creating programs, clerihews and
other minor and major works worthy permanence. Those who are not amused
need not read.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/3/2004 4:25:16 PM
|
|
LeoWong wrote:
> I briefly went to Chinese school, so I know how Chinese is traditionally
> written, though I have no idea how to say:
>
> Lord Jesus Christ, Son of God,
> have mercy on me, a sinner.
>
> in Chinese in any direction.
>
> I have friends from Taiwan who could easily give me the Cantonese and the
> Mandarin and perhaps some others, but they're not Christian, so I've been
> shy about asking them and have been waiting to get home to ask my family.
> When there I'll also try to get Japanese and Korean and other Pacific Rim
> languages. If would be nice, though, to get these and other languages now
> and forget about Java and Forth while on vacation.
Go ahead and ask them. If they take offense at being asked to use their
knowledge to help you in your quest, they're hardly friends anyway.
> I now regret I posted to clf, not because the original post was off-topic,
> but because it resulted in ill-feeling and no additional languages (from
> this group). I should thank the friends who spoke up for me here. And so
> I do: Thanks!
Let's hope that the ill feeling has dissipated.
> It's ironic I'm irenic.
Ironic? Turning the other cheek often results in a second slap!
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/3/2004 4:36:16 PM
|
|
Guy Macon wrote:
> ... A serious question was asked, and nobody has attempted
> to give an answer. The question is this; Given the argument presented
> that it OK for Leo Wong to post about a Java applet that displays the
> Jesus Prayer in a Forth newsgroup, would it be OK for Leo Wong to post
> about a Java applet that displays pornographic stories in a Forth
> newsgroup? ...
Given that I know who Leo is from associating with him here and from
reading some of what he's written, it is OK with me for him to post
whatever he chooses; not because pornography would be OK, but because of
what I know he would not choose.
Jerry
--
If I am not for myself, who is for me? If I am only for myself,
what am I? If not now, when? --- Rabbi Hillel
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/3/2004 5:11:03 PM
|
|
Jerry Avins <jya@ieee.org> says...
>I must conclude either that you are ignorant of usenet practice, or you
>have lost your sense of proportion. I assume the former
You would do better to assume the latter. I am demonstrably far from
being ignorant of Usenet practice, but like all humans I am capable of
losing my sense of proportion. You are, however, committing a logical
fallacy by offering only two choices.
>There can be no expectation that a sig, that part of a message below the
>'-- ', bear any relation to the topic of the rest of it. The quote which
>you construed as an attack was a sig.
This is true in the normal case, but in this case Mr. Wong went out of
his way to make it clear that this .sig did bear a relation to the topic
under discussion.
[1] He usually posts without a .sig.
[2] He clearly labeled the .sig "Quote for Quakers:".
[3] The quote chosen was a direct contradiction to what I had
posted a few articles previously.
There can be no expectation that a sig, that part of a message
below the '-- ', *not* bear any relation to the topic of the
rest of it either. You have to determine that by the content
of the .sig and of the thread.
--
Quote for comp.lang.forth posters with the initials JA, ieee.org
email addresses, and posts that bring the following to mind:
"Usenet is like a herd of performing elephants with
diarrhea -- massive, difficult to redirect, awe-inspiring,
entertaining, and a source of mind-boggling amounts of
excrement when you least expect it." -Gene Spafford,1992
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 8:01:33 PM
|
|
Leo Wong <hello@albany.net> says...
>> Is there a reason why you have not ported the languages in the
>> Forth program to the Java applet?
>
>Because I don't know how to say:
>
>Lord Jesus Christ, Son of God,
>have mercy on me, a sinner.
>
>in them.
But you said previously that "The Forth program has about 70 languages,
the Java applet only about 40." How did you manage to program the Jesus
Prayer into the Forth program in 70 languages without knowing how to say
the Jesus Prayer in 70 languages?
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 8:13:14 PM
|
|
Jerry Avins <jya@ieee.org> says...
>
>Guy Macon wrote:
>
>> ... A serious question was asked, and nobody has attempted
>> to give an answer. The question is this; Given the argument presented
>> that it OK for Leo Wong to post about a Java applet that displays the
>> Jesus Prayer in a Forth newsgroup, would it be OK for Leo Wong to post
>> about a Java applet that displays pornographic stories in a Forth
>> newsgroup? ...
>
>Given that I know who Leo is from associating with him here and from
>reading some of what he's written, it is OK with me for him to post
>whatever he chooses; not because pornography would be OK, but because of
>what I know he would not choose.
This does not answer my question. I didn't ask whether Leo Wong
would choose to post about a Java applet that displays pornographic
stories in a Forth newsgroup. I can guess the answer to that question.
I asked whether it would be OK for Leo Wong to post about a Java applet
that displays pornographic stories in a Forth newsgroup should he
choose to do so. The question I actually asked remains unanswered.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 8:19:14 PM
|
|
Guy Macon <http://www.guymacon.com> says...
>
>Jerry Avins <jya@ieee.org> says...
>
>>I must conclude either that you are ignorant of usenet practice, or you
>>have lost your sense of proportion. I assume the former
>
>You would do better to assume the latter. I am demonstrably far from
>being ignorant of Usenet practice, but like all humans I am capable of
>losing my sense of proportion. You are, however, committing a logical
>fallacy by offering only two choices.
>
>>There can be no expectation that a sig, that part of a message below the
>>'-- ', bear any relation to the topic of the rest of it. The quote which
>>you construed as an attack was a sig.
>
>This is true in the normal case, but in this case Mr. Wong went out of
>his way to make it clear that this .sig did bear a relation to the topic
>under discussion.
>
>[1] He usually posts without a .sig.
>
>[2] He clearly labeled the .sig "Quote for Quakers:".
>
>[3] The quote chosen was a direct contradiction to what I had
> posted a few articles previously.
>
>There can be no expectation that a sig, that part of a message
>below the '-- ', *not* bear any relation to the topic of the
>rest of it either. You have to determine that by the content
>of the .sig and of the thread.
>
>--
>Quote for comp.lang.forth posters with the initials JA, ieee.org
>email addresses, and posts that bring the following to mind:
>
> "Usenet is like a herd of performing elephants with
> diarrhea -- massive, difficult to redirect, awe-inspiring,
> entertaining, and a source of mind-boggling amounts of
> excrement when you least expect it." -Gene Spafford,1992
If you think that the above refers to you, I must conclude either
that you are ignorant of usenet practice, or you have lost your
sense of proportion.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/3/2004 8:35:08 PM
|
|
"LeoWong" <hello@albany.net> wrote in message news:<6ba535d6bc31f0c7bdb11750dc24c935@localhost.talkaboutprogramming.com>...
>
> I now regret I posted to clf, not because the original post was off-topic,
> but because it resulted in ill-feeling and no additional languages (from
> this group). I should thank the friends who spoke up for me here. And so
> I do: Thanks!
>
You did warn us all it was Off Topic and you weren't broadcasting to a
bazillion usenet groups.
Actually, posting here wasn't a bad idea since some very knowledgable
people hang out here. Arcane computer questions (especially non-PC
related) are generally welcome if you don't know who else to ask.
Politics and religion seem to be hot buttons, though.
I would be interested in using Forth to handle Unicode. Sounds like a
Windows API thing.
--
Brad
|
|
0
|
|
|
|
Reply
|
nospaambrad1 (568)
|
6/3/2004 8:46:44 PM
|
|
"Brad Eckert" <nospaambrad1@tinyboot.com> wrote in message
news:7d4cc56.0406031246.7d6e5019@posting.google.com...
> I would be interested in using Forth to handle Unicode. Sounds like a
> Windows API thing.
So would I; there are some stumbling blocks (some of which have been aired
on clf before).
UTF encodings were obviosly not considered at that time; the timeline at
www.unicode.org suggests a date of around 1990-92 for Unicode 1.0, which
predates ANS Forth by a couple of years. The standard precludes any sensible
workarounds for a lot of this; the standard looks back to ANSI X3.4-1974 and
ISO 646-1983. This is not a criticism of the standard, btw, just an
observation...
1. Windows is UTF-16 based; UTF-8 is barely tolerated, if at all.
2. What happens to C@ C! CHAR COUNT etc. My preference is for U@ (or UC@)
etc, and leave Cx alone. However, the non-normative annex "A.3.1.2 Character
types" suggests that C@ and C! deal with true "characters"; I find the whole
section confusing in the light of, say, a UTF-16 implementation of Forth.
3. Lengths, for instance READ-LINE WRITE-LINE ; are they bytes or words if
UTF-16? Do they differ from READ-FILE and WRITE-FILE ? The ANS standard
x-FILE and x-LINE words specifically mention chracters.
Windows is pretty schizoid on this subject, even with the support of type
TCHAR; some functions return bytes, others characters (ie words for Unicode
code). Strong typing doesn't always help.
4. Block support; probably academic on Windows anyhow.
Given the above, I'd say that under Windows UTF-8 is the way to go; I know
(1) is a problem, but translating to UTF-16 isn't too difficult, and Cx and
friends could remain byte based, with some Ux words to help. For instance,
COUNT counts bytes, UCOUNT treats the string as UTF-8 and adds 1 extra for
every byte with the leftmost bit set.
I can't say I can see how a clean implementation would work, though; I would
appreciate a discussion of annex "A.3.1.2 Character types" based on this
requirement.
--
Regards
Alex McDonald
|
|
0
|
|
|
|
Reply
|
alex_mcd (751)
|
6/3/2004 9:56:57 PM
|
|
On Thu, 3 Jun 2004 21:56:57 +0000 (UTC), "Alex McDonald"
<alex_mcd@btopenworld.com> wrote:
>"Brad Eckert" <nospaambrad1@tinyboot.com> wrote in message
>news:7d4cc56.0406031246.7d6e5019@posting.google.com...
>> I would be interested in using Forth to handle Unicode. Sounds like a
>> Windows API thing.
>
>So would I; there are some stumbling blocks (some of which have been aired
>on clf before).
These issues have been debated before. At the time of ANS/ISO
Forth, I was tasked to do the internationalisation proposal, but
didn't finish in time. The draft proposals for internationalisation
and wide character sets are available from:
www.mpeltd.demon.co.uk/arena.htm
These were done with considerable help from Peter Knaggs, Nick
Nelson and Willem Botha. Nick and Willem have done Japanese and
Chinese. A sample implementation of the proposal is provided
with all versions of VFX Forth in:
Lib\International.fth
Stephen
--
Stephen Pelc, stephenXXX@INVALID.mpeltd.demon.co.uk
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeltd.demon.co.uk - free VFX Forth downloads
|
|
0
|
|
|
|
Reply
|
stephenXXX1 (459)
|
6/4/2004 12:53:04 AM
|
|
According to Alex McDonald <alex_mcd@btopenworld.com>:
> 2. What happens to C@ C! CHAR COUNT etc. My preference is for U@
> (or UC@) etc, and leave Cx alone. However, the non-normative annex
> "A.3.1.2 Character types" suggests that C@ and C! deal with true
> "characters"; I find the whole section confusing in the light of, say,
> a UTF-16 implementation of Forth.
Actually, UTF-8 is a trick used to encode arbitrary Unicode characters
into what looks like an inconspicuous byte-oriented character string.
I have done some programming work related to UTF-8 in C. C is byte
oriented(*) and provides some basic string manipulation functions such
as strlen(), which you really want to use because they are often much
more efficient than hand-coded versions. However, X.509 and other
security-related standards seem to love Unicode strings, so it makes
sense to handle such strings UTF-8 encoded.
It so happens that most communication protocols which can be used to
transfer arbitrary strings use ASCII characters for management: for
instance, in an e-mail, or for SMTP, or NNTP, header lines begin with
an _ASCII_ name, followed by a colon and then arbitrary bytes, up to
the CR+LF which ends the line. UTF-8 is such that ASCII characters
are encoded unmodified, and all Unicode characters following U+0080
are encoded as octet values between 128 and 253. Therefore, a simple
ASCII-matcher can work with UTF-8: that several successive bytes with
values above 128 contribute to one or several Unicode characters is of
no importance. The software just has to handle them as a sequence of
bytes of undetermined semantics. Hence, nothing requires to be done
in order to handle UTF-8 as long as "important" characters (those upon
whose value the software has to make decisions) are ASCII.
Therefore, in my C software, the only UTF-8-aware functions I had to
program are a format checker (just to see if this is valid UTF-8) and
a character counter (because in ASN.1, size constraints on UTF8String
apply to the number of characters and not to the byte length). All
other string manipulations use standard functions such as strlen() or
sprintf().
Characters are a graphic notion; hence, Unicode characters and their
properties (graphical width and height, combining characters, visual
ordering...) must be taken into account when dealing with input and
output within a user interface. What I am trying to tell in this message
is the following: if a user interface is NOT involved, then Unicode
characters needs very rarely to be recognized; they MAY be handled UTF-8
encoded, as opaque chunks. In that way, most Forth programs which run
on octet-oriented architectures and work with strings are already as
Unicode-aware as they need be, through UTF-8 encoding.
Some ANS Forth words deal with graphical I/O (e.g., AT-XY) and these
would have to be modified in order to fully support Unicode.
UTF-16 is to so-called "BMP" what UTF-8 is to ASCII. What I call "BMP"
is a string encoding where each character is represented as a 16-bit
value. This works with the first plane of Unicode. This is what is done
is Java. This is also what was done in Windows, when Unicode had only
one place. When Unicode got extended, UTF-16 was designed (and provides
access to a bit more than one million possible character values) and
Microsoft adopted it in its system. This is a good illustration:
conversion from BMP to UTF-16 was easy and they could get with it
rather effortlessly and without compromising binary compatibility with
applications.
Forth architectures which already have 16-bit characters with Unicode
values can then use UTF-16 in quite the same way.
> Given the above, I'd say that under Windows UTF-8 is the way to go
The main distinct advantage of UTF-8 over UTF-16 is that UTF-8 has
no endianness problem: hence, it can be used for exchanging data with
a minimal amount of problems.
--Thomas Pornin
(*) I cheat a bit here. In the C world, "byte" means "unsigned char,
at least 8 bits", which is not the same as "octet". C compilers with
non-octet bytes have existed and still exist on some architectures
(mostly micro-controllers for embedded systems), but this breaks too
much "desktop computer" code; therefore, C compilers on Windows or Unix
machines are very unlikely to have bytes which are not octets. For
instance, when you deal with Internet TCP connections, you exchange
octets.
|
|
0
|
|
|
|
Reply
|
pornin (75)
|
6/4/2004 7:56:02 AM
|
|
"Alex McDonald" <alex_mcd@btopenworld.com> wrote in message news:<c9o6r9$f38$1@sparta.btinternet.com>...
> UTF encodings were obviosly not considered at that time; the timeline at
> www.unicode.org suggests a date of around 1990-92 for Unicode 1.0, which
> predates ANS Forth by a couple of years. The standard precludes any sensible
> workarounds for a lot of this; the standard looks back to ANSI X3.4-1974 and
> ISO 646-1983. This is not a criticism of the standard, btw, just an
> observation...
Yet while the standard assume that characters are coherent, uniform
units, it does not assume that characters are byte wide.
> 1. Windows is UTF-16 based; UTF-8 is barely tolerated, if at all.
So a 16-bit char would go with this.
> 2. What happens to C@ C! CHAR COUNT etc. My preference is for U@ (or UC@)
> etc, and leave Cx alone. However, the non-normative annex "A.3.1.2 Character
> types" suggests that C@ and C! deal with true "characters"; I find the whole
> section confusing in the light of, say, a UTF-16 implementation of Forth.
It gets confusing with UTF-8, sure. But with UTF16, C@ fetches a
16-bit wide character, C! stores one, COUNT extracts a 16-bit wide
char-sized count from a string of 16-bit wide characters. With the
previous byte-char standards embedded in Unicode, that's
straightforward ... the standard characters have the same values, they
just occupy more space than before.
> 3. Lengths, for instance READ-LINE WRITE-LINE ; are they bytes or words if
> UTF-16? Do they differ from READ-FILE and WRITE-FILE ? The ANS standard
> x-FILE and x-LINE words specifically mention chracters.
Words. Don't differ, all characters.
We've had this discussion before ... if you stick to the ANS Forth
standard, you can work in characters in the abstract and cells in the
abstract, but you do not have an octet in particular. So it would be
O@, O!, OCOUNT, etc. required to disentangle chars-as-chars and
chars-as-octets.
> Windows is pretty schizoid on this subject, even with the support of type
> TCHAR; some functions return bytes, others characters (ie words for Unicode
> code). Strong typing doesn't always help.
More headaches for the implementor of a Windows Forth. There but for
the grace of God goes you or I ... but that's Windows. Conversion
between chars and octets would be required for those functions that
are a legacy of byte=char days.
> 4. Block support; probably academic on Windows anyhow.
1000 chars per block, 2K blocks. A block of standard character codes
has to be portable, so its size that gives way.
> Given the above, I'd say that under Windows UTF-8 is the way to go;
Basically, the above that points toward UTF-8 is implicitly a heavy
reliance
on the size of characters being exactly 8 bits, everything else points
to
UTF-16.
> I know (1) is a problem, but translating to UTF-16 isn't too difficult,
> and Cx and friends could remain byte based, with some Ux words to help.
I dunno. I reckon if the code is well-written, it would be easier
to provide some Ox words to cover for Cx words that are talking
about small integers instead of characters as such, and leave
character strings as simple strings of uniform sized units. The
UTF-16 port is a matter of doing a scan for all char-using words
and deciding if they are actually characters or are small integers,
and if so whether it is necessary to convert them to bytes, while
the UTF-8 port has to introduce juggling between the length of
strings and the number of chars in a string.
> I can't say I can see how a clean implementation would work, though; I would
> appreciate a discussion of annex "A.3.1.2 Character types" based on this
> requirement.
I think that a clean implementation would either:
- rely on UTF-16 and the ANS Forth abstractions
- rely on UTF-8 and come up with suitable abstractions for the
variable-sized character problem.
Indeed, it may be that for processing UTF-16 is better anyway,
and UTF-8 is more for file/stream input and output, in which
case the extension might be more focused on extending the
file access modes to open files for UTF8 access, doing
the conversion during the file reads and writes.
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/4/2004 8:19:07 AM
|
|
"Stephen Pelc" <stephenXXX@INVALID.mpeltd.demon.co.uk> wrote in message
news:40bfc681.441653609@192.168.0.1...
> On Thu, 3 Jun 2004 21:56:57 +0000 (UTC), "Alex McDonald"
> <alex_mcd@btopenworld.com> wrote:
>
> >"Brad Eckert" <nospaambrad1@tinyboot.com> wrote in message
> >news:7d4cc56.0406031246.7d6e5019@posting.google.com...
> >> I would be interested in using Forth to handle Unicode. Sounds like a
> >> Windows API thing.
> >
> >So would I; there are some stumbling blocks (some of which have been
aired
> >on clf before).
>
> These issues have been debated before. At the time of ANS/ISO
> Forth, I was tasked to do the internationalisation proposal, but
> didn't finish in time. The draft proposals for internationalisation
> and wide character sets are available from:
> www.mpeltd.demon.co.uk/arena.htm
>
> These were done with considerable help from Peter Knaggs, Nick
> Nelson and Willem Botha. Nick and Willem have done Japanese and
> Chinese. A sample implementation of the proposal is provided
> with all versions of VFX Forth in:
> Lib\International.fth
>
Thank you. I shall review them. I've never downloaded VFX (or SwiftForth)
for fear of "stealing" code into Win32Forth. Without downloading it, can you
tell me what the copyright is on Lib\International.fth?
--
Regards
Alex McDonald
|
|
0
|
|
|
|
Reply
|
alex_mcd (751)
|
6/4/2004 8:46:44 AM
|
|
On Fri, 4 Jun 2004 08:46:44 +0000 (UTC), "Alex McDonald"
<alex_mcd@btopenworld.com> wrote:
>Thank you. I shall review them. I've never downloaded VFX (or SwiftForth)
>for fear of "stealing" code into Win32Forth.
Thank you. That makes a change.
> Without downloading it, can you
>tell me what the copyright is on Lib\International.fth?
From the file header:
((
Copyright (c) 2001
MicroProcessor Engineering
133 Hill Lane
Southampton SO15 5AF
England
tel: +44 (0)23 8063 1441
fax: +44 (0)23 8033 9691
net: mpe@mpeltd.demon.co.uk
tech-support@mpeltd.demon.co.uk
web: www.mpeltd.demon.co.uk
From North America, our telephone and fax numbers are:
011 44 23 8063 1441
011 44 23 8033 9691
PLEASE NOTE THAT OUR PHONE NUMBER CHANGED IN APRIL 2000
You are free to use this code in any way, as long as the MPE
copyright notice in this section is retained.
This code is an implementation of the draft ANS internationalisation
specification available from the download area of the MPE web site.
The implementation provides more functionality than is required by
the ANS draft standard and provides enough hooks to be the basis of
a practical system.
To do
=====
Change history
==============
20031210 SFP003 Updated for VFX Forth 3.60
20010425 SFP002 Added GET-ESCAPE
20010326 SFP001 First release
))
Stephen
--
Stephen Pelc, stephenXXX@INVALID.mpeltd.demon.co.uk
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeltd.demon.co.uk - free VFX Forth downloads
|
|
0
|
|
|
|
Reply
|
stephenXXX1 (459)
|
6/4/2004 12:17:28 PM
|
|
In article <10bv1teju352h72@corp.supernews.com>,
Guy Macon <http://www.guymacon.com> wrote:
<SNIP>
>
>This does not answer my question. I didn't ask whether Leo Wong
>would choose to post about a Java applet that displays pornographic
>stories in a Forth newsgroup. I can guess the answer to that question.
>I asked whether it would be OK for Leo Wong to post about a Java applet
>that displays pornographic stories in a Forth newsgroup should he
>choose to do so. The question I actually asked remains unanswered.
Let me answer that question. If a Forth regular would put up a
request to have his scanned images of Penthouse completed
with some 1997 issues, I would likewise have responded in a
mildly ironic fashion, like I did.
I wouldn't be offended and don't expect others to be.
I am a disbeliever myself.
Groetjes Albert
--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
One man-hour to invent,
One man-week to implement,
One lawyer-year to patent.
|
|
0
|
|
|
|
Reply
|
albert37 (2989)
|
6/4/2004 2:16:00 PM
|
|
In article <c8cbc925.0406040019.3376b3f3@posting.google.com>,
Dr. Bruce R. McFarling <agila61@netscape.net> wrote:
>
>I think that a clean implementation would either:
>- rely on UTF-16 and the ANS Forth abstractions
>- rely on UTF-8 and come up with suitable abstractions for the
>variable-sized character problem.
>
>Indeed, it may be that for processing UTF-16 is better anyway,
>and UTF-8 is more for file/stream input and output, in which
>case the extension might be more focused on extending the
>file access modes to open files for UTF8 access, doing
>the conversion during the file reads and writes.
I think the either is embarrassing.
IMO a clean implementation of Forth wouldn't presume either
character mode. It would internally use address units,
and for such things as READ-FILE.
Unfortunately MOVE is the only word that can access address units.
I would like to plug in a selection of UTF-8 or UTF-16.
I would like to use the address units to implement byte based
protocols, such as network packages, independant of the
character type. Imagine, even COUNT changes between UTF-8 and
UTF-16 !
Groetjes Albert
--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
One man-hour to invent,
One man-week to implement,
One lawyer-year to patent.
|
|
0
|
|
|
|
Reply
|
albert37 (2989)
|
6/4/2004 2:30:21 PM
|
|
The Forth program prints Happy New Year in 70 languages.
This weekend I'll add Cornish to the Java program.
Leo Wong
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/4/2004 3:03:01 PM
|
|
On 4 Jun 2004 01:19:07 -0700, agila61@netscape.net (Dr. Bruce R.
McFarling) wrote:
>> 1. Windows is UTF-16 based; UTF-8 is barely tolerated, if at all.
>
>So a 16-bit char would go with this.
This misses the point in Forth. There are three character sets
which are simultaneously available:
DCS - Development Character Set used by the underlying Forth.
Usually ISO Latin 1. The DCS is usually 8 bits to avoid breaking
code that makes the assumption char=byte=octet.
ACS - Application Character Set used by the Forth *application*
for display when running.
OCS - Operating system Character Set used by the underlying
operating system.
For an example of this distinction consider an application written
using VFX Forth (DCS) set up for a Russian speaker (ACS) running on a
Chinese version of Windows (OCS). Yes, this is not an uncommon
requirement for some of our clients.
Stephen
--
Stephen Pelc, stephenXXX@INVALID.mpeltd.demon.co.uk
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeltd.demon.co.uk - free VFX Forth downloads
|
|
0
|
|
|
|
Reply
|
stephenXXX1 (459)
|
6/4/2004 4:31:50 PM
|
|
agila61@netscape.net (Dr. Bruce R. McFarling) wrote in message news:<c8cbc925.0406040019.3376b3f3@posting.google.com>...
> > 3. Lengths, for instance READ-LINE WRITE-LINE ; are they bytes or words if
> > UTF-16? Do they differ from READ-FILE and WRITE-FILE ? The ANS standard
> > x-FILE and x-LINE words specifically mention chracters.
>
> Words. Don't differ, all characters.
The standard clearly states 'characters' not 'words' in 11.6.1.2090
READ-LINE ( char-adr u1 fileid -- u2 flag ior )
"u2 is the number of characters"
On a 32-bit cell based Forth with byte addressing a cell would be 4
address units and if unicode was being used a character would be
2 address units. Words and characters are not the same size unless
it happened to be a 16-bit implemenation.
> We've had this discussion before ... if you stick to the ANS Forth
> standard, you can work in characters in the abstract and cells in the
> abstract, but you do not have an octet in particular. So it would be
> O@, O!, OCOUNT, etc. required to disentangle chars-as-chars and
> chars-as-octets.
So when he asked what u2 is suppose to be in those words why did
you reply words instead of characters? Maybe I misread your two
sentances. "Words." looked like your answer to his question.
"Don't differ" has no stated subject, perhaps meant as an
imperative, but then "all characters." seemed to contradict
your first statement that u2 is a count in words.
> I think that a clean implementation would either:
> - rely on UTF-16 and the ANS Forth abstractions
> - rely on UTF-8 and come up with suitable abstractions for the
> variable-sized character problem.
So cells can only be one size in an implemenation but characters
can have variable size. That does complicate the conversion
between cells, characters and addressing units.
Best Wishes
|
|
0
|
|
|
|
Reply
|
fox21 (1833)
|
6/4/2004 5:08:18 PM
|
|
agila61@netscape.net (Dr. Bruce R. McFarling) wrote in message news:<c8cbc925.0406040019.3376b3f3@posting.google.com>...
> "Alex McDonald" <alex_mcd@btopenworld.com> wrote in message news:<c9o6r9$f38$1@sparta.btinternet.com>...
>
> > UTF encodings were obviosly not considered at that time; the timeline at
> > www.unicode.org suggests a date of around 1990-92 for Unicode 1.0, which
> > predates ANS Forth by a couple of years. The standard precludes any sensible
> > workarounds for a lot of this; the standard looks back to ANSI X3.4-1974 and
> > ISO 646-1983. This is not a criticism of the standard, btw, just an
> > observation...
>
> Yet while the standard assume that characters are coherent, uniform
> units, it does not assume that characters are byte wide.
>
> > 1. Windows is UTF-16 based; UTF-8 is barely tolerated, if at all.
>
> So a 16-bit char would go with this.
There's also UTF-32 to think about. But that could be handled the
same
way.
> I think that a clean implementation would either:
> - rely on UTF-16 and the ANS Forth abstractions
> - rely on UTF-8 and come up with suitable abstractions for the
> variable-sized character problem.
>
> Indeed, it may be that for processing UTF-16 is better anyway,
> and UTF-8 is more for file/stream input and output, in which
> case the extension might be more focused on extending the
> file access modes to open files for UTF8 access, doing
> the conversion during the file reads and writes.
I wrote code for UTF-8 file i/o a while back.
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&safe=off&selm=8muchg%24jn1%241%40nnrp1.deja.com&rnum=1
One potential problem is that UTF-8 can encode UTF-16 or UTF-32.
I'm not sure if there's a way to tell this without scanning
through the entire file. Pehaps the "safe" thing would be
to assume UTF-32 and "compress" to UTF-16 if you're concerned
about saving space?
Regards,
John M. Drake
|
|
0
|
|
|
|
Reply
|
jmdrake_98 (317)
|
6/4/2004 8:24:42 PM
|
|
"Jeff Fox" <fox@ultratechnology.com> wrote in message
news:4fbeeb5a.0406040908.7f7b4e65@posting.google.com...
> agila61@netscape.net (Dr. Bruce R. McFarling) wrote in message
news:<c8cbc925.0406040019.3376b3f3@posting.google.com>...
> > > 3. Lengths, for instance READ-LINE WRITE-LINE ; are they bytes or
words if
> > > UTF-16? Do they differ from READ-FILE and WRITE-FILE ? The ANS
standard
> > > x-FILE and x-LINE words specifically mention chracters.
> >
> > Words. Don't differ, all characters.
>
> The standard clearly states 'characters' not 'words' in 11.6.1.2090
> READ-LINE ( char-adr u1 fileid -- u2 flag ior )
> "u2 is the number of characters"
>
> On a 32-bit cell based Forth with byte addressing a cell would be 4
> address units and if unicode was being used a character would be
> 2 address units. Words and characters are not the same size unless
> it happened to be a 16-bit implemenation.
>
To clarify; I meant word as in 16bits, 2bytes (the "standard" definition of
a word in Windows; references to dwords is common when 4bytes are meant).
Does that change the argument any? Certainly, being only able to read or
write an even number of bytes seems to be an issue, and unnatural when the
target of these words is a stream of (normally) octets.
--
Regards
Alex McDonald
|
|
0
|
|
|
|
Reply
|
alex_mcd (751)
|
6/4/2004 8:42:57 PM
|
|
"jmdrake" <jmdrake_98@yahoo.com> wrote in message
news:e20a4a47.0406041224.450429ae@posting.google.com...
===snipped
>
> I wrote code for UTF-8 file i/o a while back.
>
>
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&safe=off&selm=8muchg%24jn1%241%40nnrp1.deja.com&rnum=1
Thanks. The link to the website appears to be dead. Presumably it's the same
information as in chapter 3 of the Unicode standard; there's a table that
documents the 8->16->32 encodings.
--
Regards
Alex McDonald
|
|
0
|
|
|
|
Reply
|
alex_mcd (751)
|
6/4/2004 11:36:54 PM
|
|
"Stephen Pelc" <stephenXXX@INVALID.mpeltd.demon.co.uk> wrote in message
news:40c068fb.483248203@192.168.0.1...
> On 4 Jun 2004 01:19:07 -0700, agila61@netscape.net (Dr. Bruce R.
> McFarling) wrote:
>
> >> 1. Windows is UTF-16 based; UTF-8 is barely tolerated, if at all.
> >
> >So a 16-bit char would go with this.
>
> This misses the point in Forth. There are three character sets
> which are simultaneously available:
>
> DCS - Development Character Set used by the underlying Forth.
> Usually ISO Latin 1. The DCS is usually 8 bits to avoid breaking
> code that makes the assumption char=byte=octet.
>
> ACS - Application Character Set used by the Forth *application*
> for display when running.
7 bit DCS part? Or is the divorce between the ACS display attributes and the
underlying Forth absolute? In other words, does the DCS part of this Forth
support, for instance, TYPE , or is that part of the ACS? Or (more likely)
am I misunderstanding?
--
Regards
Alex McDonald
|
|
0
|
|
|
|
Reply
|
alex_mcd (751)
|
6/4/2004 11:36:55 PM
|
|
"Stephen Pelc" <stephenXXX@INVALID.mpeltd.demon.co.uk> wrote in message
news:40c06796.482890765@192.168.0.1...
> On Fri, 4 Jun 2004 08:46:44 +0000 (UTC), "Alex McDonald"
> <alex_mcd@btopenworld.com> wrote:
>
> >Thank you. I shall review them. I've never downloaded VFX (or SwiftForth)
> >for fear of "stealing" code into Win32Forth.
> Thank you. That makes a change.
No problem. Ideas I copy; I prefer my own code to anyone elses.
>
> > Without downloading it, can you
> >tell me what the copyright is on Lib\International.fth?
> From the file header:
>
===snipped
I'll read the docs first. Thanks for the "open source" licence; I may allow
myself to borrow that one file...
--
Regards
Alex McDonald
|
|
0
|
|
|
|
Reply
|
alex_mcd (751)
|
6/4/2004 11:44:59 PM
|
|
Albert van der Horst <albert@spenarnc.xs4all.nl> says...
>
>Guy Macon wrote:
>>
>>This does not answer my question. I didn't ask whether Leo Wong
>>would choose to post about a Java applet that displays pornographic
>>stories in a Forth newsgroup. I can guess the answer to that question.
>>I asked whether it would be OK for Leo Wong to post about a Java applet
>>that displays pornographic stories in a Forth newsgroup should he
>>choose to do so. The question I actually asked remains unanswered.
>
>Let me answer that question. If a Forth regular would put up a
>request to have his scanned images of Penthouse completed
>with some 1997 issues, I would likewise have responded in a
>mildly ironic fashion, like I did.
>
>I wouldn't be offended and don't expect others to be.
I wouldn't have been offended in either case, and given the lack of
recent participation and the content being indistinguishable from
typical spam, I had no way of knowing that he was a Forth regular,
so I would have asked the
|
|
0
|
|
|
|
Reply
|
Guy
|
6/5/2004 4:34:55 AM
|
|
"Alex McDonald" <alex_mcd@btopenworld.com> wrote in message news:<c9qmsh$a4g$1@sparta.btinternet.com>...
> To clarify; I meant word as in 16bits, 2bytes (the "standard" definition of
> a word in Windows; references to dwords is common when 4bytes are meant).
> Does that change the argument any? Certainly, being only able to read or
> write an even number of bytes seems to be an issue, and unnatural when the
> target of these words is a stream of (normally) octets.
OK. I was probably just going a little overboard in following up
on Dr. McFarling's posts and trying to bridge the gap between 1980
views and 2004 views. I was also thinking of word addressing machines
where words could be 20-bits or 24-bits or 32-bits or whatever and
where 16-bit characters might end up getting packed in various ways.
I like the ANS notion of refering to cells and characters. I think
the idea that cell size must be fixed for an implemenation is fine
although some machines have different addressing units in different
address spaces. And if a system can have some characters of different
sizes in one implementation such as variable bit size, 7-bit, 8-bit,
16-bit, 20-bit or whatever then there are complications regarding
converting between character counts and addressing, but what can
one do other than address those as they come up?
Best Wishes
|
|
0
|
|
|
|
Reply
|
fox21 (1833)
|
6/5/2004 5:16:27 AM
|
|
On Fri, 4 Jun 2004 23:36:55 +0000 (UTC), "Alex McDonald"
<alex_mcd@btopenworld.com> wrote:
>7 bit DCS part? Or is the divorce between the ACS display attributes and the
>underlying Forth absolute? In other words, does the DCS part of this Forth
>support, for instance, TYPE , or is that part of the ACS? Or (more likely)
>am I misunderstanding?
In all Forths that I have seen, the character words such as C@
and TYPE follow the DCS, which is fixed. The ACS may vary during
run-time, e.g. in a translation program. For all practical
porposes, the OCS is fixed for one run-time of the applocation,
but may vary from machine to machine, as when running the same
application in different countries.
See the internationalisation and wide character set papers on
our website
www.mpeltd.demon.co.uk/arena.htm
Stephen
--
Stephen Pelc, stephenXXX@INVALID.mpeltd.demon.co.uk
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeltd.demon.co.uk - free VFX Forth downloads
|
|
0
|
|
|
|
Reply
|
stephenXXX1 (459)
|
6/5/2004 12:26:52 PM
|
|
Guy Macon wrote:
> Albert van der Horst <albert@spenarnc.xs4all.nl> says...
>
>>Guy Macon wrote:
>>
>>>This does not answer my question. I didn't ask whether Leo Wong
>>>would choose to post about a Java applet that displays pornographic
>>>stories in a Forth newsgroup. I can guess the answer to that question.
>>>I asked whether it would be OK for Leo Wong to post about a Java applet
>>>that displays pornographic stories in a Forth newsgroup should he
>>>choose to do so. The question I actually asked remains unanswered.
>>
>>Let me answer that question. If a Forth regular would put up a
>>request to have his scanned images of Penthouse completed
>>with some 1997 issues, I would likewise have responded in a
>>mildly ironic fashion, like I did.
>>
>>I wouldn't be offended and don't expect others to be.
>
>
> I wouldn't have been offended in either case, and given the lack of
> recent participation and the content being indistinguishable from
> typical spam, I had no way of knowing that he was a Forth regular,
> so I would have asked the
>
This is not intended as flame bait, and I hope is not taken as such.
Your post seems to have been cut off at "the".
Then you and Albert van der Horst agree that OT=pornography and equally
acceptable from a Forth regular?
Or perhaps, leaving out OT=pornography:
OT and pornography are equally acceptable from a Forth regular.
-- Guy Macon and Albert van der Horst
Again, please don't respond with a flame.
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/5/2004 3:01:50 PM
|
|
Thomas Pornin wrote:
...
> (*) I cheat a bit here. In the C world, "byte" means "unsigned char,
> at least 8 bits", which is not the same as "octet". C compilers with
> non-octet bytes have existed and still exist on some architectures
> (mostly micro-controllers for embedded systems), but this breaks too
> much "desktop computer" code; therefore, C compilers on Windows or Unix
> machines are very unlikely to have bytes which are not octets. For
> instance, when you deal with Internet TCP connections, you exchange
> octets.
The definition of a byte in ANSI C depends on the architecture of the
machine that the compiler targets and the size of a character's
representation. It is not optional. A byte is the largest of
* the smallest addressable unit of memory
* the number of bits used to represent a character
* 8 bits
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/5/2004 3:21:37 PM
|
|
Leo Wong wrote:
> OT and pornography are equally acceptable from a Forth regular.
> -- Guy Macon and Albert van der Horst
>
It may be that some here consider pornography OT, so a better
formulation might be:
Religion and pornography are equally acceptable from a Forth regular
-- Guy Macon and Albert van der Horst
It may be that Mr Macon's post after the "the" gives the answer, in
which case, please re-post.
Or, to take this away from comp.lang.forth, please post it to a group
where the answer would be on topic, and let me know the group's name.
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/5/2004 3:49:29 PM
|
|
Leo Wong <hello@albany.net> says...
>
>Leo Wong wrote:
>> OT and pornography are equally acceptable from a Forth regular.
>> -- Guy Macon and Albert van der Horst
>
>It may be that some here consider pornography OT, so a better
>formulation might be:
>
>Religion and pornography are equally acceptable from a Forth regular
> -- Guy Macon and Albert van der Horst
The use of "--" implies that the above is a quote from me rather than
Leo Wong's interpretation of what I wrote.
I do not find *any* off-topic posts to be acceptable, so "religion and
pornography posts in a Forth newsgroup are equally unacceptable - both
of Off-Topic" would be more accurate.
Despite my position that off-topic posts are unacceptable I do
agree that someone who regularly contributes high-quality on-topic
post to a newsgroup should be allowed more leeway in the area of
off-topic posting (which means that I still dislike the off-topic
post but am likely to not say anything about it) than a non-contributor
or someone who posts low-quality material (off-topic posts, flame wars,
spam, etc.).
|
|
0
|
|
|
|
Reply
|
Guy
|
6/5/2004 11:46:42 PM
|
|
"Alex McDonald" <alex_mcd@btopenworld.com> wrote in message news:<c9r12l$6bf$1@hercules.btinternet.com>...
> "jmdrake" <jmdrake_98@yahoo.com> wrote in message
> news:e20a4a47.0406041224.450429ae@posting.google.com...
> ===snipped
>
> >
> > I wrote code for UTF-8 file i/o a while back.
> >
> >
> http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&safe=off&selm=8muchg%24jn1%241%40nnrp1.deja.com&rnum=1
>
> Thanks. The link to the website appears to be dead.
Well you can still hit it from the "Internet Wayback Machine".
http://web.archive.org/web/20030621133334/http://www.czyborra.com/utf/
> Presumably it's the same
> information as in chapter 3 of the Unicode standard; there's a table that
> documents the 8->16->32 encodings.
Quite likely.
Regards,
John M. Drake
|
|
0
|
|
|
|
Reply
|
jmdrake_98 (317)
|
6/6/2004 2:45:48 AM
|
|
Guy Macon wrote:
...
> I do not find *any* off-topic posts to be acceptable, ...
From the context, that seems to include even messages clearly so
marked. That seems quite unreasonable to me; there certainly is no
freedom from digression in comp.lang.forth or any other newsgroup I
frequent. I fear you must expect and accept frequent disappointment.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/6/2004 3:19:57 AM
|
|
Guy Macon wrote:
>
> Despite my position that off-topic posts are unacceptable I do
> agree that someone who regularly contributes high-quality on-topic
> post to a newsgroup should be allowed more leeway in the area of
> off-topic posting (which means that I still dislike the off-topic
> post but am likely to not say anything about it) than a non-contributor
> or someone who posts low-quality material (off-topic posts, flame wars,
> spam, etc.).
>
Then we are at peace (though not necessarily agreeing on particulars),
and you are at liberty to deny that I am a regular contributor of high
quality posts.
God bless,
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/6/2004 3:27:09 AM
|
|
Jerry Avins wrote:
> Guy Macon wrote:
>
> ...
>
>> I do not find *any* off-topic posts to be acceptable, ...
>
>
> From the context, that seems to include even messages clearly so
> marked. That seems quite unreasonable to me; there certainly is no
> freedom from digression in comp.lang.forth or any other newsgroup I
> frequent. I fear you must expect and accept frequent disappointment.
>
> Jerry
I take it that Mr. Macon will not voice his disappointment without first
checking on the quality of the poster's previous contributions to clf.
I hope his notion of "regular" is not exacting since there are several
of us who don't post as often as we once did, and a few who only post
when they have something to say.
As to the vast number of off-topic musings posted in response to
supposedly on-topic subjects, we have I suppose become inured to them,
and indeed have found in some of them unexpected pleasure or food for
thought. I myself shall continue to adhere to the motto of the Abbey of
Theleme.
Leo
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/6/2004 3:59:50 AM
|
|
Leo Wong wrote:
> As to the vast number of off-topic musings posted in response to
> supposedly on-topic subjects, we have I suppose become inured to them,
> and indeed have found in some of them unexpected pleasure or food for
> thought. I myself shall continue to adhere to the motto of the Abbey of
> Theleme.
>
You mean adhere to the rules? I don't remember - was there a motto
as well?
|
|
0
|
|
|
|
Reply
|
etchernevnono (23)
|
6/6/2004 4:14:51 AM
|
|
Leo Wong <hello@albany.net> says...
>...you are at liberty to deny that I am a regular contributor
>of high quality posts.
If you expect people to think "Leo Wong is a regular contributor
of high quality posts - we should cut him some slack when he posts
Off-Topic" you must provide some way for them to know that Leo Wong
is a regular contributor of high quality posts. Common clues that
would tip someone off would be a Forth related .sig or email address
or a mention of Forth before launching into the post about Java. The
most common way would, of course, be for the reader to have seen some
recent high quality posts from Leo Wong. Let's assume that all Forth
posts by Leo Wong are high-quality and look at the record:
Previous Month:
http://groups.google.com/groups?q=group:comp.lang.forth+author:Leo+author:Wong&num=100&hl=en&lr=&ie=UTF-8&scoring=d&as_drrb=b&as_mind=5&as_minm=5&as_miny=2004&as_maxd=5&as_maxm=6&as_maxy=2004&filter=0
*No* comp.lang.forth posts about Forth by Leo Wong in the last month.
Previous six months:
http://groups.google.com/groups?q=group:comp.lang.forth+author:Leo+author:Wong&num=100&hl=en&lr=&ie=UTF-8&scoring=d&as_drrb=b&as_mind=5&as_minm=6&as_miny=2003&as_maxd=5&as_maxm=6&as_maxy=2004&filter=0
8 comp.lang.forth posts about Forth by Leo Wong in the last six months.
It appears that you stopped being a regular contributor of posts about Forth
at least six months ago.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/6/2004 4:49:56 AM
|
|
Jerry Avins <jya@ieee.org> says...
>
>Guy Macon wrote:
>
>> I do not find *any* off-topic posts to be acceptable, ...
>
> From the context, that seems to include even messages clearly so
>marked. That seems quite unreasonable to me; there certainly is no
>freedom from digression in comp.lang.forth or any other newsgroup I
>frequent. I fear you must expect and accept frequent disappointment.
I have asked the following question of you several times and have yet
to get a straight answer out of you:
Are Off-Topic posts about pornography OK if they are clearly marked?
Or is it only *some* kinds of Off-Topic posts that are OK with you?
I don't expect you to answer the question asked this time either,
so I will make my own assumptions about what your answer is.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/6/2004 4:51:08 AM
|
|
Guy Macon wrote:
> Leo Wong <hello@albany.net> says...
>
>
>>...you are at liberty to deny that I am a regular contributor
>>of high quality posts.
>
>
> If you expect people to think "Leo Wong is a regular contributor
> of high quality posts - we should cut him some slack when he posts
> Off-Topic" you must provide some way for them to know that Leo Wong
> is a regular contributor of high quality posts. Common clues that
> would tip someone off would be a Forth related .sig or email address
> or a mention of Forth before launching into the post about Java. The
> most common way would, of course, be for the reader to have seen some
> recent high quality posts from Leo Wong. Let's assume that all Forth
> posts by Leo Wong are high-quality and look at the record:
>
> Previous Month:
> http://groups.google.com/groups?q=group:comp.lang.forth+author:Leo+author:Wong&num=100&hl=en&lr=&ie=UTF-8&scoring=d&as_drrb=b&as_mind=5&as_minm=5&as_miny=2004&as_maxd=5&as_maxm=6&as_maxy=2004&filter=0
> *No* comp.lang.forth posts about Forth by Leo Wong in the last month.
>
> Previous six months:
> http://groups.google.com/groups?q=group:comp.lang.forth+author:Leo+author:Wong&num=100&hl=en&lr=&ie=UTF-8&scoring=d&as_drrb=b&as_mind=5&as_minm=6&as_miny=2003&as_maxd=5&as_maxm=6&as_maxy=2004&filter=0
> 8 comp.lang.forth posts about Forth by Leo Wong in the last six months.
>
> It appears that you stopped being a regular contributor of posts about Forth
> at least six months ago.
>
>
BAHH
Macon -- *NO POSTS* prior ~6 months [ 41 total, 26% this topic( 2
threads)]
Wong -- >900 posts over 10 years!
Who is regular?
|
|
0
|
|
|
|
Reply
|
rowlett10 (1881)
|
6/6/2004 7:18:47 AM
|
|
Richard Owlett <rowlett@atlascomm.net> says...
>Macon -- *NO POSTS*
>Who is regular?
Nobody is claiming that Macon can freely post off-topic on
the basis of being a "regular."
|
|
0
|
|
|
|
Reply
|
Guy
|
6/6/2004 8:18:11 AM
|
|
Elko Tchernev wrote:
> Leo Wong wrote:
>
>> As to the vast number of off-topic musings posted in response to
>> supposedly on-topic subjects, we have I suppose become inured to them,
>> and indeed have found in some of them unexpected pleasure or food for
>> thought. I myself shall continue to adhere to the motto of the Abbey
>> of Theleme.
>>
> You mean adhere to the rules? I don't remember - was there a motto
> as well?
"Do as you will"
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/6/2004 2:59:27 PM
|
|
Guy Macon wrote:
> Jerry Avins <jya@ieee.org> says...
>
>>Guy Macon wrote:
>>
>>
>>>I do not find *any* off-topic posts to be acceptable, ...
>>
>>From the context, that seems to include even messages clearly so
>>marked. That seems quite unreasonable to me; there certainly is no
>>freedom from digression in comp.lang.forth or any other newsgroup I
>>frequent. I fear you must expect and accept frequent disappointment.
>
>
> I have asked the following question of you several times and have yet
> to get a straight answer out of you:
>
> Are Off-Topic posts about pornography OK if they are clearly marked?
> Or is it only *some* kinds of Off-Topic posts that are OK with you?
>
> I don't expect you to answer the question asked this time either,
> so I will make my own assumptions about what your answer is.
I answered you, but you rejected my answer. To go a step further, if a
post were marked "OFF TOPIC PORN" and not part of a repeated pattern, I
would not report the author to his/her ISP. It would certainly affect my
opinion of him/her. All computer-related OT messages (as well as those
about hotel accommodations in cities known to participants here) are
well within the framework long established here. That's how it is.
Refraining from imagining the worst of motives when there gaps in your
knowledge will probably lower your blood pressure. Your assumption that
Leo was advertising his web site and your subsequent justification of
the rudeness that that assumption engendered probably raised it. Peace!
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/6/2004 3:17:52 PM
|
|
Richard Owlett <rowlett@atlascomm.net> wrote in message news:<10c5h4u3qjntl3d@corp.supernews.com>...
> BAHH
> Macon -- *NO POSTS* prior ~6 months [ 41 total, 26% this topic( 2
> threads)]
>
> Wong -- >900 posts over 10 years!
>
> Who is regular?
I would like to point out that Mr. Macon has been a regular
contributor to comp.arch.embedded for quite a while and posted
hundreds of interesting posts there with quite a few interesting
references to Forth. I found his posts some of the most informative
even before I noticed that he had interesting information to
report about Forth.
He contributed a couple of very amusing posts on the subject of
usenet trolls there recently that make for an amusing read. He
has some interesting and education stories at his website and
recently posted a pointer to a very interesting article about
'the trance state conjecture.' Good reading.
I think content is more important than the number of posts,
although those who judge the popularity of languages by the
number of posts in their associated usenet groups obviously
value the number of posts more than I. Trolling and flamewars
and off-topic threads do increase the number of total posts in
comp.lang.forth and help keep the post count higher than they
would be if people only talked about Forth here. ;-)
It is usenet and people will talk about who is full of it and
who is regular. :-)
Best Wishes
|
|
0
|
|
|
|
Reply
|
fox21 (1833)
|
6/6/2004 6:20:43 PM
|
|
Richard Owlett wrote:
> Wong -- >900 posts over 10 years!
Hmmm. How many of the >900 were off topic?
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/6/2004 10:43:51 PM
|
|
Guy Macon wrote:
>
> If you expect people to think "Leo Wong is a regular contributor
> of high quality posts - we should cut him some slack when he posts
> Off-Topic" you must provide some way for them to know that Leo Wong
> is a regular contributor of high quality posts. Common clues that
> would tip someone off would be a Forth related .sig or email address
> or a mention of Forth before launching into the post about Java. The
> most common way would, of course, be for the reader to have seen some
> recent high quality posts from Leo Wong. Let's assume that all Forth
> posts by Leo Wong are high-quality and look at the record:
>
> Previous Month:
> http://groups.google.com/groups?q=group:comp.lang.forth+author:Leo+author:Wong&num=100&hl=en&lr=&ie=UTF-8&scoring=d&as_drrb=b&as_mind=5&as_minm=5&as_miny=2004&as_maxd=5&as_maxm=6&as_maxy=2004&filter=0
> *No* comp.lang.forth posts about Forth by Leo Wong in the last month.
>
> Previous six months:
> http://groups.google.com/groups?q=group:comp.lang.forth+author:Leo+author:Wong&num=100&hl=en&lr=&ie=UTF-8&scoring=d&as_drrb=b&as_mind=5&as_minm=6&as_miny=2003&as_maxd=5&as_maxm=6&as_maxy=2004&filter=0
> 8 comp.lang.forth posts about Forth by Leo Wong in the last six months.
>
> It appears that you stopped being a regular contributor of posts about Forth
> at least six months ago.
>
>
I think I shall ignore you from now on.
God bless,
Leo Wong
--
http://www.albany.net/~hello/
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/6/2004 10:48:11 PM
|
|
Jerry Avins says...
>
>Guy Macon wrote:
>
>> Jerry Avins says...
>>
>>>Guy Macon wrote:
>>
>> I have asked the following question of you several times and have yet
>> to get a straight answer out of you:
>>
>> Are Off-Topic posts about pornography OK if they are clearly marked?
>> Or is it only *some* kinds of Off-Topic posts that are OK with you?
>>
>> I don't expect you to answer the question asked this time either,
>> so I will make my own assumptions about what your answer is.
>
>I answered you, but you rejected my answer.
You answered another question that I didn't ask and failed to
answer the question I did ask. Are off-topic posts about pornography
OK with you, or is it only *some* kinds of Off-Topic posts that are OK
with you?
>To go a step further, if a
>post were marked "OFF TOPIC PORN" and not part of a repeated pattern, I
>would not report the author to his/her ISP.
Nor would I, but that's not what I asked. ISPs don't care whether you
post off-topic. Are off-topic posts about pornography OK with you, or
is it only *some* kinds of Off-Topic posts that are OK with you?
>It would certainly affect my opinion of him/her.
Mine too (as would posting about Java in a Forth newsgroup), but
that's not what I asked. Are off-topic posts about pornography OK
with you, or is it only *some* kinds of Off-Topic posts that are OK
with you?
>All computer-related OT messages (as well as those
>about hotel accommodations in cities known to participants here) are
>well within the framework long established here. That's how it is.
I disagree, This isn't comp.lang.forth-and-other-computer-related-topics,
nor is it comp.lang.forth-and-hotel-accommodations. You have still not
answered the question I asked. Are off-topic posts about pornography
OK with you, or is it only *some* kinds of Off-Topic posts that are OK
with you?
Based on your answers to questions I never asked, I will now assume
that off-topic posts about finding more porn for inclusion a computer
program that displays pornography are OK with Jerry Avins, just as
off-topic posts about finding more prayers for inclusion a computer
program that displays prayers are OK with Jerry Avins.
I would prefer to have a direct answer from you, but you seem to be
unable or unwilling to answer the actual question asked, so I must
make assumptions. If this displeases you, you can rectify the
situation by deciding to answer the actual question asked.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/7/2004 2:44:26 AM
|
|
Leo Wong <hello@albany.net> says...
>I think I shall ignore you from now on.
Please do. I prefer to talk about Forth-related topics in a
Forth newsgroup and your recent output has been about other topics.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/7/2004 2:46:32 AM
|
|
Jerry Avins wrote:
> Elko Tchernev wrote:
>
>> Leo Wong wrote:
>>
>>> As to the vast number of off-topic musings posted in response to
>>> supposedly on-topic subjects, we have I suppose become inured to
>>> them, and indeed have found in some of them unexpected pleasure or
>>> food for thought. I myself shall continue to adhere to the motto of
>>> the Abbey of Theleme.
>>>
>> You mean adhere to the rules? I don't remember - was there a motto
>> as well?
>
>
> "Do as you will"
>
That's the rules. Was there a motto too?
|
|
0
|
|
|
|
Reply
|
etchernevnono (23)
|
6/7/2004 4:28:42 AM
|
|
Elko Tchernev <etchernevnono@acm.org> says...
>
>Jerry Avins wrote:
>
>> Elko Tchernev wrote:
>>
>>> Leo Wong wrote:
>>>
>>>> As to the vast number of off-topic musings posted in response to
>>>> supposedly on-topic subjects, we have I suppose become inured to
>>>> them, and indeed have found in some of them unexpected pleasure or
>>>> food for thought. I myself shall continue to adhere to the motto of
>>>> the Abbey of Theleme.
>>>>
>>> You mean adhere to the rules? I don't remember - was there a motto
>>> as well?
>>
>> "Do as you will"
>>
> That's the rules. Was there a motto too?
There never was a motto, but many people have taken four words out
of the rules of the Abbey of Theleme and improperly turned them
into a motto.
In the book _Letter from Gargantua to his son Pantagruel_, Chapter 52
Fran�ois Rabelais wrote the following (Translated): "In their rules there
was only one clause: do what you will [Original: Fay ce que vouldras]
because people who are free, well-born, well-bred, and easy in honest
company have a natural spur and instinct which drives them to virtuous
deeds and deflects them from vice; and this they called honour."
Those who only quote "do what you will" out of context miss the point
entirely. The point is voluntary self restraint; choosing to not do
what you have liberty to do. It's the reason why polite people avoid
top-posting, off-topic posting, posting HTML and any number of other
ways that are known to annoy most of those who read Usenet posts.
--
Guy Macon, Electronics Engineer & Project Manager for hire.
Remember Doc Brown from the _Back to the Future_ movies? Do you
have an "impossible" engineering project that only someone like
Doc Brown can solve? My resume is at http://www.guymacon.com/
|
|
0
|
|
|
|
Reply
|
Guy
|
6/7/2004 8:12:56 AM
|
|
Dear Guy,
> In the book _Letter from Gargantua to his son Pantagruel_, Chapter 52
> Fran�ois Rabelais wrote the following (Translated): "In their rules there
> was only one clause: do what you will [Original: Fay ce que vouldras]
> because people who are free, well-born, well-bred, and easy in honest
> company have a natural spur and instinct which drives them to virtuous
> deeds and deflects them from vice; and this they called honour."
>
> Those who only quote "do what you will" out of context miss the point
> entirely. The point is voluntary self restraint; choosing to not do
> what you have liberty to do.
I suspect that the quote was from Aleister Crowley, who was not known
for his self restraint.
Best regards,
Bill
|
|
0
|
|
|
|
Reply
|
bspight1 (63)
|
6/7/2004 8:35:21 AM
|
|
fox@ultratechnology.com (Jeff Fox) wrote in message news:<4fbeeb5a.0406042116.21a6062e@posting.google.com>...
> "Alex McDonald" <alex_mcd@btopenworld.com> wrote in message news:<c9qmsh$a4g$1@sparta.btinternet.com>...
> > To clarify; I meant word as in 16bits, 2bytes (the "standard"
> > definition of a word in Windows;
And yes, it IS more likely that a 32-bit (or other system with
a word size larger than 16-bits) system will have a Unicode
environment, so yes, if one wanted to quibble, most often when
you would use two octets to hold a character, the word size
would be bigger than two octets. I don't understand the urgent
desire to quibble, but in the sense of the term that I originally
learned in 1980, I did not means per se, but about
double octets.
Of course, as I read the original, that was the connotation being
given to "words".
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/7/2004 9:58:22 AM
|
|
stephenXXX@INVALID.mpeltd.demon.co.uk (Stephen Pelc) wrote in message news:<40c068fb.483248203@192.168.0.1>...
> On 4 Jun 2004 01:19:07 -0700, agila61@netscape.net (Dr. Bruce R.
> McFarling) wrote:
>
> >> 1. Windows is UTF-16 based; UTF-8 is barely tolerated, if at all.
> >
> >So a 16-bit char would go with this.
> This misses the point in Forth. There are three character sets
> which are simultaneously available:
> DCS - Development Character Set used by the underlying Forth.
> Usually ISO Latin 1. The DCS is usually 8 bits to avoid breaking
> code that makes the assumption char=byte=octet.
Ah yes, standardising internationalisation (I18N) brought these
distinctions to a head, but of course since it involved something
that some implementers had had to handle, so had existing practice
to fall back on.
Of course, the standard permits UTF16 or UTF32 as DCS ... but its
not surprising that its legacy code that gives the past-binding
behaviour.
Of course, there are two assumptions lurking around that can break
legacy code:
char=byte=octet
OCS=ACS
The first really kicks in if you start using Blocks as workspaces to
store ACS characters, but then you either parametrise that for
I18N or you rewrite for each different case.
That distinction is also handy for the Palm Pilot, where as near
as I understand it, in anything I would be doing, OCS = ASCII7
subset of ACS = OCS <> ISO Latin 1.
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/7/2004 10:20:44 AM
|
|
"LeoWong" <hello@albany.net> wrote in message news:<b7004b188f5e8146c14a25dc4dab2250@localhost.talkaboutprogramming.com>...
> The Forth program prints Happy New Year in 70 languages.
>
> This weekend I'll add Cornish to the Java program.
>
> Leo Wong
> http://www.albany.net/~hello/
Where's fny.f?
And does it work equally well for Chinese New Year? Well, DOES IT? IS
THERE AN OCCIDENTAL BIAS IN THIS PROGRAM?
No, wait a minute, it does work equally well for Chinese New Year.
Never mind.
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/7/2004 10:46:03 AM
|
|
Leo Wong <hello@albany.net> wrote in message news:<y_mdnalqr8SaA17dRVn-hw@thebiz.net>...
> Richard Owlett wrote:
>
> > Wong -- >900 posts over 10 years!
>
> Hmmm. How many of the >900 were off topic?
>
>
> Leo Wong
I dunno, but its creeping up. Posting a musing
on the ratio of your own off topic posts would
be off topic.
You should have wondered how many of the >900
were ON topic, and then in the privacy of your
own home taken the ratio and subtracted it from 1.
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/7/2004 10:51:00 AM
|
|
According to Dr. Bruce R. McFarling <agila61@netscape.net>:
> And yes, it IS more likely that a 32-bit (or other system with
> a word size larger than 16-bits) system will have a Unicode
> environment, so yes, if one wanted to quibble, most often when
> you would use two octets to hold a character, the word size
> would be bigger than two octets. I don't understand the urgent
> desire to quibble
As far as things like Java or the Win32 API are concerned (both use
16-bit characters), they were defined at a time when Unicode was not
really expected to exceed 65000 characters. That both use 16-bit
characters instead of 32-bit characters is not really petty quibbling,
rather lack of foresight. Note that even going to 16-bit triggered
gigabytes of flamewars about that unacceptable waste. What a flamewar
needs in order to strive and grow is not a rational, logical basis, but
fresh and regular supplies of minds to set aflame.
When Unicode went to 1000000+ characters, UTF-16 was designed: it is a
way to encode arbitrary such characters into either 1 or 2 16-bit words
(just like UTF-8 is a way to encode Unicode characters into 1 to 4 8-bit
bytes). Unicode characters from the first plane remained unchanged in
UTF-16. Windows switched to UTF-16. I expect Java to switch also to
UTF-16 at some time in the near future.
C has a notion of "wide characters" with the <wchar.h> header, wide
characters literal strings and wide characters in the "wchar_t" type.
The wchar_t type is defined in most modern Unix-like systems (e.g.
Linux, FreeBSD or Solaris) and it usually is a 32-bit type. Hence, under
those systems, there is no real quibbling and 32-bit characters have
won. Somehow. Note that the base Unix kernel uses zero-terminated octet
strings (which may be UTF-8 or latin-1 or whatever -- the kernel does
not mind), hence the definition of "wchar_t" is really a matter of local
convention between applications and their libraries.
Interestingly, the Gtk graphical toolkit (an emergent standard under
Linux) has defined that all strings are exchanged as UTF-8 octet
strings, even though wchar_t was available and ready to use.
--Thomas Pornin
PS: the Unix kernel expects string arguments with a terminal 0, for
instance when opening a file (the file name is thus provided). In Forth,
strings do not have that terminating 0. When implementing a Forth
system running under Unix, how is this usually solved ? By requiring
application code to add (or at least leave room for) the terminating 0,
or with local buffers where file names are temporarily copied ?
|
|
0
|
|
|
|
Reply
|
pornin (75)
|
6/7/2004 11:39:24 AM
|
|
Guy Macon wrote:
...
> In the book _Letter from Gargantua to his son Pantagruel_, Chapter 52
> Fran�ois Rabelais wrote the following (Translated): "In their rules there
> was only one clause: do what you will [Original: Fay ce que vouldras]
> because people who are free, well-born, well-bred, and easy in honest
> company have a natural spur and instinct which drives them to virtuous
> deeds and deflects them from vice; and this they called honour."
>
> Those who only quote "do what you will" out of context miss the point
> entirely. The point is voluntary self restraint; choosing to not do
> what you have liberty to do. It's the reason why polite people avoid
> top-posting, off-topic posting, posting HTML and any number of other
> ways that are known to annoy most of those who read Usenet posts.
You captured the essence here, putting into words what I couldn't have
expressed as well. Leo is one of those well bred people of honor to whom
I feel perfectly at ease saying "Fay ce que vouldras".
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/7/2004 3:22:48 PM
|
|
Guy Macon wrote:
> Jerry Avins says...
...
> You answered another question that I didn't ask and failed to
> answer the question I did ask. Are off-topic posts about pornography
> OK with you, or is it only *some* kinds of Off-Topic posts that are OK
> with you?
Infrequent off-topic threads like this one waste bandwidth and patience,
but being easily avoidable, are harmless. Avoidable pornography is
probably harmless too; the difference is bad taste. I don't like porn. I
don't like boiled squash, either, but I don't make an issue of it.
> .... Are off-topic posts about pornography OK
> with you, or is it only *some* kinds of Off-Topic posts that are OK
> with you?
Posts in bad taste are not OK with me. I was guilty of that once here.
Not everyone shares my sense of what is humorous. I've reformed.
>>All computer-related OT messages (as well as those
>>about hotel accommodations in cities known to participants here) are
>>well within the framework long established here. That's how it is.
>
>
> I disagree, This isn't comp.lang.forth-and-other-computer-related-topics,
> nor is it comp.lang.forth-and-hotel-accommodations. You have still not
> answered the question I asked. Are off-topic posts about pornography
> OK with you, or is it only *some* kinds of Off-Topic posts that are OK
> with you?
Only some kinds.
> Based on your answers to questions I never asked, I will now assume
> that off-topic posts about finding more porn for inclusion a computer
> program that displays pornography are OK with Jerry Avins, just as
> off-topic posts about finding more prayers for inclusion a computer
> program that displays prayers are OK with Jerry Avins.
>
> I would prefer to have a direct answer from you, but you seem to be
> unable or unwilling to answer the actual question asked, so I must
> make assumptions. If this displeases you, you can rectify the
> situation by deciding to answer the actual question asked.
Some would call certain images commonly used for image-processing
studies pornographic. "Lena" could be so characterized by priggish
zealots. Asking here where to find such images would be appropriate
whatever language the questioner might plan to use. Lena was alive and
well last I heard. Asking where to find her might be appropriate too.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/7/2004 3:44:39 PM
|
|
On 07/06/2004 4:35 AM, Bill Spight wrote:
> Dear Guy,
>
>
>>In the book _Letter from Gargantua to his son Pantagruel_, Chapter 52
>>Fran�ois Rabelais wrote the following (Translated): "In their rules there
>>was only one clause: do what you will [Original: Fay ce que vouldras]
>>because people who are free, well-born, well-bred, and easy in honest
>>company have a natural spur and instinct which drives them to virtuous
>>deeds and deflects them from vice; and this they called honour."
>>
>>Those who only quote "do what you will" out of context miss the point
>>entirely. The point is voluntary self restraint; choosing to not do
>>what you have liberty to do.
>
>
> I suspect that the quote was from Aleister Crowley, who was not known
> for his self restraint.
>
I think his version was something like, "Do what thou wilt; thus is the
whole of the law."
I'm pretty sure Crowley meant much the same as Guy's comment, above.
I.e., you are free to do the right thing as well as the wrong thing.
It's up to the individual to practice the self-restraint (or not) as
they see fit.
In Crowley's case, deciding to not practice self-restraint was not
necessarily a bad thing. As long as you have considered the choices
before you, then you are correct. The responsibility for your actions
still remains, but it is not _necessarily_ the basis or criteria for
"correct."
I had to consider if contributing to this thread was correct or not. I
live with my decision.
-- cm
|
|
0
|
|
|
|
Reply
|
clvrmnky-uunet (585)
|
6/7/2004 5:14:26 PM
|
|
Guy Macon <http://www.guymacon.com> wrote in message news:<10c88rlsaccv64b@corp.supernews.com>...
> "In their rules there
> was only one clause: do what you will [Original: Fay ce que vouldras]
> because people who are free, well-born, well-bred, and easy in honest
> company have a natural spur and instinct which drives them to virtuous
> deeds and deflects them from vice; and this they called honour."
Ah, but this is usenet! free? well-born? well-bred? easy in honest
company? honour? usenet?
> Those who only quote "do what you will" out of context miss the point
> entirely.
This is usenet! I hope you realize the real deed that did incite
the mob lest you should err again.
> The point is voluntary self restraint; choosing to not do
> what you have liberty to do. It's the reason why polite people avoid
> top-posting, off-topic posting, posting HTML and any number of other
> ways that are known to annoy most of those who read Usenet posts.
CYRANO:
Ah no! Young blade! That was a trifle short!
You might have said at least a hundred things
By varying the tone. . .like this, suppose, . . .
/ /
|
|
0
|
|
|
|
Reply
|
fox21 (1833)
|
6/7/2004 5:26:40 PM
|
|
Guy Macon wrote:
> [Original: Fay ce que vouldras]
Je pense que vous n'orthographiez pas bien.
And my French is 40+ years in past.
|
|
0
|
|
|
|
Reply
|
rowlett10 (1881)
|
6/7/2004 6:23:56 PM
|
|
Richard Owlett wrote:
> Guy Macon wrote:
>
>> [Original: Fay ce que vouldras]
>
>
> Je pense que vous n'orthographiez pas bien.
>
> And my French is 40+ years in past.
Rabelais's French is even older that that!
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/7/2004 6:57:22 PM
|
|
Jerry Avins <jya@ieee.org> says...
>Infrequent off-topic threads like this one waste bandwidth and patience,
>but being easily avoidable, are harmless. Avoidable pornography is
>probably harmless too; the difference is bad taste. I don't like porn. I
>don't like boiled squash, either, but I don't make an issue of it.
I do make an issue of it. Post about boiled squash, adding porn to
your collection or adding prayers to your collection and I will
politely yet firmly request that you not post off-topic.
Perhaps you may wish to add my requesting that people not post
off-topic to your list of things that you don't like but don't
make an issue of. Without your making an issue of it, this
thread would consist of little other than my original request
and the off-topic poster's refusal to comply.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/7/2004 7:06:32 PM
|
|
Jerry Avins wrote:
> Richard Owlett wrote:
>
>> Guy Macon wrote:
>>
>>> [Original: Fay ce que vouldras]
>>
>>
>>
>> Je pense que vous n'orthographiez pas bien.
>>
>> And my French is 40+ years in past.
>
>
> Rabelais's French is even older that that!
>
> Jerry
Yeah but Macon's French appeared psudeo-phonetic
I'm trying to alert him that he goes GOBBLE GOBBLE
but rest of group is tooooooooo polite
Then again he may be a terminal case ;}
P.S. I'm posting as one who has been considered to be ...
[ fill in your own adjectives ;]
|
|
0
|
|
|
|
Reply
|
rowlett10 (1881)
|
6/7/2004 7:33:54 PM
|
|
Richard Owlett wrote:
> Jerry Avins wrote:
>
>> Richard Owlett wrote:
>>
>>> Guy Macon wrote:
>>>
>>>> [Original: Fay ce que vouldras]
>>>
>>>
>>>
>>>
>>> Je pense que vous n'orthographiez pas bien.
>>>
>>> And my French is 40+ years in past.
>>
>>
>>
>> Rabelais's French is even older that that!
>>
>> Jerry
>
>
>
> Yeah but Macon's French appeared psudeo-phonetic
>
> I'm trying to alert him that he goes GOBBLE GOBBLE
> but rest of group is tooooooooo polite
>
> Then again he may be a terminal case ;}
>
> P.S. I'm posting as one who has been considered to be ...
> [ fill in your own adjectives ;]
I was trying to alert you to the quote being from Old French. Rabelais
lived 1494-1553. From http://www.pantagruelion.com/p/s/10020.html:
Abbey of Theleme
The Abbey of Theleme was presented to Friar John by Gargantua, who built
it to reward his assistance in the defeat of King Picrochole. Friar John
desired his religious order to be exactly contrary to all others, so he
had the Abbey built without walls and without clocks. He decreed that
there should be no women where there were no men, and no men where there
were no women; and instead of the vows of chastity, poverty, and
obedience, he decreed that anyone could marry, get rich, and live at
liberty. The rules contained one clause: "Do what you will" ("Fay ce que
vouldras")
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/7/2004 8:09:57 PM
|
|
Jerry Avins wrote:
> [snip]
>
> I was trying to alert you to the quote being from Old French. Rabelais
> lived 1494-1553. From http://www.pantagruelion.com/p/s/10020.html:
>
> Abbey of Theleme
>
> The Abbey of Theleme was presented to Friar John by Gargantua, who built
> it to reward his assistance in the defeat of King Picrochole. Friar John
> desired his religious order to be exactly contrary to all others, so he
> had the Abbey built without walls and without clocks. He decreed that
> there should be no women where there were no men, and no men where there
> were no women; and instead of the vows of chastity, poverty, and
> obedience, he decreed that anyone could marry, get rich, and live at
> liberty. The rules contained one clause: "Do what you will" ("Fay ce que
> vouldras")
>
> Jerry
Interesting ;]
I was influenced by a "pet peeve" of some of my French teachers. The
despised (and that be mild) conversational courses with NO reading
requirement.
I suspect "Fay ce que vouldras" would be something like "Fait ce que
vouldrais" in modern spelling. But now we are getting into the history
of orthography, and we all know how Mr Macon hates 'off topic' posts.
BTW, has anyone noticed he has had no objection to the "I'm not
Finnish" thread.
My favorite children's story is "Emperor's New Cloths" [snicker snicker]
|
|
0
|
|
|
|
Reply
|
rowlett10 (1881)
|
6/7/2004 11:31:55 PM
|
|
Richard Owlett <rowlett@atlascomm.net> says...
>I suspect "Fay ce que vouldras" would be something like "Fait ce que
>vouldrais" in modern spelling. But now we are getting into the history
>of orthography, and we all know how Mr Macon hates 'off topic' posts.
>
>BTW, has anyone noticed he has had no objection to the "I'm not
>Finnish" thread.
So now I am being flamed for *NOT* complaining about off-topic posts?
It is clear to me that you wish to have a flame war rather than talk
about Forth, so I suppose that I should give you what you want:
You swine. You vulgar little maggot. You worthless bag of filth. As we
say in Texas, you couldn't pour water out of a boot with instructions
printed on the heel. You are a canker, an open wound. I would rather
kiss a lawyer than be seen with you. You took your last vacation in
the Islets of Langerhans.
You're a putrescent mass, a walking vomit. You are a spineless little
worm deserving nothing but the profoundest contempt. You are a jerk, a
cad, and a weasel. I take that back; you are a festering pustule on a
weasel's rump. Your life is a monument to stupidity. You are a stench,
a revulsion, a big suck on a sour lemon.
I will never get over the embarrassment of belonging to the same
species as you. You are a monster, an ogre, a malformity. I barf at
the very thought of you. You have all the appeal of a paper cut.
Lepers avoid you. You are vile, worthless, less than nothing. You are
a weed, a fungus, the dregs of this earth. You are a technicolor yawn.
And did I mention that you smell?
You are a squeaking rat, a mistake of nature and a heavy-metal bagpipe
player. You were not born. You were hatched into an unwilling world
that rejects the likes of you. You didn't crawl out of a normal egg,
either, but rather a mutant maggot egg rejected by an evil scientist
as being below his low standards. Your alleged parents abandoned you
at birth and then died of shame in recognition of what they had done
to an unsuspecting world. They were a bit late.
Try to edit your responses of unnecessary material before attempting
to impress us with your insight. The evidence that you are a
nincompoop will still be available to readers, but they will be able
to access it ever so much more rapidly. If cluelessness were crude
oil, your scalp would be crawling with caribou.
You are a thick-headed trog. I have seen skeet with more sense than
you have. You are a few bricks short of a full load, a few cards short
of a full deck, a few bytes short of a full core dump, and a few
chromosomes short of a full human. Worse than that, you top-post. God
created houseflies, cockroaches, maggots, mosquitos, fleas, ticks,
slugs, leeches, and intestinal parasites, then he lowered his
standards and made you. I take it back; God didn't make you. You are
Satan's spawn. You are Evil beyond comprehension, half-living in the
slough of despair. You are the entropy which will claim us all. You
are a green-nostriled, crossed eyed, hairy-livered inbred
trout-defiler. You make Ebola look good.
You are weary, stale, flat and unprofitable. You are grimy, squalid,
nasty and profane. You are foul and disgusting. You're a fool, an
ignoramus. Monkeys look down on you. Even sheep won't have sex with
you. You are unreservedly pathetic, starved for attention, and lost in
a land that reality forgot. You are not ANSI compliant and your markup
doesn't validate. You have a couple of address lines shorted together.
You should be promoted to Engineering Manager.
Do you really expect your delusional and incoherent ramblings to be
read? Everyone plonked you long ago. Do you fantasize that your
tantrums and conniption fits could possibly be worth the $0.000000001
worth of electricity used to send them? Your life is one big
W.O.M.B.A.T. and your future doesn't look promising either. We need to
trace your bloodline and terminate all siblings and cousins in order
to cleanse humanity of your polluted genes. The good news is that no
normal human would ever mate with you, so we won't have to go into the
sewers in search of your git.
You are a waste of flesh. You have no rhythm. You are ridiculous and
obnoxious. You are the moral equivalent of a leech. You are a living
emptiness, a meaningless void. You are sour and senile. You are a
loathsome disease, a drooling inbred cross-eyed toesucker. You make
Quakers shout and strike Pentecostals silent. You have a version 1.0
mind in a version 6.12 world. Your mother had to tie a pork chop
around your neck just to get your dog to play with you. You think
that HTTP://WWW.GUYMACON.COM/FUN/INSULT/INDEX.HTM is the name of a
rock band. You believe that P.D.Q. Bach is the greatest composer who
ever lived. You prefer L. Ron Hubbard to Larry Niven and Jerry
Pournelle. Hee-Haw is too deep for you. You would watch test patterns
all day if the other inmates would let you.
On a good day you're a half-wit. You remind me of drool. You are
deficient in all that lends character. You have the personality of
wallpaper. You are dank and filthy. You are asinine and benighted.
Spammers look down on you. Phone sex operators hang up on you.
Telemarketers refuse to be seen in public with you. You are the source
of all unpleasantness. You spread misery and sorrow wherever you go.
May you choke on your own foolish opinions. You are a Pusillanimous
galactophage and you wear your sister's training bra. Don't bother
opening the door when you leave - you should be able to slime your
way out underneath. I hope that when you get home your mother runs
out from under the porch and bites you.
You smarmy lagerlout git. You bloody woofter sod. Bugger off, pillock.
You grotty wanking oik artless base-court apple-john. You clouted
boggish foot-licking half-twit. You dankish clack-dish plonker. You
gormless crook-pated tosser. You bloody churlish boil-brained clotpole
ponce. You craven dewberry pisshead cockup pratting naff. You cockered
bum-bailey poofter. You gob-kissing gleeking flap-mouthed coxcomb. You
dread-bolted fobbing beef-witted clapper-clawed flirt-gill. May your
spouse be blessed with many bastards.
You are so clueless that if you dressed in a clue skin, doused yourself
in clue musk, and did the clue dance in the middle of a field of horny
clues at the height of clue mating season, you still would not have a
clue. If you were a movie you would be a double feature;
_Battlefield_Earth_ and _Moron_Movies_II_. You would be out of focus.
You are a fiend and a sniveling coward, and you have bad breath. You
are the unholy spawn of a bandy-legged hobo and a syphilitic camel.
You wear strangely mismatched clothing with oddly placed stains. You
are degenerate, noxious and depraved. I feel debased just knowing that
you exist. I despise everything about you, and I wish you would go
away. You are jetsam who dreams of becoming flotsam. You won't make
it. I beg for sweet death to come and remove me from a world which
became unbearable when you crawled out of a harpy's lair.
It is hard to believe how incredibly stupid you are. Stupid as a stone
that the other stones make fun of. So stupid that you have traveled
far beyond stupid as we know it and into a new dimension of stupid.
Meta-stupid. Stupid cubed. Trans-stupid stupid. Stupid collapsed to
a singularity where even the stupons have collapsed into stuponium.
Stupid so dense that no intelligence can escape. Singularity stupid.
Blazing hot summer day on Mercury stupid. You emit more stupid in one
minute than our entire galaxy emits in a year. Quasar stupid. It cannot
be possible that anything in our universe can really be this stupid.
This is a primordial fragment from the original big stupid bang. A pure
extract of stupid with absolute stupid purity. Stupid beyond the laws
of nature. I must apologize. I can't go on. This is my epiphany of
stupid. After this experience, you may not hear from me for a while.
I don't think that I can summon the strength left to mock your moronic
opinions and malformed comments about boring trivia or your other
drivel. Duh.
The only thing worse than your logic is your manners. I have snipped
away most of your of what you wrote, because, well ... it didn't
really say anything. Your attempt at constructing a creative flame was
pitiful. I mean, really, stringing together a bunch of insults among a
load of babbling was hardly effective... Maybe later in life, after
you have learned to read, write, spell, and count, you will have more
success. True, these are rudimentary skills that many of us "normal"
people take for granted that everyone has an easy time of mastering.
But we sometimes forget that there are "challenged" persons in this
world who find these things to be difficult. If I had known that this
was true in your case then I would have never have exposed myself to
what you wrote. It just wouldn't have been "right." Sort of like
parking in a handicap space. I wish you the best of luck in the
emotional, and social struggles that seem to be placing such a
demand on you.
P.S.: You are hypocritical, greedy, violent, malevolent, vengeful,
cowardly, deadly, mendacious, meretricious, loathsome, despicable,
belligerent, opportunistic, barratrous, contemptible, criminal,
fascistic, bigoted, racist, sexist, avaricious, tasteless, idiotic,
brain-damaged, imbecilic, insane, arrogant, deceitful, demented, lame,
self-righteous, byzantine, conspiratorial, satanic, fraudulent,
libelous, bilious, splenetic, spastic, ignorant, clueless, EDLINoid,
illegitimate, harmful, destructive, dumb, evasive, double-talking,
devious, revisionist, narrow, manipulative, paternalistic,
fundamentalist, dogmatic, idolatrous, unethical, cultic, diseased,
suppressive, controlling, restrictive, malignant, deceptive, dim,
crazy, weird, dyspeptic, stifling, uncaring, plantigrade, grim,
unsympathetic, jargon-spouting, censorious, secretive, aggressive,
mind-numbing, arassive, poisonous, flagrant, self-destructive,
abusive, socially-retarded, puerile, and Generally Not Good.
I hope this helps...
|
|
0
|
|
|
|
Reply
|
Guy
|
6/7/2004 11:48:44 PM
|
|
Jerry Avins says...
>
>Richard Owlett wrote:
>
>> Jerry Avins wrote:
>>
>>> Richard Owlett wrote:
>>>
>>>> Guy Macon wrote:
>>>>
>>>>> [Original: Fay ce que vouldras]
>>>>
>>>> Je pense que vous n'orthographiez pas bien.
>>>>
>>>> And my French is 40+ years in past.
>>>
>>> Rabelais's French is even older that that!
>>
>> Yeah but Macon's French appeared psudeo-phonetic
>>
>> I'm trying to alert him that he goes GOBBLE GOBBLE
>> but rest of group is tooooooooo polite
>>
>> Then again he may be a terminal case ;}
>
>I was trying to alert you to the quote being from Old French.
>Rabelais lived 1494-1553.
....which is exactly how I understood your comment.
Richard Owlett, on the other hand, is an idiot. He just flamed
what he thought was my French when what I wrote was a direct quote
from the J. M. Cohen 1955 translation of _The Histories of Gargantua
and Pantagruel_. One can only admire the spectacle of someone being
so invested in having a flame war that he steps in it while grasping
at any straw in his silly quest to pick a fight.
BTW, I *always* rely on translations. I am British and do not speak
French.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/7/2004 11:59:17 PM
|
|
Guy Macon wrote:
...
> Richard Owlett, on the other hand, is an idiot. He just flamed
> what he thought was my French when what I wrote was a direct quote
> from the J. M. Cohen 1955 translation of _The Histories of Gargantua
> and Pantagruel_.
In much the same spirit that you criticized Leo without understanding
the ramifications. Most of us do that once in a while. BTW: How long did
it take you to type out your flame? May I copy it with attribution?
> One can only admire the spectacle of someone being
> so invested in having a flame war that he steps in it while grasping
> at any straw in his silly quest to pick a fight.
Richard is quick to leap to the defense of his friends. I'm glad that we
have no reason to be at odds.
Jerry
--
Winning isn't everything, it doesn't even matter.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/8/2004 2:00:54 AM
|
|
Jerry Avins wrote:
> LeoWong wrote:
>
>>
>> I have friends from Taiwan who could easily give me the Cantonese and the
>> Mandarin and perhaps some others, but they're not Christian, so I've been
>> shy about asking them....
> Go ahead and ask them. If they take offense at being asked to use their
> knowledge to help you in your quest, they're hardly friends anyway.
>
I asked, thanks, and got something back. I wasn't thinking that they
would take offense: I just didn't want to make them uncomfortable.
Last evening I attended a performance of Mozart's Requiem in an event
remembering D-day.
Leo
--
http://www.albany.net/~hello/
Lord Jesus Christ, Son of God, have mercy on me, a sinner.
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/8/2004 2:21:13 AM
|
|
pornin@nerim.net (Thomas Pornin) wrote in message news:<ca1k5c$2qvh$1@biggoron.nerim.net>...
> Interestingly, the Gtk graphical toolkit (an emergent standard under
> Linux) has defined that all strings are exchanged as UTF-8 octet
> strings, even though wchar_t was available and ready to use.
Of course, part of it may be whether the focus is on text processing
or text display. How wide and how how is this string on the screen
are the kinds of things that (from the outside looking in) would seem
to be to be easier to do with character stream perspective, while
fuzzy indexing would gain greater simplifications from being able to
view a text string as the conventional vector of characters.
And the advantage for the string exchange problem is that you don't
have to worry about whether you are in the first slice of Unicode
or out in the full (so far) Unicode ... once you get over the
#char <-> #octets conversions for the one, the solution scales.
If you are actually doing string processing, it seems like there
would be no serious problems if it is all sequential, but if it
heads to a more general indexed access, uniform character sizes
start to become handy.
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/8/2004 3:18:17 AM
|
|
Jerry Avins <jya@ieee.org> says...
>How long did it take you to type out your flame?
>May I copy it with attribution?
You may quote it *without* attributiion. See the permissions section
of my my insult page at [ http://www.guymacon.com/FUN/INSULT/INDEX.HTM ]
Here is a portion of that file:
---------------------------------------------------------------------
ABOUT THE INSULT FILE::
This document is a collection of insults gathered from many years of
BBS and Usenet use, so the real credit goes to the many fine flamers
who have had their work added to this document over the years. I am
but an editor who has gathered the works of others into one document.
When I started writing this file, I had no idea that it would become
so popular. I just wanted a humorous way to defuse the kind of
arguments BBS users sometimes get into. A Monty Python skit about
people complaining about how bad they had it while growing up inspired
me to put it together an insult file. Whenever I found a couple of
people online trading nasty insults, I posted the insult file with the
question "Do I Win?" at the bottom. This often resulted in the flame
war dissolving into laughter.
In the years since then, I have refined it and improved the quality of
the insults, but most of the credit is not mine; the real authors are
the scores of flamers who have contributed insults to the file.
HOW TO USE:
For full effect, I *strongly* advise using the full insult file. Yes,
I know that it goes on and on. That's what makes it funny. Trust me on
this one. One insult is insulting. A *bunch* of insults are funny!
PERMISSIONS:
You are free to use this for any purpose, including web pages,
newsgroup posts, emails, and letters to the Los Angeles Times. I do
*not* require you to give me credit if you use this in an email or
newsgroup post - it is more effective without it. Just cut and past it
as is, tell anybody who asks where you got it, and refer them to this
paragraph if they think you stole it. I would prefer credit if you put
this on your web page, but feel free to ignore that preference if the
page works better that way. I would appreciate it of you don't remove
the hidden reference to the web page URL and version number - that's
how my fellow DNRC members know they have the latest version.
CREDITS:
I tried to make a list of who the original author of each bit was, but
I keep running into cases where more than one person claimed to be the
original author. In some cases I have found that the supposed original
author stole it himself. The best way to solve this is to do a web and
newsgroup search on any phrase that you are particularly interested in,
and look for the earliest published occurrence.
LATEST VERSION:
Now that the file has become popular, there are many hacked-up and
outdated versions of it floating around the 'Net. If you see one,
please let people know that they can always find the latest version
at [ HTTP://WWW.GUYMACON.COM/INSULT/ ].
COMMENTS WELCOME:
If you have an insult you would like included, please send it to
[ guymacon+insult03@spamcop.net ] with the word "insult" or
"flame" in the title.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/8/2004 5:35:24 AM
|
|
Guy Macon <http://www.guymacon.com> wrote in message news:<10c9f565l0gjf8b@corp.supernews.com>...
> Jerry Avins <jya@ieee.org> says...
> >Infrequent off-topic threads like this one waste bandwidth and patience,
> >but being easily avoidable, are harmless. Avoidable pornography is
> >probably harmless too; the difference is bad taste. I don't like porn. I
> >don't like boiled squash, either, but I don't make an issue of it.
> I do make an issue of it. Post about boiled squash, adding porn to
> your collection or adding prayers to your collection and I will
> politely yet firmly request that you not post off-topic.
Whoa there, pardner, if somebody is boiling squash with Forth,
I want to read about it here first. Why should the computerised
squash management newsgroup have all the fun?
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/8/2004 7:04:39 AM
|
|
Guy Macon <http://www.guymacon.com> wrote in message news:<10c88rlsaccv64b@corp.supernews.com>...
> Those who only quote "do what you will" out of context miss the point
> entirely. The point is voluntary self restraint;
So it is do what you WILL yourself to do, not (as some would misread
it) whatever you might have an urge to do.
Fortunately I have been among the ozzies too long to worry about
being lumbered with that particular taxing stricture. Yet I would
be happy to place that kind of pressure on Leo. Talk about ingratitude,
after learning so much playng around with MF.F when I have the spare
time.
|
|
0
|
|
|
|
Reply
|
agila61 (3956)
|
6/8/2004 7:14:24 AM
|
|
> I suspect "Fay ce que vouldras" would be something like "Fait ce que
> vouldrais" in modern spelling.
" Fait ce que tu voudras"
Amicalement,
Astrobe
|
|
0
|
|
|
|
Reply
|
astrobe (260)
|
6/8/2004 11:15:39 AM
|
|
jonah thomas <j2thomas@cavtel.net> wrote in message news:<ZG1vc.56$4Z.18126@news.uswest.net>...
> Was it one poster on the Forth group who responded abusively? Two?
>
> I thought it better not to defend you since the result is to make a
> continued little flamewar of no particular value.
But then you did anyway. Did you decide that it was safe once
enough others joined into the flaming for you to contribute?
> I consider your explanation quite clear, and I hope you won't feel the
> need to keep explaining when you get flamed further.
>
> Probably if you do something similar again -- say christmas -- no one
> here then will object.
How can you know that?
> Or maybe they will, clf seems to have gotten a
> bit more abrasive in the last few years.
Now don't get huffy!
|
|
0
|
|
|
|
Reply
|
fox21 (1833)
|
6/10/2004 2:57:29 AM
|
|
Leo Wong <hello@albany.net> writes:
>I
>thought that OT sufficiently announced "off-topic",
It does, but that does not make it ok. Letting an on-topic thread
degenerate into an off-topic one is bad enough, but starting an
off-topic thread from scratch is unexcusable. Any flamage you get for
that is well-deserved, no matter what merits you may have earned in
the group.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
|
|
0
|
|
|
|
Reply
|
anton (5254)
|
6/13/2004 2:23:52 PM
|
|
Anton Ertl wrote:
> Leo Wong <hello@albany.net> writes:
>
>>I
>>thought that OT sufficiently announced "off-topic",
>
>
> It does, but that does not make it ok. Letting an on-topic thread
> degenerate into an off-topic one is bad enough, but starting an
> off-topic thread from scratch is unexcusable. Any flamage you get for
> that is well-deserved, no matter what merits you may have earned in
> the group.
>
I didn't intend to start a thread: I asked that the help be e-mailed to
me. I don't wish to be excused.
I said "live and learn," and it seems that some have learned, since no
one has criticized:
OT: Wil Baden
This group seems to have forgotten: Who is my neighbor?
Leo Wong
--
http://www.albany.net/~hello/
He lived. He loved. He prayed.
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/14/2004 2:24:38 AM
|
|
Leo Wong <hello@albany.net> says...
>I don't wish to be excused.
>since no one has criticized:
Do you have something to say that has anything at all to do with
Forth, or is it your intent to keep posting about other things?
I want to know because my time is valuable and I need to decide
whether to killfile you.
|
|
0
|
|
|
|
Reply
|
Guy
|
6/14/2004 3:25:04 AM
|
|
Guy Macon wrote:
> Leo Wong <hello@albany.net> says...
>
>
>>I don't wish to be excused.
>
>
>>since no one has criticized:
>
>
> Do you have something to say that has anything at all to do with
> Forth, or is it your intent to keep posting about other things?
> I want to know because my time is valuable and I need to decide
> whether to killfile you.
>
6/7:
This is perhaps the Hebrew:
224 \ SwiftForth?
many: enums
alef bet gimel dalet he vav zayin het tet yod fkaf kaf
lamed fmem mem fnun num samekh ayin fpe pe ftsadi tsadi
qof resh shin tav ; DROP
\ lay down shalom aleichem
CREATE Shalom_Aleichem 0 ,
shin lamed vav fmem BL ayin lamed yod kaf fmem
dEPTH dup Shalom_Aleichem ! -,s
Leo Wong
6/8:
...
Since the FIG-UK leadership are not regulars on comp.lang.forth, I
myself will put a plug for them here:
http://www.fig-uk.org/
The site has pdf versions of most of the recent issues of Forthwrite.
Among the small benefits of getting the publication is that words like
>IN are spelled consistently -- the big benefit is the discussion.
The January 2003 issue has an article on "Using Wordlists for Many[".
Leo Wong
and:
\ I'm at a conference.
\ This probably needs polishing and generalizing.
\ d2h.f Leo Wong June 8, 2004 +
\ Find decimal unicode in html source,
\ return their hex values for pasting into Java code
: scan ( ca1 u1 c -- ca2 u2 )
>R
BEGIN DUP WHILE OVER C@ R@ <> WHILE 1 /STRING REPEAT THEN
R> DROP ;
\ Isolate decimal unicode
: unicode> ( ca u -- ca1 u1 ca2 u2 )
2 /STRING 2DUP 2>R [CHAR] ; scan DUP 2R> ROT - ;
\ (.) as implementated by Wil Baden
: (.) ( n -- str len )
DUP ABS 0 <# #S ROT SIGN #> ;
: 0U.R ( n # -- )
>R (.) R> OVER - 0 MAX 0 ?DO [CHAR] 0 EMIT LOOP TYPE ;
: >java ( ca u -- )
EVALUATE BASE @ >R ." \u" HEX 4 0U.R R> BASE ! ;
: get-unicodes ( a u -- )
BEGIN S" &#" SEARCH WHILE unicode> >java REPEAT 2DROP ;
0 VALUE infile
: opens ( -- )
S" chinese.htm" R/O OPEN-FILE THROW TO infile ;
: closes ( -- )
infile CLOSE-FILE THROW ;
1024 CONSTANT maxin
CREATE inpad maxin CHARS ALLOT
: reads ( -- )
BEGIN inpad DUP maxin infile READ-LINE THROW
WHILE get-unicodes
REPEAT 2DROP ;
: fyj opens reads closes ;
\ Sample output:
\ \u8036\u7A4C\u57FA\u7763\uFE50\u4E0A\u5E1D\u4E4B\u5B50
\ \u8ACB\u6190\u61AB\u6211\uFE50\u8F09\u7F6A\u4E4B\u4EBA
\ Leo Wong
and:
Brad Eckert wrote:
> Hi all,
>
> I tried www.tinyboot.com/ANS/color.f under a few different Forths:
Very good. Thank you.
Leo Wong
6/9:
Should start from the right margin.
33 CONSTANT linewidth
: c++ ( n ca u -- n+1 ca u ) 2>R 1+ 2R> ;
: unis ( ca u -- )
linewidth 0 2SWAP 0 >R
BEGIN \u SEARCH WHILE get-uni uni>sf >R c++
BEGIN punct 3 PICK C@ cin
WHILE OVER C@ >R c++ 1 /STRING
REPEAT
REPEAT 2DROP
- 0 MAX SPACES
BEGIN R> DUP WHILE EMIT REPEAT DROP ;
Leo Wong
And about as many more. They come in flurries.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/14/2004 3:52:23 AM
|
|
I was sent some Thai, apparently (not sure) in ISO-8859-11. What I
received looked like: =A2=E9=D2=E1, etc.
Fortunately ISO and Unicode for Thai seem to be in the same order, so I
was able to write a Forth program to translate the ISO codes into Java
Unicode escape characters.
Leo Wong
http://www.albany.net/~hello/
Kha Tae Pra Sawami Jesu Christ to Chao, Butr Pra Chao,
Prod Metta Khaphachao Khon Bab Duai thoet.
|
|
0
|
|
|
|
Reply
|
hello988 (120)
|
6/16/2004 10:42:19 PM
|
|
pornin@nerim.net (Thomas Pornin) writes:
>As far as things like Java or the Win32 API are concerned (both use
>16-bit characters), they were defined at a time when Unicode was not
>really expected to exceed 65000 characters. That both use 16-bit
>characters instead of 32-bit characters is not really petty quibbling,
>rather lack of foresight.
....
>When Unicode went to 1000000+ characters, UTF-16 was designed: it is a
>way to encode arbitrary such characters into either 1 or 2 16-bit words
>(just like UTF-8 is a way to encode Unicode characters into 1 to 4 8-bit
>bytes). Unicode characters from the first plane remained unchanged in
>UTF-16. Windows switched to UTF-16. I expect Java to switch also to
>UTF-16 at some time in the near future.
UTF-16 may be the future for stuff that started with 16-bit
characters, but otherwise it seems obsolete, because it gives neither
the benefits of fixed-width characters nor the benefit of fitting into
the char-is-byte framework, and it usually costs more space than
UTF-8, too. Java might just decide to ignore the parts of Unicode
that don't fit in 16 bits, and keep the fixed-width advantage.
For programming languages (in particular, Forth) the reasonable
options are:
1) A fixed-width characters. For Unicode this would currently fit
in 32 bits, but who knows if they won't extend it to 64 bits or more
in the future. ANS Forth has the char data type, and fortunately its
width is not defined (other than being at least 8 bits), and the most
straightforward fit of Unicode to ANS Forth is to have each Unicode
character being a char; i.e., a char would be at least 32 bits in such
a system.
Unfortunately there are many programs around that assume that 1 CHARS
equals 1, and a very straightforward implementation of 32-bit
characters on a system would break such programs.
One way around this would be to make the address unit also 32 bits.
Benefit: does not break the programs mentioned above. Cost: on
byte-addressed hardware this would require a scaling operation on
every memory access; good compilers may be able to optimize some of
that away, but I suspect that this would require full-blown data-flow
analysis to be effective (AFAIK neither VFX nor iForth, and of course
nothing else does this).
2) Variable-width characters, where each Unicode character may consist
of several Forth chars (e.g., UTF-8). This would break Forth programs
that manipulate individual characters (e.g., anagram and palindrome
programs), but not programs that use C@/C! to deal with blocks/strings
of characters.
A Forth system that used UTF-8 could still be argued to conform to ANS
Forth, since it works correctly for the characters supported by ANS
Forth (ASCII characters, which are a single byte in UTF-8, too).
However, I consider it problematic to extend ANS Forth in this way
(i.e., by saying that UTF-8 is supported in extended-ANS-Forth), since
a correct ANS Forth program would not necessarily be a correct
extended-ANS-Forth program.
OTOH, I believe that fewer programs would be broken by such an
extension, and in fewer places, than with the wide-character variant
(but the broken programs in this case are ANS Forth programs, whereas
they are not pure ANS Forth programs in the wide-character variant).
Concerning the questions of what C@, C!, and the counts in CMOVE,
COUNT etc. refer to, of course they would refer to (fixed-width) Forth
chars, not Unicode characters. EMIT and KEY would have to buffer
stuff, converting sequences of Forth chars to/from Unicode characters.
Originally I leaned towards option 1, because it fits perfectly with
the ANS Forth standard, i.e., works without change for all existing
standard programs. However, in the meantime I am more pragmatist than
purist in this area, so I also consider it important to keep existing
programs with common environmental dependences working, and I now
think that the ANS Forth code broken by UTF-8 is not as much as I used
to think, so now I am leaning more towards option 2.
Has anyone done a Forth system with UTF-8 and can report experiences?
Hmm, I have used Fedora Core 1 for some months now (which supposedly
uses UTF-8), but have not noticed any problems with Gforth yet (but I
may not be dealing with UTF-8 stuff due to my setup).
>Interestingly, the Gtk graphical toolkit (an emergent standard under
>Linux) has defined that all strings are exchanged as UTF-8 octet
>strings, even though wchar_t was available and ready to use.
Sure, why should they use wchar_t?
>PS: the Unix kernel expects string arguments with a terminal 0, for
>instance when opening a file (the file name is thus provided). In Forth,
>strings do not have that terminating 0. When implementing a Forth
>system running under Unix, how is this usually solved ? By requiring
>application code to add (or at least leave room for) the terminating 0,
>or with local buffers where file names are temporarily copied ?
Gforth copies the strings to temporary buffers that are 0-terminated.
However, for the more important stuff there are calls that take memory
blocks (e.g., write/fwrite).
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
|
|
0
|
|
|
|
Reply
|
anton (5254)
|
6/18/2004 8:48:48 AM
|
|
On Fri, 18 Jun 2004 08:48:48 GMT, anton@mips.complang.tuwien.ac.at
(Anton Ertl) wrote (more or less):
....
>Originally I leaned towards option 1, because it fits perfectly with
>the ANS Forth standard, i.e., works without change for all existing
>standard programs. However, in the meantime I am more pragmatist than
>purist in this area, so I also consider it important to keep existing
>programs with common environmental dependences working, and I now
>think that the ANS Forth code broken by UTF-8 is not as much as I used
>to think, so now I am leaning more towards option 2.
>
>Has anyone done a Forth system with UTF-8 and can report experiences?
>
>Hmm, I have used Fedora Core 1 for some months now (which supposedly
>uses UTF-8), but have not noticed any problems with Gforth yet (but I
>may not be dealing with UTF-8 stuff due to my setup).
Surely you'll never experience any problems unless you encounter some
extended charset characters?
If you only encounter ASCII chars, UTF-8 and ASCII are interchangable,
surely?
--
Cheers,
Euan
Gawnsoft: http://www.gawnsoft.co.sr
Symbian/Epoc wiki: http://html.dnsalias.net:1122
Smalltalk links (harvested from comp.lang.smalltalk) http://html.dnsalias.net/gawnsoft/smalltalk
|
|
0
|
|
|
|
Reply
|
xlucid (70)
|
6/18/2004 11:05:09 AM
|
|
On Fri, 18 Jun 2004 08:48:48 GMT, anton@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:
>Has anyone done a Forth system with UTF-8 and can report experiences?
MPE has clients who support internationalised apps and have
used variable width character encoding. The discussions in
the draft proposals on internationalisation and wide character
sets are available from
http://www.mpeltd.demon.co.uk/arena.htm
and are the result of years of experience. The file
INTERNATIONAL.FTH is supplied as part of VFX Forth for
Windows from the same location. This an implementation
of what is in the drafts.
The real issue is NOT how the underlying Forth works, but
what the relationship between the
* host Forth - usually ISO Latin1
* application's current language
* operating system
My standard example for this is a Russian engineer using
an application written in South Africa running on a Chinese
version of Windows. It happens!
Stephen
--
Stephen Pelc, stephenXXX@INVALID.mpeltd.demon.co.uk
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeltd.demon.co.uk - free VFX Forth downloads
|
|
0
|
|
|
|
Reply
|
stephenXXX1 (459)
|
6/18/2004 12:05:43 PM
|
|
stephenXXX@INVALID.mpeltd.demon.co.uk (Stephen Pelc) writes:
>On Fri, 18 Jun 2004 08:48:48 GMT, anton@mips.complang.tuwien.ac.at
>(Anton Ertl) wrote:
>
>>Has anyone done a Forth system with UTF-8 and can report experiences?
>
>MPE has clients who support internationalised apps and have
>used variable width character encoding.
Such experiences should be transferable to UTF-8. What I wonder about
is: What are the experiences with using UTF-8 or other variable-width
character encodings with code that was not designed for variable-width
characters from the start? How much breakage was there, how hard was
it to find the breakage, and to fix it?
I get the impression that variable-width character encodings were not
used for the DCS (developer character sets).
> The discussions in
>the draft proposals on internationalisation and wide character
>sets are available from
> http://www.mpeltd.demon.co.uk/arena.htm
>and are the result of years of experience.
Not sure what you mean, but I looked at
http://www.mpeltd.demon.co.uk/arena/i18n.widechar.v7.PDF and did find
a proposal, but no experience report.
>The real issue is NOT how the underlying Forth works, but
>what the relationship between the
> * host Forth - usually ISO Latin1
> * application's current language
> * operating system
>My standard example for this is a Russian engineer using
>an application written in South Africa running on a Chinese
>version of Windows. It happens!
If you want different character sets/encodings for each of them, this
is messy, confusing, and probably error-prone. In particular the
division between host/development and application eliminates one of
the advantages of Forth: the absence of such a division.
But Unicode offers us a different perspective: Everyone uses the same
character set/encoding in the source code and in the application; the
OS should be less of a problem, we have well-defined interface words
for that (well, except for stuff like FILE-POSITION, which forced the
OS view of characters on Gforth).
That still leaves the problem of using different languages in the
various output text strings and other locale stuff (e.g., number and
currency formatting), but at least we don't get the mess at the
character level.
Looking at other programming languages, I don't see them sporting a
DCS, OCS, and ACS (or maybe I did not look close enough). I doubt
that they are used any less for international applications than Forth.
Supporting different DCS, OCS, and ACS could be an option (that I
don't intend to implement), but my question was about having UTF-8 as
a common CS.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
|
|
0
|
|
|
|
Reply
|
anton (5254)
|
6/18/2004 5:27:47 PM
|
|
On Fri, 18 Jun 2004 17:27:47 GMT, anton@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:
>Such experiences should be transferable to UTF-8. What I wonder about
>is: What are the experiences with using UTF-8 or other variable-width
>character encodings with code that was not designed for variable-width
>characters from the start? How much breakage was there, how hard was
>it to find the breakage, and to fix it?
>
>I get the impression that variable-width character encodings were not
>used for the DCS (developer character sets).
The papers are draft proposals based on techniques from years of
experience at CCS and Micross, with additional input from others.
Using variable width encodings for the developer character set
(DCS) breaks too much code and introduces kernel complexity, so
nobody wants to do that. Hence the application character set (ACS)
which now can be defined separately from the DCS. This permits
you to avoid most of the issues of COUNT and CMOVE etc.
>If you want different character sets/encodings for each of them, this
>is messy, confusing, and probably error-prone. In particular the
>division between host/development and application eliminates one of
>the advantages of Forth: the absence of such a division.
The drafts reflect reality: the vast majority of existing hosted
Forths assume char=byte=address-unit. We still see application
programmers using COUNT to step through memory.
>But Unicode offers us a different perspective: Everyone uses the same
>character set/encoding in the source code and in the application;
At the time the drafts were written there were languages for which
Unicode encodings did not exist. UTF-16 is not adequate and even
some embedded systems with 16 bit Forths need run-time selection
of display language.
>Looking at other programming languages, I don't see them sporting a
>DCS, OCS, and ACS (or maybe I did not look close enough). I doubt
>that they are used any less for international applications than Forth.
C effectively recognises DCS and ACS by having chars and wide
chars. All that we did was to realise that the operating system
can use another encoding, OCS. Again, this just reflects reality
at the time the drafts were written.
>Supporting different DCS, OCS, and ACS could be an option (that I
>don't intend to implement), but my question was about having UTF-8 as
>a common CS.
UTF-8 is fine as a transfer standard. But any variable width
encoding scheme will be incompatible with common practice for
fixed-width encoding. We just recognised that and accepted the
consequences.
Stephen
--
Stephen Pelc, stephenXXX@INVALID.mpeltd.demon.co.uk
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeltd.demon.co.uk - free VFX Forth downloads
|
|
0
|
|
|
|
Reply
|
stephenXXX1 (459)
|
6/19/2004 10:09:51 AM
|
|
"Anton Ertl" <anton@mips.complang.tuwien.ac.at> wrote in message
news:2004Jun18.104848@mips.complang.tuwien.ac.at...
===snipped
>
> 2) Variable-width characters, where each Unicode character may consist
> of several Forth chars (e.g., UTF-8). This would break Forth programs
> that manipulate individual characters (e.g., anagram and palindrome
> programs), but not programs that use C@/C! to deal with blocks/strings
> of characters.
>
> A Forth system that used UTF-8 could still be argued to conform to ANS
> Forth, since it works correctly for the characters supported by ANS
> Forth (ASCII characters, which are a single byte in UTF-8, too).
>
> However, I consider it problematic to extend ANS Forth in this way
> (i.e., by saying that UTF-8 is supported in extended-ANS-Forth), since
> a correct ANS Forth program would not necessarily be a correct
> extended-ANS-Forth program.
That I can't see; can you explain? As a correct ANS Forth program would be a
subset (in character terms) of an extended-ANS-Forth program, why would an
extended-ANS-system not correctly process it?
>
> OTOH, I believe that fewer programs would be broken by such an
> extension, and in fewer places, than with the wide-character variant
> (but the broken programs in this case are ANS Forth programs, whereas
> they are not pure ANS Forth programs in the wide-character variant).
>
> Concerning the questions of what C@, C!, and the counts in CMOVE,
> COUNT etc. refer to, of course they would refer to (fixed-width) Forth
> chars, not Unicode characters. EMIT and KEY would have to buffer
> stuff, converting sequences of Forth chars to/from Unicode characters.
>
Again, A.3.1.2 Character types; 5) For the purposes of input (KEY, ACCEPT,
etc.) and output (EMIT, TYPE, etc.), the encoding between numbers and
human-readable symbols is ISO646/IRV (ASCII) within the range from 32 to 126
(space to ~).
So anything outside that range has an environmental dependency; standard
programs should be fine in a UTF-8 system?
>
> Originally I leaned towards option 1, because it fits perfectly with
> the ANS Forth standard, i.e., works without change for all existing
> standard programs.
If COUNTED-BUFF 1+ is considered non-standard (or 1 \STRING for addr-len
buffers). There's lots of code out there that doesn't use CHARS or CHAR+ ; I
know that 3.3.3.1 says that "Adding or subtracting an arbitrary number to an
address can produce an unaligned address that shall not be used to fetch or
store anything", but how much code abides by that restriction? Any
wide-character system will suffer from this problem, but the advantage of
UTF-8 is that programs that are non-standard on this one point only will
continue to run correctly.
> However, in the meantime I am more pragmatist than
> purist in this area, so I also consider it important to keep existing
> programs with common environmental dependences working, and I now
> think that the ANS Forth code broken by UTF-8 is not as much as I used
> to think, so now I am leaning more towards option 2.
>
> Has anyone done a Forth system with UTF-8 and can report experiences?
FILE-POSITION as you note (seperate thread in reply to Stephen Pelc) on
non-BIN files might be an issue, but given that line endings are not defined
and are variable length (CR/LF/CRLF) then the issue is moot; it's almost
impossible to use on a text file and remain ANS unless the result from
FILE-POSITION is treated as opaque. UTF-8 adds no complications to this.
I'm leaning towards UTF-8 myself; W32F should be able to compile UTF-8
source with no to little change. I think that the only change might be an
increase to the length of the SOURCE buffer, and a check in OPEN-FILE for
hex EF BB BF (which I believe to be the UTF-8 lead-in sequence).
Frustratingly, the editor I currently use supports ASCII Latin-1 and UTF-16
only; are there any decent text editors for Windows that support UTF-8?
Notepad supports UTF-8, but it's not really suitable.
--
Regards
Alex McDonald
|
|
0
|
|
|
|
Reply
|
alex_mcd (751)
|
6/19/2004 11:16:44 AM
|
|
According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
> pornin@nerim.net (Thomas Pornin) writes:
> >Interestingly, the Gtk graphical toolkit (an emergent standard under
> >Linux) has defined that all strings are exchanged as UTF-8 octet
> >strings, even though wchar_t was available and ready to use.
>
> Sure, why should they use wchar_t?
UTF-8 is a way to use Unicode on machines which only handle ASCII or
latin-1. C includes a basic character processing system, with the type
"char" and support functions, which is mapped on ASCII or some other
octet-based charset on most machines (at least those on which Gtk may
run).
C also has a support for "extended" characters, with the "wchar_t" type,
wide-character string literal (such as L"hello") and wide-character
constants (L'H'), and string functions which use wide characters
(wprintf() and co). On modern machines (those targeted by Gtk),
"wchar_t" is mapped on Unicode. As far as the C standard is concerned,
"wchar_t" is the way of the future for Unicode handling.
The fact that Gtk does not use wchar_t, but prefers UTF-8, shows that
the Gtk designer have deemed wchar_t an inappropriate way to handle
Unicode. This may be an indication of obsolescence. Thus, I find this
fact interesting.
--Thomas Pornin
|
|
0
|
|
|
|
Reply
|
pornin (75)
|
6/20/2004 9:53:31 AM
|
|
According to Alex McDonald <alex_mcd@btopenworld.com>:
> hex EF BB BF (which I believe to be the UTF-8 lead-in sequence)
It is the encoding of a "zero-width unbreakable space". That is a
Unicode character with no graphical effect. When used as the first
character of a string, it is called a BOM ("binary order mark") because
it is an unambiguous way to recognize encoding endianness, for UTF-16
and UTF-32.
UTF-8 has no endianness problem (it is defined as a stream of octets)
and, as such, does not need a BOM. Previous versions of Unicode forbade
BOM for UTF-8; current versions allow it but do not mandate it. The
bottom line is that man UTF-8 texts do NOT begin with EF BB BF. At my
office, we recently had some problems with an emacs which added a BOM
to an otherwise ASCII-only XML file, which made some other tool very
unhappy.
> Frustratingly, the editor I currently use supports ASCII Latin-1
> and UTF-16 only; are there any decent text editors for Windows that
> support UTF-8?
Both vim and emacs have Windows versions, and both may handle UTF-8
relatively cleanly. I am not sure about bidirectional output, though.
--Thomas Pornin
|
|
0
|
|
|
|
Reply
|
pornin (75)
|
6/20/2004 10:03:11 AM
|
|
stephenXXX@INVALID.mpeltd.demon.co.uk (Stephen Pelc) writes:
>On Fri, 18 Jun 2004 17:27:47 GMT, anton@mips.complang.tuwien.ac.at
>(Anton Ertl) wrote:
>>What are the experiences with using UTF-8 or other variable-width
>>character encodings with code that was not designed for variable-width
>>characters from the start? How much breakage was there, how hard was
>>it to find the breakage, and to fix it?
....
>Using variable width encodings for the developer character set
>(DCS) breaks too much code and introduces kernel complexity, so
>nobody wants to do that.
That's the stuff I was looking for. Could you elaborate on that?
> Hence the application character set (ACS)
>which now can be defined separately from the DCS. This permits
>you to avoid most of the issues of COUNT and CMOVE etc.
I don't see any issues with COUNT, CMOVE, etc. They would just refer
to the fixed-width Forth chars (usually bytes), not to Unicode
characters.
> We still see application
>programmers using COUNT to step through memory.
While not especially good style, that's standard-compliant and may be
appropriate for speed on simple Forth systems.
>>Looking at other programming languages, I don't see them sporting a
>>DCS, OCS, and ACS (or maybe I did not look close enough). I doubt
>>that they are used any less for international applications than Forth.
>C effectively recognises DCS and ACS by having chars and wide
>chars.
Hmm, thinking a little bit about this C has a DCS different from the
ACS, because it has a strict separation of compile time and run-time.
The C source is in the DCS and the strings manipulated at run-time are
in the ACS; both the char type and the wchar_t type are for expressing
application characters (i.e., ACS).
However, looking at languages without this strict separation
(typically interactive languages, but probably also stuff like Java
reflection), I don't see a separation between DCS and ACS, either.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
|
|
0
|
|
|
|
Reply
|
anton (5254)
|
6/26/2004 6:47:47 PM
|
|
"Alex McDonald" <alex_mcd@btopenworld.com> writes:
>"Anton Ertl" <anton@mips.complang.tuwien.ac.at> wrote in message
>news:2004Jun18.104848@mips.complang.tuwien.ac.at...
>
>===snipped
>
>>
>> 2) Variable-width characters, where each Unicode character may consist
>> of several Forth chars (e.g., UTF-8). This would break Forth programs
>> that manipulate individual characters (e.g., anagram and palindrome
>> programs), but not programs that use C@/C! to deal with blocks/strings
>> of characters.
>>
>> A Forth system that used UTF-8 could still be argued to conform to ANS
>> Forth, since it works correctly for the characters supported by ANS
>> Forth (ASCII characters, which are a single byte in UTF-8, too).
>>
>> However, I consider it problematic to extend ANS Forth in this way
>> (i.e., by saying that UTF-8 is supported in extended-ANS-Forth), since
>> a correct ANS Forth program would not necessarily be a correct
>> extended-ANS-Forth program.
>
>That I can't see; can you explain? As a correct ANS Forth program would be a
>subset (in character terms) of an extended-ANS-Forth program, why would an
>extended-ANS-system not correctly process it?
Consider the following word:
: palindrome? ( addr u -- f )
over chars + begin ( addr1 addr2 )
2dup u< while
1 chars -
over c@ over c@ <> if
2drop false exit
endif
swap char+ swap
repeat
2drop true ;
This works nicely with fixed-width characters, e.g., with the ANS
Forth specified ASCII character set. If we extended ANS Forth with
UTF-8, this program would still work correctly for the original ASCII
characters (because in the ASCII subset of UTF-8 each character still
is fixed-width and fits in one byte), but it would fail to work as
intended for strings containing multi-byte UTF-8 characters (i.e.,
everything but ASCII).
So in some sense this program would not be a correct
extended-ANS-Forth program. In that sense extending the standard with
UTF-8 would not be an extension, but a restriction.
>> Concerning the questions of what C@, C!, and the counts in CMOVE,
>> COUNT etc. refer to, of course they would refer to (fixed-width) Forth
>> chars, not Unicode characters. EMIT and KEY would have to buffer
>> stuff, converting sequences of Forth chars to/from Unicode characters.
>>
>
>Again, A.3.1.2 Character types; 5) For the purposes of input (KEY, ACCEPT,
>etc.) and output (EMIT, TYPE, etc.), the encoding between numbers and
>human-readable symbols is ISO646/IRV (ASCII) within the range from 32 to 126
>(space to ~).
>
>So anything outside that range has an environmental dependency; standard
>programs should be fine in a UTF-8 system?
Yes, but only as long as you keep them restricted to ASCII.
>> Originally I leaned towards option 1, because it fits perfectly with
>> the ANS Forth standard, i.e., works without change for all existing
>> standard programs.
>
>If COUNTED-BUFF 1+ is considered non-standard (or 1 \STRING for addr-len
>buffers).
COUNTED-BUFF 1+: non-standard (COUNTED-BUF not defined).
1 \STRING: non-standard (\STRING not defined).
OTOH, 's" foo" 1 /STRING' is standard, and works perfectly even if 1
CHARS does not equal 1.
> There's lots of code out there that doesn't use CHARS or CHAR+
Yes. Probably more than code like PALINDROME?.
>Any
>wide-character system will suffer from this problem, but the advantage of
>UTF-8 is that programs that are non-standard on this one point only will
>continue to run correctly.
Yes, As long as it does not treat characters individually, like
PALINDROME?.
>FILE-POSITION as you note (seperate thread in reply to Stephen Pelc) on
>non-BIN files might be an issue, but given that line endings are not defined
>and are variable length (CR/LF/CRLF) then the issue is moot; it's almost
>impossible to use on a text file and remain ANS unless the result from
>FILE-POSITION is treated as opaque.
11.2 says:
|file position:
| The character offset from the start of the file.
While there may be some way to interpret this as allowing opaque file
positions, we decided not to do this in Gforth (it led to various
problems under Windows), and changed it to have a 1:1 mapping between
file contents and memory contents (i.e., no translation between CRLF
on-file and LF in memory, unlike earlier Gforth versions). This
caused some minor other problems, but overall I am happy with the
decision.
>UTF-8 adds no complications to this.
Right, as long as we don't try to translate on I/O (e.g., from
internal fixed-width to external UTF-8).
>I'm leaning towards UTF-8 myself; W32F should be able to compile UTF-8
>source with no to little change.
Same for Gforth. The only thing that comes to mind that needs
changing is the positioning of the pointers for pointing out the error
location. ^^^^^^^^ like this
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
|
|
0
|
|
|
|
Reply
|
anton (5254)
|
6/26/2004 7:09:10 PM
|
|
pornin@nerim.net (Thomas Pornin) writes:
>The fact that Gtk does not use wchar_t, but prefers UTF-8, shows that
>the Gtk designer have deemed wchar_t an inappropriate way to handle
>Unicode. This may be an indication of obsolescence. Thus, I find this
>fact interesting.
Obsolescence of what? Of Gtk or wchar_t? I lean towards the latter.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
|
|
0
|
|
|
|
Reply
|
anton (5254)
|
6/26/2004 7:45:56 PM
|
|
"Anton Ertl" <anton@mips.complang.tuwien.ac.at> wrote in message
news:2004Jun26.210910@mips.complang.tuwien.ac.at...
> "Alex McDonald" <alex_mcd@btopenworld.com> writes:
> >"Anton Ertl" <anton@mips.complang.tuwien.ac.at> wrote in message
> >news:2004Jun18.104848@mips.complang.tuwien.ac.at...
===snipped
> >
> >That I can't see; can you explain? As a correct ANS Forth program would
be a
> >subset (in character terms) of an extended-ANS-Forth program, why would
an
> >extended-ANS-system not correctly process it?
>
> Consider the following word:
>
> : palindrome? ( addr u -- f )
> over chars + begin ( addr1 addr2 )
> 2dup u< while
> 1 chars -
> over c@ over c@ <> if
> 2drop false exit
> endif
> swap char+ swap
> repeat
> 2drop true ;
>
> This works nicely with fixed-width characters, e.g., with the ANS
> Forth specified ASCII character set. If we extended ANS Forth with
> UTF-8, this program would still work correctly for the original ASCII
> characters (because in the ASCII subset of UTF-8 each character still
> is fixed-width and fits in one byte), but it would fail to work as
> intended for strings containing multi-byte UTF-8 characters (i.e.,
> everything but ASCII).
>
> So in some sense this program would not be a correct
> extended-ANS-Forth program. In that sense extending the standard with
> UTF-8 would not be an extension, but a restriction.
That was my point; it's not an extended-ANS-Forth program; but it is an
ANS-Forth program that will run correctly on an extend-ANS-Forth _system_.
No restriction?
===much snipped
>
> >Any
> >wide-character system will suffer from this problem, but the advantage of
> >UTF-8 is that programs that are non-standard on this one point only will
> >continue to run correctly.
>
> Yes, As long as it does not treat characters individually, like
> PALINDROME?.
Agreed; but again, it's not an extended-ANS-Forth program; but it is an
ANS-Forth program that will run correctly on an extend-ANS-Forth system.
What an extended-ANS-Forth PALINDROME program looks like, I'm not yet quite
sure.
Is an ENVIRONMENT variable in order here; say, UTF-EXT and UTF-TYPE which
returns "UTF-8" "UCS-2" "UTF-16" etc per the accepted standard encodings?
>
> >FILE-POSITION as you note (seperate thread in reply to Stephen Pelc) on
> >non-BIN files might be an issue, but given that line endings are not
defined
> >and are variable length (CR/LF/CRLF) then the issue is moot; it's almost
> >impossible to use on a text file and remain ANS unless the result from
> >FILE-POSITION is treated as opaque.
>
> 11.2 says:
>
> |file position:
> | The character offset from the start of the file.
>
> While there may be some way to interpret this as allowing opaque file
> positions, we decided not to do this in Gforth (it led to various
> problems under Windows), and changed it to have a 1:1 mapping between
> file contents and memory contents (i.e., no translation between CRLF
> on-file and LF in memory, unlike earlier Gforth versions). This
> caused some minor other problems, but overall I am happy with the
> decision.
>
> >UTF-8 adds no complications to this.
>
> Right, as long as we don't try to translate on I/O (e.g., from
> internal fixed-width to external UTF-8).
[Which is the correct terminology here? Is a file handle an opaque or a
transparent object? I thought opaque values are not inspectable and can't be
manipulated; but that transparent values can be manipulated. So the output
from FILE-POSITION is unusable except to pass to REPOSITION-FILE ? In which
case, we agree; 1:1 is the only sensible option.]
Presumably READ-LINE and WRITE-LINE can be permitted that luxury. R/O R/W
BIN might be extendable with FILE-IS-UTF (for instance) honoured on
byte-stream I/O words. This I've not thought through yet.
>
> >I'm leaning towards UTF-8 myself; W32F should be able to compile UTF-8
> >source with no to little change.
>
> Same for Gforth. The only thing that comes to mind that needs
> changing is the positioning of the pointers for pointing out the error
> location. ^^^^^^^^ like this
Aha! W32F doesn't bother. That's about the only simplicity we have...
Error: AHA! is undefined
--
Regards
Alex McDonald
|
|
0
|
|
|
|
Reply
|
alex_mcd (751)
|
6/26/2004 9:30:54 PM
|
|
Stephen Pelc wrote:
...
> Using variable width encodings for the developer character set
> (DCS) breaks too much code and introduces kernel complexity, so
> nobody wants to do that. Hence the application character set (ACS)
> which now can be defined separately from the DCS. This permits
> you to avoid most of the issues of COUNT and CMOVE etc.
...
Oh, acronyms! www.lacdcs.org/reorganization/ReorgCharter.doc
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
|
|
0
|
|
|
|
Reply
|
jya (12870)
|
6/26/2004 9:48:21 PM
|
|
"Alex McDonald" <alex_mcd@btopenworld.com> writes:
>"Anton Ertl" <anton@mips.complang.tuwien.ac.at> wrote in message
[PALINDROME?]
>> This works nicely with fixed-width characters, e.g., with the ANS
>> Forth specified ASCII character set. If we extended ANS Forth with
>> UTF-8, this program would still work correctly for the original ASCII
>> characters (because in the ASCII subset of UTF-8 each character still
>> is fixed-width and fits in one byte), but it would fail to work as
>> intended for strings containing multi-byte UTF-8 characters (i.e.,
>> everything but ASCII).
>>
>> So in some sense this program would not be a correct
>> extended-ANS-Forth program. In that sense extending the standard with
>> UTF-8 would not be an extension, but a restriction.
>
>That was my point; it's not an extended-ANS-Forth program; but it is an
>ANS-Forth program that will run correctly on an extend-ANS-Forth _system_.
>No restriction?
Maybe, but not a real extension either. A change is only a real
extension of ANS Forth if compliant ANS Forth programs are also
compliant programs for ANS-Forth-with-the-change.
>Agreed; but again, it's not an extended-ANS-Forth program; but it is an
>ANS-Forth program that will run correctly on an extend-ANS-Forth system.
>What an extended-ANS-Forth PALINDROME program looks like, I'm not yet quite
>sure.
We would need words for dealing with UTF-8 characters or strings. In
the current case an UTF8@ that produces the character that includes
the given address would be sufficient (just use it instead of C@), but
in general I guess we have to gain some experience until we know a
good set of words.
>Is an ENVIRONMENT variable in order here; say, UTF-EXT and UTF-TYPE which
>returns "UTF-8" "UCS-2" "UTF-16" etc per the accepted standard encodings?
What would you use these queries for?
One thing that would be useful is having a standard way to document
whether a program works with variable-width character sets or not
(with the default being "undocumented", so both "known to work with
variable-width characters" and "known not to work with variable-width
characters" would have to be documented explicitly). An environmental
query might be a way to do that (it is also the way to document that a
wordset is required), or we might just add it to the program
documentation requirements.
>> 11.2 says:
>>
>> |file position:
>> | The character offset from the start of the file.
....
>[Which is the correct terminology here? Is a file handle an opaque or a
>transparent object? I thought opaque values are not inspectable and can't be
>manipulated;
Opaque types can only be inspected and manipulated through
type-specific access words, not through general data access words like
@ and move. At least that's the meaning of opaque in Modula-2. This
idea is also known as abstract data type. File-ids in Forth are
opaque.
>So the output
>from FILE-POSITION is unusable except to pass to REPOSITION-FILE ? In which
>case, we agree; 1:1 is the only sensible option.]
File positions are specified as ud, so I assume that you can compute
with them. I.e., they are not opaque. That's why 1:1 is the only
thing that works (it's the only guarantee that one (CHARS) character
in memory is also a character in the file system).
If file positions were opaque in Forth, then arbitrary translations on
I/O would be ok.
>Presumably READ-LINE and WRITE-LINE can be permitted that luxury. R/O R/W
>BIN might be extendable with FILE-IS-UTF (for instance) honoured on
>byte-stream I/O words. This I've not thought through yet.
Unless you want to do translation, there is no need for FILE-IS-UTF
(and no need for BIN, either).
>> Same for Gforth. The only thing that comes to mind that needs
>> changing is the positioning of the pointers for pointing out the error
>> location. ^^^^^^^^ like this
>
>Aha! W32F doesn't bother. That's about the only simplicity we have...
>Error: AHA! is undefined
What about other errors (e.g., stack underflows), with several
instances of the same word on the command line?
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
|
|
0
|
|
|
|
Reply
|
anton (5254)
|
6/27/2004 7:10:35 AM
|
|
On Sat, 26 Jun 2004 18:47:47 GMT, anton@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:
>stephenXXX@INVALID.mpeltd.demon.co.uk (Stephen Pelc) writes:
>>On Fri, 18 Jun 2004 17:27:47 GMT, anton@mips.complang.tuwien.ac.at
>>(Anton Ertl) wrote:
>>>What are the experiences with using UTF-8 or other variable-width
>>>character encodings with code that was not designed for variable-width
>>>characters from the start? How much breakage was there, how hard was
>>>it to find the breakage, and to fix it?
>...
>>Using variable width encodings for the developer character set
>>(DCS) breaks too much code and introduces kernel complexity, so
>>nobody wants to do that.
>
>That's the stuff I was looking for. Could you elaborate on that?
COUNT for a start. CMOVE and friends in variable width encodings
require character counts whereas buffers are normally defined
in address units. The *experience* of people with
internationalised apps is that working in terms of memory
size is safer for application programmers.
In terms of the DCS, people use C@ COUNT CMOVE and so on for
memory manipulation, and the byte=char=address-unit assumption
is widespread, especially in embedded systems. You change that
at great peril which will bring great cost.
Most applications (that I know of) that require wide character
sets require more than one language. This brings another set
of problems for internationalisation. How much screen space
will the string take? What are the time and date formats?
Even a simple output like:
Your balance at %time% on %date% was %balance%.
involves yet more problems such as that in some languages
the conventional presentation involves a different parameter
order.
In general, trying to merge the DCS with a generalised ACS is
solving the wrong problem. In the main you want a wide
character set to deal with multiple languages. Dealing
with multiple languages requires consideration of many
other issues than just character size.
Stephen
--
Stephen Pelc, stephenXXX@INVALID.mpeltd.demon.co.uk
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeltd.demon.co.uk - free VFX Forth downloads
|
|
0
|
|
|
|
Reply
|
stephenXXX1 (459)
|
6/27/2004 8:03:44 AM
|
|
anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>"Alex McDonald" <alex_mcd@btopenworld.com> writes:
>>I'm leaning towards UTF-8 myself; W32F should be able to compile UTF-8
>>source with no to little change.
>
>Same for Gforth. The only thing that comes to mind that needs
>changing is the positioning of the pointers for pointing out the error
>location. ^^^^^^^^ like this
I just did a little test. There is another, more serious problem:
Editing the command line. Other than that, Gforth was fine.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
|
|
0
|
|
|
|
Reply
|
anton (5254)
|
6/27/2004 8:31:03 AM
|
|
"Anton Ertl" <anton@mips.complang.tuwien.ac.at> wrote in message
news:2004Jun27.091035@mips.complang.tuwien.ac.at...
> "Alex McDonald" <alex_mcd@btopenworld.com> writes:
> >"Anton Ertl" <anton@mips.complang.tuwien.ac.at> wrote in message
===snipped
>
> We would need words for dealing with UTF-8 characters or strings. In
> the current case an UTF8@ that produces the character that includes
> the given address would be sufficient (just use it instead of C@), but
> in general I guess we have to gain some experience until we know a
> good set of words.
>
> >Is an ENVIRONMENT variable in order here; say, UTF-EXT and UTF-TYPE which
> >returns "UTF-8" "UCS-2" "UTF-16" etc per the accepted standard encodings?
>
> What would you use these queries for?
>
> One thing that would be useful is having a standard way to document
> whether a program works with variable-width character sets or not
> (with the default being "undocumented", so both "known to work with
> variable-width characters" and "known not to work with variable-width
> characters" would have to be documented explicitly). An environmental
> query might be a way to do that (it is also the way to document that a
> wordset is required), or we might just add it to the program
> documentation requirements.
>
Badly thought through; but a query for various UTF and UCS types would allow
a certain fexibility of implementation; say, with conditional UTF-16?
[UNDEFINED] [IF] .( Requires UTF-16 support) ABORT [THEN] .
> >> 11.2 says:
> >>
> >> |file position:
> >> | The character offset from the start of the file.
> ...
> >[Which is the correct terminology here? Is a file handle an opaque or a
> >transparent object? I thought opaque values are not inspectable and can't
be
> >manipulated;
>
> Opaque types can only be inspected and manipulated through
> type-specific access words, not through general data access words like
> @ and move. At least that's the meaning of opaque in Modula-2. This
> idea is also known as abstract data type. File-ids in Forth are
> opaque.
>
> >So the output
> >from FILE-POSITION is unusable except to pass to REPOSITION-FILE ? In
which
> >case, we agree; 1:1 is the only sensible option.]
>
> File positions are specified as ud, so I assume that you can compute
> with them. I.e., they are not opaque. That's why 1:1 is the only
> thing that works (it's the only guarantee that one (CHARS) character
> in memory is also a character in the file system).
>
> If file positions were opaque in Forth, then arbitrary translations on
> I/O would be ok.
>
> >Presumably READ-LINE and WRITE-LINE can be permitted that luxury. R/O R/W
> >BIN might be extendable with FILE-IS-UTF (for instance) honoured on
> >byte-stream I/O words. This I've not thought through yet.
>
> Unless you want to do translation, there is no need for FILE-IS-UTF
> (and no need for BIN, either).
Yes, to translate; for instance (to use Stephen Pelc's nomenclature) the ACS
might be UTF-16. The native OCS for Windows NT and above is UTF-16 (although
some of Windows is not surrogate pair capable, and hence UCS-2 would seem to
be a better description).
>
> >> Same for Gforth. The only thing that comes to mind that needs
> >> changing is the positioning of the pointers for pointing out the error
> >> location. ^^^^^^^^ like this
> >
> >Aha! W32F doesn't bother. That's about the only simplicity we have...
> >Error: AHA! is undefined
>
> What about other errors (e.g., stack underflows), with several
> instances of the same word on the command line?
W32F makes you work for it right now.
>
> - anton
> --
> M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
> comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
|
|
0
|
|
|
|
Reply
|
alex_mcd (751)
|
6/27/2004 10:20:47 AM
|
|
According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
> Obsolescence of what? Of Gtk or wchar_t? I lean towards the latter.
Obsolescence of wchar_t, that's what I meant.
--Thomas Pornin
|
|
0
|
|
|
|
Reply
|
pornin (75)
|
6/27/2004 6:10:55 PM
|
|
Anton Ertl wrote:
> Same for Gforth. The only thing that comes to mind that needs
> changing is the positioning of the pointers for pointing out the error
> location. ^^^^^^^^ like this
A possible solution would be to include the marking into the output of the
error line, i.e.
1 2 + + .
the terminal:Stack underflow: 1 2 + >>>+<<< .
This adds the benefit that you could add vt100 highlighting escape sequences
(like underline, bold, red, or inverse) with no problem.
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
|
|
0
|
|
|
|
Reply
|
bernd.paysan (2408)
|
6/29/2004 10:55:30 AM
|
|
stephenXXX@INVALID.mpeltd.demon.co.uk (Stephen Pelc) writes:
>On Sat, 26 Jun 2004 18:47:47 GMT, anton@mips.complang.tuwien.ac.at
>(Anton Ertl) wrote:
>
>>stephenXXX@INVALID.mpeltd.demon.co.uk (Stephen Pelc) writes:
>>>On Fri, 18 Jun 2004 17:27:47 GMT, anton@mips.complang.tuwien.ac.at
>>>(Anton Ertl) wrote:
>>>>What are the experiences with using UTF-8 or other variable-width
>>>>character encodings with code that was not designed for variable-width
>>>>characters from the start? How much breakage was there, how hard was
>>>>it to find the breakage, and to fix it?
>>...
>>>Using variable width encodings for the developer character set
>>>(DCS) breaks too much code and introduces kernel complexity, so
>>>nobody wants to do that.
>>
>>That's the stuff I was looking for. Could you elaborate on that?
>COUNT for a start. CMOVE and friends in variable width encodings
>require character counts whereas buffers are normally defined
>in address units. The *experience* of people with
>internationalised apps is that working in terms of memory
>size is safer for application programmers.
Certainly. The "characters" used in the various Forth words should be
bytes (for byte-addressed machines), not UTF-8 characters. I don't
see any problems with COUNT and CMOVE in this context.
>Most applications (that I know of) that require wide character
>sets require more than one language. This brings another set
>of problems for internationalisation.
Yes, but that's another can of worms that can and should be separated
from the character-set issues.
> How much screen space
>will the string take?
That's a character set issue, though, and we probably should have a
word for that. I found the UTF-8 and Unicode FAQ
<http://www.cl.cam.ac.uk/~mgk25/unicode.html> most informative; it's
targeted for Unix, but since they have the same problems as we have,
it's very applicable to Forth.
BTW, it would help readability if you left a blank line between cited
text and your reply.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
EuroForth 2004: http://www.complang.tuwien.ac.at/anton/euroforth2004/
Deadline (refereed): August 28, 2004; Conference: November 19-21, 2004
|
|
0
|
|
|
|
Reply
|
anton (5254)
|
7/25/2004 10:44:15 AM
|
|
Anton wrote
> That's a character set issue, though, and we probably should have a
> word for that. I found the UTF-8 and Unicode FAQ
> <http://www.cl.cam.ac.uk/~mgk25/unicode.html> most informative; it's
> targeted for Unix, but since they have the same problems as we have,
> it's very applicable to Forth.
>
> BTW, it would help readability if you left a blank line between cited
> text and your reply.
>
> - anton
I looked at the above link and this paragraph seemed like it explained
enough (in a nut shell) for me. So here it is for those who did not venture
there.
What are combining characters?
Some code points in UCS have been assigned to combining characters. These
are similar to the non-spacing accent keys on a typewriter. A combining
character is not a full character by itself. It is an accent or other
diacritical mark that is added to the previous character. This way, it is
possible to place any accent on any character. The most important accented
characters, like those used in the orthographies of common languages, have
codes of their own in UCS to ensure backwards compatibility with older
character sets. They are known as precomposed characters. Precomposed
characters are available in UCS for backwards compatibility with older
encodings that have no combining characters, such as ISO 8859. The
combining-character mechanism allows one to add accents and other
diacritical marks to any character. This is especially important for
scientific notations such as mathematical formulae and the International
Phonetic Alphabet, where any possible combination of a base character and
one or several diacritical marks could be needed.
Combining characters follow the character which they modify. For example,
the German umlaut character � ("Latin capital letter A with diaeresis") can
either be represented by the precomposed UCS code U+00C4, or alternatively
by the combination of a normal "Latin capital letter A" followed by a
"combining diaeresis": U+0041 U+0308. Several combining characters can be
applied when it is necessary to stack multiple accents or add combining
marks both above and below the base character. The Thai script, for example,
needs up to two combining characters on a single base character.
JaP
|
|
0
|
|
|
|
Reply
|
japeters1 (209)
|
7/25/2004 5:39:00 PM
|
|
Anton Ertl wrote:
>> How much screen space
>>will the string take?
>
> That's a character set issue, though, and we probably should have a
> word for that.
You can avoid most of the variable width character set problems in e.g.
the command line editor if you always step back to the beginning of the
line and reprint the whole line. Escape sequences like "save cursor
position" and "restore cursor position" should help you to put the
cursor where you want. With Unicode, you pretty much have to do that,
because you can't just retype the changed rest of the text, you *have
to* retype all of it, since there are combining characters.
For "fixed"-sized fonts in Unicode, it's somehow possible to calculate
the size with a relatively simple decoder: The size of all combining
characters is 0, the size of all CJK characters is 2, and most of the
rest is size 1. The main difficulty for cursor positioning is writing
direction, i.e. if you have Arabic or Hebrew text. The text size alone
still can be calculated as is, though (writing direction doesn't
matter).
The rest of the unicode terminal handling is also pretty trivial:
backspace must erase a complete Unicode character, i.e. delete all
10xxxxxx bytes plus the additional byte (don't check for sanity, since
when you delete, since you have to be able to delete corrupted inputs,
too). Moving the cursor backwards uses the same algorithm, whereas
forward could be simpler; but for robustness I suggest to move one byte
forward and then skip all the 10xxxxxx bytes.
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
|
|
0
|
|
|
|
Reply
|
bernd.paysan (2408)
|
7/25/2004 9:16:05 PM
|
|
|
180 Replies
48 Views
(page loaded in 1.281 seconds)
|