f



The Robotic Scientist

Researchers in the Computational Biology research group at the
University of Wales-Aberysthwyth have developed a laboratory robot
capable of not only executing experiments but developing hypotheses as
well. The robot performs genetic testing on batches of mutated yeast
and alters their genes systematically to determine which genes affect
which enzyme.

http://www.syntheticthought.com/st/artificial-intelligence/37-academia/109-the-robotic-scientist
0
5/13/2009 1:39:55 PM
comp.ai.alife 885 articles. 0 followers. Post Follow

51 Replies
834 Views

Similar Articles

[PageSpeed] 7

On 13 May, 14:39, "chuck.ril...@gmail.com" <chuck.ril...@gmail.com>
wrote:
> Researchers in the Computational Biology research group at the
> University of Wales-Aberysthwyth have developed a laboratory robot
> capable of not only executing experiments but developing hypotheses as
> well. The robot performs genetic testing on batches of mutated yeast
> and alters their genes systematically to determine which genes affect
> which enzyme.
>
> http://www.syntheticthought.com/st/artificial-intelligence/37-academi...

This is in effect an event, but the real significance is in terms of
the complexity of the logic the robot has to go though. What is the
real essence is of course how we determine which genes affect which
enzyme. Calculations of this type have in point of fact been done many
times before.

Evolutionary links (not confirmed by the fossil record) strike me as
being far harder calculations.

To me the most significant event has been the Indus valley script.

http://news.softpedia.com/news/AI-Machine-Identifies-4-000-Year-Old-Language-Code-110044.shtml

Computer analysis of the grammar of the Indus valley text. The script
cannot yet be read. The computer indicates that the text was a spoken
language. The writing is like Chinese, that is to say there is one
pictogram per word. The computer using statistical techniques was able
to work out a grammar and the way of writing words. We now know what
the parts of speech of each word was, but not the meaning.

This indicates that AI is going to be an essential tool in linguistic
scholarship.

The Indus valley strikes me as being the hardest calculation to date,
or at any rate among the hardest. This is easy be comparison.


  - Ian Parker
0
Ian
5/13/2009 7:04:29 PM
I am English too, I will not tell a lie to you. Your mail is
ianparker2@gmail.com, I have sent messages to mail. Just waste your 2
minutes, please open your mail: ianparker2@gmail.com. Dear Ian Parker,
I need your help! I have sent some messages to your mail on google
groups. Please open your mail!
http://sites.google.com/site/aitranslationproject/

Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
Parker!Ian Parker!Ian Parker!Ian Parker!
0
oki239pcl
5/14/2009 12:14:08 PM
On May 14, 7:14 am, oki239...@gmail.com wrote:
> I am English too, I will not tell a lie to you. Your mail is
> ianpark...@gmail.com, I have sent messages to mail. Just waste your 2
> minutes, please open your mail: ianpark...@gmail.com. Dear Ian Parker,
> I need your help! I have sent some messages to your mail on google
> groups. Please open your mail!http://sites.google.com/site/aitranslationproject/
>
> Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
> Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian Parker!Ian
> Parker!Ian Pa

Dear Sir/Madame:

Please cease and desist this type of activity.  You're wasting
precious Usenet space.  Someday we may want to use Usenet for
something useful.  Thank you in advance for your cooperation.

- The Management
0
Don
5/15/2009 12:47:38 PM
Ian Parker wrote:

> To me the most significant event has been the Indus valley script.
> 
> http://news.softpedia.com/news/AI-Machine-Identifies-4-000-Year-Old-Language-Code-110044.shtml
> 
> Computer analysis of the grammar of the Indus valley text. The script
> cannot yet be read. The computer indicates that the text was a spoken
> language. The writing is like Chinese, that is to say there is one
> pictogram per word. The computer using statistical techniques was able
> to work out a grammar and the way of writing words. We now know what
> the parts of speech of each word was, but not the meaning.

 From the above given webpage one reads only "The next step is to
create a grammar from the data that we have.", i.e. working out
a grammer "remains" to be done, in contradiction to what you wrote
above.

M. K. Shen
0
Mok
5/16/2009 3:01:28 PM
On 16 May, 16:01, Mok-Kong Shen <mok-kong.s...@t-online.de> wrote:
> Ian Parker wrote:
> > To me the most significant event has been the Indus valley script.
>
> >http://news.softpedia.com/news/AI-Machine-Identifies-4-000-Year-Old-L...
>
> > Computer analysis of the grammar of the Indus valley text. The script
> > cannot yet be read. The computer indicates that the text was a spoken
> > language. The writing is like Chinese, that is to say there is one
> > pictogram per word. The computer using statistical techniques was able
> > to work out a grammar and the way of writing words. We now know what
> > the parts of speech of each word was, but not the meaning.
>
> =A0From the above given webpage one reads only "The next step is to
> create a grammar from the data that we have.", i.e. working out
> a grammer "remains" to be done, in contradiction to what you wrote
> above.
>
Sorry, you are quite right. However the point I want to put forward is
that what we should really be looking at is the mathematical content
of the task, not anything else.

What I think has emerged is that AI should be viewed as doing
different things from a human, not simply doing human tasks.

In some of my postings on "robotic ethics" where I discussed Hamid
Karzai and "blasphemy" hinted at this. We thing of AI being able to
translate. Yet Arabic -> English implies an Arabic genome which is
traceable in the same way DNA is.


  - Ian Parker
0
Ian
5/17/2009 10:59:44 AM
Ian Parker wrote:
> Mok-Kong Shen wrote:
[snip]

> Sorry, you are quite right. However the point I want to put forward is
> that what we should really be looking at is the mathematical content
> of the task, not anything else.
> 
> What I think has emerged is that AI should be viewed as doing
> different things from a human, not simply doing human tasks.

What I also consider to be a good and general principle to be
adopted in AI is that human and machine should each do what
each is best capable and cooperate in the most rational amd
economical way possible. After over 50 years of R&D, it seems
to be too much e.g. to unconditonally want MT to achieve (or
even surpass?) the human level. A hunter doesn't expect and
want his dog to perform the same function as he himself (nor
is he ever capable of doing what his dog does), or does he?

BTW, I find it difficult to imagine that a "pure" statistical
method could determine the parts of speech of an "entirely"
unknown language. (By pure I mean in particular without any
help from morphology.) For purpose of discussion, consider the
hypothetical case where the language were one of two types,
namely SV and VS. Now S and V are in the data equally frequent.
Which ones are S and which ones are V?

M. K. Shen
0
Mok
5/18/2009 6:42:52 AM
On 18 May, 07:42, Mok-Kong Shen <mok-kong.s...@t-online.de> wrote:
> Ian Parker wrote:
> > Mok-Kong Shen wrote:
>
> [snip]
>
> > Sorry, you are quite right. However the point I want to put forward is
> > that what we should really be looking at is the mathematical content
> > of the task, not anything else.
>
> > What I think has emerged is that AI should be viewed as doing
> > different things from a human, not simply doing human tasks.
>
> What I also consider to be a good and general principle to be
> adopted in AI is that human and machine should each do what
> each is best capable and cooperate in the most rational amd
> economical way possible. After over 50 years of R&D, it seems
> to be too much e.g. to unconditonally want MT to achieve (or
> even surpass?) the human level. A hunter doesn't expect and
> want his dog to perform the same function as he himself (nor
> is he ever capable of doing what his dog does), or does he?
>
> BTW, I find it difficult to imagine that a "pure" statistical
> method could determine the parts of speech of an "entirely"
> unknown language. (By pure I mean in particular without any
> help from morphology.) For purpose of discussion, consider the
> hypothetical case where the language were one of two types,
> namely SV and VS. Now S and V are in the data equally frequent.
> Which ones are S and which ones are V?
>
I don't. Let us look at this in a Kolmorgorov way and look at the
result of minimization  Let us take Arabic as an example. In Arabic
there are prefixes and suffixes which are added onto a stem. Tim
Buckwalter has composed a list of compatibilities. The stem has a
morphology type and there are 3 tables which list the prefixes you can
have with a given stem morphology, the suffixes and the compatibility
between different prefixes and suffixes.

This is very much Kolmogorov. These tables not only disambiguate but
also reduce the entropy and hence the compressible size.

bsm can mean either besm "In the name of" or "bi-sama" by poison. But
NOT bi-sim by characteristics, despite the Positivist ring of bi-sim
Allah. You know God by his characteristics.

All these examples for a countable set, the number of grammatical
utterances form a smaller set. It should be possible to find the
intersection, that is to say the combinations that do not occur.

In german we have "wenn man krank ist" the verb goes at the end of a
subordinate clause. We have wenn, weil etc. which signify that a class
of objects are placed at the end.

BTW - despite what Kurtzweil said there are no grammatical heuristics
in Google Translate which is based on n-grams and is in effect a
glorified phrase book. We have grammar when we have generalizations.


  - Ian Parker

0
Ian
5/18/2009 11:37:50 AM
Ian Parker wrpte:
> Mok-Kong Shen wrote:

>> BTW, I find it difficult to imagine that a "pure" statistical
>> method could determine the parts of speech of an "entirely"
>> unknown language. (By pure I mean in particular without any
>> help from morphology.) For purpose of discussion, consider the
>> hypothetical case where the language were one of two types,
>> namely SV and VS. Now S and V are in the data equally frequent.
>> Which ones are S and which ones are V?
>>
> I don't. Let us look at this in a Kolmorgorov way and look at the
> result of minimization  Let us take Arabic as an example. In Arabic
> there are prefixes and suffixes which are added onto a stem. Tim
> Buckwalter has composed a list of compatibilities. The stem has a
> morphology type and there are 3 tables which list the prefixes you can
> have with a given stem morphology, the suffixes and the compatibility
> between different prefixes and suffixes.
[snip]

But here you have overlooked just my very point. I said that in my
hypothetical case (it is assumed that) there is "no" morphological
information. In fact, in Chinese there is nothing corresponding
to prefixes and suffixes of e.g. many European languages, if one 
considers each Chinese ideogram as a unit. Anyway, since I want to 
discuss the capability of a "purely" statistical method, one can also
assume that any available informations from prefixes and suffixes
are ignored (not exploited). Now, once again the question: Without
the help of informations from Morphology, could math determine the
parts of speech of an unknown language? (I guess the answer is no.)

M. K. Shen
0
Mok
5/18/2009 7:29:53 PM
> But here you have overlooked just my very point. I said that in my
> hypothetical case (it is assumed that) there is "no" morphological
> information. In fact, in Chinese there is nothing corresponding
> to prefixes and suffixes of e.g. many European languages, if one 
> considers each Chinese ideogram as a unit. Anyway, since I want to 
> discuss the capability of a "purely" statistical method, one can also
> assume that any available informations from prefixes and suffixes
> are ignored (not exploited). Now, once again the question: Without
> the help of informations from Morphology, could math determine the
> parts of speech of an unknown language? (I guess the answer is no.)
> 
> M. K. Shen


I only have basic knowledge of Mandarin, very far from expert.
I agree, Mandarin generally does *not* have the prefixes/suffixes of 
European languages or English.

But on the other hand:

1 - Mandarin does have some suffixes e.g. -le for past tense, and -de 
for possessive (wode gangbi = my pen, nide gangbi = your pen, zhe shi 
wode = this is mine, wo = I/me, wode = my/mine, ni = you, nide = yours, etc)

2 - Mandarin depends on word order, generally subject verb object, e.g. 
wo yao tang = I want soup, *not* tang yao wo = soup wants me (artificial 
  example). Quite different to Japanese where tag words (-wa, -no) 
indicate subject and object, and word order is secondary. e.g Watashi-wa 
  Hira-no ani desu (I am Hira's brother) or Hira-no watashi-wa ani desu 
(Hira's brother I am).

This is oversimplifying, but in Mandarin the word order is often similar 
to English SVO, with suffixes for possessive and past tense, whereas in 
Japanese word order is very flexible, with suffixes (or tagwords) 
indicating the subject & object.

HTH

0
Brian
5/19/2009 12:53:19 PM
Brian Martin wrote:
> 
> I agree, Mandarin generally does *not* have the prefixes/suffixes of 
> European languages or English.
> 
> But on the other hand:
> 
> 1 - Mandarin does have some suffixes e.g. -le for past tense, and -de 
> for possessive (wode gangbi = my pen, nide gangbi = your pen, zhe shi 
> wode = this is mine, wo = I/me, wode = my/mine, ni = you, nide = yours, 
> etc)

Since the "de" is itself an ideogram, wouldn't it be hard to find out
its grammatical function, suppose one doesn't know the language at all?

> 2 - Mandarin depends on word order, generally subject verb object, e.g. 
> wo yao tang = I want soup, *not* tang yao wo = soup wants me (artificial 
>  example). Quite different to Japanese where tag words (-wa, -no) 
> indicate subject and object, and word order is secondary. e.g Watashi-wa 
>  Hira-no ani desu (I am Hira's brother) or Hira-no watashi-wa ani desu 
> (Hira's brother I am).
> 
> This is oversimplifying, but in Mandarin the word order is often similar 
> to English SVO, with suffixes for possessive and past tense, whereas in 
> Japanese word order is very flexible, with suffixes (or tagwords) 
> indicating the subject & object.

I don't have any knowledge of Japanese. I remember to have read
sometime somewhere that it is SOV. Could that be right?

0
Mok
5/19/2009 10:32:37 PM
Mok-Kong Shen wrote:
> 
> Since the "de" is itself an ideogram, wouldn't it be hard to find out
> its grammatical function, suppose one doesn't know the language at all?
> 



True. However even if you have no knowledge say of Chinese ideograms 
(characters), it becomes self-evident visually that the more common 
everyday words ideograms are much simpler, have less strokes, less 
geometrically complex, than less common words used say in politics, 
economics, etc.

For example, the characters for up, down, mountain, large, small are 
quite simple, presumably since they would have been developed earlier, 
and are used more often, so become simplified for convenience.

I don't mean the difference between traditional ideograms (Taiwan) and 
simplified ideograms (China), I mean the difference between say a simple 
everyday word like "house", vs a more abstract word like "architecture".

OK you still can't tell the part of speech, but you can expect that if 
the language has connector words (like on, in, at, etc), or 
prefix/suffix type words, or place marker words (like Japanese), they 
will probably be smaller words, or simpler ideograms, and probably occur 
more frequently.
0
Brian
5/20/2009 3:05:00 AM
On 19 May, 23:32, Mok-Kong Shen <mok-kong.s...@t-online.de> wrote:
> Brian Martin wrote:
>
> > I agree, Mandarin generally does *not* have the prefixes/suffixes of
> > European languages or English.
>
> > But on the other hand:
>
> > 1 - Mandarin does have some suffixes e.g. -le for past tense, and -de
> > for possessive (wode gangbi =3D my pen, nide gangbi =3D your pen, zhe s=
hi
> > wode =3D this is mine, wo =3D I/me, wode =3D my/mine, ni =3D you, nide =
=3D yours,
> > etc)
>
> Since the "de" is itself an ideogram, wouldn't it be hard to find out
> its grammatical function, suppose one doesn't know the language at all?
>
> > 2 - Mandarin depends on word order, generally subject verb object, e.g.
> > wo yao tang =3D I want soup, *not* tang yao wo =3D soup wants me (artif=
icial
> > =A0example). Quite different to Japanese where tag words (-wa, -no)
> > indicate subject and object, and word order is secondary. e.g Watashi-w=
a
> > =A0Hira-no ani desu (I am Hira's brother) or Hira-no watashi-wa ani des=
u
> > (Hira's brother I am).
>
> > This is oversimplifying, but in Mandarin the word order is often simila=
r
> > to English SVO, with suffixes for possessive and past tense, whereas in
> > Japanese word order is very flexible, with suffixes (or tagwords)
> > indicating the subject & object.
>
> I don't have any knowledge of Japanese. I remember to have read
> sometime somewhere that it is SOV. Could that be right?

Just because a language conveys meaning by its word order that does
not in itself mean that the grammar is impossible to deduce. It simply
means that grammar resides in the permutation group. This does have
statistical meaning. If I have "huang" or yellow it (Chinese/English)
comes before a noun. In Arabic "mSfr" (muSofar) goes after.

The rules of precedence of words reduce the compressed size in
Kologorov terms and do constitute a "grammar". What does Indus valley
in fact say? This is quite an important question. Other records are
accountancy lists. So many units of wheat/rice/barley. We have
(English/Chinese) number, unit, crop IN THAT ORDER. A statistical
examination would count the number of terms xy compared with the
number of terms yx. If we have words for number, unit and crop jumbled
together therefore we can construct sets for numbers, units and crops.


  - Ian Parker
0
Ian
5/20/2009 3:53:40 PM
Ian Parker wrote:
[...]
> Just because a language conveys meaning by its word order that does
> not in itself mean that the grammar is impossible to deduce. It simply
> means that grammar resides in the permutation group. This does have
> statistical meaning. If I have "huang" or yellow it (Chinese/English)
> comes before a noun. In Arabic "mSfr" (muSofar) goes after.
> 
> The rules of precedence of words reduce the compressed size in
> Kologorov terms and do constitute a "grammar". What does Indus valley
> in fact say? This is quite an important question. Other records are
> accountancy lists. So many units of wheat/rice/barley. We have
> (English/Chinese) number, unit, crop IN THAT ORDER. A statistical
> examination would count the number of terms xy compared with the
> number of terms yx. If we have words for number, unit and crop jumbled
> together therefore we can construct sets for numbers, units and crops.
> 
> 
>   - Ian Parker

You can't make a statistical analysis of word order without first 
recognising words. You can't make a statistical analysis of syntactic 
order without first recognising which words are in which syntactic 
categories. IOW, a statistical analysis of SOV vs SVO (for example) 
works only if you already know the language. In which case you don't 
need that statistical analysis.

A few comments that I hope complicate the issue enough to show why 
merely statistical analysis of word sequences isn't enough. ;-) I 
comment only on syntax, but a similar argument applies to semantics.

All human languages express the same syntactic categories, and have the 
same limited means to do so. Spoken language is more than a simple 
stream (or string) of symbols. Juncture is used to combine and separate 
groups of symbols. Stress, tone, and length differentiate what would 
otherwise be identical symbols, or symbol groups.

Most importantly, "intonation" (patterns of stress, tone, and length) 
impose syntactic meanings on symbol streams. Without intonation there is 
no syntax. Intonation patterns make different utterances out of 
superficially identical phrases and sentences. Noam Chomsky had a tin 
ear, which is the reason he was (AFAIK still is) bemused by "surface" 
and "deep" structures of language -- he _looked_ at the written symbols, 
instead of _listening_ to speech.

Punctuation (which includes the space) is an inadequate attempt to 
indicate intonations patterns.

Etc.

cheers

wolf k.
0
Wolf
5/20/2009 4:24:33 PM
Brian Martin <brianNOSPAM@futuresoftware.com.auNOSPAM> wrote:
> Mok-Kong Shen wrote:
> >
> > Since the "de" is itself an ideogram, wouldn't it be hard to find out
> > its grammatical function, suppose one doesn't know the language at all?
> >
>
> True. However even if you have no knowledge say of Chinese ideograms
> (characters), it becomes self-evident visually that the more common
> everyday words ideograms are much simpler, have less strokes, less
> geometrically complex, than less common words used say in politics,
> economics, etc.
>
> For example, the characters for up, down, mountain, large, small are
> quite simple, presumably since they would have been developed earlier,
> and are used more often, so become simplified for convenience.
>
> I don't mean the difference between traditional ideograms (Taiwan) and
> simplified ideograms (China), I mean the difference between say a simple
> everyday word like "house", vs a more abstract word like "architecture".
>
> OK you still can't tell the part of speech, but you can expect that if
> the language has connector words (like on, in, at, etc), or
> prefix/suffix type words, or place marker words (like Japanese), they
> will probably be smaller words, or simpler ideograms, and probably occur
> more frequently.

Well, I have no expertise in language study, but a few things seem obvious.

Natural language is produced by humans for the purpose of manipulating
other humans (or themselves).  Because of our own limitations in producing
coded messages with language and receiving coded messages with language,
the types of patterns that show up in our language is somewhat predictable.
And becuase a written language can be assumed to have evolved from a spoken
language, more assumptions about its nature can be made.

The type of rules that humans can abstract, and use to communicate, is very
limited.  They can's speak in DES encrypted binary for one example of an
encoding far beyond what the human mind can create on the fly in a spoken
language.  As such, it seems reasonable to me to believe that by simple
study of written text out of context, the grammar rules of the language
could be abstracted.

You should be able to identify the parts of speech, but not the _meaning_
of each part of speech.  That is, when we say a word is a verb, that is not
just a statement about the grammar (where and how words which are verbs can
be placed in a sentence).  It's a statement about the _meaning_ of that
part of the grammar as well since we know that verbs _mean_ actions.  They
purely syntactic nature of the word order and word modifications (adding s,
etc), should be able to be statistically abstracted from the text alone.
Such analysis should be able to show that two verbs were the same part of
speech, but not what the meaning of that part of speech was.

However, in addition to understanding the syntactical nature of the
grammar, much of the semantic nature can probably be abstracted as well
given enough examples of the text because we know that it was produced by
humans for reasons that are typical of why humans produce language.  That
is, what a human tends to talk and write about is not random.  It's
statistically biased by what is important to humans.  So when you take that
into account, we can make loose guesses as to the semantics of the encoded
messages and then test all the text we have to see if it's consistent with
reality.  We expect the language to be talking about the things that exist
on the planet Earth that the person who wrote the text cares about.  So
simply having a good statistical understanding of what exists on the Earth
that humans tend to care about, allows us to place that against the
statistical facts of the syntax, and start to guess at the semantics.  We
can then evolve that guess by testing all the messages we have examples of
to see if they are meaningful.  If we think some word represents "bird" and
we see text talking about how the bird was taken out of the lake, we start
to wonder if that word was really "fish".

So though there is limited information to work with in the text alone,
becuase we can add to that the fact it was created by humans, for the
purpose of talking to other humans, about the things humans care about, we
can overlay our vast understanding of humans and their environment (the
earth) on the statistical nature of the language, and end up with what is
probably a fairly good basic understanding of the language, simply by
studying a large collection of written text.

And because text exists in a context, (where it was found - like on a wall
in pottery factory), or in a picture of humans marching to war, or in a
picture of farmers harvesting a crop, that context adds large additional
statistical hints about the meaning of the language.  But even without the
context (which is the prime tool for decoding lost languages), the pure
statistical nature of the text alone matched with our knowledge of humans
gives us a lot to work with if there is a large enough body of sample text
to work with.

-- 
Curt Welch                                            http://CurtWelch.Com/
curt@kcwc.com                                        http://NewsReader.Com/
0
curt
5/20/2009 4:50:30 PM
On 20 May, 17:24, Wolf K <weki...@sympatico.ca> wrote:
>
> You can't make a statistical analysis of word order without first
> recognising words. You can't make a statistical analysis of syntactic
> order without first recognising which words are in which syntactic
> categories. IOW, a statistical analysis of SOV vs SVO (for example)
> works only if you already know the language. In which case you don't
> need that statistical analysis.
>
But the Indus valley words ARE recognized.

  - Ian Parker
0
Ian
5/20/2009 7:55:56 PM
Ian Parker wrote:
> On 20 May, 17:24, Wolf K <weki...@sympatico.ca> wrote:
>> You can't make a statistical analysis of word order without first
>> recognising words. You can't make a statistical analysis of syntactic
>> order without first recognising which words are in which syntactic
>> categories. IOW, a statistical analysis of SOV vs SVO (for example)
>> works only if you already know the language. In which case you don't
>> need that statistical analysis.
>>
> But the Indus valley words ARE recognized.
> 
>   - Ian Parker


That is a vague statement. What's recognised is that the script reprenst 
a language, so presumably its signs (or groups of signs) represent words 
in that language. That isn't enough to decide anything about the syntax 
and semantics of the language. IOW, you can't tell whether the language 
is SVO or SOV or VSO. You can't tell whether it's analytic, synthetic, 
or agglutinative. Etc. IOW, even though you can show that the signs are 
a script, you can't tell anything about the language it represents. 
that's the trouble with unknown languages: you need a key. (When human 
babies learn language, the key is supplied by the surrounding language 
environment, which includes all kinds of non-linguistic gestures to 
alert the infant to the links between the sounds/sound-groups, and 
entities in its environment. Etc.)

Bir'acada 'an gutembi lloca 'an. Ni pak tu'uti ssa no terimbi. Maq 'na 
tegumba calango paa' tass. Strenino ni ga'uto paa' no mmik 'ombi.

There you are - there's a short message in a language I just made up. 
What's its syntax? FWIW, I did use a few ad hoc rules in creating the 
words. But it's _not_ a translation from English - it's just a stream of 
symbols. If you want to read it out loud, give the vowel letters the 
standard Latin values, open and short. Doubled letters represent a 
lengthened sound. The ' is a glottal stop (I like glottal stops.) I can 
make up languages all day. ;-)

cheers,

wolf k.
0
Wolf
5/20/2009 10:07:12 PM
Brian Martin wrote:
> Mok-Kong Shen wrote:
>>
>> Since the "de" is itself an ideogram, wouldn't it be hard to find out
>> its grammatical function, suppose one doesn't know the language at all?
>>
> True. However even if you have no knowledge say of Chinese ideograms 
> (characters), it becomes self-evident visually that the more common 
> everyday words ideograms are much simpler, have less strokes, less 
> geometrically complex, than less common words used say in politics, 
> economics, etc.
> 
> For example, the characters for up, down, mountain, large, small are 
> quite simple, presumably since they would have been developed earlier, 
> and are used more often, so become simplified for convenience.
> 
> I don't mean the difference between traditional ideograms (Taiwan) and 
> simplified ideograms (China), I mean the difference between say a simple 
> everyday word like "house", vs a more abstract word like "architecture".
> 
> OK you still can't tell the part of speech, but you can expect that if 
> the language has connector words (like on, in, at, etc), or 
> prefix/suffix type words, or place marker words (like Japanese), they 
> will probably be smaller words, or simpler ideograms, and probably occur 
> more frequently.

I would think that connector words complicates the matter in that
there is one more grammatical category to recognize and thus the
complexity, when the different categories are combined into
sentences, increases.

I suppose the history of rosseta stone sufficiently indicates the
difficulty of studying an "entirely" unknown language. Note that,
as the quote below points out, languages could be astonishingly
(surprise for those of us who are familiar only with the more popular
languages) different from one another.

M. K. Shen

------------------------------------------------------------

Quoted from M. Tomasello in M. Tomasello (ed.), The New Psychology
of Language, vol.2, Mahwah, 2003. p.5:

There are many languages that simply do not have one or the other
of relative clauses, sentential complements, passive constructions,
grammatical markers for tense, grammatical markers of evidentiality,
ditransitives, topic markers, a copula (to be), case marking of
grammatical roles, subjunctive mood, definite and indefinite articles,
incorporated nouns, plural markers, and on and on. Typological
research has also established beyond a reasonable doubt that not only
are specific grammatical constructions not universal, but basically
none of the so-called minor word classes of English that help to
constitute particular constructions (e.g. prepositions, auxiliary
verbs, conjunctions, articles, adverbs, complementizers, and the like)
are universal across languages either.


0
Mok
5/21/2009 6:59:19 AM
Mok-Kong Shen wrote:
[...]

> I would think that connector words complicates the matter in that
> there is one more grammatical category to recognize and thus the
> complexity, when the different categories are combined into
> sentences, increases.
> 
> I suppose the history of Rosetta stone sufficiently indicates the
> difficulty of studying an "entirely" unknown language. Note that,
> as the quote below points out, languages could be astonishingly
> (surprise for those of us who are familiar only with the more popular
> languages) different from one another.
> 
> M. K. Shen
> 
> ------------------------------------------------------------
> 
> Quoted from M. Tomasello in M. Tomasello (ed.), The New Psychology
> of Language, vol.2, Mahwah, 2003. p.5:
> 
> There are many languages that simply do not have one or the other
> of relative clauses, sentential complements, passive constructions,
> grammatical markers for tense, grammatical markers of evidentiality,
> ditransitives, topic markers, a copula (to be), case marking of
> grammatical roles, subjunctive mood, definite and indefinite articles,
> incorporated nouns, plural markers, and on and on. Typological
> research has also established beyond a reasonable doubt that not only
> are specific grammatical constructions not universal, but basically
> none of the so-called minor word classes of English that help to
> constitute particular constructions (e.g. prepositions, auxiliary
> verbs, conjunctions, articles, adverbs, complementizers, and the like)
> are universal across languages either.


Well yes, but so what?  Tomasello's list is like pointing out that 
plants have spores, nuts, drupes, berries, etc etc. But they all have 
seeds. On the one hand, Tomasello exhibits the depressing parochialism 
of American scholars, who too often ignore the fact that other people 
have already invented the wheel. On the other hand, Tomasello exhibits a 
common fault of psychologists and sociologists: he is so dazzled by the 
wonderful variety of human beings that he doesn't notice that none of 
them are horses.

IOW, the better question is not "What grammatical categories does this 
language exhibit?", but "How does this language express number? time? 
person? actor? acted upon? place/time/etc? condition? process? type? 
counterfactuals? etc?" IOW, ask about which categories are expressed 
syntactically, which morphologically, which semantically. (The fact that 
these three concepts are not easy to distinguish should give one pause, 
BTW.)

The fact is that Tomasello's non-universal grammatical categories 
couldn't be recognised if there were not some way of expressing those 
meanings in every language. What differs is not the range of concepts 
that a language can express, but the selection of these concepts that 
are expressed "grammatically" (i.e., are expressed by means of one or 
another item on Tomasello's list), and those that are expressed 
"semantically" (i.e., by some collocation of "words"). IOW, the 
difference between languages is that of optional and compulsory 
expression of concepts. And that in turn is a matter of culture. In 
fact, it's a clue to a culture's values, and, as the language changes, a 
clue to changing values.

Consider gender, for exmaple. Not so long ago gender had to be expressed 
in English not only via pronouns, but also via inflections on articles, 
nouns, and adjectives (as it still is in most IE languages). Now gender 
is only expressed when it's perceived as necessary. That perception is 
diminishing -- e.g., in the last 100 years or so, the feminine suffix 
has been disappearing. Adding -ess to a noun is now felt to be 
old-fashioned or worse. E.g., the vernacular uses "them" as a 
gender-neutral pronoun (a trend that started well before the recent 
attempts to degender English, please note.) Does this mean that "gender" 
is not a language universal? Of course not -- gender is expressible in 
all languages.

Or consider social relationship/status. In all societies, relative 
social status must be expressed, (even if, as in current 
American/Canadian, most of the time "no difference" is expressed -- that 
is the American/Canadian form of courtesy. But "title" is still usable, 
though its use is increasingly ironic, i.e., polite forms used to 
express rudeness). However, how social status is expressed varies 
enormously. It may be enough to adopt a deferential tone or posture. 
Which reminds us that "extra-linguistic" gestures are also a part of 
language. Which is one of the reasons I am getting increasingly 
impatient with the symbol-processing focus of most AI 
research/discussion, including almost every comment in this thread.

Bottom line: true though it is that humans have invented wonderfully 
diverse ways of expressing what they want to say, what they want to say 
can be said in any language.

cheers,

wolf k.
0
Wolf
5/21/2009 1:58:06 PM
On 20 May, 23:07, Wolf K <weki...@sympatico.ca> wrote:

> > But the Indus valley words ARE recognized.
>
> > =A0 - Ian Parker
>
> That is a vague statement. What's recognised is that the script reprenst
> a language, so presumably its signs (or groups of signs) represent words
> in that language. That isn't enough to decide anything about the syntax
> and semantics of the language. IOW, you can't tell whether the language
> is SVO or SOV or VSO.


You apply basic clustering techniques as I have explained earlier. You
get precedence sets.

Lets take something else. Suppose we have a bath of acid or alkali,
does not really matter which and we have a DC amplifier which is
incredibly sensitive. When you have any pair of metals in the bath it
goes fully one way or fully the other way. If the words in our
language are pairs of letters (occasionally one letter) such as Au,
Ag, Fe, C, Zn, Pb we can arrange them in order. This is giving you
your parts of speech.

> You can't tell whether it's analytic, synthetic,
> or agglutinative. Etc.

You count the number of different variations in a word. In fact with
Indus valley you might well find that you have only one person and
only one tense. A lot of records as I said are about accounts. If they
are accounts then decyphering them will be easier.

> IOW, even though you can show that the signs are
> a script, you can't tell anything about the language it represents.
> that's the trouble with unknown languages: you need a key. (When human
> babies learn language, the key is supplied by the surrounding language
> environment, which includes all kinds of non-linguistic gestures to
> alert the infant to the links between the sounds/sound-groups, and
> entities in its environment. Etc.)
>
> Bir'acada 'an gutembi lloca 'an. Ni pak tu'uti ssa no terimbi. Maq 'na
> tegumba calango paa' tass. Strenino ni ga'uto paa' no mmik 'ombi.
>
> There you are - there's a short message in a language I just made up.
> What's its syntax?

There are not enough words, you need repetition.

>FWIW, I did use a few ad hoc rules in creating the
> words. But it's _not_ a translation from English - it's just a stream of
> symbols. If you want to read it out loud, give the vowel letters the
> standard Latin values, open and short. Doubled letters represent a
> lengthened sound. The ' is a glottal stop (I like glottal stops.) I can
> make up languages all day. ;-)
>
I feel we really ought to be thinking in Kolmogorov terms. There is a
Hutter prize for the compression of Wkipaedia.

http://cs.fit.edu/~mmahoney/compression/cs200516.pdf
http://cs.fit.edu/~mmahoney/compression/text.html#1323

Most programs work by context mixing. A context being generated by a
bit string. A hash table generates the contexts. Now according to K
compression is the thing that defines intelligence.

Will a compressor help us to decode an unknown language? Well SVO, SOV
etc. represent bit patterns. You know immediately what words and what
combination of words is allowed and which are not allowed. If you know
a little bit more, or you have a hypothesis (say accounts) you are in
a good position to decode.


  - Ian Parker
0
Ian
5/21/2009 3:25:29 PM
Ian Parker <ianparker2@gmail.com> wrote:

> I feel we really ought to be thinking in Kolmogorov terms. There is a
> Hutter prize for the compression of Wkipaedia.
>
> http://cs.fit.edu/~mmahoney/compression/cs200516.pdf
> http://cs.fit.edu/~mmahoney/compression/text.html#1323
>
> Most programs work by context mixing. A context being generated by a
> bit string. A hash table generates the contexts. Now according to K
> compression is the thing that defines intelligence.
>
> Will a compressor help us to decode an unknown language? Well SVO, SOV
> etc. represent bit patterns. You know immediately what words and what
> combination of words is allowed and which are not allowed. If you know
> a little bit more, or you have a hypothesis (say accounts) you are in
> a good position to decode.
>
>   - Ian Parker

Compression plays a very important role in intelligence but it alone is not
intelligence.  It would be a fairly long stretch to try and argue that the
program gzip were a good example of artificial intelligence.

I like to argue that reinforcement learning is intelligence, but that
leaves out the implementation details.  One of the most important missing
pieces it leaves out which explain the difference between that the brain
can do, and what our current RL programs can do, is compression.

Reinforcement learning explains why we (or any intelligence) does what it
does.  But it doesn't explain how it works.

That is, in order for RL to be applied to large state space problems, the
space has to be modeled efficiently.  The machine can't learn to avoid
something dangerous, or seek out something good, if it can't first learn to
recognize that thing.  The machine must "understand" the environment it is
placed in, by parsing it up into things - objects and actions.

The job of taking raw sensory data, and extracting the "things" it
represents, is the compression problem you speak of.  If we can't take raw
data from the environment, and parse it up into s concise and much smaller
set of "things" to react to, then we can't learn how to react.

The problem of compressing Wikipedia (or any data stream) is one of taking
a long list of bits, and extracting from that the correct model to best
represent the same information in a far shorter term.  It has to be changed
from a list of billions of bits, to web pages, with titles, and paragraphs,
and sentences and words, and parts of speech, and diagrams.  That is, the
recurring patterns that exist in the bits must be recognized, and given an
internal representation.

The problem with that Hutter prize is even though I agree that developing
better compression algorithms is a fundamental part of solving AI, I'm not
so sure that the form of the problem is going to all that applicable.  That
is, AI is not about compression alone, it's about learning complex
behavior.  It's the complexity and quality of our learned behaviors that
shows our true intelligence - that is, how effective is our behaviors at
producing higher rewards for us?  That's intelligence.

The way one might structure a compression algorithm to compress a large bit
sting like wikipedia, I suspect, is not going to have a lot in common with
how one would structure an algorithm which interacts with a complex
environment and learns by reinforcement.  The first is just compressions,
the second is AI.

-- 
Curt Welch                                            http://CurtWelch.Com/
curt@kcwc.com                                        http://NewsReader.Com/
0
curt
5/21/2009 5:28:08 PM
On May 21, 11:25=A0am, Ian Parker <ianpark...@gmail.com> wrote:
> On 20 May, 23:07, Wolf K <weki...@sympatico.ca> wrote:
>
> > > But the Indus valley words ARE recognized.
>
> > > =A0 - Ian Parker
>
> > That is a vague statement. What's recognised is that the script reprens=
t
> > a language, so presumably its signs (or groups of signs) represent word=
s
> > in that language. That isn't enough to decide anything about the syntax
> > and semantics of the language. IOW, you can't tell whether the language
> > is SVO or SOV or VSO.
>
> You apply basic clustering techniques as I have explained earlier. You
> get precedence sets.
>
> Lets take something else. Suppose we have a bath of acid or alkali,
> does not really matter which and we have a DC amplifier which is
> incredibly sensitive. When you have any pair of metals in the bath it
> goes fully one way or fully the other way. If the words in our
> language are pairs of letters (occasionally one letter) such as Au,
> Ag, Fe, C, Zn, Pb we can arrange them in order. This is giving you
> your parts of speech.
>
> > You can't tell whether it's analytic, synthetic,
> > or agglutinative. Etc.
>
> You count the number of different variations in a word. In fact with
> Indus valley you might well find that you have only one person and
> only one tense. A lot of records as I said are about accounts. If they
> are accounts then decyphering them will be easier.
>
> > IOW, even though you can show that the signs are
> > a script, you can't tell anything about the language it represents.
> > that's the trouble with unknown languages: you need a key. (When human
> > babies learn language, the key is supplied by the surrounding language
> > environment, which includes all kinds of non-linguistic gestures to
> > alert the infant to the links between the sounds/sound-groups, and
> > entities in its environment. Etc.)
>
> > Bir'acada 'an gutembi lloca 'an. Ni pak tu'uti ssa no terimbi. Maq 'na
> > tegumba calango paa' tass. Strenino ni ga'uto paa' no mmik 'ombi.
>
> > There you are - there's a short message in a language I just made up.
> > What's its syntax?
>
> There are not enough words, you need repetition.
>
> >FWIW, I did use a few ad hoc rules in creating the
> > words. But it's _not_ a translation from English - it's just a stream o=
f
> > symbols. If you want to read it out loud, give the vowel letters the
> > standard Latin values, open and short. Doubled letters represent a
> > lengthened sound. The ' is a glottal stop (I like glottal stops.) I can
> > make up languages all day. ;-)
>
> I feel we really ought to be thinking in Kolmogorov terms. There is a
> Hutter prize for the compression of Wkipaedia.
>
> http://cs.fit.edu/~mmahoney/compression/cs200516.pdfhttp://cs.fit.edu/~mm=
ahoney/compression/text.html#1323
>
> Most programs work by context mixing. A context being generated by a
> bit string. A hash table generates the contexts. Now according to K
> compression is the thing that defines intelligence.
>
> Will a compressor help us to decode an unknown language?

   Compression doesn't help much decoding a known language,
   since it's mostly media hype, rather than a Language.
   Which is also why people with actual progessive engineering brains
invented
   MP3, MPEG, CD+rw, CD-rom, DVD-rom, Optical Computers, Flat Screen
HDTV Debuggers,
   Thermo-Electric Cooling, Microwave Cooling, On-Line Publishing, On-
Line Shopping, USB,
   XML, Holographics, C++, Digital Fiber Optics Links, Cell Phones,
Self-Assembling Robots,
   Pv Cell Energy Arrays, Digital-Terrain Mapping, Compact Flourescent
Lighting, AUVs,
   and Light Sticks, rather than the idiot Media.










 Well SVO, SOV
> etc. represent bit patterns. You know immediately what words and what
> combination of words is allowed and which are not allowed. If you know
> a little bit more, or you have a hypothesis (say accounts) you are in
> a good position to decode.
>
> =A0 - Ian Parker

0
zzbunker
5/22/2009 2:31:09 AM
> You can't make a statistical analysis of word order without first
> recognising words. You can't make a statistical analysis of syntactic
> order without first recognising which words are in which syntactic
> categories. IOW, a statistical analysis of SOV vs SVO (for example)
> works only if you already know the language. In which case you don't
> need that statistical analysis.

I don't believe this is the case.  There are a lot of things about
languages that seem hard wired into our brains that, if we could
discover them, we might be able to figure an unknown language out
statistically.

Most languages have a concept of a verb and nouns.  This is a basic
classification problem.  We also know that most languages use a SVO or
SOV or sometimes SV.  S and O are usually nouns so we can probably,
with enough data, start seeing patterns where some words occur once in
a sentence and others occur more than once. (i.e. fish eat fish).  Any
three symbol sentences would imply that two of the have a higher
probability of being nouns and another word has a higher probability
of being a verb.  So we could look at a three symbol sentence and give
each word a 2/3 chance of being a noun and a 1/3 chance of being a
verb.  After analyzing enough 3 symbol sentences, some words would
start having a higher probability of being nouns and others of being
verbs.  We might start comparing word similarity as well and come up
with theories of common affixes.  Affixes that occur on word types
that have a high probability of being verbs probably occur on classes
of similar verbs.  Others are probably morphologies on nouns or
adjectives.

After running the above tests we probably have a bunch of words we are
"pretty sure" are nouns and some we are "pretty sure" are verbs.  This
would be enough information to start determining SOV of SVO order of
the language.  Words that aren't nouns or verbs could be analyzed for
other patterns.  Determiners, in languages that have them, modify
nouns.  Some languages have adjectives which commonly modify nouns as
well.  Sometimes these have morphological versions that agree in
person and number with the nouns.  We could do a pattern search to see
if this is the case and then come up with a new category: "possible
adjectives".

The same technique could be applied to adverbs.  Looking though our
bin of words we could probably identify other groups of words by
placement or morphology as well.  Given enough text we could make
assumptions on those categories and prove them.  For instance, in
English we have conjunctions that, among other things, can join two
independent phrases.  Given our pattern of assumed nouns and verbs we
could search for patterns that might indicate a conjunction of this
type.  If we don't find any, we can either assume that the language
doesn't have conjunctions, or even better, lower the probability of
our model that they exist. (until other evidence might be found).  If
we do find a few possible conjunctions we can analyze other phrases
where these words occur and learn more about them.

Of course all of these techniques make for an astoundingly large
search space to determine the language but they rely ONLY on a basic
understanding of how syntax works in other languages.  By analyzing
known languages and looking for patterns in the unknown languages that
match, we might be able to learn quite a bit about an unknown language
with statistical techniques.

-Scott Frye
"Those who say it's impossible should stay out of the way of those of
us doing it." - anonymous
0
Scott
5/22/2009 2:43:29 PM
In Alice in Wonderland Humpty Dumpty said that when he used a word it
meant exactly what he said it meant, no more no less. What can we
define "Intelligence" as. To be sure we do not want to discuss
philosophical questions (like consciousness) unless we absolutely have
to. I am saying that "Intelligence" may be defined in the Kolmogorov
way  as complexity. Complexity is defined as being a minimum binary
string. What I am now going to do is give some examples. I will seek
to show that Kolmogorov's definition is not as far removed as you
might think from the normal lay definition of intelligence.

INTELLIGENCE TESTS - These have now largely been discredited,
certainly the claims made by people like Sir Cyril Burt have been
shown to be unfounded. You can train for intelligence tests,
intelligence is a function of socioeconomic circumstances and the
difference between black and white (in America) is completely
explained in socioeconomic terms. However let us take

  3,8,13,18  ... what is the next number?

5n-2 is the compressed (Kolmogorov) value of the string 5n+3 depending
on whether we denote our start by 0 or 1. Kolmogorov can in fact be
said to DEFINE the mathematical tests. What about verbal reasoning?

{Primavera, Resorte, Mamanthal, Salsa} son flores

"Primavera" is here worth 2 bits, one DNA base! I will discuss
Evolution later! Thus Kolmogorov and compression seems to rule verbal
reasoning as well. If we are translating "spring" into Spanish we have
to make choices depending on context. Sir Cyril Burt in fact
incorporated this into his test. Hence verbal reasoning, compression
and translation are all related concepts.

SEARCHING - In essence this involves matching a key to a list. Google
for example will accept our search keys and map them to websites.
Wolfram and Jeopardy are a match of questions to answers, but the
basic principle remains the same. Let us have n keys and m possible
responses. Now if we match the two together we have nm possibilities.
If each key has q plausible answers we have reduced prompt ad response
we only have nq possibilities. If nq << nm we have Kolmogorov. Our
search algorithm is "intelligent".

EVOLUTION - The human genome is 800MB, the Chimpanzee genome is also
800MB. What is the minimal representation? Not 1.6GB since a lot of
both genomes is common. At least 98% is common, so the minimal
representation of the 2 genomes is little more than that of one. I am
now going to ask another question. What is the minimal DNA
representation of life? Now that is going to represent a descent tree.
We came from the same stock as Chimps. This is implicit in terms of
minimal representation.  A minimal representation of life therefore
contains a descent tree which can be compared to the fossil record.
Minimal representation in fact reproduces the fossil record and does a
lot more as well.

If we define "intelligence" as the ability to see connection between
different species we thus have the Kolmogorov definition. I have
defined similarities in terms of DNA rather than morphology, but the
same principles apply. In fact DNA has shown that occasionally
morphology gives a wrong classification as morphological similarity
may simply mean evolution to a niche.

UNKNOWN LANGUAGE - Compression represents our ability to predict what
the next word will be. Grammar can in fact be defined as a set of
predictive rules. A minimum compression of accounts is going to show
that the text has got certain characteristics, the characteristics of
accounts. We ask what is possible? We are here in the realm of what is
computable and what is not. What has interested me about the Indus
valley has been the fact that AI has got further than anyone or
anything else in decoding it. If P = NP (I don't personally believe it
does) we will be able to get out a least the numbers. If we are adding
up a column of figures we are coming close to computability.

2 + 2 = 4, 4 + 5 = 9

We know in fact that there is a polynomial solution to this. Say x + x
= y, y + z = w. This is a matrix and in fact cubic.

SCHOLARSHIP - I am going to ask a very similar question. What is the
minimal representation of a number of ancient languages. Qu'ranic
Arabic, Syriac, Modern Standard Arabic?  If I find some fragments of
an old Qu'ran, in Yemen say, can I compress them. This compression
represents the most probable interpretation. Scholarship has in fact
got a striking similarity to Evolution (as investigated by DNA). There
is one improvement that compression will have over scholarship. It can
be made to give a numerical probability. Without this each scholar is
going to give an interpretation that seems probable. We will sometimes
need to compare different interpretations.

THE ROLE OF MATHEMATICS - Ian Stewart has remarked that different
scientific disciplines have the same problems, when expressed in
mathematical terms, but a different language for expressing them.
Matrix theory is ubiquitous. If we are dealing with information we
have Maths which is the same as that used for thermodynamics. We
should really refer to context in terms of Hamiltonians. (viz exp(-E/
kT)). Doing things this way will tell us how much "intelligence" we
are adding. How good in strictly quantitative terms each discriminant
really is.

Ian Stewart has said that Mathematics should be out to sell. With the
advent of computers even the more abstract of concepts has a role to
play. We need to devise generic solution for similar types of problem.
It may well be that some problems do not have computable solutions.
Mathematics however is in the business of providing computability. P =
NP is STILL unsolved. As I said my interest lies in what the current
boundaries of computability are. The boundaries of computability are,
of course, the boundaries of AI. This is the essential question.

If someone seriously claims that the Indus valley is NOT computable,
that accounts cannot be deciphered I suggest they write to Clay! If P
= NP any unknown language is computable.

Scot Frye whose post crossed mine gives a method similar to what I
have stated earlier. As far as Kolmogorov is concerned it can be seen
at once that sets of parts of speech represent a compression. You have
to work out the permutations and combinations.


  - Ian Parker
0
Ian
5/22/2009 7:17:37 PM
Wolf K wrote:

[snip]
> ..... IOW, the 
> difference between languages is that of optional and compulsory 
> expression of concepts. And that in turn is a matter of culture.
[snip]

Doesn't this imply that people of different native languages "think"
differently?

M. K. Shen
0
Mok
5/22/2009 11:51:19 PM
Mok-Kong Shen wrote:
> Wolf K wrote:
> 
> [snip]
>> ..... IOW, the difference between languages is that of optional and 
>> compulsory expression of concepts. And that in turn is a matter of 
>> culture.
> [snip]
> 
> Doesn't this imply that people of different native languages "think"
> differently?
> 
> M. K. Shen


That's the Whorf hypothesis, and it's been taken up mostly by literary 
critics, who've found it congenial to their observations about style. 
Psychologists however have not been able to find a definitive answer to 
this question. It may be an unanswerable.

Literary critics also point out that writers differ in their "sensorium" 
-- a writer's style (choice of worss, imagery) suggests which sense 
dominate in his or her experience. Read Helen Keller, for example. She 
was deaf and mute. You'll find that aprt from conventional cliches she 
does not use visual imagery.

Bi-/multi-lingual people (I'm one) claim that speaking/thinking in 
different languages "feels different." That's my experience, too, but 
it's very difficult to explain -- because presumably a thought 
expressible in A can't, by definition, be expressed in B.... ;-)

cheers,

wolf k.
0
Wolf
5/23/2009 12:54:09 PM
On 23 May, 13:54, Wolf K <weki...@sympatico.ca> wrote:

> > Doesn't this imply that people of different native languages "think"
> > differently?
>
> > M. K. Shen
>
> That's the Whorf hypothesis, and it's been taken up mostly by literary
> critics, who've found it congenial to their observations about style.
> Psychologists however have not been able to find a definitive answer to
> this question. It may be an unanswerable.
>
> Literary critics also point out that writers differ in their "sensorium"
> -- a writer's style (choice of worss, imagery) suggests which sense
> dominate in his or her experience. Read Helen Keller, for example. She
> was deaf and mute. You'll find that aprt from conventional cliches she
> does not use visual imagery.
>
> Bi-/multi-lingual people (I'm one) claim that speaking/thinking in
> different languages "feels different." That's my experience, too, but
> it's very difficult to explain -- because presumably a thought
> expressible in A can't, by definition, be expressed in B.... ;-)
>
There are in fact 2 questions here. Do people speaking different
languages brought up in the same (in broad terms) culture think
differently? I think the answer seems to be "no". There is another
question "Do languages reflect culture?" The answer to that question
must be "yes". The Bushmen of the Kalahari have a language based on
clicks. This suits their environment, they are hunter/gatherers and
their language reflects the "fittest" hunting language. There is a
question of "survival of the fittest in terms of the evolution of a
language.

BTW - If you look in Kaye and Laby you will find that high pitched
sounds are attenuated more rapidly. For a person living in a
"civilised" environment, this does not matter, although for a hunter
it is of the utmost importance.


  - Ian Parker
0
Ian
5/23/2009 5:26:27 PM
Ian Parker <ianparker2@gmail.com> wrote:
> In Alice in Wonderland Humpty Dumpty said that when he used a word it
> meant exactly what he said it meant, no more no less. What can we
> define "Intelligence" as. To be sure we do not want to discuss
> philosophical questions (like consciousness) unless we absolutely have
> to. I am saying that "Intelligence" may be defined in the Kolmogorov
> way  as complexity. Complexity is defined as being a minimum binary
> string. What I am now going to do is give some examples. I will seek
> to show that Kolmogorov's definition is not as far removed as you
> might think from the normal lay definition of intelligence.

Have you read Hutter's book?

Universal Artificial Intelligence: Sequential Decisions Based On
Algorithmic Probability (Hardcover)

http://www.amazon.com/gp/product/3540221395

You might like it if you haven't.  It's a formal mathematical approach to
defining general intelligence based on concepts such as  Kolmogorov
Complexity.

You mentioned the Hutter prize so it seems you might already know about it.

> INTELLIGENCE TESTS - These have now largely been discredited,
> certainly the claims made by people like Sir Cyril Burt have been
> shown to be unfounded. You can train for intelligence tests,
> intelligence is a function of socioeconomic circumstances and the
> difference between black and white (in America) is completely
> explained in socioeconomic terms.

Yes, intelligence is hard to test for and IQ tests only measure one
dimension of of a complex issue.  IQ tests are certainly still valid.  They
are still used to test students - they just don't call them IQ tests
anymore.

They do attempt to test a fundamental and important aspect of intelligence
- which is our ability to abstract.  Sadly, it's hard to create such a test
without creating a social bias at the same time.

> However let us take
>
>   3,8,13,18  ... what is the next number?
>
> 5n-2 is the compressed (Kolmogorov) value of the string 5n+3 depending
> on whether we denote our start by 0 or 1. Kolmogorov can in fact be
> said to DEFINE the mathematical tests. What about verbal reasoning?

Yes, data compression such as Kolmograov requires the ability to abstract
out (aka find) common patterns over time (in a sequence). It's a
fundamental and important part of intelligence.

> {Primavera, Resorte, Mamanthal, Salsa} son flores
>
> "Primavera" is here worth 2 bits, one DNA base! I will discuss
> Evolution later! Thus Kolmogorov and compression seems to rule verbal
> reasoning as well. If we are translating "spring" into Spanish we have
> to make choices depending on context. Sir Cyril Burt in fact
> incorporated this into his test. Hence verbal reasoning, compression
> and translation are all related concepts.

Yes, I agree.

> SEARCHING - In essence this involves matching a key to a list. Google
> for example will accept our search keys and map them to websites.

Map them to an ordered list of web sites actually. :)

> Wolfram

Yeah, they are so stupid to give their search engine a 12 character name!
I see you have chosen to shorten it to 7!  If it becomes more popular,
people will create an even shorter name for it and over time, that name
will become the standard way to talk about the site.  (Like Federal Express
became shortened to FedEx in popular culture and then the company ended up
changing their name to the short version).

Starting off with a 12 character name is like admitting your site will
never be very popular! :)

> and Jeopardy are a match of questions to answers, but the
> basic principle remains the same. Let us have n keys and m possible
> responses. Now if we match the two together we have nm possibilities.
> If each key has q plausible answers we have reduced prompt ad response
> we only have nq possibilities. If nq << nm we have Kolmogorov. Our
> search algorithm is "intelligent".

Well, I think your logic is a real stretch for that example.

> EVOLUTION - The human genome is 800MB, the Chimpanzee genome is also
> 800MB. What is the minimal representation? Not 1.6GB since a lot of
> both genomes is common. At least 98% is common, so the minimal
> representation of the 2 genomes is little more than that of one. I am
> now going to ask another question. What is the minimal DNA
> representation of life? Now that is going to represent a descent tree.
> We came from the same stock as Chimps. This is implicit in terms of
> minimal representation.  A minimal representation of life therefore
> contains a descent tree which can be compared to the fossil record.
> Minimal representation in fact reproduces the fossil record and does a
> lot more as well.

Well, the process of evolution is one which follows an optimal path by it's
very nature. It takes the first option it finds which works, which creates
some expectation of compression since by it's nature, it is more likely to
find the simpler (shorter/most compressed) options first.

> If we define "intelligence" as the ability to see connection between
> different species we thus have the Kolmogorov definition. I have
> defined similarities in terms of DNA rather than morphology, but the
> same principles apply. In fact DNA has shown that occasionally
> morphology gives a wrong classification as morphological similarity
> may simply mean evolution to a niche.
>
> UNKNOWN LANGUAGE - Compression represents our ability to predict what
> the next word will be.

That is true for everything.  That is, compression is how we store the
information about what to expect next from the environment. Our environment
"talks to us" as much as any human talks to us.

> Grammar can in fact be defined as a set of
> predictive rules.

Yeah, that works for me.

> A minimum compression of accounts is going to show
> that the text has got certain characteristics, the characteristics of
> accounts. We ask what is possible? We are here in the realm of what is
> computable and what is not. What has interested me about the Indus
> valley has been the fact that AI has got further than anyone or
> anything else in decoding it. If P = NP (I don't personally believe it
> does) we will be able to get out a least the numbers. If we are adding
> up a column of figures we are coming close to computability.
>
> 2 + 2 = 4, 4 + 5 = 9
>
> We know in fact that there is a polynomial solution to this. Say x + x
> = y, y + z = w. This is a matrix and in fact cubic.
>
> SCHOLARSHIP - I am going to ask a very similar question. What is the
> minimal representation of a number of ancient languages. Qu'ranic
> Arabic, Syriac, Modern Standard Arabic?  If I find some fragments of
> an old Qu'ran, in Yemen say, can I compress them. This compression
> represents the most probable interpretation.

Well, let me throw out an idea for you to think about there.

If we see written text in a language we don't know, and study it, we will
notice how some patterns seem to repeat in the marks we see on the page.
We will learn to recognize letters, and words, based on the fact that these
shape marks tens to repeat themselves.  So our ability to recognize
letters, and words in the marks on the paper, will be a form of compression
as you talk about (as Wolf has already noted for verbal language).

But something else very important happens at the same time.  If we wanted
to compress the data, we wouldn't just remember the letters and words, we
would remember every little pixel of detail on the paper (as would happen
in a jpg image which was compressed).

What we do in addition to compression, is generally ignore data that we
think is not important - the brain uses a system of highly lossy
compression.

For this to work, some decision has to be made about what to keep, and what
to throw away.  This need to assign worth to the data so that _most_ if it
can be thrown away, is incomparable with simple Kolmogorov Complexity which
assumes nothing is lost.

Have you thought about how value is assigned to the data so that the brain
knows what to keep, and what to throw away?

> Scholarship has in fact
> got a striking similarity to Evolution (as investigated by DNA). There
> is one improvement that compression will have over scholarship. It can
> be made to give a numerical probability. Without this each scholar is
> going to give an interpretation that seems probable. We will sometimes
> need to compare different interpretations.
>
> THE ROLE OF MATHEMATICS - Ian Stewart has remarked that different
> scientific disciplines have the same problems, when expressed in
> mathematical terms, but a different language for expressing them.
> Matrix theory is ubiquitous. If we are dealing with information we
> have Maths which is the same as that used for thermodynamics. We
> should really refer to context in terms of Hamiltonians. (viz exp(-E/
> kT)). Doing things this way will tell us how much "intelligence" we
> are adding. How good in strictly quantitative terms each discriminant
> really is.

Beyond me... :)

> Ian Stewart has said that Mathematics should be out to sell. With the
> advent of computers even the more abstract of concepts has a role to
> play. We need to devise generic solution for similar types of problem.
> It may well be that some problems do not have computable solutions.
> Mathematics however is in the business of providing computability. P =
> NP is STILL unsolved. As I said my interest lies in what the current
> boundaries of computability are. The boundaries of computability are,
> of course, the boundaries of AI. This is the essential question.
>
> If someone seriously claims that the Indus valley is NOT computable,
> that accounts cannot be deciphered I suggest they write to Clay! If P
> = NP any unknown language is computable.
>
> Scot Frye whose post crossed mine gives a method similar to what I
> have stated earlier. As far as Kolmogorov is concerned it can be seen
> at once that sets of parts of speech represent a compression. You have
> to work out the permutations and combinations.
>
>   - Ian Parker

Multiple people seem to feel the draw to approach an understanding of
intelligence from the perspective of compression which they feel creates a
basic understanding of the data.  I agree with that, but it's only half the
problem.

Intelligence is about decision making.  We see it at the very high level as
we make our daily decisions in life, but for the brain, it extends all the
way down to the micro level of the generation of pulses to make our body
move.  Each pulse the brain sends out is a decision made by the brain.

Understanding the environment alone (compression alone), can't explain how
decisions are made, which is the core problem of intelligence.  Why is one
decision "intelligent" where as another decision is not intelligent?

I can build a robot with behaviors generated by a random number generator.
This robot is making "decisions" for each command it sends to it's wheels
and arms, but why would we say this robot was not intelligent?  It's not
because it was lacking compression.

Compression can help an AI understand that if it swings an ax towards it's
finger, the finger is mostly likely going to be separated from the hand.
It will be able to predict that even before it happens, becuase of
compression - becuase it can identify the most likely common patterns that
exist between things it's seen in the past (like chopping carrots), with
what is likely to happen with the finger.

But what compress can't do, is explain why the robot decided to cut the
carrots to make a snack, and not cut the fingers off it's own to make some
finger food.

What compensation alone can't do, is explain why one decision (carrot
cutting), is intelligent, and another decision (finger cutting) is not
intelligent.

To explain intelligent decisions, we have to talk about goals of some type.
And the most generic type of goal, the one and only type of goal that is a
super set of all human goals, is the problem of reinforcement learning -
the most generic type of goal in which all other goals can be framed.  It's
the goal of maximizing _future_ rewards.  Compression plays a very
important role in achieving this goal, but it's not the goal itself.  We
could talk about maximal compression as being a goal, but that goal doesn't
explain the finger vs carrot problem.

Reinforcement learning, is the problem of maximizing future rewards, and to
solve that problem, the agent must assign values to everything - that value
being the association between that thing, and past rewards.  This is how a
machine gains values, and purpose.  It's why it's able to throw out most
the data, and keep only what is most important to it.  It's how it's able
to prioritize the importance of the data it is compressing and how it can
gain more value, by using it's limited storage to only represent the state
of the environment which is most important to it.

It's why a compressed jpg image of a scene may take up 1 MB of data, but an
"intelligent" compression of the same data may only be 100 bytes of data -
it's because a correctly trained intelligent agent only needs to decode the
small part of the scene which is important to it.

If a person walks into a room, studies it for 10 seconds, and then is asked
what they remember, what will they be able to recall?

It's not the same for everyone, becuase everyone is not using a single
"compression" algorithm to represent the contents of the room.  Everyone
has their own compression algorithm which is trained to remember the things
that are most important to them.

If someone is a collector of antique art, and there is a lot of antique art
in the room, that person will remember all sorts of details about the art.
If another person is an avid reader of science fiction, and there is a
large collection of science fiction books in a book case, they will
recognize and remember facts about that.  The second person may only
remember "there were some pictures on the wall", where as the first
remembers how many, who they were buy, approximately what they are worth,
and a long list of other details.  But the first person only remembers,
"there were some books on a book shelf - I think".

It's because the compression algorithm at work in each person's brain has
been trained and optimized to best represent those things that are
important to that person.

Compression explains how we are able to do so much, with what is relatively
a small amount of hardware, but it doesn't explain the fundamental problem
of intelligence - which is a decision making problem.

Intelligence is about making the right decisions.  Compression simply gives
a huge boost to the quality of the decisions that can be made.

We know how to make intelligent machines already.  That is, we know how to
make machines that make wise decisions in response to being trained by
rewards.  But what we don't know how to do, is harness the power of
compression to boost those programs out of the toy domains they are
currently limited to, and into the very high dimension complex domains of
real life that the brain works so well in.  Compression is what's missing,
but it's not the core of what intelligence is all about.

-- 
Curt Welch                                            http://CurtWelch.Com/
curt@kcwc.com                                        http://NewsReader.Com/
0
curt
5/23/2009 8:29:20 PM
Mok-Kong Shen <mok-kong.shen@t-online.de> wrote:
> Wolf K wrote:
>
> [snip]
> > ..... IOW, the
> > difference between languages is that of optional and compulsory
> > expression of concepts. And that in turn is a matter of culture.
> [snip]
>
> Doesn't this imply that people of different native languages "think"
> differently?

I don't know what Wolf was implying, but it seems quite obvious to me that
people of different native languages do think differently.

In order to correctly communicate with a language, our brain must be shaped
to fit it.  That is, how we perceive our world, is heavily influenced by
our language.  We train our brains to see the world, using the concepts,
and structures, of our native language.

To learn a second language we must build a more complex understanding of
the environment so that our understanding can fit both languages.  I
suspect very few people really manage to do that well.  I suspect most
people that learn a second languages tend to think in terms of their main
language, and translate those thoughts to the second, instead of ever
learning to think in multiple languages.

I suspect in order to think in multiple languages, one would have to use
them all with nearly equal frequency to keep the mind shaped to deal with
all of the concepts equally.  Maybe kids that grow up in a mixed language
family and a mixed language society learn to do that?

> M. K. Shen

-- 
Curt Welch                                            http://CurtWelch.Com/
curt@kcwc.com                                        http://NewsReader.Com/
0
curt
5/23/2009 9:16:28 PM
Ian Parker wrote:
> On 23 May, 13:54, Wolf K <weki...@sympatico.ca> wrote:
> 
>>> Doesn't this imply that people of different native languages "think"
>>> differently?
>>> M. K. Shen
>> That's the Whorf hypothesis, and it's been taken up mostly by literary
>> critics, who've found it congenial to their observations about style.
>> Psychologists however have not been able to find a definitive answer to
>> this question. It may be an unanswerable.
>>
>> Literary critics also point out that writers differ in their "sensorium"
>> -- a writer's style (choice of worss, imagery) suggests which sense
>> dominate in his or her experience. Read Helen Keller, for example. She
>> was deaf and mute. You'll find that aprt from conventional cliches she
>> does not use visual imagery.
>>
>> Bi-/multi-lingual people (I'm one) claim that speaking/thinking in
>> different languages "feels different." That's my experience, too, but
>> it's very difficult to explain -- because presumably a thought
>> expressible in A can't, by definition, be expressed in B.... ;-)
>>
> There are in fact 2 questions here. Do people speaking different
> languages brought up in the same (in broad terms) culture think
> differently? I think the answer seems to be "no". There is another
> question "Do languages reflect culture?" The answer to that question
> must be "yes". The Bushmen of the Kalahari have a language based on
> clicks. This suits their environment, they are hunter/gatherers and
> their language reflects the "fittest" hunting language. There is a
> question of "survival of the fittest in terms of the evolution of a
> language.
> 
> BTW - If you look in Kaye and Laby you will find that high pitched
> sounds are attenuated more rapidly. For a person living in a
> "civilised" environment, this does not matter, although for a hunter
> it is of the utmost importance.
> 
> 
>   - Ian Parker
0
Wolf
5/24/2009 12:19:38 AM
Ian Parker wrote:
[...]
> There are in fact 2 questions here. Do people speaking different
> languages brought up in the same (in broad terms) culture think
> differently? I think the answer seems to be "no".

English was out family language, and German out public language. So you 
might say that we were raised bi-lingual in "the same culture". But 
English and German still "feel different."

> There is another
> question "Do languages reflect culture?" The answer to that question
> must be "yes". The Bushmen of the Kalahari have a language based on
> clicks. This suits their environment, they are hunter/gatherers and
> their language reflects the "fittest" hunting language. There is a
> question of "survival of the fittest in terms of the evolution of a
> language.
> 
> BTW - If you look in Kaye and Laby 

Thanks

> you will find that high pitched
> sounds are attenuated more rapidly. For a person living in a
> "civilised" environment, this does not matter, although for a hunter
> it is of the utmost importance.

Clicks are very high frequency sounds, with very sharp attack and decay. 
So I don't think the click languages are good support for your 
hypothesis (which seems to be that features of a given language survive 
because the suit the culture of the speakers.)

cheers,

wolf k.
0
Wolf
5/24/2009 12:24:41 AM
Curt Welch wrote:
> Mok-Kong Shen <mok-kong.shen@t-online.de> wrote:
>> Wolf K wrote:
>>
>> [snip]
>>> ..... IOW, the
>>> difference between languages is that of optional and compulsory
>>> expression of concepts. And that in turn is a matter of culture.
>> [snip]
>>
>> Doesn't this imply that people of different native languages "think"
>> differently?
> 
> I don't know what Wolf was implying, but it seems quite obvious to me that
> people of different native languages do think differently.
> 
> In order to correctly communicate with a language, our brain must be shaped
> to fit it.  That is, how we perceive our world, is heavily influenced by
> our language.  We train our brains to see the world, using the concepts,
> and structures, of our native language.
> 
> To learn a second language we must build a more complex understanding of
> the environment so that our understanding can fit both languages.  I
> suspect very few people really manage to do that well.

In fact millions of people do. The trick is grow up bi-/multi-lingual. 
OTOH, people who have learned their second language relatively late in 
life apparently use a different part of the brain when speaking the 
second language than when speaking the first one. (Sorry, reference lost).

> I suspect most
> people that learn a second languages tend to think in terms of their main
> language, and translate those thoughts to the second, instead of ever
> learning to think in multiple languages.

Yes, this is true of people who learn the second language as young 
adults or later. Those who learn multiple languages in childhood can and 
do think in different languages. (I do, so much so that I find it 
difficult to translate my Austrian relatives' letters into English for 
the benefit of my wife, who is monolingual.)

> I suspect in order to think in multiple languages, one would have to use
> them all with nearly equal frequency to keep the mind shaped to deal with
> all of the concepts equally.  Maybe kids that grow up in a mixed language
> family and a mixed language society learn to do that?
> 
>> M. K. Shen

Yes, but there's an odd effect: I find that for some kinds of discourse 
I prefer English, for other kinds, either language will do. Considering 
your insights, you should be able to guess which. ;-)

cheers,

wolf k.
0
Wolf
5/24/2009 12:39:33 AM
On 24 May, 01:24, Wolf K <weki...@sympatico.ca> wrote:
> Ian Parker wrote:
>
> [...]
>
> > There are in fact 2 questions here. Do people speaking different
> > languages brought up in the same (in broad terms) culture think
> > differently? I think the answer seems to be "no".
>
> English was out family language, and German out public language. So you
> might say that we were raised bi-lingual in "the same culture". But
> English and German still "feel different."
>
There is nothing in English that cannot be expressed in German and
vice versa. Looking at this from an experimental viewpoint the way in
which we think is determined by our environment and upbringing not
primerally by lnguage. In fact our language fits what we want to
express, not the other way rould.

There are differences in the way that an Englishman and a German look
at the world, but these are differences of culture and geography.
History too plays its part. Germany is in Mittleuropa and has a
continental outlook. German industry rests on exporting to the EU.
Britain on the other hand has a more insular outlook. This is
EVERYTHING to do with geography, with a certain amount of history, and
has NOTHING to do with language.

There is one point to bear in mind. People who are bilingual tend to
be more intelligent in lots of ways than those who are not. It would
seem that the effort of learning another language increases are
intellectual potential. This would fit in with your feelings. As I and
other people have said, psychometric measures show no differences
between one language and another, although there are real differences
between polyglots and monoglots.

> > There is another
> > question "Do languages reflect culture?" The answer to that question
> > must be "yes". The Bushmen of the Kalahari have a language based on
> > clicks. This suits their environment, they are hunter/gatherers and
> > their language reflects the "fittest" hunting language. There is a
> > question of "survival of the fittest in terms of the evolution of a
> > language.
>
> > BTW - If you look in Kaye and Laby
>
> Thanks
>
> > you will find that high pitched
> > sounds are attenuated more rapidly. For a person living in a
> > "civilised" environment, this does not matter, although for a hunter
> > it is of the utmost importance.
>
> Clicks are very high frequency sounds, with very sharp attack and decay.
> So I don't think the click languages are good support for your
> hypothesis (which seems to be that features of a given language survive
> because the suit the culture of the speakers.)
>
That is the whole point. You use clicks when you are a few m away from
the person you are talking to but a km+ away from your prey.


  - Ian Parker
0
Ian
5/24/2009 10:20:34 AM
On 24 May, 01:39, Wolf K <weki...@sympatico.ca> wrote:
>
> In fact millions of people do. The trick is grow up bi-/multi-lingual.
> OTOH, people who have learned their second language relatively late in
> life apparently use a different part of the brain when speaking the
> second language than when speaking the first one. (Sorry, reference lost).
>
This depends on when you learn your second language. If you learn it
later on in life it is an intellectual exercise. If you are not a
native Arabic speaker you are more likely to notice that two women
have more than 2 husbands. (The Qur'an). You see you are concerned
with the structure of duals and plurals.

Intellectual exercises go into the other hemisphere.


  - Ian Parker
0
Ian
5/24/2009 10:27:22 AM
Ian Parker wrote:
> On 24 May, 01:24, Wolf K <weki...@sympatico.ca> wrote:
>> Ian Parker wrote:
>>
>> [...]
>>
>>> There are in fact 2 questions here. Do people speaking different
>>> languages brought up in the same (in broad terms) culture think
>>> differently? I think the answer seems to be "no".
>> English was out family language, and German out public language. So you
>> might say that we were raised bi-lingual in "the same culture". But
>> English and German still "feel different."
>>
> There is nothing in English that cannot be expressed in German and
> vice versa.

Read george Steiner's After Babel, and also Quine on translation.

[...]
0
Wolf
5/24/2009 12:21:33 PM
Ian Parker wrote:
> On 24 May, 01:39, Wolf K <weki...@sympatico.ca> wrote:
>> In fact millions of people do. The trick is grow up bi-/multi-lingual.
>> OTOH, people who have learned their second language relatively late in
>> life apparently use a different part of the brain when speaking the
>> second language than when speaking the first one. (Sorry, reference lost).
>>
> This depends on when you learn your second language.{...]


As I said, and more than once, too.

cheers,

wolf k.



0
Wolf
5/24/2009 12:22:10 PM
If you are going to have a definition of "Intelligence" it must be a
definition that has RIGOUR. Now "What is the difference between
pouring concrete at 50C and fighting Israel? This arose from a Google
Translation "wsT jw AlmErkp=A0" as "central air battle". My logic is
that anyone who fails to ask that question in a Turing Test is a "true
believer". We therefore see that subjectivity is at the heart of
Turing. A definition based on compressibility is something unambiguous
that can be calculated.

The essence of what you are saying is that Intelligence is defined as
the capacity to reach a goal. This can be reconciled with compression
in the following way. Let us define a goal to be an energy well. Now
compression is the metric in a flat space which lacks energy wells. In
a space with energy wells we define the probability of a particular
configuration of phase space as :-

exp(-E/kT) or exp(E/kT) depending on sign. Notice I am using "kT".
This links the whole idea up with physical thermodynamics. We have to
minimize this quantity. This quantity is minimized when you are in a
well. If you say that we forget and concentrate on what is important.
True but this implies that the thermodynamics is such that only the
well matters.

Is this useful? Before you wade in I think there are some points to
consider. First of all contests are not transitive. If A beats B and B
beats C will A beat C? Not necessarily. C may have a style that loses
to B but beats A. Mathematicians are fond of pathological cases.
Unless a relationship is demonstrably transitive you cannot use
victory/defeat as a sole criterion.

A robot has in essence to look at phase space and try to occupy
victorious phase space. What phase space is victorious? Deciding is a
matter of Intelligence. If we classify phase space in this way we are
back to our old choice and compression problem.

One interesting sideline on phase space and victory/defeat is the "Red
Baron". He hung over an airfield at a reasonable altitude and
challenged pilots to take him on. Of course an aircraft taking off has
little energy. The Red Baron was occupying victorious phase space. He
just had to dive, he would have speed and hence manoeuvrability. The
opposing aircraft was doomed whatever it did.

http://www.cowboylyrics.com/tabs/misc/snoopy-versus-the-red-baron-3311.html
AI would not get you "Buried together in the countryside".

This is an interesting anecdote, but surprisingly important. NIM was
solved using precisely this kind of backwards logic. One can say
immediately that a robot should manoeuvre so that its chance of
victory is maximized. If we define "Intelligence" in terms of phase
space (may have a large number of dimensions) we avoid the transitive
paradox. How useful is this in designing a robot? As I said NIM was
solved using phase space. In other cases it is less useful.

Of course Evolution advances with a minimum change of genes. My
presentation of the resolution of life using maximum compression is
looking at life from the stand-point of the present day. Any sort of
analysis or scholarship involves getting the most probable solution.


  - Ian Parker
0
Ian
5/24/2009 1:32:24 PM
"Curt Welch" <curt@kcwc.com> wrote in message 
news:20090523171531.615$FV@newsreader.com...
> Mok-Kong Shen <mok-kong.shen@t-online.de> wrote:
>> Wolf K wrote:
>>
>> [snip]
>> > ..... IOW, the
>> > difference between languages is that of optional and compulsory
>> > expression of concepts. And that in turn is a matter of culture.
>> [snip]
>>
>> Doesn't this imply that people of different native languages "think"
>> differently?
>
> I don't know what Wolf was implying, but it seems quite obvious to me that
> people of different native languages do think differently.

Do people who speak different languages behave differently? If there is a 
sense in which this is true - and I think there is - then it is likely that 
they think differently. No? In your view, is "thinking" something other than 
behavior?

>
> In order to correctly communicate with a language, our brain must be 
> shaped
> to fit it.

What does this mean? What do you mean by "language" here? Is it something 
other than a person's verbal repertoire?

>That is, how we perceive our world, is heavily influenced by
> our language.  We train our brains to see the world, using the concepts,
> and structures, of our native language.

We "train our brains"? What could this mean?

>
> To learn a second language we must build a more complex understanding of
> the environment so that our understanding can fit both languages.

I don't understand how this can make sense.

>I
> suspect very few people really manage to do that well.  I suspect most
> people that learn a second languages tend to think in terms of their main
> language, and translate those thoughts to the second, instead of ever
> learning to think in multiple languages.

Does this mean that you think that utterances are translations of 
"thoughts"? If so, is this the case when one speaks only one language?

>
> I suspect in order to think in multiple languages, one would have to use
> them all with nearly equal frequency to keep the mind shaped to deal with
> all of the concepts equally.  Maybe kids that grow up in a mixed language
> family and a mixed language society learn to do that?

By "think[ing] in multiple languages" do you mean "talking to oneself"?

0
Glen
5/24/2009 4:36:29 PM
"Wolf K" <wekirch@sympatico.ca> wrote in message 
news:4a189647$0$18005$9a6e19ea@news.newshosting.com...
> Curt Welch wrote:
>> Mok-Kong Shen <mok-kong.shen@t-online.de> wrote:
>>> Wolf K wrote:
>>>
>>> [snip]
>>>> ..... IOW, the
>>>> difference between languages is that of optional and compulsory
>>>> expression of concepts. And that in turn is a matter of culture.
>>> [snip]
>>>
>>> Doesn't this imply that people of different native languages "think"
>>> differently?
>>
>> I don't know what Wolf was implying, but it seems quite obvious to me 
>> that
>> people of different native languages do think differently.
>>
>> In order to correctly communicate with a language, our brain must be 
>> shaped
>> to fit it.  That is, how we perceive our world, is heavily influenced by
>> our language.  We train our brains to see the world, using the concepts,
>> and structures, of our native language.
>>
>> To learn a second language we must build a more complex understanding of
>> the environment so that our understanding can fit both languages.  I
>> suspect very few people really manage to do that well.
>
> In fact millions of people do. The trick is grow up bi-/multi-lingual. 
> OTOH, people who have learned their second language relatively late in 
> life apparently use a different part of the brain when speaking the second 
> language than when speaking the first one. (Sorry, reference lost).

I wonder if it is not ultimately misleading to say that someone "uses" their 
brain to speak. True, we do sometimes say that a person used a part of their 
body as in "He used his hand to break the window." But this is simply the 
same as saying "He broke the window with his hand." But would one say "He 
used his [insert brain part] to speak"? Would the analogous substitution be 
"He spoke with his [insert brain part]." This seems somewhat awkward, which 
could mean that it is an ineffective way to talk. Even in the case of "He 
used his hand to break the window" no one would say that "He used only his 
hand to break the window" except, perhaps,  in the unusual circumstance that 
one was in possession of one's disembodied hand and, perhaps, threw it 
through the window. Even if one wants to keep the term "cognition," as in 
"embodied cognition," it probably behooves one to remember that the root of 
"embodied" is "body," and it would be bizarre to say that "one used their 
body to behave." One simply behaves. But having said that, it is almost 
certain that the behavior of a person speaking a "second language," acquired 
late in life is quite different from that of a child that grows up in a 
bilingual household. Indeed, it is somewhat misleading - at least in some 
important sense - to say that the child speaks "two languages," at least 
from the standpoint of the individual child.  The truly bilingual child is 
simply speaking and, no doubt, depending on the audience and circumstance, 
says "I'm hungry" vs. J'ai faim." No substantially-different principles must 
be invoked to explain the two utterances. But this is not the case - at 
least quite often - for a person who has "learned a second-language" late in 
life. In the latter condition, the person almost always - except with 
respect to a few dozen (or even few hundred) phrases, - privately says the 
first language locution, and then constructs the "second language" using 
rules and substitution. How could the physiology that mediates such behavior 
NOT be different from that of the child that has simply "learned to speak 
two languages." The utterances of these two very different people might have 
some structural characteristics in common, but the responses are, 
functionally, quite different.

>
>> I suspect most
>> people that learn a second languages tend to think in terms of their main
>> language, and translate those thoughts to the second, instead of ever
>> learning to think in multiple languages.
>
> Yes, this is true of people who learn the second language as young adults 
> or later. Those who learn multiple languages in childhood can and do think 
> in different languages. (I do, so much so that I find it difficult to 
> translate my Austrian relatives' letters into English for the benefit of 
> my wife, who is monolingual.)

This makes a great deal of sense from the standpoint of what I said above, 
except this aspect deals with utterances as a stimulus rather than as a 
response (though the two are intimately entwined), and is a revealing 
personal account. [Thank you, BTW.] The letters from your relatives, in 
German, control your behavior as a listener but, even though you are "truly 
bilingual," you must engage in the same sorts of mediating behavior  that a 
"second-language" person would have to emit. A "skilled translator" would, 
no doubt, acquire a large repertoire of responses where whole phrases in one 
language, serve as discrimitive stimuli for responses "in the other 
language." Such a person would have trouble frequently - I am guessing - 
with extended utterances that contain few standard phrases. The skilled 
translator would do well  under many circumstances, but would have a 
difficult time in a courtroom, for example, when an expert witness is 
speaking.

Cordially,
Glen
>
>> I suspect in order to think in multiple languages, one would have to use
>> them all with nearly equal frequency to keep the mind shaped to deal with
>> all of the concepts equally.  Maybe kids that grow up in a mixed language
>> family and a mixed language society learn to do that?
>>
>>> M. K. Shen
>
> Yes, but there's an odd effect: I find that for some kinds of discourse I 
> prefer English, for other kinds, either language will do. Considering your 
> insights, you should be able to guess which. ;-)
>
> cheers,
>
> wolf k. 

0
Glen
5/24/2009 8:40:28 PM
Wolf K wrote:
[snip]

> Bi-/multi-lingual people (I'm one) claim that speaking/thinking in 
> different languages "feels different." That's my experience, too, but 
> it's very difficult to explain -- because presumably a thought 
> expressible in A can't, by definition, be expressed in B.... ;-)

The tense system in English is quite elaborate in comparison with
some other languages. If I don't err, both present and present
continuous in English have to be translated to present in Russian.
A Russian certainly could in some sort of roundabout way express the
exact equivalent of present continuous in English. But, since he
in daily life doesn't do that, he thinks normally differently than
an Englishman. Could one argue like that to support the Whorf
hypothesis?

Thanks.

M. K. Shen
0
Mok
5/24/2009 9:50:17 PM
On May 24, 5:50=A0pm, Mok-Kong Shen <mok-kong.s...@t-online.de> wrote:
> Wolf K wrote:
>
> [snip]
>
> > Bi-/multi-lingual people (I'm one) claim that speaking/thinking in
> > different languages "feels different." That's my experience, too, but
> > it's very difficult to explain -- because presumably a thought
> > expressible in A can't, by definition, be expressed in B.... ;-)
>
> The tense system in English is quite elaborate in comparison with
> some other languages. If I don't err, both present and present
> continuous in English have to be translated to present in Russian.
> A Russian certainly could in some sort of roundabout way express the
> exact equivalent of present continuous in English. But, since he
> in daily life doesn't do that, he thinks normally differently than
> an Englishman. Could one argue like that to support the Whorf
> hypothesis?

   I don't think so, Since what Englishmen call present continous has
   more to do with their government, than their language.
   Which is also where acres come from.






>
> Thanks.
>
> M. K. Shen

0
zzbunker
5/24/2009 10:26:05 PM
Ian Parker <ianparker2@gmail.com> wrote:
> If you are going to have a definition of "Intelligence" it must be a
> definition that has RIGOUR.

Why?  Language has millions of words, most of which are defined with no
rigor.

I think you just happen to be a typical hyper rational that is attracted to
such rigor.

I don't believe it's possible to capture the essence of the concept of
intelligence with rigor.  I tend to relate it most closely with the concept
of an agent that is trained by reinforcement, which gives it a far more
rigorous and firm foundation than most common usage, but still, that
doesn't give it the rigor of a mathematical definition.  That's becuase
there is no way to look at some hunk of mater and conclude conclusively
whether or not it's a reinforcement learning agent.

> Now "What is the difference between
> pouring concrete at 50C and fighting Israel? This arose from a Google
> Translation "wsT jw AlmErkp=A0" as "central air battle". My logic is
> that anyone who fails to ask that question in a Turing Test is a "true
> believer". We therefore see that subjectivity is at the heart of
> Turing. A definition based on compressibility is something unambiguous
> that can be calculated.
>
> The essence of what you are saying is that Intelligence is defined as
> the capacity to reach a goal.

Well, actually no.  Reinforcement learning has no end goal so it's not
about _reaching_ the goal. It's only about seeking to _maximize_ the reward
signal - aka, do as best as the agent can do.  There's not even a minimal
performance requirement really - simply that it be so structured that the
expected probability of it improving it's reward over time is higher than
the probability of not doing so.

> This can be reconciled with compression
> in the following way. Let us define a goal to be an energy well. Now
> compression is the metric in a flat space which lacks energy wells. In
> a space with energy wells we define the probability of a particular
> configuration of phase space as :-
>
> exp(-E/kT) or exp(E/kT) depending on sign. Notice I am using "kT".
> This links the whole idea up with physical thermodynamics. We have to
> minimize this quantity. This quantity is minimized when you are in a
> well. If you say that we forget and concentrate on what is important.
> True but this implies that the thermodynamics is such that only the
> well matters.

I don't understand the connection between compression and minimizing
exp(E/kT) you seem to see.

> Is this useful? Before you wade in I think there are some points to
> consider. First of all contests are not transitive. If A beats B and B
> beats C will A beat C? Not necessarily. C may have a style that loses
> to B but beats A. Mathematicians are fond of pathological cases.
> Unless a relationship is demonstrably transitive you cannot use
> victory/defeat as a sole criterion.

That would only apply if you were trying to quantify intelligence.  I don't
suspect that even applies.

> A robot has in essence to look at phase space and try to occupy
> victorious phase space. What phase space is victorious? Deciding is a
> matter of Intelligence. If we classify phase space in this way we are
> back to our old choice and compression problem.

Yeah, well, the general idea of reinforcement learning is that _some_
(unspecified) parameter is attempting to be maximized by physically
transforming the agent.  It's not relevant what the parameter is.

> One interesting sideline on phase space and victory/defeat is the "Red
> Baron". He hung over an airfield at a reasonable altitude and
> challenged pilots to take him on. Of course an aircraft taking off has
> little energy. The Red Baron was occupying victorious phase space. He
> just had to dive, he would have speed and hence manoeuvrability. The
> opposing aircraft was doomed whatever it did.
>
> http://www.cowboylyrics.com/tabs/misc/snoopy-versus-the-red-baron-3311.ht
> ml AI would not get you "Buried together in the countryside".
>
> This is an interesting anecdote, but surprisingly important. NIM was
> solved using precisely this kind of backwards logic.

What is NIM?

What "backwards logic" are you talking about?

> One can say
> immediately that a robot should manoeuvre so that its chance of
> victory is maximized. If we define "Intelligence" in terms of phase
> space (may have a large number of dimensions) we avoid the transitive
> paradox. How useful is this in designing a robot? As I said NIM was
> solved using phase space. In other cases it is less useful.

I don't really understand your use of the phrase "phase space" here.  I
assume it's something common in physics?

All interesting reward maximizing problems are multidimensional becuase the
learning agent will have multiple parameters it will attempt to adjust for
the purpose of maximizing the reward.

> Of course Evolution advances with a minimum change of genes. My
> presentation of the resolution of life using maximum compression is
> looking at life from the stand-point of the present day. Any sort of
> analysis or scholarship involves getting the most probable solution.
>
>   - Ian Parker

-- 
Curt Welch                                            http://CurtWelch.Com/
curt@kcwc.com                                        http://NewsReader.Com/
0
curt
5/24/2009 10:48:38 PM
Glen M. Sizemore wrote:
> 
> "Wolf K" <wekirch@sympatico.ca> wrote in message 
> news:4a189647$0$18005$9a6e19ea@news.newshosting.com...
[...]
>> Yes, this is true of people who learn the second language as young 
>> adults or later. Those who learn multiple languages in childhood can 
>> and do think in different languages. (I do, so much so that I find it 
>> difficult to translate my Austrian relatives' letters into English for 
>> the benefit of my wife, who is monolingual.)
> 
> This makes a great deal of sense from the standpoint of what I said 
> above, except this aspect deals with utterances as a stimulus rather 
> than as a response (though the two are intimately entwined), and is a 
> revealing personal account. [Thank you, BTW.] The letters from your 
> relatives, in German, control your behavior as a listener but, even 
> though you are "truly bilingual," you must engage in the same sorts of 
> mediating behavior  that a "second-language" person would have to emit.

Agreed. I haver the same problem translating English into German, BTW, 
made worse by the fact that my German lexicon is much smaller than my 
English one (I left Austria when I was 14, so all my secondary and 
post-secondary schooling was in English.)

> A "skilled translator" would, no doubt, acquire a large repertoire of 
> responses where whole phrases in one language, serve as discrimitive 
> stimuli for responses "in the other language." Such a person would have 
> trouble frequently - I am guessing - with extended utterances that 
> contain few standard phrases. The skilled translator would do well  
> under many circumstances, but would have a difficult time in a 
> courtroom, for example, when an expert witness is speaking.
>[...]

Your guess is IMO correct. I occasionally watch our (Canadian) 
Parliament on TV. It's bilingual by law. When the French 
Parliamentarians speak, the simultaneous translator occasionally delays 
or stumbles. They work relatively short stints, BTW -- about 1/2 an hour 
at a time. Translation is never easy, simultaneous translation requires 
arduous training, and only a minority of candidates succeed.

By "thinking [in a language]". I of course mean "talking to myself." 
Here's a quirk for you: when I use counting for timing, I usually count 
in German, starting with "ein und zwanzig" (21). Many years ago, I 
trained myself to  to count in German at a rate of 1 number per second. 
The error is about 5 seconds +/- per minute, which is close enough for 
short timings.

I agree that "use one's brain to speak" is metaphorical, loosey-goosey 
talk. What's known is that in many cases of progressive dementia, 
bi-/multi-lingual people lose their most recently acquired language(s) 
first. Also, some fMRIs have shown that different parts of the brain are 
active when a person speaks different languages. FWIW, a singer's brain 
shows similar differences in speaking vs singing. Also, singers can sing 
a second language accent-free even when they cannot speak that language 
accent free. In fact, they don't even have to be able to speak the 2nd 
language in order to sing it. And most tourists learn a handful of handy 
phrases, which they use in much the same way as hand-signals to indicate 
their desires. If uttering a language required "understanding", this 
would of course be impossible.

cheers,

wolf k.
0
Wolf
5/24/2009 11:39:49 PM
Mok-Kong Shen wrote:
> Wolf K wrote:
> [snip]
> 
>> Bi-/multi-lingual people (I'm one) claim that speaking/thinking in 
>> different languages "feels different." That's my experience, too, but 
>> it's very difficult to explain -- because presumably a thought 
>> expressible in A can't, by definition, be expressed in B.... ;-)
> 
> The tense system in English is quite elaborate in comparison with
> some other languages. If I don't err, both present and present
> continuous in English have to be translated to present in Russian.
> A Russian certainly could in some sort of roundabout way express the
> exact equivalent of present continuous in English. But, since he
> in daily life doesn't do that, he thinks normally differently than
> an Englishman. Could one argue like that to support the Whorf
> hypothesis?
> 
> Thanks.
> 
> M. K. Shen

As to whether and how a language controls one's thinking: that depends 
on what you mean by thinking. If you mean "talking to oneself", then 
your perception will be controlled the same way as when speaking/ 
listening. If you mean something else, well, define it, and some sort of 
answer will emerge more or less murkily from your definition. The fact 
of "selective inattention" is IMO relevant here. Just was we tend not 
notice the new stop sign on a daily route because we're looking for 
something else (such as cross traffic), we tend not to notice what our 
speech doesn't refer to.

Re tense:

By tense I mean an inflection of a verb, not the more general notion of 
"way of denoting time past, present and future." Defined thus, English 
has only two tenses: "unmarked" and "marked." Unmarked is used for the 
"indefinite present", and "marked" is used for the immediate past, and 
the indefinite past (context decides which.) All other "tenses" in 
English are strictly speaking modals. What you call the "continuing 
present" is the _actual_ present. The name "continuous (progressive) 
present" was invented by a C17 school teacher whose knew only Latin 
grammar, which uses adverbials, not tense or modals, to express this 
concept. And the poor guy got it wrong when he applied it to English..

EG:
"I am painting" == my action of painting is occurring as I speak.

"I paint" == my action of painting has occurred and will occur again -- 
IOW, painting is one of my habitual actions, and may be one of my 
skills, or my occupation, or my avocation (as determined by context.)

Generally speaking, IE languages have a mix of tenses and modals to 
denote the time-relationship between an action/event/process and the 
time of speaking about it. No IE language uses only tenses for these 
matters. What little I know of non-IE languages suggests that all 
languages use two or more means for expressing time-relationships.

In German, one uses the present tense for both the actual present 
(common), and the indefinite present (uncommon). One can also use ot 
denote the immediate future (quite common.) Thus, according to context, 
"Ich gehe in die Stadt" can mean "I go to town", "I will go to town very 
soon", and "I am going to town now." If context isn't clear, one can say 
"Ich gehe jetzt in die Stadt"; etc.

BTW, the fact that context may prompt the addition or omission of 
adverbials is one reason why a purely statistical analysis of an unknown 
language's written record will not get us very far.

Those who want to continue believing that if we have written texts, then 
we can at least determine the parts of speech, should consider Etruscan.

cheers,

wolf k.
0
Wolf
5/25/2009 12:06:02 AM
Curt Welch wrote:
> Ian Parker <ianparker2@gmail.com> wrote:
>> If you are going to have a definition of "Intelligence" it must be a
>> definition that has RIGOUR.
> 
> Why?  Language has millions of words, most of which are defined with no
> rigor.
>[...]

True of all human languages taken together, false when referring to any 
one language. Besides, the vocabulary of any speaker of a language is 
much smaller than the lexicon (== all the words ever used by 
speakers/writers of that language.)

cheers,

wolf k.
0
Wolf
5/25/2009 12:10:30 AM
"Glen M. Sizemore" <gmsizemore2@yahoo.com> wrote:

Ah, haven't seen you speak up in some time.  Been busy?

> "Curt Welch" <curt@kcwc.com> wrote in message
> news:20090523171531.615$FV@newsreader.com...
> > Mok-Kong Shen <mok-kong.shen@t-online.de> wrote:
> >> Wolf K wrote:
> >>
> >> [snip]
> >> > ..... IOW, the
> >> > difference between languages is that of optional and compulsory
> >> > expression of concepts. And that in turn is a matter of culture.
> >> [snip]
> >>
> >> Doesn't this imply that people of different native languages "think"
> >> differently?
> >
> > I don't know what Wolf was implying, but it seems quite obvious to me
> > that people of different native languages do think differently.
>
> Do people who speak different languages behave differently?

Sure.  They speak French instead of English.  Isn't that an obvious
difference in behavior?

If we could produce a simple one to one correlation between French speaking
beahvior and English Speaking behavior (i.e. a simple mechanical
translation from one language to the other), we could chose to ignore the
minor syntactical differences in the two behaviors.  But since no such
simple mapping (aka translation) exists, I think we can very simply argue
the behavior is different.

> If there is a
> sense in which this is true - and I think there is - then it is likely
> that they think differently. No?

Yes, I would say so.

> In your view, is "thinking" something
> other than behavior?

Well, thinking to me is mostly a reference to private brain behaviors vs
externally created behaviors but either way, its beahvior to me.

> > In order to correctly communicate with a language, our brain must be
> > shaped
> > to fit it.
>
> What does this mean? What do you mean by "language" here? Is it something
> other than a person's verbal repertoire?

Yes, that's all I mean by language.  In order to effectively manipulate our
environment by creating the sounds of language, our brain must be "wired"
to correctly produce, and perceive, these unique sound sequences.

> >That is, how we perceive our world, is heavily influenced by
> > our language.  We train our brains to see the world, using the
> > concepts, and structures, of our native language.
>
> We "train our brains"? What could this mean?

Yeah, that's a very sloppy way to talk isn't it! It's hard to find a way to
talk about these things that everyone can follow.

Becoming conditioned to use and respond to language causes our brain's
classification (perception and action generation) system to be shaped to
fit the requirements of the language behavior we are being trained to
produce.

> > To learn a second language we must build a more complex understanding
> > of the environment so that our understanding can fit both languages.
>
> I don't understand how this can make sense.

The brain has to learn to map sensory perceptions into actions in order to
correctly produce language behaviors.

For example when we see (are exposed to) a fixed frequency of light, we
will be conditioned to map that frequency into a word like "blue" or "red".
In learning the words blue and red, our brain has had it's clasification
system adjusted to fit the socially accepted defintions of blue and red.

Light of the frequencey from 606650 might be mapped to the behavior of
producing the word "blue", and 650750 might be mapped to red.

But by social convention, rouge (in French) might correspond to the
frequency range of 700-800 (which overlaps, but does not really fit, the
English word red).

In order for the brain to correctly use both the words red, and Rouge, it
will need a more complex classification system conditioned into it.

It will need to map 650-700 as red, 700-750 as red or rouge, and 750-800 as
rouge.

An English speaker that wasn't well conditioned to correctly use the word
rouge, would likely first just learn the "short cut" of using their same
"red" classification hardware, and mapping that to "rouge".

It's even possible (just guessing) that parts of the brain get "locked in"
early in life and that an older person couldn't learn to retrain that lower
level classification hardware to correctly do the mapping from light
frequency to these two different, but similar words.

> > I
> > suspect very few people really manage to do that well.  I suspect most
> > people that learn a second languages tend to think in terms of their
> > main language, and translate those thoughts to the second, instead of
> > ever learning to think in multiple languages.
>
> Does this mean that you think that utterances are translations of
> "thoughts"? If so, is this the case when one speaks only one language?

It means that, like with my made up red/rouge example, they have what is
mostly English trained classification hardware conditioned in their brain
which does a poor job of translating their internal classifications to
external verbal behaviors in the second language.  They say rouge, but the
"thought" was really the product of their red classier circuitry.

> > I suspect in order to think in multiple languages, one would have to
> > use them all with nearly equal frequency to keep the mind shaped to
> > deal with all of the concepts equally.  Maybe kids that grow up in a
> > mixed language family and a mixed language society learn to do that?
>
> By "think[ing] in multiple languages" do you mean "talking to oneself"?

Yes. Mostly.  Talking to oneself using concepts/words from multiple
languages.

Learning a second language is not much different than extending a language
you already know.  Either way, you are simply extending your language
behavior.

The difference however is that words from a single language for the most
part, doesn't include a lot of duplication.  When you add the vocabulary of
a second language, most of it is duplicating concepts (classifications)
already labeled in the first language.

However, I suspect there are testable examples of how people tend to
classify similar events that are shaped by the language they have learned.
For example, when we see multiple color patches, and are trying to pick out
a paint color, we might logical try to lump the colors into different sets,
such as "the blues", and "the reds".  If we were physically arranging color
patches on a table top so we could look at them, we might sort them into
one row for the red colors, and another row for the blue colors.  How we
might chose to do such a sorting is likely biased by how the classifiers in
our brain have been conditioned by learning a language.

In addition to arranging the colors patches on the table, we can perform a
similar sorting of the colors in our head by mentally grouping them in to
different sets.  If those sets end up being biased by our language
classification, then that would be an example of what I mean by how
language effects the way we think - even though it includes no direct
"talking to oneself" in that example.

I suspect much about what do and think about is more heavily biased by how
our brain's classification systems have been conditioned because of the
language we have been conditioned to speak.

-- 
Curt Welch                                            http://CurtWelch.Com/
curt@kcwc.com                                        http://NewsReader.Com/
0
curt
5/25/2009 1:09:52 AM
On 24 May, 21:40, "Glen M. Sizemore" <gmsizemo...@yahoo.com> wrote:
> A "skilled translator" would,
> no doubt, acquire a large repertoire of responses where whole phrases in =
one
> language, serve as discrimitive stimuli for responses "in the other
> language." Such a person would have trouble frequently - I am guessing -
> with extended utterances that contain few standard phrases. The skilled
> translator would do well =A0under many circumstances, but would have a
> difficult time in a courtroom, for example, when an expert witness is
> speaking.
>
Well, well, well - This is essentially the n-gram technique as
represented by Google Translate. we no longer have aquatic sheep. A
hydraulic ram is no longer "Wasserwidder" it is recognized as an
expression. Oeldruckramme being (probably) the best rendering. However
you need other techniques as well to produce a good translation. GT
takes absolutely no specific cognisance of tense or any other
grammatical construct.

A human translator will in fact take an n-gram and inflect it
according to grammatical rules. Google might be improved if it took
and stored templates as n-grams. Every verb should be stored as an
infinitive and inflected according to grammatical rules.


  - Ian Parker
0
Ian
5/25/2009 10:01:40 AM
I think you have to try to reach a rigorous definition. When I talk
about "fighting Israel" I am, of course, making a subjective
assessment as well. I seem to recall a conversation on concrete.

http://www35.wolframalpha.com/input/?i=concrete

I must say that I had a drainpipe that came loose and I needed to mix
up a small quantity The packet told me 5-35C. In England temperatures
are rarely >35 although in Aswan they are frequently in the 50s. Also
a small quantity can set hotter than a large quantity that generates
internal heat.

Point is both Google and Wolfram Alpha have all this in their
information store. Wolfram is in some respects getting close to
understanding and what we would regard as AI.

Google translates without knowledge. Google and W in fact HAVE the
knowledge stored away. Wolfram BTW can give you the weather in Aswan
on any day in history!

We need rigour for a number of reasons. First of all we need to know
how well we are doing. How good is out translation? Bleu gives you an
assessment of this. The NIST hold an annual translation competition.
Arabic, Chinese, Urdu -> English. This needs to be judged fairly.

In Mathematics the search for rigour often gives us insights. You are
talking about learning and reinforcement. If we can quantify this we
will be able to develop algorithms. Certainly any idea of
reinforcement leads to a different distribution of "E" as in exp(-E/
kT) throughout phase space.

In reinforcement we have a number of convergence criteria. We can
examine how the phase space has changed after a number of attempts
resulting in convergence. WE need to know what the important
parameters of phase space are.

Let us take an example. A great many robots use stepping motors. These
in effect advance one step at a time. The robot is constrained to move
slowly as it has to stop in one click. Enter a robot with phase space.
This robot can move rapidly. It does however have a starting point and
a finishing point. The algorithm is to get to the finishing point with
minimum cost in terms of a balance of energy and time. There will be
regions of phase space where the robot is going to overshoot (whatever
inputs are put in). We define E to be a region we do not want to be
in.

In a similar way bipedal balance may be defined in terms of phase
space. An action like running or walking is effectively cyclic motion
in phase space. We can define this and iterate to it.

We are discussing primevally language here rather than robotics.
Language fits in well with Kolmogorov. How does robotics fit in? It
does, but not quite so obviously. It seems amazing to me that we have
not made better advances in robotics than we have. I think we need to
look at phase space.


  - Ian Parker
0
Ian
5/25/2009 10:43:57 AM
Wolf K wrote:
8snip]
> As to whether and how a language controls one's thinking: that depends 
> on what you mean by thinking. If you mean "talking to oneself", then 
> your perception will be controlled the same way as when speaking/ 
> listening. If you mean something else, well, define it, and some sort of 
> answer will emerge more or less murkily from your definition. The fact 
> of "selective inattention" is IMO relevant here. Just was we tend not 
> notice the new stop sign on a daily route because we're looking for 
> something else (such as cross traffic), we tend not to notice what our 
> speech doesn't refer to.

O.k. If a Russian in everyday life uses the present tense and thus
makes not distinction between present and present continuous, it seems
that he is also not consciously aware of the existence of that
distinction. In this sense could he be considered to be thinking
differently than an Englishman, I suppose.

> By tense I mean an inflection of a verb, not the more general notion of 
> "way of denoting time past, present and future." Defined thus, English 
> has only two tenses: "unmarked" and "marked." 

You definition differs from what I learned in school (if I remember
right.) According to Collins: tense is "a category of the verb or
verbal inflections, such as present, past, and future ....".

M. K. Shen
0
Mok
5/26/2009 3:49:45 PM
Mok-Kong Shen wrote:
> Wolf K wrote:
> 8snip]
>> As to whether and how a language controls one's thinking: that depends 
>> on what you mean by thinking. If you mean "talking to oneself", then 
>> your perception will be controlled the same way as when speaking/ 
>> listening. If you mean something else, well, define it, and some sort 
>> of answer will emerge more or less murkily from your definition. The 
>> fact of "selective inattention" is IMO relevant here. Just was we tend 
>> not notice the new stop sign on a daily route because we're looking 
>> for something else (such as cross traffic), we tend not to notice what 
>> our speech doesn't refer to.
> 
> O.k. If a Russian in everyday life uses the present tense and thus
> makes not distinction between present and present continuous, it seems
> that he is also not consciously aware of the existence of that
> distinction. In this sense could he be considered to be thinking
> differently than an Englishman, I suppose.

I suspect that a Russian is very aware of the actual present vs the 
indefinite present. I don't speak Russian, but I do speak German, which 
also lacks the "continuous present", yet I am always aware of whether 
"Er arbeitet in der Fabrik" means "He is working at the factory" and "He 
works at the factory." Context makes it clear.

>> By tense I mean an inflection of a verb, not the more general notion 
>> of "way of denoting time past, present and future." Defined thus, 
>> English has only two tenses: "unmarked" and "marked." 
> 
> You definition differs from what I learned in school (if I remember
> right.) According to Collins: tense is "a category of the verb or
> verbal inflections, such as present, past, and future ....".

Yes, that's the definition I learned, too, and it worked well until I 
had to teach it. Then I realised that it mixes a semantic notion (time 
past, present, and future) with morphology and syntax (verb forms and 
verb phrases that are used to express time relationships.) It's also 
unclear: in English (and AFAIK in all languages), what's expressed is 
time relationships, not time as such. That's why we can refer to an 
event A as being in the past in relation to some future event B, yet A 
is in the future in relation to the time of speaking ("now"). IOW, it's 
really much more subtle (and elegant) than traditional grammar describes.

BTW, tense == verb inflection is one of several standard linguistic 
definitions of the term. (Linguists argue about such concepts even more 
than grammarians do. ;-) ) It's sad but true that grammar as taught in 
English schools (including ESL) schools) is a mish-mash of semantic, 
morphological, and syntactic definitions, with no clear distinctions 
among them, often smushed together, overlaid and confused by usage 
rules, and uninformed by linguistic insights. Not that it's all that 
much better for other languages, if I recall correctly the German 
grammar taught in lower and middle school in Austria.

cheers,

wolf k.

0
Wolf
5/26/2009 7:12:01 PM
Mok-Kong Shen wrote:

> BTW, I find it difficult to imagine that a "pure" statistical
> method could determine the parts of speech of an "entirely"
> unknown language. [snip]

In fact Rao et. al. have with statistical techniques only shown
that it is very plausible that the Indus script is a natural language.
The following is quoted from their article Entropic Evidence for
Linguistic Structure in the Indus Script, Science, Vol.324,
29 May 2009. p.1165:

    We find that the conditional entropy of Indus inscriptions
    closely matches those of linguistic systems and remains far
    from nonlinguistic systems throughout the entire range of
    token set sizes.

    These observations are consistent with previous suggestions,
    made on the basis of the total number of Indus signs, that
    the Indus script may be logo-syllalbic.

    Given the prior evidence for syntactic structure in the
    Indus script, our results increase the probability that the
    script represents language, complementing other arguments
    that have been made explicity or implicitely in favor of
    the linguistic hypothesis.

In an interview
(http://www.hindu.com/2009/04/24/stories/2009042455732000.htm)
Rao said: For now we want to analyse the structure and syntax
of the script and infer its grammatical rules. Someday we could
leverage this information to get to a decipherment, It would
indeed be interesting to see how this task could be achieved
(unless something equivalent to the Rosetta stone would turn up).

M. K. Shen
0
Mok
6/27/2009 7:31:56 PM
On Jun 27, 12:31=A0pm, Mok-Kong Shen <mok-kong.s...@t-online.de> wrote:
> Mok-Kong Shen wrote:
> > BTW, I find it difficult to imagine that a "pure" statistical
> > method could determine the parts of speech of an "entirely"
> > unknown language. [snip]
>
> In fact Rao et. al. have with statistical techniques only shown
> that it is very plausible that the Indus script is a natural language.

In fact Rao et. al. screwed up pretty seriously and didn't show
anything at all.

See here for a summary of the debunking by Mark Liberman, Richard
Sproat, Fernando Perreira and others.



0
Ted
6/28/2009 7:09:46 PM
Reply: