f



interpreters & semantics

Q1.
"Programs called interpreters provide the most direct, executable expression
of program semantics." EoPL Preface xi.

Does this mean, if you *read the source code* for the interpreter, then you
can know the program semantics? So if I wanted to know the semantics for
Scheme, I could read the R5RS, but even better, I could read the source code
for MzScheme?

Q2.
"Most of these essentials relate to the semantics, or meaning, of program
elements. Such meanings reflect how program elements are interpreted as the
program executes." EoPL Preface xi.

The second sentence seems backwards to me. Does semantics reflect how
elements are interpreted, or does how the elements are interpreted reflect
the semantics?


0
5/27/2004 6:27:43 AM
comp.lang.scheme 4781 articles. 0 followers. Post Follow

732 Replies
1792 Views

Similar Articles

[PageSpeed] 6

In article <zTftc.75275$hH.1367445@bgtnsc04-news.ops.worldnet.att.net>,
Marlene Miller <marlenemiller@worldnet.att.net> wrote:
> Q1.
> "Programs called interpreters provide the most direct, executable expression
> of program semantics." EoPL Preface xi.
> 
> Does this mean, if you *read the source code* for the interpreter, then you
> can know the program semantics?

Yes, _if_ you already know the semantics of the language that the
interpreter has been written in. If it is a metacircular interpreter
(i.e. one written in the language it is interpreting), then you have
to resort to other means for learning the language's semantics, e.g.
reading an informal English description, or some formal operational or
denotational rules, or maybe just playing around with an
implementation and learning by experimenting, how the language works.
This latter kind of inductive learning, of course, is always prone to
mistakes.

> Q2.
> "Most of these essentials relate to the semantics, or meaning, of program
> elements. Such meanings reflect how program elements are interpreted as the
> program executes." EoPL Preface xi.
> 
> The second sentence seems backwards to me. Does semantics reflect how
> elements are interpreted, or does how the elements are interpreted reflect
> the semantics?

That depends. If the language was properly designed, then it probably
has an a priori semantics, and an implementation's job is simply to
execute programs according to the semantics. On the other hand,
sometimes a language has been implemented without exact
specifications, and then a semantics can be designed afterwards to
formally express what the interpreter does. This often a pretty
thankless task, though, since undesigned languages tend to be full of
horribly complicated kludges.


Lauri Alanko
la@iki.fi
0
la (473)
5/27/2004 7:53:11 AM
Marlene Miller wrote:

> Q2.
> "Most of these essentials relate to the semantics, or meaning, of program
> elements. Such meanings reflect how program elements are interpreted as the
> program executes." EoPL Preface xi.
> 
> The second sentence seems backwards to me. Does semantics reflect how
> elements are interpreted, or does how the elements are interpreted reflect
> the semantics?

One can, in principle, define a mathematical function that assigns a 
mathematical object to each phrase of a language. That's the meaning of the phrase.

Say your language looks like this:

  exp = 0 | 1 | -1 | ... | (+ exp exp)

Then you could decide to say that the meaning of an exp is an Integer (Z), that 
the phrase "0" denotes the number "0", etc, and that the phrase "(+ exp1 exp2)" 
denotes the addition of the meaning of exp1 and the meaning of exp2.

Now write this as an interpreter and use the semantics to guide you.

-- Matthias

0
5/27/2004 12:29:10 PM
In article <zTftc.75275$hH.1367445@bgtnsc04-news.ops.worldnet.att.net>,
 "Marlene Miller" <marlenemiller@worldnet.att.net> wrote:

> Q1.
> "Programs called interpreters provide the most direct, executable expression
> of program semantics." EoPL Preface xi.
> 
> Does this mean, if you *read the source code* for the interpreter, then you
> can know the program semantics?

It depends on what you mean by "know".

If you read a piece of sheet music, do you know what the music sounds 
like?  The situation is exactly analogous.  If you have the right 
training and the music is simple enough then you can "hear" the music by 
reading the notes.  If you know the semantics of the language the 
interpreter is written in and the interpreter is simple enough then you 
can "know" the semantics of the language by reading the interpreter.  
But if you don't or it isn't then you can't.

E.
0
gNOSPAMat1 (138)
5/27/2004 3:52:13 PM
Matthias Felleisen responded to Marlene Miller:

> > "Most of these essentials relate to the semantics, or meaning, of
> > program elements. Such meanings reflect how program elements are
> > interpreted as the program executes." EoPL Preface xi.

> One can, in principle, define a mathematical function that assigns a
> mathematical object to each phrase of a language. That's the meaning
> of the phrase.

Mathias, that's Schmidt's definition of Denotation Semantics (DS):

David Schmidt's book "DS: a methodology for language development"
states on p 3:

   The DS method maps a program directly to its meaning, called its
   denotation.  The denotation is usually a mathematical value, such
   as a number or a function.  No interpreters are used, a valuation
   function maps programs directly to its meaning.

I think that's a great definition of DS, and it includes your LC_v
Standard Reduction function eval_s, defined on p. 51 in
<http://www.ccs.neu.edu/course/com3357/mono.ps> 
your paper with Matthew Flatt 
Programming Languages and Lambda Calculi.

Unfortunately it seems to me that DS has been redefined to mean
specific ways of constructing the valuation function via domains
(i.e. Scott models of LC) and structural induction, the techniques
Schmidt describes in his book.
0
richter1974 (312)
6/5/2004 4:31:27 AM
Thank you Bill for clarifying this issue.

If that's what the *preface* is about, clearly the rest of EOPL is way out
of scope for me.

"Bill Richter" <richter@math.northwestern.edu> wrote in message
news:57189ce0.0406042031.258bebea@posting.google.com...
> Matthias Felleisen responded to Marlene Miller:
>
> > > "Most of these essentials relate to the semantics, or meaning, of
> > > program elements. Such meanings reflect how program elements are
> > > interpreted as the program executes." EoPL Preface xi.
>
> > One can, in principle, define a mathematical function that assigns a
> > mathematical object to each phrase of a language. That's the meaning
> > of the phrase.
>
> Mathias, that's Schmidt's definition of Denotation Semantics (DS):
>
> David Schmidt's book "DS: a methodology for language development"
> states on p 3:
>
>    The DS method maps a program directly to its meaning, called its
>    denotation.  The denotation is usually a mathematical value, such
>    as a number or a function.  No interpreters are used, a valuation
>    function maps programs directly to its meaning.
>
> I think that's a great definition of DS, and it includes your LC_v
> Standard Reduction function eval_s, defined on p. 51 in
> <http://www.ccs.neu.edu/course/com3357/mono.ps>
> your paper with Matthew Flatt
> Programming Languages and Lambda Calculi.
>
> Unfortunately it seems to me that DS has been redefined to mean
> specific ways of constructing the valuation function via domains
> (i.e. Scott models of LC) and structural induction, the techniques
> Schmidt describes in his book.


0
6/5/2004 4:39:32 PM
"Marlene Miller" <marlenemiller@worldnet.att.net> responded to me:

> Thank you Bill for clarifying this issue.
> 
> If that's what the *preface* is about, clearly the rest of EOPL is
> way out of scope for me.

 Marlene, I didn't give you good advice.  I took something Matthias
Felleisen (MFe) wrote & ran off in a different direction. I apologize.

Really, I'm just excited that MFe is now posting regularly to c.l.s.
I think he has a lot of good leadership to offer.  We had a disastrous
thread last year about the R5RS DS, and there were 2 problems:

1) I made a bunch of dumb errors initially, and by the time I'd been
   straightened out (mostly by MB and WC), folks were fed up, and 

2) there was nobody on c.l.s. with MFe's expertise.

And now I'll try to answer your original question:

 > Does semantics reflect how elements are interpreted, or does how
 > the elements are interpreted reflect the semantics?

I say the first, for EoPL.  The interpreter defines the semantics.
The meaning of a program (or an phrase), is what the interpreter does
to it.  Sometimes this is expressed mathematically.  Let me explain.

Here's the text from the EoPL introduction past your quote:
"Programs called interpreters provide the most direct executable
 expression of the program semantics.  They process a program by
 directly analyzing an abstract representation of the program text.
 We therefore choose interpreters as our primary vehicle for
 expressing the semantics of programming language elements."

So it looks to me like EoPL is a book about interpreters, such as
DrScheme or Gambit. So much of the book should be accessible.

I think this is what's called Operational Semantics (OpS), and it's
got a different flavor than the DS I scared you off with.  In fact,
the very previous paragraph in Schmidt's DS book is:

   The OpS method uses an interpreter to define a language.  The
   meaning of a program in the language is the evaluation history that
   the interpreter produces when it interprets the program.  The
   evaluation history is a sequence of internal interpreter
   configurations.

So semantics = meaning which is expressed mathematically, but in OpS,
it's expressed by the interpreter.  The "meaning" of your program is
what the interpreter is gonna do to it.  That's OK, right?

Or as MFe said, the interpreter "assigns a mathematical object to each
phrase of a language".  In the OpS/interpreter world, that means take
a phrase in your language, and evaluate in a given environment, with
various things stored in memory (and a continuation if you like), and
ask what the interpreter is going to print, or how the memory
locations will change, etc.  That's not too mathematical, right?  

BTW I enjoyed MFe's ambiguous, "That's the meaning of the phrase."

However, on the next page, EoPL says:
"Frequently our interpreters a very high level view that expresses
 language semantics in a very concise way, not far from that of formal
 mathematical semantics."  
Maybe you'd have trouble there, and that might even involve some DS.
 (I haven't read EoPL myself, but the Preface is on their web page.)

But EoPL is hard, and old, and maybe there are better books for you.
What do you want to learn?  (I won't be able to help, but others can.)
0
richter1974 (312)
6/6/2004 3:41:49 AM
Thank you Bill. I hope MFe sees your question to him.

(I don't mind math. That was my subject in graduate school.)

I am beginning to suspect Essentials of Programming Languages is Essential
for people who research and design languages, not for people who use
languages to build things.


0
6/6/2004 6:14:48 AM
Thank you Bill. I hope MFe sees your question to him.

(I don't mind math. That was my subject in graduate school.)

I am beginning to suspect Essentials of Programming Languages is Essential
for people who research and design languages, not for people who use
languages to build things.



0
6/6/2004 6:14:49 AM
"Marlene Miller" <marlenemiller@worldnet.att.net> responded to me:

> (I don't mind math. That was my subject in graduate school.)

Then let me be more specific, Marlene, and correct an error of my last
post.  Schmidt defines DS to be the study of semantic functions

curly-E : Expression-of-our-Language  ---->  Some set

where the 2 sets and the function are mathematically defined.  You
know what this means!  And you know that curly-E can't be a computable
function by the Halting problem (i.e. Goedel Incompleteness).

So any kind of semantics becomes DS once you fully mathematize it.  In
that sense, the OpS (as studied I think in EoPL) is also DS.  

So OpS must mean the subset of DS where you concentrate on
interpreters, and not their full mathematization.  A good example of
this is the R5RS DS, which as MB pointed out is usually understood by
schemers as functional programming.  It's a good exercise (worked out
on Anton vS's web page) to code up the R5RS DS as a functional Scheme
program.  That would be called OpS.  Now to mathematize even
functional programming requires hard Lambda calculus.  But it's still
the "shallow end" of the DS pool, as the real mathematization of the
R5RS DS uses domains: Scott models of the Lambda calculus, which
involves non-Hausdorff Cantor sets.  That's much much harder Math!
And it's this "deep end" of the DS pool which is normally called DS.

> I am beginning to suspect Essentials of Programming Languages is
> Essential for people who research and design languages, not for
> people who use languages to build things.

Yeah, maybe!  What do you want to build?  I suspect EoPL is a book
about how to build interpreters.  From the end of Abelson's preface:

"You'll come to see yourself as a designer of languages, rather than
 only a user of languages, as a person who chooses the rules by which
 languages are put together, rather than only a follower of rules that
 other people have chosen."
0
richter1974 (312)
6/7/2004 2:57:20 AM
Thank you for the intro to DS and OpS. Thank you for your time - the time it
takes to write such an explanation.

I hope you get an answer soon to your original question.


0
6/7/2004 5:29:05 AM
"Marlene Miller" <marlenemiller@worldnet.att.net> writes:

> I am beginning to suspect Essentials of Programming Languages is Essential
> for people who research and design languages, not for people who use
> languages to build things.

In my experience, people who research and design languages are *much*
better at using them to build things.  A huge part of solving a
complex problem is coming up with the language necessary to describe
the problem.  If you know a bit about designing languages, you'll be
much further ahead than someone that just knows `how to program'.

As a math major you surely are aware of the power of a good formal
notation.  (And as a computer scientist I am *very* frustrated by the
amazingly poor formal notations invented by mathematicians who don't
program!)  You don't need to steep yourself ``non-Hausdorff Cantor
sets'' (what?) and Scott's domains to understand programming
semantics.  Programming usually involves creating a `mini-language' or
extending an existing language with problem-specific features.
Knowing a bit about semantics will keep you from doing amazingly dumb
things like separating program clauses into `statements' and
`expressions' that cannot be interchanged, making the syntax depend on
runtime values, forgetting about tail recursion, etc.

~jrm

p.s.  Scott was trying to adapt set theory to programming, but since
programs can operate on programs, you need the set P : P->P which is,
unfortunately, empty.  However, if you add some conditions to P (like
restricting it to continuous partial functions) you can make it
satisfy the conditions necessary to apply Tarski's fixed-point theorem
and construct a non-empty set suitable for modeling programming.

Some people are very uncomfortable with mathematics that has not been
proven correct.  Some people are uncomfortable unless they work
through the proof themselves.  I'm pretty sure that Scott and Tarski
got it right, so I haven't bothered to memorize the details.
0
jrm (1310)
6/7/2004 4:25:09 PM
Joe Marshall wrote:

> Knowing a bit about semantics will keep you from doing amazingly dumb
> things like separating program clauses into `statements' and
> `expressions' ...

Gold. Can I use it in a signature?


And Marlene: Just by reading your (very good) questions in
this group makes me absolutely sure, that you will like
"Essentials of Programming Languages". Get it at the library
first, if you want to skim it first.

(the reason you get so good answers, is because you ask
  the right ones)

-- 
Jens Axel S�gaard
0
usenet8944 (1130)
6/7/2004 5:01:25 PM
richter@math.northwestern.edu (Bill Richter) writes:

> I say the first, for EoPL.  The interpreter defines the semantics.
> The meaning of a program (or an phrase), is what the interpreter does
> to it.  Sometimes this is expressed mathematically.  Let me explain.
> 
> [...]
> 
> So it looks to me like EoPL is a book about interpreters [...]

Bill, have you considered actually reading the books you're talking
about before you hold forth about them?  Or do you only read the
prefaces and forewords of books?

(I understand that reading books has the deleterious effect of
actually slowing down your prodigious newsgroup and mail output.)

Shriram
0
sk1 (223)
6/8/2004 1:47:32 AM
Joe Marshall <jrm@ccs.neu.edu> responded to Marlene Miller:

> Programming usually involves creating a `mini-language' or extending
> an existing language with problem-specific features.  Knowing a bit
> about semantics will keep you from doing amazingly dumb things like
> separating program clauses into `statements' and `expressions' that
> cannot be interchanged, making the syntax depend on runtime values,
> forgetting about tail recursion, etc.

Cool, Joe.  Sounds heavy on interpreters.  Where's a place to read?

Perhaps a better book for Marlene would be Shriram Krishnamurthi's
Programming Languages: Application and Interpretation
listed right after EoPL:
<http://www.schemers.org/Documents/#all-texts> 

> p.s.  Scott was trying to adapt set theory to programming, but since
> programs can operate on programs, you need the set P : P->P which
> is, unfortunately, empty.  However, if you add some conditions to P
> (like restricting it to continuous partial functions)

You mean P = (P->P), right?  The function space of continuous maps
from P to itself is bijective with P.  And so you need a topology on P
to talk about continuous maps from P to P.

> You don't need to steep yourself ``non-Hausdorff Cantor sets''
> (what?) and Scott's domains to understand programming semantics.

I think so.  As you say, you just need to know that P = (P->P), you
don't have to know why.  But (responding to your (what?)) here's why
Scott's domain P = P(infty) is a ``non-Hausdorff Cantor set'':

P(omega) is the power set of the natural numbers N, i.e. the set 
{ S subset N } = (N -> boolean).
See, a function f : N ---> boolean defines a subset 
S = f^{-1}(true) subset N

The usual topology on (N -> boolean) has "basic" open sets the
collection of subsets of (N -> boolean) of the form

O(A,B) = { S subset N : A subset S, B disjoint from S }  

where A and B are finite subsets of N.

With this topology, (N -> boolean) is the well known Cantor set, which
you should know something about from fractals: the "dust"Julia sets.

Scott's P(omega) is (N -> boolean) with a different and non-Hausdorff
topology. The basic open sets of P(omega) are

O(A) = { S subset N : A subset S }

for finite subsets A subset N.  We did this a year ago on c.l.s.  If
you don't know what Hausdorff means, let's just say this: the real
line R is Hausdorff, and non-Hausdorff is really strange.  I think
Scott was a genius to come up with his P(omega).
0
richter1974 (312)
6/8/2004 3:35:20 AM
Thank you Joe for your (always) helpful insights and explanations.

> A huge part of solving a
> complex problem is coming up with the language necessary to describe
> the problem.

> Programming usually involves creating a `mini-language' or
> extending an existing language with problem-specific features.

I've never thought about programming in this way. Abelson talks about
metalinguistic abstraction, which puzzled me. Is this the same as or related
to what you are saying?:

----
"We control complexity by establishing new languages for describing design,
each of which emphasizes particular ascpects of the design and deemphasizes
others." SICP

"a cluster of languages, where the pieces could be flexibly combined"
preface to EOPL

"To appreciate this point [the evaluator is ust another program] is to
change our images of ourselves as programmers. We come to see ourselves as
designers of languages, rather than only users of languages designed by
others." SICP

"Perhaps the whole distinction between programming and programming language
is a misleading idea, and future programmers will see themselves not as
writing programs in particular, but as creating new languages for each new
application." preface to EOPL
----

I thought of a way to explain my concern. A plumbers doesn't need to read
Plato to be good at plumbing. He might enjoy reading Plato. So he reads
Plato in his spare time on Sundays. If Plato is good for the plumber, why
don't we see all plumbers reading Plato?


0
6/8/2004 6:54:08 AM
"Jens Axel S�gaard" <usenet@soegaard.net> wrote>
> And Marlene: Just by reading your (very good) questions in
> this group makes me absolutely sure, that you will like
> "Essentials of Programming Languages". Get it at the library
> first, if you want to skim it first.
>
> (the reason you get so good answers, is because you ask
>   the right ones)
>
> -- 
> Jens Axel S�gaard

Thank you very much, Jens Axel, for your encouragement and advice.

I own the book. It looks fun to read. I like the idea of learning by
implementing ideas in code. It's so tedious having to read English prose and
map ambiguous words and metaphors to technical ideas. (I like the R5RS.) I
am trying to decide whether I am "allowed" to move this book from the Fun
queue to the queue with Tanenbaum's Computer Networks, Lea's Concurrent
Programming in Java, etc.


0
6/8/2004 7:12:22 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to Marlene Miller:
>
>> Programming usually involves creating a `mini-language' or extending
>> an existing language with problem-specific features.  Knowing a bit
>> about semantics will keep you from doing amazingly dumb things like
>> separating program clauses into `statements' and `expressions' that
>> cannot be interchanged, making the syntax depend on runtime values,
>> forgetting about tail recursion, etc.
>
> Cool, Joe.  Sounds heavy on interpreters.  Where's a place to read?

I haven't seen a `how to design a language' book.  The Art of the
Interpreter <http://c2.com/cgi/wiki?TheArtOfTheInterpreter> is a good
paper to look at.  There is a lot of good stuff at
<http://library.readscheme.org/> As for my `dumb things' list, those
are things I've encountered in various languages.

There are many languages that draw a distinction between `statements'
and `expressions'.  C is one.  There are statements like while, do,
if, return, for, break, etc. and there are expressions like 
(a + b)*c.  Semantically, there are two kinds of continuation, one for
statements and one for expressions.  A statement continuation discards
its argument, but an expression continuation does not.  So you cannot
use a statement where you have an expression because a statement will
not supply a return value to the continuation.

Unfortunately, most of the control-flow constructs in C are
statements.  There are constructs in C for expression sequences and
conditional expressions, but they have a completely different syntax
from statement sequences and conditional sequences.  The C statement 

  if (x > 3) {
    x += 2;
    y -= 3;
    return TRUE;
    }
  else {
    x -= 2;
    y += 7;
    return FALSE;
    }

is essentially equivalent to the C expression
  (x > 3) 
    ? (x += 2, 
       y -= 3,
       TRUE)
    : (x -= 2,
       y += 7,
       FALSE)

but you clearly cannot just substitute one for the other!  Yet you may
wish to do exactly that sort of thing if you are refactoring code.

If your syntax depends on runtime values, then you cannot effectively
compile your program.  In REBOL, for example, the expression 
[foo bar baz] could mean (begin (foo) (bar) (baz)) or it could mean
(foo (bar (baz))) or it could mean (begin (foo bar) (baz)) or any of
about 20 other parses.  What it means depends on the current values of
foo, bar, and baz, and that can change over time.

Without tail recursion, you must pepper your language with looping
constructs.  Users cannot create their own, so you must supply a wide
variety.  But loops can only express primitive recursion, so there
will be some things that are extraordinarily painful to compute this
way.  Users will also not be able to resort to
continuation-passing-style if they need complex control flow.

> Perhaps a better book for Marlene would be Shriram Krishnamurthi's
> Programming Languages: Application and Interpretation
> listed right after EoPL:
> <http://www.schemers.org/Documents/#all-texts> 
>
>> p.s.  Scott was trying to adapt set theory to programming, but since
>> programs can operate on programs, you need the set P : P->P which
>> is, unfortunately, empty.  However, if you add some conditions to P
>> (like restricting it to continuous partial functions)
>
> You mean P = (P->P), right?  The function space of continuous maps
> from P to itself is bijective with P.  And so you need a topology on P
> to talk about continuous maps from P to P.

Yes.

>> You don't need to steep yourself ``non-Hausdorff Cantor sets''
>> (what?) and Scott's domains to understand programming semantics.
>
> I think so.  As you say, you just need to know that P = (P->P), you
> don't have to know why.  

Right.  Scott proved that the domain is well-founded and his word is
good enough for me.

0
jrm (1310)
6/8/2004 5:51:57 PM
"Marlene Miller" <marlenemiller@worldnet.att.net> writes:

> Thank you Joe for your (always) helpful insights and explanations.
>
>> A huge part of solving a
>> complex problem is coming up with the language necessary to describe
>> the problem.
>
>> Programming usually involves creating a `mini-language' or
>> extending an existing language with problem-specific features.
>
> I've never thought about programming in this way. Abelson talks about
> metalinguistic abstraction, which puzzled me. Is this the same as or related
> to what you are saying?:

It's related.  You can modify an existing language to accept a few new
constructs or you can go whole hog and write a brand new language
tailor-made for your problem.  One advantage to Lisp and Scheme is
that you can do something inbetween these two extremes.  Any new
language is going to need variables, definitions, primitive data,
etc.  You'll probably want strings and numbers for interacting with
the rest of the world.  You'll need to manage memory.  Start with
Scheme or Lisp and you get all that for free.

> I thought of a way to explain my concern. A plumbers doesn't need to read
> Plato to be good at plumbing. He might enjoy reading Plato. So he reads
> Plato in his spare time on Sundays. If Plato is good for the plumber, why
> don't we see all plumbers reading Plato?

If your only aspiration were to be a plumber, then Plato may not have
a direct impact on your life.  But I'm not sure this analogy is quite
the right one.

You don't *have* to understand programming semantics to be a good
programmer, but people that do understand semantics tend to be far
better programmers than people who do not.  Furthermore, given the
absolutely horrible state of software in the world, it seems that the
bulk of people writing software are not good programmers.

So instead of Plato, what about fluid dynamics?  A plumber doesn't
need to understand fluid dynamics to solder pipes together, but
if plumbing were like software, we'd be ankle deep in water.  A
plumber with an understanding of fluid dynamics would get more work
and be able to relax on Sundays in a dry house.
0
jrm (1310)
6/8/2004 6:11:47 PM
Shriram Krishnamurthi <sk@cs.brown.edu> responds to me:

> > So it looks to me like EoPL is a book about interpreters [...]
> 
> Bill, have you considered actually reading the books you're talking
> about before you hold forth about them?  Or do you only read the
> prefaces and forewords of books?
>
> (I understand that reading books has the deleterious effect of
> actually slowing down your prodigious newsgroup and mail output.)

:D Shriram, I don't think EoPL is on-line, unlike your book, and your
expertise here greatly exceeds mine.  So please bail me out:

Would it seem that Marlene ought to read your book on schemers.org
instead of EoPL?  Could you compare the goals of the 2 books?

My expertise here just has to do with MFe's response on 2004-05-27:

   > Does semantics reflect how elements are interpreted, or does how
   > the elements are interpreted reflect the semantics?

   One can, in principle, define a mathematical function that assigns
   a mathematical object to each phrase of a language. That's the
   meaning of the phrase.

That's DS.  The point is that any semantics becomes DS if you fully
mathematize it.  I've had a lot of fun thinking about LC and R5RS DS.

But as to how to write interpreters, why learning about interpreters
or semantics makes one a better programmer: I'm a rank beginner.
0
richter1974 (312)
6/9/2004 2:17:50 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Would it seem that Marlene ought to read your book on schemers.org
> instead of EoPL?  Could you compare the goals of the 2 books?

I like EoPL; I loved the first edition (which I bet you could get real
cheap used) even more.  I cut my teeth on it, and it's what convinced
me to take up programming languages.  So I couldn't possibly tell
someone to read my book instead of EoPL.  Nor do I want to flog my
book here.

Given what Marlene has told us of her background, EoPL may indeed be a
better book than mine for her.

My book grew out of a great frustration from teaching from EoPL.
There are several things I don't like about it as a professor.  I was
once asked by a correspondent (Ehud Lamm, I think) about the pedagogic
philosophy behind the text and for whom it was written.  I wrote:

My long-term vision for teaching programming languages is to integrate
the "two cultures" that have evolved in its pedagogy.  I was raised in
the interpreter (EoPL) culture, which meant I looked with some disdain
at the "survey of languages" courses.  After a while I realized that
otherwise intelligent people used the survey approach, so I spent some
time trying to understand what they got out of it.

I still think that not doing interpreters (broadly construed) is a
mistake, and students who go through the experience of doing it come
out with a much richer perspective.  But what I have realized is that
students who don't do the survey also lose something valuable.

Since a course needs one dominant philosophy, I decided to make the
interpreters dominant but use the survey to inform the interpreters.
So students program with a new set of features first (survey), then
try to distill those principles into an actual interpreter.  This has
the following benefits:

- by seeing the feature in the context of a real language, they can
  build something interesting with it first, so they understand that
  it isn't an entirely theoretical construct, and will actually *care*
  to build an interpreter for it (in my experience, a few students who
  are interested in knowledge for its own sake will get excited about
  the interpreter in either case, but I want to also capture the
  attention of the other 90%)

- they get at least fleeting exposure to multiple languages, which is
  an important educational attribute that is fast being crushed in
  this era of Java's dominance (and in the process, they come to
  understand why Java will not be the last word in languages)

- because they have already coded with the feature, the explanations
  and discussions are much more interesting than when all they have
  seen is an abstract model

- by first building a mental model for the feature through experience,
  they have a much better chance of actually figuring out how the
  interpreter is supposed to work

In short, many more humans work by induction than by deduction, so a
pedagogy that supports it is much more likely to succeed than one that
suppresses it.  The book currently reflects this design, though the
survey parts are done better (!) in lecture than in the book; that
will change in future versions.

Separate from this vision is a goal.  My goal is to not only teach
students new material, but to also change the way they solve problems;
as Marx wrote, "The philosophers have only interpreted the world in
various ways; the point, however, is to change it."  I want to show
students where languages come from (the language "nebula"), why we
should regard languages as the ultimate form of abstraction, how to
recognize such an evolving abstraction, and how to turn what they
recognize into a language.  The last section of the book, on
domain-specific languages, is a very, very weak step in this
direction.  The homeworks I've done in the class have conveyed this
point much better.  Over time, I will update the text to reflect what
the homeworks have taught.

The book is currently the sole textbook for the programming languages
course at Brown, where it is taken primarily by juniors (3rd year),
seniors (4th year) and beginning graduate (both MS and PhD) students.
It seems very accessible to smart sophomores (2nd year) too, and
indeed those are some of my most successful students.  The book has
been used at some other universities as a primary or secondary text.
The book's material is worth one undergraduate course worth of credit;
for students who want graduate credit, I supplement the material in
the book with some research paper readings.

The book is still very much under development.  

One common criticism I have heard of the text is that the writing is
too colloquial; it sounds too much like me standing in front of a room
and lecturing.  However, I don't intend to change the voice of the
book (tighten it, of course; change it, no).  I've been told this may
make it much harder to publish the book formally.  Either way, I
intend to continue offering a full, free copy on the Web.

Comments welcome.

Shriram
0
sk1 (223)
6/9/2004 1:05:21 PM
Shriram Krishnamurthi wrote:

[...]

> Separate from this vision is a goal.  My goal is to not only teach
> students new material, but to also change the way they solve problems;
> as Marx wrote, "The philosophers have only interpreted the world in
> various ways; the point, however, is to change it."  [...]

The change Marx had in mind was metaphysical (i.e. delusional). See 
"Science, Politics, and Gnosticism," by Eric Voegelin, and "Karl Marx: 
Communist as Religious Eschatologist," in the second volume of "Logic of 
Action" by Murray Rothbard.

Sorry for the off-topic distraction, but I'm not one to pass up a chance 
to dis Marx.

-thant
0
thant (332)
6/9/2004 3:58:15 PM
There was more I should have written about the comparison between my
book and EoPL.  Warning: here I *will* be dissing EoPL a bit.

I think EoPL does a poor job on some crucial topics:

- type systems
- garbage collection
- domain-specific languages

The material on types is so caught up in mechanics (especially of type
inference) that I think it fails to provide very much insight into
types.  There is little or discussion of soundness, safety, etc. This
is a pity coming from authors who are masters of the topic, but it is
in general consistent with EoPL's "look ma"-ness.

Garbage collection isn't discussed at all in any meaningful way.  I
find that students are generally woefully uninformed about garbage
collection, having lots of wrong ideas in their heads.  The
programming languages course at most universities is the only chance
to rectify some of these misconceptions (especially since many
colleagues on facutly actively cause the misconceptions).  This is one
place where I have found having them implement collectors, very much
in the spirit of EoPL, is actually really helpful; it takes a lot of
the mystery out of GC.  I think we also have a responsibility to
discuss some of the systems aspects of GC, especially provide a
meaningful comparison to manual memory management.

While the entire EoPL philosophy is built around "build your own
language" (as Abelson's preface also points out), EoPL doesn't reflect
on this practice.  As such, many students can leave an EoPL course
unsure of what to do with the interpreters, eg, not knowing that
Scheme macros offer a great way to transplant what they've learned to
building their own languages.  [In prehistoric times, EoPL was
actually written entirely through macros.  I'm glad this practice did
not survive, but it's a pity that the book has swung entirely in the
opposite direction.]

All these points can also be read as positive statements about EoPL,
especially if you're a purist.  So these should help you determine
which book is more appropriate for your studies, EoPL or PLAI.
Hopefully you will read both.  But if you're going to insist on
reading only one, follow the advice in my Acknowledgments section --
read EoPL, a true classic.

Shriram
0
sk1 (223)
6/10/2004 2:44:23 AM
Shriram Krishnamurthi <sk@cs.brown.edu> wrote an excellent post about 
his book (listed on schemers.org) and how it compares to EoPL!

   we should regard languages as the ultimate form of abstraction

That's exciting, Shiram!  You must mean some higher-level version of
what you write on p 235 of your book: "Scheme's map and filter are
also abstractions".  So I read HtDP to try to learn about abstraction.
And I like HtDP's sec 21.5 "Designing Abstractions from Templates":
abstracting your design template for lists leads us to the abstract
function `fold' (actually in sec 22.2).  I really see how HtDP
abstraction makes for better programmers (Plato for plumbers), and I
can maybe dimly glimpse how your "ultimate abstraction" helps.

I very much liked your (short) Ch 19 "Semantics".  You write:

   It would be convenient to have some common language for explaining
   interpreters.  We already have one: math!

   We call this *big step operational semantics*.  It's semantics
   because it ascribes meaning to programs. [...] It's operational
   because we aren't compiling the entire program into a mathematical
   object and using fancy math to reduce it to an answer.

Good: semantics is Math.  By Schmidt's definition, you're doing DS as
well. You're defining a semantic (or valuation) function

V: Expressions -> (Environments -> Values)

It's something I tried unsuccessfully to post 2 years ago.  The fact
that you're not using fancy Math (CPOs & Scott models of LC) doesn't
mean it's not DS, by Schmidt's definition.  I have some comments:

1) There's some unstated mathematical induction in your definition of
V.  E.g. your rule on p 173 includes

b, E'[i <- a_v] ==> b_v 

but of course you'll have to apply the rule recursively to do so, and
maybe it will not halt, and in that case (V exp E) = bottom, and that
shows V is not a computable function...  This kind of induction isn't
hard, but it's all you need to define V.  You don't need CPOs...

2) I'm not quite sure you define your mathematical set Values.  But it
sure looks to me like you don't need Scott models of LC.  The subset
Procedure-Values of Values just consists (p 172) of triples
<function-name, function-body text, evaluating environment>

So you don't run into the problem MB explained to me 2 years ago:

> E [the R5RS domain of Values] is "defined" as
> 
>     E = .... some expressions ultimately involving E ...

you only need Scott models (non-Hausdorff Cantor sets) if you define
Procedure-Values to be a set of functions on Values, while Values
contains Procedure-Values as a subset.  You avoid this problem by
giving your (I say SICP-ish) triples definition of Procedure-Values.

3) You only fail to define an R5RS-like semantic function

Expressions -> (Env Store Cont -> Answers)

by not doing continuations (which you discussed at length 2 sections
earlier, so probably you'll probably do Cont when you expand Ch 19),
and by "conflating Store and Env", to use MB's great phrase.  I
haven't read enough of your book to comment on your decision.

4) So I claim that your "big step operational semantics" is also DS,
by Schmidt's definition.  It seems that folks only say DS if you're
doing CPOs & Scott models, but that's a cultural distinction.
0
richter1974 (312)
6/10/2004 5:49:57 AM
>>>>> "Shriram" == Shriram Krishnamurthi <sk@cs.brown.edu> writes:

Shriram> I think EoPL does a poor job on some crucial topics:

Shriram> - type systems
Shriram> - garbage collection
Shriram> - domain-specific languages

Concurrency?

-- 
Cheers =8-} Mike
Friede, V�lkerverst�ndigung und �berhaupt blabla
0
sperber (138)
6/10/2004 7:56:38 AM
Michael Sperber <sperber@informatik.uni-tuebingen.de> writes:

> Shriram> I think EoPL does a poor job on some crucial topics:
> 
> [...]
> 
> Concurrency?

Well, if we start going down this path, there's a lot more that one
could add.  I was trying to limit myself to things that I do discuss
in some detail in my course/book.

I don't cover concurrency due to a peculiarity of Brown's curriculum,
which covers it very well in several other courses (or at least well
enough that I don't feel like expending precious time on the subject).

As an example of something that I *do* cover, I think it's important
to show that types are only the tip of the proof iceberg, and there
are interesting techniques for proving properties of programs; eg, I
discuss model checking.

Shriram
0
sk1 (223)
6/10/2004 11:55:38 AM
richter@math.northwestern.edu (Bill Richter) writes:

> 2) I'm not quite sure you define your mathematical set Values.  But it
> sure looks to me like you don't need Scott models of LC.  The subset
> Procedure-Values of Values just consists (p 172) of triples
> <function-name, function-body text, evaluating environment>
>
> So you don't run into the problem MB explained to me 2 years ago:
>
>> E [the R5RS domain of Values] is "defined" as
>> 
>>     E = .... some expressions ultimately involving E ...
>
> you only need Scott models (non-Hausdorff Cantor sets) if you define
> Procedure-Values to be a set of functions on Values, while Values
> contains Procedure-Values as a subset.  You avoid this problem by
> giving your (I say SICP-ish) triples definition of Procedure-Values.

Right, but this removes first-class functions from your language.
Rather boring.

> 3) You only fail to define an R5RS-like semantic function
>
> Expressions -> (Env Store Cont -> Answers)
>
> by not doing continuations (which you discussed at length 2 sections
> earlier, so probably you'll probably do Cont when you expand Ch 19),
> and by "conflating Store and Env", to use MB's great phrase.  I
> haven't read enough of your book to comment on your decision.

Remove continuations and you remove control flow.  That's even less
interesting than a language with no functions.
0
jrm (1310)
6/10/2004 3:52:11 PM
Joe Marshall <jrm@ccs.neu.edu> responded to me:
> 
> > it sure looks to me like you don't need Scott models of LC.  The
> > subset Procedure-Values of Values just consists (p 172) of triples
> > <function-name, function-body text, evaluating environment>

Joe, that's identifier, not function-name, so (lambda (x) body)
evaluates in environment E to the Procedure-Value triple <x, body, E>.
That's basically SICP, which you agree has first-class functions.

> > So you don't run into the problem MB explained to me 2 years ago:
> >
> >> E [the R5RS domain of Values] is "defined" as
> >> 
> >>     E = .... some expressions ultimately involving E ...
> >
> > So you only need Scott models if you define Procedure-Values (a
> > subset of Values) to be a set of functions on Values.
 
> Right, but this removes first-class functions from your language.
> Rather boring.

What? Shriram didn't change his language (approx. Scheme) at all, but
only the definition of the set Values in his semantic function

Expressions -> (Env -> Values)

These aren't my ideas. It's Shriram's book PLAI
<http://www.cs.brown.edu/~sk/Publications/Books/ProgLangs/PDF/all.pdf>
listed on schemers.org right after EoPL.


> > 3) You only fail to define an R5RS-like semantic function
> >
> > Expressions -> (Env Store Cont -> Answers)
> >
> > by not doing continuations, and by "conflating Store and Env", to
> > use MB's great phrase.
 
> Remove continuations and you remove control flow.  That's even less
> interesting than a language with no functions.

Sure, but Part IX Semantics is only 3 pages long, so maybe it's under
construction. Part VI Continuations is 50+ pages long. So I think we
can conclude that Shriram knows how to add continuations to his
semantics without changing everything (adding in Scott models e.g.).
0
richter1974 (312)
6/11/2004 5:07:21 AM
Thank you very much Shriram. Thank you for taking the time to explain. Thank
you for sharing your perpsective and insights and lots of interesting ideas.
Thank you Bill for presenting my question to Shriram.

I think I would learn much from reading both books.

Marlene, the plumber


0
6/11/2004 6:52:53 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>> 
>> > it sure looks to me like you don't need Scott models of LC.  The
>> > subset Procedure-Values of Values just consists (p 172) of triples
>> > <function-name, function-body text, evaluating environment>
>
> Joe, that's identifier, not function-name, so (lambda (x) body)
> evaluates in environment E to the Procedure-Value triple <x, body, E>.
> That's basically SICP, which you agree has first-class functions.

Die, horse, die!

Shriram is using *operational* semantics rather than *denotational*
semantics.  The key difference is this:  Denotational semantics
defines a function that maps programs to what they mean; operational
semantics defines a set of rules that *maintain* the meaning.

As an example, look at function application:

    f, E => <i, b, E0>   a, E => av    b, E0 [i <-av] =>  bv
   -----------------------------------------------------------
                     {f a}, E => bv


This says that *provided that*  f, E  is the triple <i, b, E0> *and* 
a, E reduces to av, *and* b, E0[i <- av] reduces to bv, *then* (f a),
E reduces to bv.

There are a couple of unanswered questions:

   1.  Does there exist an f, E, i, b, E0, a, av, and bv that can
       satisfy the antecedents?  (In particular, it would be nice to
       know if there exists non-empty sets of values for av and bv.)

   2.  Do the sets of values av and bv correspond to the kinds of
       values we want to manipulate?

   3.  Does this reduction rule correspond to our intuitive notion of
       function application?  In particular, if I want to model the
       `add three' function, does there exist an f (a program) that
       will do that?

There are three approaches to answering these questions:

   1.  Assume a-priori that Shrirams semantics for reducing an
       application correspond to the common notion of function
       application.

   2.  *Prove* (or disprove) that Shrirams semantics do indeed mean
       function application.

   3.  Treat the operational semantics as the rules for a meaningless
       game.


If you don't wish to simply assume that the semantics work, then you
either need to prove they do or take the position that `function
application' as defined in the language is simply a complex syntactic
operation that may or may not have anything to do with mathematical
functions.
       
>> > So you only need Scott models if you define Procedure-Values (a
>> > subset of Values) to be a set of functions on Values.
>  
>> Right, but this removes first-class functions from your language.
>> Rather boring.

Let me amend this:  Without Scott domains I can only consider
Procedure-Values to be curiously formed tuples that have a complex
reduction rule.  This isn't interesting.

> What? Shriram didn't change his language (approx. Scheme) at all, but
> only the definition of the set Values in his semantic function
>
> Expressions -> (Env -> Values)
>
> These aren't my ideas. It's Shriram's book PLAI
> <http://www.cs.brown.edu/~sk/Publications/Books/ProgLangs/PDF/all.pdf>
> listed on schemers.org right after EoPL.

Yes, but Shriram is discussing ``big-step operational semantics'' not
denotational semantics.  Shriram's semantics will tell me that 
((lambda (x) (+ x 3)) 7) => 10, but it has nothing to say about the
relationship between the Scheme expression `(lambda (x) (+ x 3))' and
the mathematical function `add three'.

0
jrm (1310)
6/11/2004 3:09:52 PM
Shriram, I live in Seattle, so I was curious to see what the University of
Washington uses for their programming languages course. They are using your
book. http://www.cs.washington.edu/education/courses/341/04sp/


0
6/11/2004 7:58:24 PM
Joe Marshall <jrm@ccs.neu.edu> writes:

> Yes, but Shriram is discussing ``big-step operational semantics'' not
> denotational semantics.  Shriram's semantics will tell me that 
> ((lambda (x) (+ x 3)) 7) => 10, but it has nothing to say about the
> relationship between the Scheme expression `(lambda (x) (+ x 3))' and
> the mathematical function `add three'.

Quite right, though note that it will also tell you

  ((lambda (x) (+ x 3)) 7) => ^10^  [circumflex-10]

You could even possibly prove that, for all numbers ^N^,

  ((lambda (x) (+ x 3)) N) => ^N+10^

where N => ^N^.

Joe enumerates three approaches for Making Sense of my semantics:

>    1.  Assume a-priori that Shrirams semantics for reducing an
>        application correspond to the common notion of function
>        application.
> 
>    2.  *Prove* (or disprove) that Shrirams semantics do indeed mean
>        function application.
> 
>    3.  Treat the operational semantics as the rules for a meaningless
>        game.

Since I do periodically allude in that section to the relationship
between the semantics and an interpreter, we can rule out both #3 (the
semantics doesn't live cut off from the universe) and #1 (though I'm
taking great liberties, I'm at least trying to draw offer a
justification).  We are therefore left with two refinements of #2:

1. Prove that the semantics captures function application.

2. Prove that the semantics faithfully reflects the interpreter, and
   prove that the interpreter captures function application.

Shriram
0
sk1 (223)
6/12/2004 1:56:35 AM
Joe Marshall <jrm@ccs.neu.edu> responded to me:

> Shriram is using *operational* semantics rather than *denotational*
> semantics.  

Let's be precise, Joe.  Shriram calls it big-step OpS (p 173, PLAI).
But it's also DS by Schmidt's definition (p 3 of his DS book):

   The DS method maps a program directly to its meaning, called its
   denotation.  The denotation is usually a mathematical value, such
   as a number or a function.  No interpreters are used, a valuation
   function maps programs directly to its meaning.

So any mathematically defined function 

V: Programs [or Expressions] ---> Some-Set

is a DS valuation function, by Schmidt's definition.  So practically
any semantics is DS, once you mathematize it.  (More DS talk below.)

> The key difference is this: Denotational semantics defines a
> function that maps programs to what they mean; 

But Shriram did just that.  He mapped programs (expressions even)
mathematically to a set, by a function (I'll give names here)

V: Expressions -> (Env -> Values)

In Schmidt's definition of DS, "meaning" just means a mathematical
value.  If you don't think his subset Procedure-Values is interesting,
that's fine, but there's no mathematical point to argue about.

Let's keep going.  You seemed to agree that Shriram is a good enough
semanticist that he could've expanded his semantics to a math function

W:  Expressions -> (Env Store Cont -> Answers)

Now Shriram's W won't be the same as the R5RS DS semantic function
curly-E, because the target sets are different.  Since Shriram has a
different definition of Values (using Procedure-Value triples to
bypass Scott models), his Store & Cont will differ from R5RS DS.

But Shriram's set Answers will be the same as R5RS DS, so we can ask a
more meaningful question. R5RS DS uses an initial

<rho_0, sigma_0, kappa_0> in Env x Store x Cont 

Shriram's semantic function W will also uses such initial values, and
let's call them by the same names, even though they live in different
sets.  Then for any program P, I claim that

		  W[[ P ]] <rho_0, sigma_0, kappa_0>
				  =
	curly-E[[ P ]] <rho_0, sigma_0, kappa_0>   in  Answers

By P here I mean the expression you get by wrapping P in a let form
with "undefined"s as R5RS DS does in sec 7.2.  I think this would be
easy to prove.  And I'd say that was a satisfactory "proof" of your

>    2.  *Prove* (or disprove) that Shriram's semantics do indeed mean
>        function application.

which is to say, that's what I think your "mean" should mean.

> [...] Shriram's semantics will tell me that 
> ((lambda (x) (+ x 3)) 7) => 10, but it has nothing to say about the
> relationship between the Scheme expression `(lambda (x) (+ x 3))' and
> the mathematical function `add three'.

Sure, but that's a matter for proving theorems about observational
equivalence (which MFe has posted about).  It's not a deficiency of
Shriram's semantic functions V & W.  I don't see that Shriram's
semantics is at any disadvantage here with R5RS DS.

I think folks have posted that R5RS DS isn't particularly good for
observational equivalence, and I remember WC posting that R5RS DS is
actually too strict: there are expressions that we want to say are the
same, but curly-E distinguishes them because of some inconsequential
differences in what gets stored in what locations.

Maybe we should say that DS is a subject that humans work in, so DS
means whatever the DS humans are doing.  I work in the subject of
"Homotopy Theory", but the meaning of "Homotopy Theory" has changed
quite a bit since I started 25 years ago.  However, let's note that
Barendregt is much in agreement with Schmidt's definition above.
Barendregt's book is called 
The Lambda Calculus: its Syntax and Semantics,
and he says LC Semantics include the term model (more or less standard
reduction) as well as the much harder Scott models.   

Shriram's book comes close to a precise definition.  In his short
Semantics section, he writes 

   It would be convenient to have some common language for explaining
   interpreters.  We already have one: math!
   [...] It's semantics because it ascribes meanings to programs.

To me, that sounds like what Schmidt calls DS, Shriram calls S!  How
about this: any mathematically defined function

V: Programs [or Expressions] ---> Some-Set

is an S valuation function.  [Now don't tell me the previous sentence
here was "We call this a big-step OpS.  It makes no difference.]

I'll abide by Shriram's definition of S, if others will do accept it
and also reject Schmidt's definition.  What about a definition of DS?

In an interesting private discussion, I think Shriram said he didn't
want to call something DS unless it made really serious and integrated
use of Scott models.  Maybe that means if you only use Scott models to
solve the P = (P -> P) problem, it's not really DS.  That's fine, as
long as we admit this is culture, and not Math.  We can argue about
whether somebody's "really" using Scott models, just like we can argue
about whether some proof is "deep", or "trivial".  The truth of our
theorems must be above such political discussions, or it's not Math.
Math isn't a `common language' if we won't use it precisely.
0
richter1974 (312)
6/13/2004 5:14:57 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>
>> Shriram is using *operational* semantics rather than *denotational*
>> semantics.  
>
> Let's be precise, Joe.  Shriram calls it big-step OpS (p 173, PLAI).
> But it's also DS by Schmidt's definition (p 3 of his DS book):
>
>    The DS method maps a program directly to its meaning, called its
>    denotation.  The denotation is usually a mathematical value, such
>    as a number or a function.  No interpreters are used, a valuation
>    function maps programs directly to its meaning.

I don't see how this applies.  Shriram's equations do *not* map
programs directly to their meaning.  Shriram is assuming that a
mapping exists and shows how to reduce Scheme expressions while
preserving the mapping.  In fact, if we restrict our Scheme to
lambda-expressions only, we can dispense with the right hand side.
We'd still have an operational semantics that reduced expressions, but
no denotation attached to it.

> So any mathematically defined function 
>
> V: Programs [or Expressions] ---> Some-Set
>
> is a DS valuation function, by Schmidt's definition.  

The set is supposed to be the meaning of the program.  The unix `wc'
command (word count) maps program text to non-negative integers, but
it is not generally considered a DS valuation function.

> So practically any semantics is DS, once you mathematize it.  (More
> DS talk below.) 

Nonsense.  Denotational semantics involves finding a function that
maps programs directly to their meaning.  There can be other
relationships between programs and meanings.  Consider a `boolean
semantics' which maps (program * set) -> boolean (true if the
program's meaning is contained within the set, false otherwise).  It's
a weird sort of semantics (and not entirely useless), but not
denotational.

>> The key difference is this: Denotational semantics defines a
>> function that maps programs to what they mean; 
>
> But Shriram did just that.  

Take a look at pg. 171  There are constructs like this:

     l => ^lv^      r => ^rv^
   -----------------------------
      {+ l r} =>  ^lv + rv^

These are called `judgements'.  They are not terms in curly-E and
Shriram makes no claim that they are.

> He mapped programs (expressions even) mathematically to a set, by a
> function (I'll give names here)
>
> V: Expressions -> (Env -> Values)

Some of the rules in the operational semantics have no antecedents.
For example,
                n,E => ^n^

These are taken as execution axioms.

> In Schmidt's definition of DS, "meaning" just means a mathematical
> value.  If you don't think his subset Procedure-Values is interesting,
> that's fine, but there's no mathematical point to argue about.
>
> Let's keep going.  You seemed to agree that Shriram is a good enough
> semanticist that he could've expanded his semantics to a math function
>
> W:  Expressions -> (Env Store Cont -> Answers)

The reason Shriram didn't expand that is because he isn't trying to do
denotational semantics.  On page 173 Shriram says explicitly:

``It's *operational* because ... we aren't compiling the entire
  program into a mathematical object and using fancy math to reduce it
  to an answer.''

> Now Shriram's W won't be the same as the R5RS DS semantic function
> curly-E, because the target sets are different.  

Yes.  This is why you don't try to expand the axioms.

> Since Shriram has a different definition of Values (using
> Procedure-Value triples to bypass Scott models), his Store & Cont
> will differ from R5RS DS.
>
> But Shriram's set Answers will be the same as R5RS DS, 

That's a bold assertion.

> so we can ask a more meaningful question. 
>
> R5RS DS uses an initial
>
> <rho_0, sigma_0, kappa_0> in Env x Store x Cont 
>
> Shriram's semantic function W will also uses such initial values, and
> let's call them by the same names, even though they live in different
> sets.  Then for any program P, I claim that
>
> 		  W[[ P ]] <rho_0, sigma_0, kappa_0>
> 				  =
> 	curly-E[[ P ]] <rho_0, sigma_0, kappa_0>   in  Answers
>
> By P here I mean the expression you get by wrapping P in a let form
> with "undefined"s as R5RS DS does in sec 7.2.  I think this would be
> easy to prove.  

Feel free to provide the proof.  Consider in particular the issue of
self-application.  Hint:  

    http://www1.elsevier.com/homepage/sac/opit/24/article.pdf

Meyer and de Vink make heavy use of domains, though, so you'll have to
remove them from the proof.

> And I'd say that was a satisfactory "proof" of your
>
>>    2.  *Prove* (or disprove) that Shriram's semantics do indeed mean
>>        function application.
>
> which is to say, that's what I think your "mean" should mean.

I see no proof.

>> [...] Shriram's semantics will tell me that 
>> ((lambda (x) (+ x 3)) 7) => 10, but it has nothing to say about the
>> relationship between the Scheme expression `(lambda (x) (+ x 3))' and
>> the mathematical function `add three'.
>
> Sure, but that's a matter for proving theorems about observational
> equivalence (which MFe has posted about).  

Given that the denotational semantics *do* say that
 `(lambda (x) (+ x 3))' means the mathematical function `add three',
this is a central point.

> It's not a deficiency of Shriram's semantic functions V & W.  I
> don't see that Shriram's semantics is at any disadvantage here with
> R5RS DS.

I never said that Shriram's semantics were deficient, I said they were
operational rather than denotational.  They approach the problem of
semantics in a different manner.

> Shriram's book comes close to a precise definition.  In his short
> Semantics section, he writes 
>
>    It would be convenient to have some common language for explaining
>    interpreters.  We already have one: math!
>    [...] It's semantics because it ascribes meanings to programs.
>
> To me, that sounds like what Schmidt calls DS, Shriram calls S!  

Not to me.  Semantics ascribes meanings to programs, but there are
many techniques for this.  Denotational semantics attempts to define a
function over programs that maps them to meanings.  Operational
semantics identifies the meaning of a program with the steps taken to
evaluate it.  Both are mathematical.

-- 
~jrm
0
6/13/2004 4:45:19 PM
In article <57189ce0.0406122114.37b742f8@posting.google.com>,
Bill Richter <richter@math.northwestern.edu> wrote:
> Let's be precise, Joe.  Shriram calls it big-step OpS (p 173, PLAI).
> But it's also DS by Schmidt's definition (p 3 of his DS book):
> 
>    The DS method maps a program directly to its meaning, called its
>    denotation.  The denotation is usually a mathematical value, such
>    as a number or a function.  No interpreters are used, a valuation
>    function maps programs directly to its meaning.
> 
> So any mathematically defined function 
> 
> V: Programs [or Expressions] ---> Some-Set
> 
> is a DS valuation function, by Schmidt's definition.  So practically
> any semantics is DS, once you mathematize it.  (More DS talk below.)

Since you feel so keen to interpret Schmidt, you may want to have a
look at <http://citeseer.ist.psu.edu/schmidt95programming.html>.
There Schmidt says, among other things:

	Unlike denotational semantics, natural semantics does not
	claim that the meaning of a program is necessarily
	"mathematical."

Here "natural semantics" means big-step operational semantics, as you
can readily verify by reading the paper.


Lauri Alanko
la@iki.fi
0
la (473)
6/13/2004 7:22:49 PM
Big-step Ops only define meanings for terminating programs. i.e. the 
evaluation function is a partial function. DS requires the meaning function 
be total for all programs.

The small steps operational semantics do not define any meaning for programs 
but allows you to relate a program with valid reductions of it.

Big-step and DS are different end of story.


Bill Richter wrote:
> Joe Marshall <jrm@ccs.neu.edu> responded to me:
> 
> 
>>Shriram is using *operational* semantics rather than *denotational*
>>semantics.  
> 
> 
> Let's be precise, Joe.  Shriram calls it big-step OpS (p 173, PLAI).
> But it's also DS by Schmidt's definition (p 3 of his DS book):
> 
>    The DS method maps a program directly to its meaning, called its
>    denotation.  The denotation is usually a mathematical value, such
>    as a number or a function.  No interpreters are used, a valuation
>    function maps programs directly to its meaning.
> 
> So any mathematically defined function 
> 
> V: Programs [or Expressions] ---> Some-Set
> 
> is a DS valuation function, by Schmidt's definition.  So practically
> any semantics is DS, once you mathematize it.  (More DS talk below.)
> 
> 
>>The key difference is this: Denotational semantics defines a
>>function that maps programs to what they mean; 
> 
> 
> But Shriram did just that.  He mapped programs (expressions even)
> mathematically to a set, by a function (I'll give names here)
> 
0
danwang742 (171)
6/13/2004 7:25:17 PM
Daniel C. Wang wrote:

> Big-step Ops only define meanings for terminating programs. i.e. the 
> evaluation function is a partial function. DS requires the meaning 
> function be total for all programs.
> 
> The small steps operational semantics do not define any meaning for 
> programs but allows you to relate a program with valid reductions of it.
> 
> Big-step and DS are different end of story.

Let me clarify one subtle point. For programs or terms for which the 
big-step operational semantics is total, one may in a rather perverse way 
consider it a form of DS.

i.e. if we consider an OP-sem for simple arithmetic expressions without 
non-terminating expressions. The DS for such a language would basically be 
the same thing.

However, for Scheme and any non-trivial programming language they are not 
the same.
0
danwang742 (171)
6/13/2004 7:44:04 PM
Joe, there's a lot in your post, but IMO we need to straighten out
some basic stuff first.  So please comment on the next 3 paragraphs:

Now I wrote that Shriram had mathematically defined a function 
		  V: Expressions -> (Env -> Values)
Now maybe he didn't actually do so, and maybe that's the problem.
Shriram's short Semantics section in PLAI certainly doesn't use the
names V, Env, Values, or Procedure-Values.  Is that an issue?

But I claim that Shriram could easily have done just that.  That is,
whatever he wrote about judgments or antecedents, that Shriram could
easily have defined such a mathematical function V. 

And then I claim that such a function V is what Schmidt calls a DS
valuation function, if we wish to say that V captures the semantics.

Now while I'm waiting for your response (or Laurie's), let me make
some quick comments (uh, 150 lines I mean :D) on your post:

Joe Marshall (prunesquallor@comcast.net) responded to me:

> > So any mathematically defined function 
> >
> > V: Programs [or Expressions] ---> Some-Set
> >
> > is a DS valuation function, by Schmidt's definition.  
> 
> The set is supposed to be the meaning of the program.  The unix `wc'
> command (word count) maps program text to non-negative integers, but
> it is not generally considered a DS valuation function.

Right, good point, & I tried to correct this above.  We have to assert
that V captures the semantics of our language.  `wc' certainly
doesn't!  So how do we decide?  We're trying to keep it to Math and
away from culture.  I know what my criterion is: stick to programs,
which is all that Schmidt's quote refers to.  For any program P,

V[[ P ]]  in  Some-Set

must be a mathematization of the actual interpreter output.  Maybe
that even works for expressions.  Shriram's (or my) value

V[[ (lambda (x) body) ]](E) in Values

strikes you as "meaningless", but Scheme prints something meaningless,
as you know.  I just pasted a lambda expr (sol to Ex 22.2.3 of HtDP,
and I was quite proud of my solution) into the Interactions window of
DrScheme, hit RET, and here's my output:


> ;; fold : Y (X Y  ->  Y)  -> ((listof X) -> Y)
(define  (fold base combine)
  (local ((define (abs-fun aloX)
            (if (empty? aloX)
                base
                (combine (first aloX) (abs-fun (rest aloX))))))
    abs-fun))
> 

Absolutely nothing!  

> > R5RS DS uses an initial
> >
> > <rho_0, sigma_0, kappa_0> in Env x Store x Cont 
> >
> > Shriram's semantic function W will also uses such initial values, and
> > let's call them by the same names, even though they live in different
> > sets.  Then for any program P, I claim that
> >
> > 		  W[[ P ]] <rho_0, sigma_0, kappa_0>
> > 				  =
> > 	curly-E[[ P ]] <rho_0, sigma_0, kappa_0>   in  Answers
> >
> > By P here I mean the expression you get by wrapping P in a let form
> > with "undefined"s as R5RS DS does in sec 7.2.  I think this would be
> > easy to prove.  
> 
> Feel free to provide the proof.  

I'd like to, Joe, but there's a serious communication problem.  I
don't know even what part of this you think is hard.  I'd say this is
obvious because R5RS DS is basically playing the SICP game, just with
domains.  That is, curly-E[[ (lambda (x) body) ]] is the function that
get by using the SICP rules for evaluation.  I thought everyone agreed
that it made pretty good sense to think of R5RS DS as FP.  Anton vS
even worked out WC's exercise, writing a DS->Scheme meta-interpreter.

> > Sure, but that's a matter for proving theorems about observational
> > equivalence (which MFe has posted about).  
> 
> Given that the denotational semantics *do* say that
>  `(lambda (x) (+ x 3))' means the mathematical function `add three',
> this is a central point.

Maybe you're right, Joe, it sounds reasonable. My R5RS DS is pretty
rusty.  But I want to stick to programs anyway if we're gonna decide
if some function is a DS valuation.   And you didn't seem to
understand my point, so let me try again, with your example in mind:

We can say that Shriram's W is too strict on lambda expressions.  Any
two lambda's will be distinguished by W unless they're identical.  

We don't want that.  We'd like our DS valuation function to identify
lambda's that are observationally equivalent.

But R5RS DS is too strict as well.  I think WC posted lambda's that
are observationally equivalent, but separated by curly-E.

To me, that just shows that "the proof of the DS valuation pudding" is
in the programs, not the expressions.

> > Shriram's book comes close to a precise definition.  In his short
> > Semantics section, he writes 
> >
> >    It would be convenient to have some common language for explaining
> >    interpreters.  We already have one: math!
> >    [...] It's semantics because it ascribes meanings to programs.
> >
> > To me, that sounds like what Schmidt calls DS, Shriram calls S!  
> 
> Not to me.  Semantics ascribes meanings to programs, but there are
> many techniques for this.  Denotational semantics attempts to define a
> function over programs that maps them to meanings.  Operational
> semantics identifies the meaning of a program with the steps taken to
> evaluate it.  Both are mathematical.

Then I want an example of Ops that can't easily be turned into DS.
Let's suppose we have an mathematically defined function

omega: Programs ---> Machine-History

which takes a program to its entire evaluation history, a sequence
[state_1, state_2,... ].  Is that OpS?

Now I'll define a DS valuation function

V: Programs ---> Answers

which sends the program P to either

state_n, if  omega[[ P ]] = [state_1, state_2,..., state_n ]

bottom, if omega[[ P ]] is an infinite sequence.

Now the only thing I read from your paper is that they defined all DS
valuation function to be compositional, and this function V is not
compositional.  But that's a matter of definition.  I claim V is a
non-compositional DS valuation function, and I think I handled
Daniel's objection.  Now I'm assuming above that state_n will be the
actual program output (if it exists).

There's certainly a gain in mathematical complexity.  I'd bet that
there's a computable function Next-State that computes state_{i+1}
from state_i.  But V is definitely not a computable function, by the
Halting problem.  We need induction to define V from Next-State.

Shriram's W will be compositional, because that's the SICP way.
0
richter1974 (312)
6/14/2004 2:27:37 AM
In article <57189ce0.0406131827.72a053a4@posting.google.com>,
Bill Richter <richter@math.northwestern.edu> wrote:
> Now while I'm waiting for your response (or Laurie's)

I think that was just pretty illustrative. You are unable to take at
face value what people actually write, and make unwarranted
extrapolations, even though the things you assume have been explicitly
denied, as you might find out if you bothered to actually follow this
newsgroup and read what people say.


Lauri
0
la (473)
6/14/2004 11:19:22 AM
In article <57189ce0.0406131827.72a053a4@posting.google.com>,
Bill Richter <richter@math.northwestern.edu> wrote:
> Let's suppose we have an mathematically defined function
> 
> omega: Programs ---> Machine-History
> 
> which takes a program to its entire evaluation history, a sequence
> [state_1, state_2,... ].  Is that OpS?
> 
> Now I'll define a DS valuation function
> 
> V: Programs ---> Answers
> 
> which sends the program P to either
> 
> state_n, if  omega[[ P ]] = [state_1, state_2,..., state_n ]
> 
> bottom, if omega[[ P ]] is an infinite sequence.

Right. So the meaning of "(lambda (x) (+ x 3))" is 
"(lambda (x) (+ x 3))". Mighty useful piece of information, that one.

(It _is_ useful for most of the practical purposes that CS folks use
calculi for. That's why op sem is so prevalent nowadays. But it does
not give any enlightenment about whether the term represents the
mathematical function that we intuitively associate it with.)


Lauri
0
la (473)
6/14/2004 11:40:55 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Now I wrote that Shriram had mathematically defined a function 
> 		  V: Expressions -> (Env -> Values)
> Now maybe he didn't actually do so, and maybe that's the problem.
> Shriram's short Semantics section in PLAI certainly doesn't use the
> names V, Env, Values, or Procedure-Values.  Is that an issue?

Yes.

> But I claim that Shriram could easily have done just that.  That is,
> whatever he wrote about judgments or antecedents, that Shriram could
> easily have defined such a mathematical function V. 

He *could* have, but he didn't.

> And then I claim that such a function V is what Schmidt calls a DS
> valuation function, if we wish to say that V captures the semantics.

Yes.

It is non-trivial to go from a set of judgements to a valuation
function.  In order to determine the meaning of a program from the
judgements, you must construct a chain of judgements that
incrementally reduce the program to axioms.  Yes, you do this with
induction, but in order for the induction to work you must have a base
case.  If you apply a function to itself, however, you construct an
infinite chain of judgements.  You can still attribute meaning to the
infinite chain if it approaches a limit, but it isn't easy.

But Shriram isn't doing this.

> Right, good point, & I tried to correct this above.  We have to assert
> that V captures the semantics of our language.  `wc' certainly
> doesn't!  So how do we decide?  We're trying to keep it to Math and
> away from culture.  I know what my criterion is: stick to programs,
> which is all that Schmidt's quote refers to.  For any program P,
>
> V[[ P ]]  in  Some-Set
>
> must be a mathematization of the actual interpreter output.  Maybe
> that even works for expressions.  Shriram's (or my) value
>
> V[[ (lambda (x) body) ]](E) in Values
>
> strikes you as "meaningless", but Scheme prints something meaningless,
> as you know.  

Lambda expressions are meaningful, but they mean different things in
denotational semantics and operational semantics.  In denotational
semantics, lambda expressions mean mathematical functions.  In
operational semantics, lambda expressions mean to construct a
closure.

>> > R5RS DS uses an initial
>> >
>> > <rho_0, sigma_0, kappa_0> in Env x Store x Cont 
>> >
>> > Shriram's semantic function W will also uses such initial values, and
>> > let's call them by the same names, even though they live in different
>> > sets.  Then for any program P, I claim that
>> >
>> > 		  W[[ P ]] <rho_0, sigma_0, kappa_0>
>> > 				  =
>> > 	curly-E[[ P ]] <rho_0, sigma_0, kappa_0>   in  Answers
>> >
>> > By P here I mean the expression you get by wrapping P in a let form
>> > with "undefined"s as R5RS DS does in sec 7.2.  I think this would be
>> > easy to prove.  
>> 
>> Feel free to provide the proof.  
>
> I'd like to, Joe, but there's a serious communication problem.  I
> don't know even what part of this you think is hard.  

The hard part is establishing a one-to-one mapping between the tuples
of Operational Semantics and the domain of function values in
denotational semantics.

> Then I want an example of Ops that can't easily be turned into DS.

On the one hand, since it can be proven that operational semantics is
equivalent to denotational semantics, there can be no example.  On the
other hand, you need to invoke domain theory to do it, so it's never
easy.

> Let's suppose we have an mathematically defined function
>
> omega: Programs ---> Machine-History
>
> which takes a program to its entire evaluation history, a sequence
> [state_1, state_2,... ].  Is that OpS?

Not quite.  omega is defined by induction over the operational
semantics.  OpS takes you from state_n to state_n+1, but no further.

> There's certainly a gain in mathematical complexity.  I'd bet that
> there's a computable function Next-State that computes state_{i+1}
> from state_i.  But V is definitely not a computable function, by the
> Halting problem.  We need induction to define V from Next-State.

Exactly.  By restricting ourselves to the state-transition function,
we avoid the computability problems associated with the denotational
approach.  But this comes at a cost:  we can no longer say that
the state transitions involved with function application and lambda
expressions `sum up' to `real' functions.
0
jrm (1310)
6/14/2004 4:20:13 PM
Joe Marshall <jrm@ccs.neu.edu> responded to me:
> 
> > Now I wrote that Shriram had mathematically defined a function 
> > 		  V: Expressions -> (Env -> Values)
> > Now maybe he didn't actually do so, and maybe that's the problem.
> > Shriram's short Semantics section in PLAI certainly doesn't use
> > the names V, Env, Values, or Procedure-Values.  Is that an issue?
> 
> Yes.

Good!  Joe, I think we're making progress.

> > But I claim that Shriram could easily have done just that.  That
> > is, whatever he wrote about judgments or antecedents, that Shriram
> > could easily have defined such a mathematical function V.
> 
> He *could* have, but he didn't.

Great.  Now the question we're divided on is how hard it would.
That's not gonna be real easy to settle, but let's keep working:
 
> > And then I claim that such a function V is what Schmidt calls a DS
> > valuation function, if we wish to say that V captures the semantics.
> 
> Yes.

Great.  

> It is non-trivial to go from a set of judgments to a valuation
> function.  

That's what I dispute, and I think we're heading toward a resolution.

> [...]  Yes, you do this with induction, [...] 

Great! 

> But Shriram isn't doing this.

Yeah, maybe.  But if it's easy enough to pass to V, then I'm not way
off base to have misinterpreted Shriram this way.  Now if it's hard,
then I wildly misinterpreted Shriram, and I owe him an apology.

> > Right, good point, & I tried to correct this above.  We have to
> > assert that V captures the semantics of our language.  `wc'
> > certainly doesn't!  So how do we decide?  We're trying to keep it
> > to Math and away from culture.  I know what my criterion is: stick
> > to programs, which is all that Schmidt's quote refers to.  For any
> > program P,
> >
> > V[[ P ]]  in  Some-Set
> >
> > must be a mathematization of the actual interpreter output.  

Joe, can I get you to vote on this?  You went on to my next point.

> > Maybe that even works for expressions.  Shriram's (or my) value
> >
> > V[[ (lambda (x) body) ]](E) in Values
> >
> > strikes you as "meaningless", but Scheme prints something
> > meaningless, as you know.
> 
> Lambda expressions are meaningful, but they mean different things in
> denotational semantics and operational semantics.  In denotational
> semantics, lambda expressions mean mathematical functions.  

Now here I claim you're bringing culture into Math.  In the DS we've
seen, yes, you're right.  There's no requirement though, in Schmidt's
definition, and you seemed to agree with me on this above.

> >> > R5RS DS uses an initial
> >> >
> >> > <rho_0, sigma_0, kappa_0> in Env x Store x Cont 
> >> >
> >> > Shriram's semantic function W will also uses such initial
> >> > values, and let's call them by the same names, even though they
> >> > live in different sets.  Then for any program P, I claim that
> >> >
> >> > 		  W[[ P ]] <rho_0, sigma_0, kappa_0>
> >> > 				  =
> >> > 	curly-E[[ P ]] <rho_0, sigma_0, kappa_0>   in  Answers
> >> >
> >> > By P here I mean the expression you get by wrapping P in a let form
> >> > with "undefined"s as R5RS DS does in sec 7.2.  I think this would be
> >> > easy to prove.  
> >> 
> >> Feel free to provide the proof.  
> >
> > I'd like to, Joe, but there's a serious communication problem.  I
> > don't know even what part of this you think is hard.
> 
> The hard part is establishing a one-to-one mapping between the tuples
> of Operational Semantics and the domain of function values in
> denotational semantics.

Yeah, great.  I think I can do that.  I read R5RS DS quite carefully
and it really looked to me like curly-E[[ lambda expressions ]] was
just the obvious function you'd want to define.  It took me a while to
decode R5RS DS, and now it's looks impenetrable again.  

Our eval order plt-scheme thread will help.  From now on, I'm junking
the permute/unpermute part of R5RS DS, and going with left->right eval
order.  I mean, I was doing that before anyway, but I felt guilty :D

> > Then I want an example of Ops that can't easily be turned into DS.
> 
> On the one hand, since it can be proven that operational semantics
> is equivalent to denotational semantics, there can be no example.
> On the other hand, you need to invoke domain theory to do it, so
> it's never easy.

I think you contradicted yourself below, Joe.  That is, if domain
theory means Scott models of LC.  Let's go read it:

> > Let's suppose we have an mathematically defined function
> >
> > omega: Programs ---> Machine-History
> >
> > which takes a program to its entire evaluation history, a sequence
> > [state_1, state_2,... ].  Is that OpS?
> 
> Not quite.  omega is defined by induction over the operational
> semantics.  OpS takes you from state_n to state_n+1, but no further.

Ah, thanks.  So OpS is just my Next-State function below.  And sure,
you'd need induction to even define omega.  Great.

> > There's certainly a gain in mathematical complexity.  I'd bet that
> > there's a computable function Next-State that computes state_{i+1}
> > from state_i.  But V is definitely not a computable function, by
> > the Halting problem.  We need induction to define V from
> > Next-State.
> 
> Exactly.  By restricting ourselves to the state-transition function,
> we avoid the computability problems associated with the denotational
> approach.  

But this doesn't bother me a bit!  Pure mathematicians rarely have
computable functions.  Any time you bring in the real line you're not
computable, because (as Tom Bushnell posted), the real line isn't a
computable set!

This is where I thought you were contradicting yourself.  Because this
V doesn't seem to use Scott models, and I said it was a
non-compositional DS valuation function.  Hmm, you snipped that part :)

> But this comes at a cost: we can no longer say that the state
> transitions involved with function application and lambda
> expressions `sum up' to `real' functions.

Don't quite grok.  Is this what you were going after Shriram for, that
his big-step OpS might be meaningless reduction rules?  If so, that's
a real good point, and the reason I always want to produce the V.
0
richter1974 (312)
6/15/2004 2:40:59 AM
Lauri Alanko <la@iki.fi> responded to me:

> > Let's suppose we have an mathematically defined function
> > 
> > omega: Programs ---> Machine-History
> > 
> > which takes a program to its entire evaluation history, a sequence
> > [state_1, state_2,... ].  Is that OpS?
> > 
> > Now I'll define a DS valuation function
> > 
> > V: Programs ---> Answers
> > 
> > which sends the program P to either
> > 
> > state_n, if  omega[[ P ]] = [state_1, state_2,..., state_n ]
> > 
> > bottom, if omega[[ P ]] is an infinite sequence.
> 
> Right. So the meaning of "(lambda (x) (+ x 3))" is 
> "(lambda (x) (+ x 3))". Mighty useful piece of information, that
> one.

But that's the answer you get, or even less, Lauri!  V maps programs
to answers, mostly meaning the printed output of the interpreter.
That's how R5RS DS uses the name Answers.  If your program was a
lambda expression, DrScheme gives no output at all.
 
> (It _is_ useful for most of the practical purposes that CS folks use
> calculi for. That's why op sem is so prevalent nowadays. But it does
> not give any enlightenment about whether the term represents the
> mathematical function that we intuitively associate it with.)

Yeah, and that must be why folks use Scott models in DS.  But there's
no requirement (in Schmidt's definition at least) that the valuation
of lambda-expr is your mathematical function.  It could be your
"mighty useful" (lambda (x) (+ x 3)).  Don't agret that the proof
of the DS pudding is for programs?  If I hand you a math function

V: Expressions -> (Env Store Cont -> Answers)

and if you're deciding if my V is really a DS valuation function, 
then you're going to make your decision based on the restriction 

V^p : Programs -> Answers

V^p( P ) = V(P)(rho_0, sigma_0, kappa_0)

All bets are off on the original V, unless we demand compositionality.

Lauri, you seem antagonistic, and maybe you remember how this went up
in smoke 2 years ago.  But MFe is here now, and Joe & I have built up
some goodwill on plt-scheme... I'm willing to give it another try.
0
richter1974 (312)
6/15/2004 2:56:56 AM
richter@math.northwestern.edu (Bill Richter) writes:

>> > Right, good point, & I tried to correct this above.  We have to
>> > assert that V captures the semantics of our language.  `wc'
>> > certainly doesn't!  So how do we decide?  We're trying to keep it
>> > to Math and away from culture.  I know what my criterion is: stick
>> > to programs, which is all that Schmidt's quote refers to.  For any
>> > program P,
>> >
>> > V[[ P ]]  in  Some-Set
>> >
>> > must be a mathematization of the actual interpreter output.  
>
> Joe, can I get you to vote on this?  You went on to my next point.

I'm not quite sure what you are asserting here, but I'd say this:

If your interpreter is faithful to your denotational semantics, then

   V [[ P ]] = V [[ interpreter output ]]

Note that this is different from operational semantics:

   Op (P) =>  interpreter output

i.e., operational semantics applied to a program reduces to the
interpreter output.

>> > Maybe that even works for expressions.  Shriram's (or my) value
>> >
>> > V[[ (lambda (x) body) ]](E) in Values
>> >
>> > strikes you as "meaningless", but Scheme prints something
>> > meaningless, as you know.
>> 
>> Lambda expressions are meaningful, but they mean different things in
>> denotational semantics and operational semantics.  In denotational
>> semantics, lambda expressions mean mathematical functions.  
>
> Now here I claim you're bringing culture into Math.  In the DS we've
> seen, yes, you're right.  There's no requirement though, in Schmidt's
> definition, and you seemed to agree with me on this above.

Yes, we could be using `wc'.

>> But this comes at a cost: we can no longer say that the state
>> transitions involved with function application and lambda
>> expressions `sum up' to `real' functions.
>
> Don't quite grok.  Is this what you were going after Shriram for, that
> his big-step OpS might be meaningless reduction rules?  If so, that's
> a real good point, and the reason I always want to produce the V.

I wasn't really `going after' Shriram; I'm sure his operational
semantics are well-founded, and I believe that he chose operational
semantics over denotational semantics in order to avoid the problems
associated with the latter and to illustrate more closely the actions
of the interpreter.

But while Shriram's semantics may faithfully model what the
interpreter does, they do not provide (nor are they intended to
provide) a reason to believe that the program *as a whole* means what
we intend (i.e., the valuation function is implied through induction,
but there is no proof that the induction is valid).

For simple expressions that makes no difference, but suppose we
consider this one:

(((lambda (f)
    ((lambda (D) (D D)) 
     (lambda (x) (f (lambda () (x x))))))
  (lambda (f)
      (lambda (n)
        (if (zero? n)
            1
          (* n ((f) (- n 1))))))) 10)

Operational semantics can easily show that this reduces to 3628800,
but it could not show that this fragment:

  (lambda (f)
    ((lambda (D) (D D)) 
     (lambda (x) (f (lambda () (x x))))))

when applied to this fragment:

    (lambda (f)
      (lambda (n)
        (if (zero? n)
            1
            (* n ((f) (- n 1))))))

yields a (partial) function.


-- 
~jrm
0
6/15/2004 8:41:13 AM
Joe Marshall <prunesquallor@comcast.net> responds to me:

Joe, I didn't really follow your post, except for your (almost!) Y_v
combinator (see below), but I think I maybe see our problem:

I think there's 2 separate issues here, and I'd like your vote on both
the mathematical issue and the "real-world modeling" issue:

****** Math ******

I claimed I could easily turn Shriram's big-step OpS into a
mathematical function

V : Expressions -> (Env -> Values)

where V[[ (lambda (x) body) ]](E) = <x, body, E>  

I said I didn't need Scott models or CPO's to define V, just some
induction.  Now relating V to the curly-E of R5RS DS would be real
work, because curly-E is real work, as it involves Scott models.  

I think maybe you agreeing with me on this point, but you say I can't
call it a DS valuation function, on account of:

****** Real-World Modeling ******

When we make a mathematical model a real-world phenomenon, we have to
ask if it's a "good" model.  And that's partly a math question, but
it's partly a real-world question.  What do we mean by good?

So a DS valuation/semantic function for our language

V : Expressions -> Meaning-Set

must satisfy us mathematically, but it must also satisfy us
intuitively, in a way that I think can't itself be mathematized.

So V[[ (lambda (x) body) ]] is supposed to mathematically codify the
"meaning" of (lambda (x) body).  But what is the "meaning"?

I think we can legitimately give different answers to this question,
and we must define different semantic functions accordingly. 

I say there's nothing wrong with the SICP solution, as Sussman &
Steele invented Scheme.  SICP "conflates" the Store with the
Environment, as Shriram does, and SICP also declares the value of 
(lambda (x) body) in an environment E to be just the lambda tagged
with E, i.e. Shriram's triple <x, body, E>.   

You can say SICP is old-hat, but I say we can, and ought to, define
a SICP DS semantic function, and then we'll have to say that the
"meaning" of (lambda (x) body), i.e. 
V[[ (lambda (x) body) ]]
reflects the SICP biz, 
and then we'll have to say something like Shriram's

V[[ (lambda (x) body) ]](E) = <x, body, E>  

Now you can say, "No!"  The "real" meaning of a lambda is some actual
function on Values, and that forces Scott models etc.

And I'd say, that's fine, but that's a different "semantic"
understanding of Scheme, so you need R5RS's curly-E, a different DS
semantic function.  There's no right answer here.  Heck, there are
folks who think that everything is a pointer in Scheme, and they also
need a DS semantic function that's different from curly-E.

****** other issues: R5RS DS has real-world-fit problems ******

I think I remember what WC posted was wrong with the R5RS DS def of
procedure values.  Let's forget Cont for now, and dumb down the R5RS
semantic function to the sort of call/cc-free DS semantic function

curly-E: Expressions -> (Store Env -> Store Value)

which Cartwright & Felleisen talk about on the top of page 2 of their
"Extensible Denotational" paper on PLT.

Then curly-E[[ (lambda (x) body) ]](sigma E) = (sigma f)

where f is an honest function.  But I think what WC pointed out is
that meaningless changes to (lambda (x) body) will result in different
functions f.  f is a function that changes the contents of locations,
and we can modify this location-behavior in meaningless ways by making
meaningless changes to (lambda (x) body).  By meaningless, I'm making
a "real-world" distinction: they don't mean anything to us Scheme
programmers.  But in the DS meaning of meaning, it's not meaningless!
What that means is that the (very nice IMO) curly-E doesn't really
reflect our real-world understanding of Scheme semantics.  But it's
pretty good, and it's perfect on whole programs.

Why don't I also point out that my Shriram V semantic function is 
compositional by the definition in "Extensible Denotional", p 6:

[The map from syntactic domains to semantic domains] satisfies the law
of compositionality: the interpretation of a phrase is a function of
the interpretation of the sub-phrases.

Since that's SICP says, it's obviously true for the Shriram 

V : Expressions -> (Env -> Values)


****** your interesting Y_v combinator ******

Your code looks pretty much like the Y_v combinator version of 

(fact 10) => 3628800

but it looked odd enough I checked.   Here's yours

(((lambda (f)
    ((lambda (D) (D D)) 
     (lambda (x) 
       (f (lambda () (x x))))))
  (lambda (f)
      (lambda (n)
        (if (zero? n)
            1
          (* n ((f) (- n 1))))))) 10)

and here's the usual Y_v fact biz (Y_v courtesy of TLS):

(((lambda (f)
    ((lambda (D) (D D))
     (lambda (x) 
       (f (lambda (p) ((x x) p))))))
  (lambda (f)
    (lambda (n)
      (if (zero? n)
          1
          (* n (f (- n 1))))))) 10)

I never saw your version.  Finally I realized that your f is really
(f), so you don't quite have a version of Y_v.  Pretty clever anyway!
0
richter1974 (312)
6/16/2004 3:52:52 AM
richter@math.northwestern.edu (Bill Richter) writes:

> You can say SICP is old-hat, but I say we can, and ought to, define
> a SICP DS semantic function, and then we'll have to say that the
> "meaning" of (lambda (x) body), i.e. 
> V[[ (lambda (x) body) ]]
> reflects the SICP biz, 
> and then we'll have to say something like Shriram's
>
> V[[ (lambda (x) body) ]](E) = <x, body, E>  
>
> Now you can say, "No!"  The "real" meaning of a lambda is some actual
> function on Values, and that forces Scott models etc.

More or less.  I'm not insisting that you must model a function, but
if you choose not to model a function, then you cannot make sense of this:

   ((lambda (x) x) 42)

Because the lambda expression means a 3-tuple and you have not
provided a semantics for applying 3-tuples to arguments.

You have 3 options at this point:

  1. Haul out the Scott domains

  2. Outlaw applying closures to arguments

  3. Punt and use Operational semantics to rewrite 
     ((lambda (x) x) 42) => 42  and declare victory.

> Why don't I also point out that my Shriram V semantic function is 
> compositional by the definition in "Extensible Denotional", p 6:
>
> [The map from syntactic domains to semantic domains] satisfies the law
> of compositionality: the interpretation of a phrase is a function of
> the interpretation of the sub-phrases.

But it isn't compositional (yet) because you haven't defined what the
application of 3-tuples mean.

> Your code looks pretty much like the Y_v combinator version of 
>
> (fact 10) => 3628800
>
> but it looked odd enough I checked.   Here's yours
>
> (((lambda (f)
>     ((lambda (D) (D D)) 
>      (lambda (x) 
>        (f (lambda () (x x))))))
>   (lambda (f)
>       (lambda (n)
>         (if (zero? n)
>             1
>           (* n ((f) (- n 1))))))) 10)
>
> and here's the usual Y_v fact biz (Y_v courtesy of TLS):
>
> (((lambda (f)
>     ((lambda (D) (D D))
>      (lambda (x) 
>        (f (lambda (p) ((x x) p))))))
>   (lambda (f)
>     (lambda (n)
>       (if (zero? n)
>           1
>           (* n (f (- n 1))))))) 10)
>
> I never saw your version.  Finally I realized that your f is really
> (f), so you don't quite have a version of Y_v.  Pretty clever anyway!

My Y is curried.

-- 
~jrm
0
6/16/2004 8:25:44 AM
richter@math.northwestern.edu (Bill Richter) wrote in message news:<57189ce0.0406151952.6ff0d78b@posting.google.com>...
> You can say SICP is old-hat, but I say we can, and ought to, define
> a SICP DS semantic function, and then we'll have to say that the
> "meaning" of (lambda (x) body), i.e. 
> V[[ (lambda (x) body) ]]
> reflects the SICP biz, 
> and then we'll have to say something like Shriram's
> 
> V[[ (lambda (x) body) ]](E) = <x, body, E>  
> 
> Now you can say, "No!"  The "real" meaning of a lambda is some actual
> function on Values, and that forces Scott models etc.
> 
> And I'd say, that's fine, but that's a different "semantic"
> understanding of Scheme, so you need R5RS's curly-E, a different DS
> semantic function.  There's no right answer here.  Heck, there are
> folks who think that everything is a pointer in Scheme, and they also
> need a DS semantic function that's different from curly-E.

Its not just a "different semantic understanding".  Its an
understanding that has no more layers of evaluation on top of it.

Your V yields a <x, body, E> tuple.  What *is* this, I ask you?  You
might say, "hmmm... well, to understand what this tuple means, I need
to show you it running on some data in my interpreter."  (And in fact,
different interpreters may yield different answers to this question)

But in Denotational Semantics, once I get the element of the
"Meaning-Set" as you call it, I'm done.  That's it.  There's no more
running it in an interpreter.  The thing I get back will be a
function; a real mathmatical function.  I can't change what the
factorial or fibonacci functions *are*.

(I can't change your <x, body, E> tuple either, but I can change what
I think the 'body' inside of it means.)

Also, I suggest you go back and look at Joe's example at the end of
his last post.  It sounds like you were trying to look at it and twist
it into something like the Y-combinator, and in the process you missed
the whole point that he was making: that Denotational Semantics gave
us a *meaning* for the sub-expressions that weren't just abstract
tuples with some code in it.

Instead, the meanings were "just" abstract functions that are tricky
for us humans to make sense out of.  Are these functions "better" than
your tuples for the purposes of reasoning about programs?   I'm a
biased (and ignorant!) party, so I'm going to be political here and
just say: It depends on the structure and goals of your analysis.

Bill, if you want to try to understand why one approach might be
better or worse than another, you should stop posting on the
newsgroup, and go write some static analysis code.  REAL CODE. 
Something like a CFA would be a good exercise for you; go google for
"control-flow analysis scheme", do some reading, and spend three weeks
hacking something up.
0
pnkfelix (27)
6/16/2004 1:43:44 PM
pnkfelix@gmail.com (Felix Klock) writes:
> Its not just a "different semantic understanding".  Its an
> understanding that has no more layers of evaluation on top of it.
> 
> Your V yields a <x, body, E> tuple.  What *is* this, I ask you?  You
> might say, "hmmm... well, to understand what this tuple means, I need
> to show you it running on some data in my interpreter."  (And in fact,
> different interpreters may yield different answers to this question)
> 
> But in Denotational Semantics, once I get the element of the
> "Meaning-Set" as you call it, I'm done.  That's it.  There's no more
> running it in an interpreter.  The thing I get back will be a
> function; a real mathmatical function.  I can't change what the
> factorial or fibonacci functions *are*.

This is my problem with DS: who decides the semantics of the "Meaning-Set"?
Just math, we say. But that is just the original problem all over again. 

Math is a system of symbols and axioms and rules for evaluating them, quite
like programs, really. It is just that people have internalized math quite
well over the years, and so there is immediate recognition and agreement as to
the meaning of things.

But ultimately the problem is the same: semantics comes about from how things
are used, are they interact; in programming terms: how things are interpreted.

In my view DS is unnecessarily complicated.

-- 
Cheers,                                        The Rhythm is around me,
                                               The Rhythm has control.
Ray Blaak                                      The Rhythm is inside me,
rAYblaaK@STRIPCAPStelus.net                    The Rhythm has my soul.
0
rAYblaaK (362)
6/16/2004 4:57:30 PM
pnkfelix@gmail.com (Felix Klock) writes:

> Also, I suggest you go back and look at Joe's example at the end of
> his last post.  It sounds like you were trying to look at it and twist
> it into something like the Y-combinator, and in the process you missed
> the whole point that he was making: that Denotational Semantics gave
> us a *meaning* for the sub-expressions that weren't just abstract
> tuples with some code in it.

The other point is that if you attempt to perform induction over the
operational semantics with this subform you will find yourself with no
base case.  It isn't the Y combinator that's the problem, it's the
self-application of F that's going to do you in.

0
jrm (1310)
6/16/2004 5:25:12 PM
Joe Marshall <prunesquallor@comcast.net> responds to me:
> 
> > You can say SICP is old-hat, but I say we can, and ought to, define
> > a SICP DS semantic function, and then we'll have to say that the
> > "meaning" of (lambda (x) body), i.e. 
> > V[[ (lambda (x) body) ]]
> > reflects the SICP biz, 
> > and then we'll have to say something like Shriram's
> >
> > V[[ (lambda (x) body) ]](E) = <x, body, E>  
> >
> > Now you can say, "No!"  The "real" meaning of a lambda is some actual
> > function on Values, and that forces Scott models etc.
> 
> More or less.  I'm not insisting that you must model a function, but
> if you choose not to model a function, then you cannot make sense of
> this:
> 
>    ((lambda (x) x) 42)

Thanks, Joe, that clarifies your previous Y biz.  I disagree:

> Because the lambda expression means a 3-tuple and you have not
> provided a semantics for applying 3-tuples to arguments.

Shriram wrote down the `semantics for applying 3-tuples to arguments'
in PLAI.  It's the SICP rule, essentially.  That's the reduction rule
you posted about.  But yeah, we have to provide such semantics!

> > Why don't I also point out that my Shriram V semantic function is 
> > compositional by the definition in "Extensible Denotational", p 6:
> >
> > [The map from syntactic domains to semantic domains] satisfies the law
> > of compositionality: the interpretation of a phrase is a function of
> > the interpretation of the sub-phrases.
> 
> But it isn't compositional (yet) because you haven't defined what
> the application of 3-tuples mean.

OK, sorry for not clarifying.  Would you agree it's compositional now?

Maybe I should clarify about SICP.  They stress: to evaluate a
combination, first evaluate the arguments, and the 1st must return a
procedure value (one of Shriram's triple), and then make a new
environment...  Sounds like compositionality to me.

> My Y is curried.

Thanks!  I'll hafta think about that. 

Now Felix: I'm not claiming that my (or Shriram's) semantic function
is better in some calculational way.  I'm in fact clueless about the
value of DS, except for what Shriram wrote in PLAI:

   It would be convenient to have some common language for explaining
   interpreters.  We already have one: math!

I'm just thinking of DS/math as a way to clearly talk about
interpreters.  I'm not a great programmer.  I got into this because I
couldn't understand the text of R5RS, which they describe as the
"informal semantics", so I read R5RS DS to clarify, and it worked, but
I had huge misunderstandings about the Math...
0
richter1974 (312)
6/16/2004 9:04:45 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <prunesquallor@comcast.net> responds to me:
>
>> Because the lambda expression means a 3-tuple and you have not
>> provided a semantics for applying 3-tuples to arguments.
>
> Shriram wrote down the `semantics for applying 3-tuples to arguments'
> in PLAI.  It's the SICP rule, essentially.  That's the reduction rule
> you posted about.  But yeah, we have to provide such semantics!

Unfortunately, Shriram's rule for reducing lambda expressions does not
give the semantics for applying 3-tuples.  Instead, it gives us a
rewrite rule that allows us to change the application of a 3-tuple
into something else.  But I can't use a rewrite rule as a definition
of a valuation function without identifying the rewrite operation as
being semantically meaningful in the valuation space. 

>> > Why don't I also point out that my Shriram V semantic function is 
>> > compositional by the definition in "Extensible Denotational", p 6:
>> >
>> > [The map from syntactic domains to semantic domains] satisfies the law
>> > of compositionality: the interpretation of a phrase is a function of
>> > the interpretation of the sub-phrases.
>> 
>> But it isn't compositional (yet) because you haven't defined what
>> the application of 3-tuples mean.
>
> OK, sorry for not clarifying.  Would you agree it's compositional now?

No. 

This would be compositional:
   V [[ ((lambda (x) x) 3) ]] = V[[ (lambda (x) x) ]] (V[[3]])

But you aren't defining V[[ (lambda (x) x) ]], you are applying the
operational steps on the lambda expression:

    OpS (OpS (lambda (x) x), OpS(3))

with the hope that
   
    V[[  OpS (OpS (lambda (x) x), OpS(3)) ]] =  V[[ (lambda (x) x) ]] (V[[3]])

But you have given no justification for asserting this.

> Maybe I should clarify about SICP.  They stress: to evaluate a
> combination, first evaluate the arguments, and the 1st must return a
> procedure value (one of Shriram's triple), and then make a new
> environment...  Sounds like compositionality to me.

Yes, a finite string of Operational semantics steps compose.  That
isn't a valuation function.
0
jrm (1310)
6/16/2004 9:33:30 PM
In article <uekoftp5b.fsf@STRIPCAPStelus.net>,
Ray Blaak  <rAYblaaK@STRIPCAPStelus.net> wrote:
> Math is a system of symbols and axioms and rules for evaluating them, quite
> like programs, really.

This is a formalist view, and probably doesn't reflect the attitude of
the majority of mathematicians, although it may come naturally to a
computer scientist (I know it comes naturally to me).

I think the prevailing view is that mathematical objects exist in some
platonic realm quite independently of any formal systems, and
mathematical truths are quite independent of the derivabality of
anything (excepting, of course, those mathematical propositions that
explicitly concern derivability). We are supposed to have an intuitive
understanding of these mathematical objects and a formal system can
then be judged by how well it captures this intuition of ours.

In this context, the "meaning" that DS assigns to the Scheme program
(+ 2 2) is not simply a numeral "4", but actually "four", or just
_four_, fourness itself. What this _really_ means is then a question
for philosophers. Certainly, as Benacerraf has argued, it cannot mean
simply {{},{{}},{{},{{}}},{{},{{}},{{},{{}}}}} (here meaning the set
which that expression denotes, not the expression itself as a
syntactic object).

In any case, even though there are philosophical problems about the
meaning of math, at least DS pushes the problem of interpretation to
the philosophers, and out from the computer scientists' shoulders. :)


Lauri Alanko
la@iki.fi
0
la (473)
6/17/2004 1:56:57 AM
Joe Marshall <jrm@ccs.neu.edu> responded to me:

> >> > Why don't I also point out that my Shriram V semantic function is 
> >> > compositional by the definition in "Extensible Denotational", p 6:
> >> 
> >> But it isn't compositional (yet) because you haven't defined what
> >> the application of 3-tuples mean.
> >
> > OK, sorry for not clarifying.  Would you agree it's compositional now?
> 
> No.
> 
> This would be compositional:
>    V [[ ((lambda (x) x) 3) ]] = V[[ (lambda (x) x) ]] (V[[3]])

Ah, that looks like exactly our problem, Joe!  I think I owe you an
apology for suggesting you were dragging culture into this.  I see now
why you insisted that the meaning of a lambda-exp is a function.

You don't quite have the the definition of compositionality.  What you
wrote is an example of compositionality, but more generally, it's that
there's a function Phi s.t. 

V [[ ((lambda (x) x) 3) ]] = Phi(V[[ (lambda (x) x) ]], V[[3]])

Your Phi is Phi(f, x) = f(x), which is fine, but that's not the only
one.  As I posted here on 26 Jun 2002 (some editing):

   As Cartwright and Felleisen's "Extensible Denotational" paper says:

    [The map from syntactic domains to semantic domains] satisfies the
    law of compositionality: the interpretation of a phrase is a
    function of the interpretation of the sub-phrases.

   This means that for a DS valuation function 

   curly-E:  Expressions ---> M

   compositionality means e.g. there must be a function 

   Phi: M x M ---> M

   such that for a pair of expressions (X, Y), 

   curly-E[[ (X Y) ]] = Phi( curly-E[[ X ]], curly-E[[ Y ]] )

I don't see any trouble defining Phi, and I think I did so 2 years
ago.  You just read it off SICP.  And if we unconflate Store & Env, 
it's not SICP, it's R5RS, 4.1.4 Procedures:

   Semantics: A lambda expression evaluates to a procedure. The
   environment in effect when the lambda expression was evaluated is
   remembered as part of the procedure. When the procedure is later
   called with some actual arguments, the environment in which the
   lambda expression was evaluated will be extended by binding the
   variables in the formal argument list to fresh locations, the
   corresponding actual argument values will be stored in those
   locations, and the expressions in the body of the lambda expression
   will be evaluated sequentially in the extended environment. 

It's easy to translate that to a Phi function for a DS function

curly-E: Expressions ---> (Env x Store -> Values x Store) 

just like the (very similar) SICP prose yields a Phi for Shriram's

V: Expressions -> (Env -> Values)

Thus Shriram's V (obtained from his actual big-step OpS with a little
induction) is a compositional DS semantic function. 

It's exciting to think we might settle this old argument!  And getting
back to Felix, my interest here is just that overly arcane math makes
for a bad `common language', per Shriram great PLAI slogan:

   It would be convenient to have some common language for explaining
   interpreters.  We already have one: math!
0
richter1974 (312)
6/17/2004 2:06:49 AM
Ray Blaak  responds to pnkfelix@gmail.com (Felix Klock):

> > But in Denotational Semantics, once I get the element of the
> > "Meaning-Set" as you call it, I'm done.  That's it.  There's no
> > more running it in an interpreter.  The thing I get back will be a
> > function; a real mathmatical function.  I can't change what the
> > factorial or fibonacci functions *are*.

Felix is accurately expressing the usual practice in DS, I think, but
there's no actual requirement to get `a real mathmatical function'.

> Math is a system of symbols and axioms and rules for evaluating
> them, quite like programs, really.  It is just that people have
> internalized math quite well over the years, and so there is
> immediate recognition and agreement as to the meaning of things.

There's an important difference, Ray!  Math is much more powerful!
That is `math' that `people have internalized' includes powerful
axioms that allow you say construct functions that you can't write
programs for.  I like your `immediate recognition and agreement' :D
0
richter1974 (312)
6/17/2004 2:17:54 AM
richter@math.northwestern.edu (Bill Richter) writes:

> It's exciting to think we might settle this old argument!  And getting
> back to Felix, my interest here is just that overly arcane math makes
> for a bad `common language', per Shriram great PLAI slogan:
> 
>    It would be convenient to have some common language for explaining
>    interpreters.  We already have one: math!

If only I'd known my throw-away comment would be quoted this many
times and in this context, I'd never have written it.  (As I'm sure
Dave Schmidt might be regretting not paying more attention to his
introduction, either.)

Bill, I'm going to rewrite my text to say that a semantics is not
denotational unless it maps lambda to an actual mathematical function.
A mapping of lambda to tuples, or to any other kind of structure, is
not a denotational semantics.  Then I'm going to quote the book ad
infinitum in response to your posts.

Shriram
0
sk1 (223)
6/17/2004 2:29:41 AM
In article <57189ce0.0406161806.431c6a4@posting.google.com>,
Bill Richter <richter@math.northwestern.edu> wrote:
> just like the (very similar) SICP prose yields a Phi for Shriram's
> 
> V: Expressions -> (Env -> Values)
> 
> Thus Shriram's V (obtained from his actual big-step OpS with a little
> induction) is a compositional DS semantic function. 

It is easy to say "it is easy to see that..." Since we're such
skeptics here, it would be easier for you to simply provide the
desired function immediately, since someone is going ask for it anyway.

Now, as to why I for one am so skeptical, lessee... In Shriram's
manuscript on p. 173 we see a rule:

f,e => <i,b,e'>    a,e => a_v    b,e'[i<-a_v] => b_v
---------------------------------------------------
                   {f a},e => b_v

Now, if I try to turn this straightforwardly into a meaning function
like you suggest, I end up with something like this:

E[{f a}] e = let <i,b,e'> = E[f] e
              in E[b] e'[i<-(E[a] e)]
                 ^^^^
See the problem?

If you have a _compositional_ meaning definition derived from the
above big-step rule, I'm sure everyone would be interested in seeing
it. Especially if you won't use any recursive domains since those
would bring along the host of problems that requires all those hairy
CPOs to solve, and you certainly don't need _those_, as you have
reiterated...


Lauri Alanko
la@iki.fi
0
la (473)
6/17/2004 2:37:08 AM
Shriram Krishnamurthi <sk@cs.brown.edu> writes:

> richter@math.northwestern.edu (Bill Richter) writes:
> 
> > It's exciting to think we might settle this old argument!  And getting
> > back to Felix, my interest here is just that overly arcane math makes
> > for a bad `common language', per Shriram great PLAI slogan:
> > 
> >    It would be convenient to have some common language for explaining
> >    interpreters.  We already have one: math!
> 
> If only I'd known my throw-away comment would be quoted this many
> times and in this context, I'd never have written it.  (As I'm sure
> Dave Schmidt might be regretting not paying more attention to his
> introduction, either.)
> 
> Bill, I'm going to rewrite my text to say that a semantics is not
> denotational unless it maps lambda to an actual mathematical function.

Uh, oh, Shriram.  Be veeeeery careful!

The first problem with your throw-away sentence (no, not the first
one, this one!) is that almost anything can formally be turned into a
function.  What you probably mean is that the denotation of a lambda
should be a function that, when applied to the denotation of an
argument, returns the denotation of the result.

But this stronger requirement would actually rule out some semantics
which are commonly considered denotational.

Matthias
0
find19 (1244)
6/17/2004 3:07:05 AM
Ray Blaak <rAYblaaK@STRIPCAPStelus.net> wrote in message news:<uekoftp5b.fsf@STRIPCAPStelus.net>...
> pnkfelix@gmail.com (Felix Klock) writes:
> > But in Denotational Semantics, once I get the element of the
> > "Meaning-Set" as you call it, I'm done.  That's it.  There's no more
> > running it in an interpreter.  The thing I get back will be a
> > function; a real mathmatical function.  I can't change what the
> > factorial or fibonacci functions *are*.
> 
> This is my problem with DS: who decides the semantics of the "Meaning-Set"?
> Just math, we say. But that is just the original problem all over again. 
> 
> Math is a system of symbols and axioms and rules for evaluating them, quite
> like programs, really. It is just that people have internalized math quite
> well over the years, and so there is immediate recognition and agreement as to
> the meaning of things.
> 
> But ultimately the problem is the same: semantics comes about from how things
> are used, are they interact; in programming terms: how things are interpreted.

I agree with this.  I said I was biased.  I didn't say in which
direction (though I suspect my choice of challenge to Bill Richter
(developing a CFA) revealed which way I'm tilted towards).

But then again, I also said I was ignorant.  I've taken only one class
that tried to cover Den.Sem., so I don't consider myself an expert in
its utility.  Hopefully by next summer I'll have a greater
appreciation for the utility of Scott domains.  Or even Category
Theory!
0
pnkfelix (27)
6/17/2004 3:24:01 AM
Felix Klock wrote:
{stuff deleted}
> But then again, I also said I was ignorant.  I've taken only one class
> that tried to cover Den.Sem., so I don't consider myself an expert in
> its utility.  Hopefully by next summer I'll have a greater
> appreciation for the utility of Scott domains.  Or even Category
> Theory!

IMNHO the killer app for denotational techniques is when you want to prove 
rich semantic equivalences between terms in your language. Especially when 
the operational techniques are just too clunky. Unfortunately, for general 
programming languages their are few if any general semantic equivalences 
that your user may want to prove.

But imagine giving a denotational semantics to some interesting thing like a 
declarative language for representing resolution independent pictures... 
(i.e. SVG, Postscript, FLASH...) The Haskell School of Expression by Hudak 
has lots of good examples of good uses of denotational reasoning of this flavor.

You would rather describe a circle as the set of points equidistant from a 
point rather than with a particular algorithm used to rasterize it.
Especially if you want to prove you can collapse several coordinate 
transformations into one transformation matrix.  In this area the 
mathematical denotation of the object is the most natural way of thinking 
about it.

For programs most of the time, I feel reasoning about the operational 
behavior of it is unfortunately more useful/natural/intuitive. So I think 
the denotational approach and Scott domains are a bit of an overkill for 
most things.

For example it is nice to know that the iterative version of fib in some way 
are semantically the same function as the naive exponential one. However, 
most people also care that they are different in the sense that the 
iterative version is more efficient. Programming requires reasoning about 
correctness and operational behavior. I do not think you can realistically 
ignore one or the other!

To be fair I think, the operational techniques have become more common 
because people have given up on reasoning about correctness, so the 
operational techniques are just useful enough to prove the weaker theorems 
people are interested in these days. One day I hope, maybe people will start 
worrying about corecntess and Scott domains will be in vogue again.

0
danwang742 (171)
6/17/2004 5:15:47 AM
richter@math.northwestern.edu (Bill Richter) wrote in message news:<57189ce0.0406161817.5689451f@posting.google.com>...
> Ray Blaak  responds to pnkfelix@gmail.com (Felix Klock):
> 
> > > But in Denotational Semantics, once I get the element of the
> > > "Meaning-Set" as you call it, I'm done.  That's it.  There's no
> > > more running it in an interpreter.  The thing I get back will be a
> > > function; a real mathmatical function.  I can't change what the
> > > factorial or fibonacci functions *are*.
> 
> Felix is accurately expressing the usual practice in DS, I think, but
> there's no actual requirement to get `a real mathmatical function'.

Okay, i think I understand now.

Bill Richter wants to give us a different kind of semantics... lets
not call it "Denotational Semantics", since that is easily confused
with the common (but clearly pointless) practice of making the range
of the valuation function be "real math stuff"...

Lets call it "A New Kind of Semantics", or NKS for short.

And this revolutionary NKS will have the ease of use of Denotational
Semantics, while providing the awesome power of reasoning in an
Operational Semantics.

After all, the V for an ideal NKS would clearly have properties like:

V[[ (lambda (x) ((lambda (y) (- x y)) 3)) ]]    !=    V[[ (lambda (x)
(- x 3)) ]]

and in NKS, we get exactly that, since these two applications of V
yield totally different tuples!  Wonderful!

</sarcasm>

Bill, please please please go try to develop a static analysis of some
sort.  Yes, I read that your background is in math and not in computer
science, but that's simply an unacceptable excuse.  We need people
with a math background in this area; there's too many compiler hackers
out there WITHOUT a reasonable background in math.  Just write a
CFA-0, it shouldn't take more than a month.  And I bet you'll find
many knowledgable folk here to help you (that is, I bet there's lots
more CFA experts reading comp.lang.scheme than NKS experts).

Until you try to apply some of the things you've learned to a concrete
problem (some problem where you have the potential for automated
testing; and posting formulae to comp.lang.scheme does not count as
"automated testing"), I fear you're never going to get an insight for
why Computer Scientists have chosen particular frameworks for
particular tasks, and why theorems aren't freely interchangable
between frameworks.

-Felix

p.s. if I misinterpreted what V[[ - ]] would do on the expressions
above, I apologize.  However, please do not interpret this apology as
a request for an explanation for what your V[[ - ]] would yield nor
how it would do so.
0
pnkfelix (27)
6/17/2004 6:12:29 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>> 
>> This would be compositional:
>>    V [[ ((lambda (x) x) 3) ]] = V[[ (lambda (x) x) ]] (V[[3]])
>
> Ah, that looks like exactly our problem, Joe!  I think I owe you an
> apology for suggesting you were dragging culture into this.  I see now
> why you insisted that the meaning of a lambda-exp is a function.
>
> You don't quite have the the definition of compositionality.  What you
> wrote is an example of compositionality, but more generally, it's that
> there's a function Phi s.t. 
>
> V [[ ((lambda (x) x) 3) ]] = Phi(V[[ (lambda (x) x) ]], V[[3]])
>
> Your Phi is Phi(f, x) = f(x), which is fine, but that's not the only
> one.  As I posted here on 26 Jun 2002 (some editing):
>
>    As Cartwright and Felleisen's "Extensible Denotational" paper says:
>
>     [The map from syntactic domains to semantic domains] satisfies the
>     law of compositionality: the interpretation of a phrase is a
>     function of the interpretation of the sub-phrases.
>
>    This means that for a DS valuation function 
>
>    curly-E:  Expressions ---> M
>
>    compositionality means e.g. there must be a function 
>
>    Phi: M x M ---> M
>
>    such that for a pair of expressions (X, Y), 
>
>    curly-E[[ (X Y) ]] = Phi( curly-E[[ X ]], curly-E[[ Y ]] )
>
> I don't see any trouble defining Phi, and I think I did so 2 years
> ago.  

The appropriate Phi for curly-E is not going to work for Ops, so I'll
call your putative Phi `Phi1':

So you are at this stage:
    V[[ ((lambda (x) x) 3) ]] = Phi1 (V[[ (lambda (x) x) ]], V[[3]])

Now I assume you wish your semantics to have the same semantics as
R5RS, so you expect this to hold:

Phi1 (<a 3-tuple>, <the number 3>) = Phi (<a function>, <the number 3>)

Or, since Phi (x, y) is defined as x(y), 

  Phi1 (<a 3-tuple>, <the number 3>) = <a function> (<the number 3>)

So Phi1 must map 3-tuples to functions.  That is, 

  Phi1 (x, y) = Phi (3-tuple->function (x), y)

But you have not demonstrated that
    3-tuple->function is definable over all (or almost all) 3-tuples,
    or that 3-tuple->function exists for all (or almost all) appropriate 3-tuples, 
    or that 3-tuple->function is unique for those.

> It's easy to translate that to a Phi function for a DS function
>
> curly-E: Expressions ---> (Env x Store -> Values x Store) 
>
> just like the (very similar) SICP prose yields a Phi for Shriram's
>
> V: Expressions -> (Env -> Values)
>
> Thus Shriram's V (obtained from his actual big-step OpS with a little
> induction) is a compositional DS semantic function. 

Woah, there's a lot of induction.  If you expand out the Y operator
example you'll find more induction than you can shake a stick at!  You
haven't show that this induction will work.

-- 
~jrm
0
6/17/2004 11:52:50 AM
Matthias Blume <find@my.address.elsewhere> writes:

> But this stronger requirement would actually rule out some semantics
> which are commonly considered denotational.

It would rule out decision-tree representations, some game-theoretic
models, and so on.  But then Richter would be writing the authors of
those papers demanding to know why their purported "denotational"
semantics don't match the throw-away wording in my book, no?

Shriram
0
sk1 (223)
6/17/2004 12:30:07 PM
"Daniel C. Wang" <danwang74@hotmail.com> writes:

> To be fair I think, the operational techniques have become more common
> because people have given up on reasoning about correctness, so the
> operational techniques are just useful enough to prove the weaker
> theorems people are interested in these days. One day I hope, maybe
> people will start worrying about corecntess and Scott domains will be
> in vogue again.

You can reason about correctness using other models, eg, Kripke
structures and temporal logic.  Some of us do this all the time.
Correctness is not out of vogue, only this technique is.

Yes, you're missing a step of proof that the Kripke structure you
extract from a program corresponds to the program's "real" (ie,
denotational) meaning.  But the extraction process is usually quite
straight-forward (not much different from, and often reusing, what a
compiler does), and lots of weaker theorems have been proven about it
to give the user additional confidence.

Shriram
0
sk1 (223)
6/17/2004 12:44:20 PM
Lauri Alanko <la@iki.fi> writes:

[a nice response]

> I think the prevailing view is that mathematical objects exist in some
> platonic realm quite independently of any formal systems, and
> mathematical truths are quite independent of the derivabality of
> anything (excepting, of course, those mathematical propositions that
> explicitly concern derivability). We are supposed to have an intuitive
> understanding of these mathematical objects and a formal system can
> then be judged by how well it captures this intuition of ours.

I don't strictly believe this myself, but have no real problem with this
view. Certainly I use my intuitive understanding of things to guide me.

My take on it is that ultimately it doesn't matter.  Our attempts to refer to
such platonic objects run into the limitations of our notations and reasoning
abilities, meaning that even if there is a "true" semantics to be found, we
are not sure if we are achieving it.

Our practical semantics, then, is the result of how things interact, which
hopefully for our intuition would tend to correspond to what we believe the
"true" semantics are like.

> In this context, the "meaning" that DS assigns to the Scheme program
> (+ 2 2) is not simply a numeral "4", but actually "four", or just
> _four_, fourness itself. What this _really_ means is then a question
> for philosophers. Certainly, as Benacerraf has argued, it cannot mean
> simply {{},{{}},{{},{{}}},{{},{{}},{{},{{}}}}} (here meaning the set
> which that expression denotes, not the expression itself as a
> syntactic object).

I see at least 6 ways to refer to the notion of "four" here. Which is right?
Are they all the different, equivalent, or just sometimes?

Is the platonic set you are attempting to refer to the same as the platonic 4?

I think it all depends on context and how things are used. E.g. how things are
interpreted.

> In any case, even though there are philosophical problems about the
> meaning of math, at least DS pushes the problem of interpretation to
> the philosophers, and out from the computer scientists' shoulders. :)

I don't know if pushing the problem to the philosophers really solves
anything; they tend to argue alot :-).

I do agree that it is useful to be able to push a problem to another standard
one, since that allows us to understand how problems can relate to each other,
and to determine if problems are equivalent. So, yes, I can see the usefulness
of DS in that respect.

-- 
Cheers,                                        The Rhythm is around me,
                                               The Rhythm has control.
Ray Blaak                                      The Rhythm is inside me,
rAYblaaK@STRIPCAPStelus.net                    The Rhythm has my soul.
0
rAYblaaK (362)
6/17/2004 6:20:54 PM
Shriram Krishnamurthi <sk@cs.brown.edu> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> 
> > But this stronger requirement would actually rule out some semantics
> > which are commonly considered denotational.
> 
> It would rule out decision-tree representations, some game-theoretic
> models, and so on.

Exactly.

>  But then Richter would be writing the authors of
> those papers demanding to know why their purported "denotational"
> semantics don't match the throw-away wording in my book, no?

I must have missed a smiley (actually present or implied), because I
don't know what you are trying to say.  (Your sentence supports the
idea of requiring denotations of lambdas to be functions HOW?)

Matthias
0
find19 (1244)
6/17/2004 7:43:07 PM
Shriram Krishnamurthi wrote:
{stuff deleted}
> You can reason about correctness using other models, eg, Kripke
> structures and temporal logic.  Some of us do this all the time.
> Correctness is not out of vogue, only this technique is.
> 
> Yes, you're missing a step of proof that the Kripke structure you
> extract from a program corresponds to the program's "real" (ie,
> denotational) meaning.  But the extraction process is usually quite
> straight-forward (not much different from, and often reusing, what a
> compiler does), and lots of weaker theorems have been proven about it
> to give the user additional confidence.

Having spent the last several months worrying about proofs at an 
unbelievable level of pedantry and annoying detail, I find it absolutely 
disturbing to "leave a step out" and claim to have a proof. Perhaps, you can 
build Kripke structures directly from an operational semantics of a language 
and show they are some how sound, but I suspect a DS version would be a bit 
more pleasant.

I would also note that if you wanted to actually mathematically prove that 
what goes on in a compiler is sound a DS with Scott models might be the most 
obvious way to go about things.  Of course not many people actually have the 
time to mathematically prove that what their compiler does is sound.

On that related note, does anyone happen to have a pointer to a proof that 
shows the CPS transformation is semantics preserving? Was such a proof done 
via a DS or operational approach. I imagine such a proof could be carried 
out using either technqiue, I'm just curious how it actually was carried out.
0
danwang742 (171)
6/17/2004 7:47:25 PM
"Daniel C. Wang" <danwang74@hotmail.com> writes:

> Having spent the last several months worrying about proofs at an
> unbelievable level of pedantry and annoying detail, I find it
> absolutely disturbing to "leave a step out" and claim to have a
> proof. 

I'm amused that you would be "absolutely disturbed" about something
that I didn't say.  I said that the proofs are about the Kripke
structures, and that we can have "confidence" in the relationship
between the structure and the program.  I chose my words carefully.

Given that "verification" is mostly about debugging, not about proving
correctness, this is not only useful, but often about as much as you
can achieve.  (Did the pedantry and annoying detail you dealt with
take into account quantum effects, hardware errors, etc?  If not,
should we be absolutely disturbed that you assumed these away?)

Shriram
0
sk1 (223)
6/18/2004 12:12:46 AM
Shriram Krishnamurthi wrote:
{stuff deleted}
> I'm amused that you would be "absolutely disturbed" about something
> that I didn't say.  I said that the proofs are about the Kripke
> structures, and that we can have "confidence" in the relationship
> between the structure and the program.  I chose my words carefully.

I don't want to nitpick, but reading your original text carefully still 
makes it seems ambiguous about as to what you actually claimed.

> Given that "verification" is mostly about debugging, not about proving
> correctness, this is not only useful, but often about as much as you
> can achieve.  (Did the pedantry and annoying detail you dealt with
> take into account quantum effects, hardware errors, etc?  If not,
> should we be absolutely disturbed that you assumed these away?)

The issue is making a precise statement about what assumptions are made and 
understanding how interesting the conclusion is with respect to the 
assumptions.  The right assumptions will allow you to prove anything.

If you are abstracting the program semantics to prove a property about the 
program, one definitely should have a proof that the abstraction is sound 
with respect to the program semantics. If however, you are merely saying 
that you have a proof about your putative abstraction and make no formal 
claims about program correctness than fine. But in that case I do not know 
what you mean by "you can reason about correctness ...." if you don't 
establish the link between your abstraction and the real semantics.

Perhaps, I'm misreading "reason"  as "prove" rather than "have confidence 
about". In any case, I have confidence about many things that are totally 
wrong.
0
danwang742 (171)
6/18/2004 2:37:29 AM
Lauri Alanko <la@iki.fi> responded to me:

> E[{f a}] e = let <i,b,e'> = E[f] e
>               in E[b] e'[i<-(E[a] e)]
>                  ^^^^
> See the problem?

Lauri, I don't quite follow you, but I do see a problem!  Thanks!
It's maybe impossible to show compositionality for my SICP-like

E: Expressions -> (Env -> Values)

because evaluating the 1st argument in {f a} changes the env/store.
So let's unconflate Env & Store, to get a DS semantic function

E: Expressions ---> (Env x Store -> Value x Store) 

of the sort discussed on page 2 of Cartwright & Felleisen "Extensible
Denotational" paper.  My E uses left->right eval order, like mzscheme.

Let's translate Shriram's OpS example as this.  We'll define

E[{f a}](e, s) = (b_v, s3), if

                  E[f](e, s) = (<i,b,e'>, s1)  

                  E[a](e, s1) = (a_v, s2)

                  l = (new s2)      
  
                  E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and

E[{f a}](e, s) = bottom,  otherwise.

My R5RS-like `new' produces a fresh location, i.e. sent to bottom by 
s2 in Store = (Location -> Value).

Was that your point, that I needed to unconflate to define E?  Can you
check my stores?  I stared at & fiddled with them for a long time, as
my State semantics is rusty.  I've been coding just Lambda alg FP.

Now I'll answer Joe's question, and define (this different!) Phi,

Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)  

                                   -> (Env x Store -> Value x Store) 
I claim that 

E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store) 

That's kinda complicated, but its easy for schemers, who are excellent
at functions that take functions as arguments etc.

Given alpha, beta in (Env x Store -> Value x Store), we define

Phi(alpha, beta)(e, s) = (b_v, s3), if 

			 alpha(e, s) = (<i,b,e'>, s1)  

			 beta(e, s1) = (a_v, s2)

			 l = (new s2)      
  
			 E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and

Phi(alpha, beta)(e, s) = bottom, otherwise.

As you see, it's a straightforward translation of the eval rule.

> If you have a _compositional_ meaning definition derived from the
> above big-step rule, I'm sure everyone would be interested in seeing
> it. 

I think I posted my Phi 2+ years ago, and nobody said it was false, or
interesting, but only that I had violated a rule, as my Phi used E.

I said no DS author insists that Phi be defined independently of E,
and we can't mathematize such a notion, and given such a Phi depending
on E, and we then re-define E by structural induction, so this rule is
even satisfied!  I didn't got a response, but folks were worn out.

As to the rest of the traffic: I think of Shriram as sortuva friend,
he tried to find me a job once, & I don't get riled at his (quite
funny!)  sarcasm.  PLAI (like HtDP) looks like a great book.  Nice to
see some mild support from MB, who I learned quite a lot from 2+ yrs
ago.  Felix: I'd be happy to know something useful about DS, perhaps
you could post something about CFA, or give a link, but what I'm
saying is really simple & really worth understanding, I think.
0
richter1974 (312)
6/18/2004 3:14:42 AM
In article <57189ce0.0406171914.4d5b1932@posting.google.com>,
Bill Richter <richter@math.northwestern.edu> wrote:
> I claim that 
> 
> E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store) 

> Given alpha, beta in (Env x Store -> Value x Store), we define
> 
> Phi(alpha, beta)(e, s) = (b_v, s3), if 
> 			 alpha(e, s) = (<i,b,e'>, s1)  
> 			 beta(e, s1) = (a_v, s2)
> 			 l = (new s2)      
> 			 E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and
> 
> Phi(alpha, beta)(e, s) = bottom, otherwise.
> 
> As you see, it's a straightforward translation of the eval rule.

This is not a definition. This is a constraint, an equation that may
or may not hold for some functions E. You yet need to construct some
function that satisfies this equation. Well, all right, _that_
is easy: just set Phi(alpha,beta)(e, s) = bottom. Or you can choose
any of an infinite number of other functions. But only a couple of
them are ones that we are _interested_ in (in the sense that bottom
accurately corresponds with nontermination). And you haven't shown
which ones are those. Scott has.

This is a recursive equation, and the usual way of dealing with those
is to get the least fixed point, but for that you need to have an
_ordering_ and that gets you again into the world of CPOs.

Incidentally, I don't think it's necessary to add stores (or
continuations) to this discussion. They just add complexity without
bringing any new insight.

> I think I posted my Phi 2+ years ago, and nobody said it was false, or
> interesting, but only that I had violated a rule, as my Phi used E.

Indeed. That's what makes it recursive, and that's why it's not a
definition.

> I said no DS author insists that Phi be defined independently of E,
> and we can't mathematize such a notion,

We don't _need_ to mathematize such a notion, because in math things
are by default defined independently of themselves. It's _recursive_
definitions that need to be mathematized.

I think usually the authors take for granted that the reader knows
that when making a definition you can't refer, even indirectly, to the
thing that you are defining.

In Schmidt, p. 51, the existence of the meaning functions is proven by
structural induction. That's what "compositional" means: defined with
structural induction. Your "definition" of E is recursive, so you have
to find out some other way of specifying exactly which function you
are talking about. You haven't done this.

> and given such a Phi depending on E, and we then re-define E by
> structural induction, so this rule is even satisfied!

By structural induction on what? As you have been told, you no longer
have a base case.

You have been given the omega example gazillion times. Consider. Let's
find, using your technique, the value of:

E[{{fun x => {x x}} {fun x => {x x}}}](e0)

Now, the start is easy. I'll leave out the store. e0 stands for the
empty environment.

E[{fun x => {x x}}](e0) = <x,{x x},e0>

So, by your definition (simplified without the stores), the result is now
E[{x x}](e1)   (where e1 = e0[x-><x,{x x},e0>])

So, we get

E[{x}](e1) = <x,{x,x},e0>

and therefore 

E[{x x}](e1) = E{{x x}](e1)

You call that a definition?

Now, as far as constraints go, that one is trivial. What all this
means is that your "definition" is satisfied by e.g. the meaning
function that assigns the value 42 to this example. But we don't want
that. We want _bottom_. And not just because we are being difficult,
but because when I say ((lambda (x) (x x)) (lambda (x) (x x))) to an
interpreter, it _doesn't terminate_!


Lauri Alanko
la@iki.fi
0
la (473)
6/18/2004 9:37:08 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Now I'll answer Joe's question, and define (this different!) Phi,
>
> Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)  
>
>                                    -> (Env x Store -> Value x Store) 

Modeling the store is unnecessary to my argument (as you will run into
trouble long before you want to introduce side effects).  Let's keep
it simple:

   Phi: (Env -> Value) x (Env -> Value) -> (Env -> Value)

> I claim that 
>
> E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store) 
>
> That's kinda complicated, but its easy for schemers, who are excellent
> at functions that take functions as arguments etc.

So if you have a Scheme form `(f a)', you claim that the denotation
may be derived as follows:

   V [[ f ]] = <a 3-tuple of args, body, and environment>

   V[[ (f a) ]] = Phi( V[[ f ]], V [[ a ]])


> Given alpha, beta in (Env x Store -> Value x Store), we define
>
> Phi(alpha, beta)(e, s) = (b_v, s3), if 
>
> 			 alpha(e, s) = (<i,b,e'>, s1)  
>
> 			 beta(e, s1) = (a_v, s2)
>
> 			 l = (new s2)      
>   
> 			 E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and
>
> Phi(alpha, beta)(e, s) = bottom, otherwise.

Leaving out the store,

  Phi (alpha, beta) (e) =  b_v if

                           alpha (e) = <i, b, e'>

                           beta (e) = a_v

                           E[b](e'[i->a_v]) = b_v
   otherwise bottom.

Let's plug in our original equation:

  V [[ (f a) ]] = V[[ b ]](e'[i -> V[[ a ]])) where

                  <i, b, e'> = V [[ f ]]

> As you see, it's a straightforward translation of the eval rule.

Right, but for one thing.

If you look at the semantic functions in section 7.2.3 in R5RS, you
will see that curly-E is defined over the recursive decomposition of
an expression.  Because expressions are finite, the denotation of an
expression can be given non-recursively.

Now look at your valuation function.  It too is defined recursively,
but the recursion is over the *body* of the function, not simply the
subform `f'.  Now suppose our function f is recursive factorial.
 
  V [[ (f a) ]] = 

      V[[ (if (zero? x) x (* x (fact (- x 1)))) ]](e'[i -> V[[ a ]])) where

      <i, (if (zero? x) x (* x (fact (- x 1)))), e'> = V [[ f ]]

Since V is compositional, this term:

    V[[ (if (zero? x) x (* x (fact (- x 1)))) ]]

must be equal to some function (I'll call it phi2) of the valuation of
subterms:

   = phi2 (V [[ (zero? x) ]], V [[ x ]], V [[ (* (fact (- x 1))) ]])

Since V is compositional, the third subterm must be equal to some
function (I'll call it phi3) of the valuation of *its* subterms:

  V [[ (* (fact (- x 1))) ]] 

    = phi3 (V [[ * ]], V [[ (fact (- x 1)) ]])

Since V is compositional, the second subterm must be equal to some
function (I'll call it phi4) of the valuation of *its* subterms:

  V [[ (fact (- x 1)) ]]

    = phi4 (V [[ fact ]], V [[ (- x 1) ]])

But wait a second.  Phi4 defines the composition necessary for
function application, so phi4 must be the same as the phi way up near
the top.  In fact, phi3 is this same function.  So by back
substitution, we find that V [[ (fact a) ]] is some composition (I'll
call this one phi5) of these terms:

   V [[ (fact a) ]] =
   
      phi6 (V [[ zero? ]],
            V [[ a ]],
            V [[ 1 ]],
            V [[ * ]],
            V [[ (fact a') ]] where a' = V [[ (- a 1) ]])

That last term, V [[ (fact a') ]], can be expanded with the above
equation: 
     
   V [[ (fact a) ]] =
   
      phi6 (V [[ zero? ]],
            V [[ a ]],
            V [[ 1 ]],
            V [[ * ]],
            phi6 (V [[ zero? ]],
                  V [[ a' ]],
                  V [[ 1 ]],
                  V [[ * ]],
                  V [[ (fact a'') ]] 
                  where a'' = V [[ (- a' 1) ]])
            where a' = V [[ (- a 1) ]])

That last term V [[ (fact a'') ]], can be expanded:

   V [[ (fact a) ]] =
   
      phi6 (V [[ zero? ]],
            V [[ a ]],
            V [[ 1 ]],
            V [[ * ]],
            phi6 (V [[ zero? ]],
                  V [[ a' ]],
                  V [[ 1 ]],
                  V [[ * ]],
                  phi6 (V [[ zero? ]],
                        V [[ a''' ]],
                        V [[ 1 ]],
                        V [[ * ]],
                        V [[ (fact a''') ]] 
                        where a''' = V [[ (- a'' 1) ]])
                  where a'' = V [[ (- a' 1) ]]) 
            where a' = V [[ (- a 1) ]])

As you can see, this diverges.  Your valuation function is not well
defined. 

> I think I posted my Phi 2+ years ago, and nobody said it was false, or
> interesting, but only that I had violated a rule, as my Phi used E.
>
> I said no DS author insists that Phi be defined independently of E,
> and we can't mathematize such a notion, and given such a Phi depending
> on E, and we then re-define E by structural induction, so this rule is
> even satisfied!  I didn't got a response, but folks were worn out.

It probably should have been explained more explicitly, but the
problem isn't that Phi can't use E, but that Phi's use of E must
converge.  This is easily satisfiable by compositional induction over
a finite expression, but not by compositional induction over the
*value* of the expression.
0
jrm (1310)
6/18/2004 3:20:51 PM
Lauri Alanko <la@iki.fi> responded to me:

> E[{x x}](e1) = E{{x x}](e1)
> 
> You call that a definition?

You caught me again!  Thanks.  And I'm very pleased to see that the
traffic today was just Math, from you & Joe.

I should've said I have some reduction rule, and if it doesn't return
a value in finite time, then we say

E[expr](e, s) = bottom

I'm drawing a blank on my reduction rule, so I'll think about it and
get back to you, after I unstick myself.  My State Semantics is rusty.

But Lauri, I take strong exception to other things you said, and I can
deal with these now.  The model for my (poorly defined) function

E: Expressions ---> (Env x Store -> Value x Store) 

is Felleisen & Flatt's Standard Reduction function eval_s, defined on
p. 51 in <http://www.ccs.neu.edu/course/com3357/mono.ps>.  

It seems to me that all the complaints you made about my E (other than
my goof!) apply to F&F's eval_s.  I'd say that their function 

eval_s : LC_v Expressions  ---> LC_v Values 

is well-defined and satisfying compositionality, even though they
don't do the various things you mention below.  Or Barendregt's
Standard Reduction function in LC, which is not I think
compositional. Can you think about that, while I fix my E?

> This is a recursive equation, and the usual way of dealing with those
> is to get the least fixed point, but for that you need to have an
> _ordering_ and that gets you again into the world of CPOs.

I say no.  Are you saying F-F & Barendregt needed CPOs?  It's just
induction.  Keep flailing away the reduction rule until you terminate,
and if you don't, send it to bottom.  Just because this function is
(as you say) the solution of a fixed point equation doesn't mean you
have to consider all possible solutions of the fixed point equation.
You just inductively define the one solution you want, and you don't
need the CPOs, which tell you the desired solution is the minimal one!

> Incidentally, I don't think it's necessary to add stores (or
> continuations) to this discussion. They just add complexity without
> bringing any new insight.

You don't need stores to understand my goof!  Plus, I goofed in
another way.  The SICP conflation works fine if we tack on Env:

E: Expressions ---> (Env  -> Value x Env) 

> > I think I posted my Phi 2+ years ago, and nobody said it was
> > false, or interesting, but only that I had violated a rule, as my
> > Phi used E.
> 
> Indeed. That's what makes it recursive, and that's why it's not a
> definition.

I say no, but that's a good question.  I define E first by induction,
and then afterward, I defined Phi in terms of E.  I don't define E &
Phi by simultaneous recursion.

> In Schmidt, p. 51, the existence of the meaning functions is proven
> by structural induction. That's what "compositional" means: defined
> with structural induction. 

I say that's false, Lauri.  Can you give me an exact quote?  Here's my
quote again from Cartwright and Felleisen's "Extensible Denotational":

    [The map from syntactic domains to semantic domains] satisfies the
    law of compositionality: the interpretation of a phrase is a
    function of the interpretation of the sub-phrases.

It's just what I said: there must exist such functions like my (poorly
defined) Phi.  It doesn't matter how you construct the Phi functions.

So I assert that C-F & Schmidt do not make your definition, and
furthermore, it's impossible to make a mathematical formulation of
your definition of compositional.  Can you check?

> Your "definition" of E is recursive, so you have to find out some
> other way of specifying exactly which function you are talking
> about. You haven't done this.

Yeah, I sure haven't!  Many apologies.
 
> You have been given the omega example gazillion times. 

Thanks for giving it to me again.
0
richter1974 (312)
6/19/2004 3:08:41 AM
Joe Marshall <jrm@ccs.neu.edu> responded to me:

> > I said no DS author insists that Phi be defined independently of
> > E, and we can't mathematize such a notion [...] 
> 
> It probably should have been explained more explicitly, but the
> problem isn't that Phi can't use E, but that Phi's use of E must
> converge.  This is easily satisfiable by compositional induction
> over a finite expression, but not by compositional induction over
> the *value* of the expression.

Let's go with that, Joe.  Thanks.  Thats' a nice "mathematical" thing
to say.  Now the burden is on me to define a converging E & Phi.

> > Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)  
> >
> >                                    -> (Env x Store -> Value x Store) 
> 
> Modeling the store is unnecessary to my argument (as you will run
> into trouble long before you want to introduce side effects).  Let's
> keep it simple:
> 
>    Phi: (Env -> Value) x (Env -> Value) -> (Env -> Value)

I failed to define such a Phi last night, and therefore switched back
to (Env x Store -> Value x Store).  The obvious problem is that
evaluating expressions can change the store/env with side effects.

But even with functional programs, you need side effects for the
recursion to make sense, at least in the R5RS way that I'm thinking.
We need define (which in R5RS DS is really set!), which I didn't
mention yet, but it's going to set up a binding

Ident -e'--> Locations -s--> Values 

fact ->  (e' fact) ->  <i, (if (zero? x) x (* x (fact (- x 1)))), e'>

and later we'll add the bindings (let's say a_v = 6)

i    ->    l       ->   6
i    ->    l'      ->   5
i    ->    l''     ->   4

Your diverging factorial computation didn't seem to use that binding
for fact.

> So if you have a Scheme form `(f a)', you claim that the denotation
> may be derived as follows:
> 
>    V [[ f ]] = <a 3-tuple of args, body, and environment>

That's not quite right, and your post yesterday had this trouble too.
You're reverting my E to

V: Expressions -> (Env -> Values)

so you needed to say 

V [[ f ]](this env) = <a 3-tuple of args, body, and environment>

Maybe that's what you mean, and that's probably fine for this
factorial computation.  But in general, I can't define Phi without V
being a function on all env's.  Let's do it my way:

E: Expresions -> (Env x Store -> Value x Store)

I can't define my Phi(alpha, beta)(e, s) with alpha & beta only
calling the initial argument (e, s).
0
richter1974 (312)
6/19/2004 4:36:08 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>
>> > I said no DS author insists that Phi be defined independently of
>> > E, and we can't mathematize such a notion [...] 
>> 
>> It probably should have been explained more explicitly, but the
>> problem isn't that Phi can't use E, but that Phi's use of E must
>> converge.  This is easily satisfiable by compositional induction
>> over a finite expression, but not by compositional induction over
>> the *value* of the expression.
>
> Let's go with that, Joe.  Thanks.  Thats' a nice "mathematical" thing
> to say.  Now the burden is on me to define a converging E & Phi.
>
>> > Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)  
>> >
>> >                                    -> (Env x Store -> Value x Store) 
>> 
>> Modeling the store is unnecessary to my argument (as you will run
>> into trouble long before you want to introduce side effects).  Let's
>> keep it simple:
>> 
>>    Phi: (Env -> Value) x (Env -> Value) -> (Env -> Value)
>
> I failed to define such a Phi last night, and therefore switched back
> to (Env x Store -> Value x Store).  The obvious problem is that
> evaluating expressions can change the store/env with side effects.
>
> But even with functional programs, you need side effects for the
> recursion to make sense, at least in the R5RS way that I'm thinking.

No, you don't.  That's why I used the Y operator.  You get recursion
out of self-application.

> We need define (which in R5RS DS is really set!), which I didn't
> mention yet, but it's going to set up a binding
>
> Ident -e'--> Locations -s--> Values 
>
> fact ->  (e' fact) ->  <i, (if (zero? x) x (* x (fact (- x 1)))), e'>
>
> and later we'll add the bindings (let's say a_v = 6)
>
> i    ->    l       ->   6
> i    ->    l'      ->   5
> i    ->    l''     ->   4
>
> Your diverging factorial computation didn't seem to use that binding
> for fact.

I hope you want your factorial to work for other values than 6.  But
let me point out that I'm *not* performing a factorial computation.
I'm performing an analysis of the factorial program and it is the
analysis that is diverging, not the program.  The fault lies in the
analysis, which is not sufficiently powerful to model recursive
programs.

Look what happens even if you do substitute in a value:

   V [[ (fact 2) ]] =
      phi6 (V [[ zero? ]],
            V [[ 2 ]],
            V [[ 1 ]],
            V [[ * ]],
            phi6 (V [[ zero? ]],
                  V [[ 1 ]],
                  V [[ 1 ]],
                  V [[ * ]],
                  phi6 (V [[ zero? ]],
                        V [[ 0 ]],
                        V [[ 1 ]],
                        V [[ * ]],
                        phi6 (V [[ zero? ]],
                              V [[ -1 ]],
                              V [[ 1 ]],
                              V [[ * ]],
                              phi6 (V [[ zero? ]],
                                    V [[ -2 ]],
                                    V [[ 1 ]],
                                    V [[ * ]],
                                    phi6 (V [[ zero? ]],
                                          V [[ -3 ]],
                                          V [[ 1 ]],
                                          V [[ * ]],
                                          ....

It still diverges.

Remember that phi6 is our compositional valuation function.  It does
not `know' that the meaning of the IF expression in factorial is
conditional on the value of the predicate.

>> So if you have a Scheme form `(f a)', you claim that the denotation
>> may be derived as follows:
>> 
>>    V [[ f ]] = <a 3-tuple of args, body, and environment>
>
> That's not quite right, and your post yesterday had this trouble too.
> You're reverting my E to
>
> V: Expressions -> (Env -> Values)
>
> so you needed to say 
>
> V [[ f ]](this env) = <a 3-tuple of args, body, and environment>
>
> Maybe that's what you mean, and that's probably fine for this
> factorial computation.  But in general, I can't define Phi without V
> being a function on all env's.  Let's do it my way:
>
> E: Expresions -> (Env x Store -> Value x Store)
>
> I can't define my Phi(alpha, beta)(e, s) with alpha & beta only
> calling the initial argument (e, s).

That's the problem.

-- 
~jrm
0
6/19/2004 5:13:27 PM
richter@math.northwestern.edu (Bill Richter) writes:

>> This is a recursive equation, and the usual way of dealing with those
>> is to get the least fixed point, but for that you need to have an
>> _ordering_ and that gets you again into the world of CPOs.
>
> I say no.  Are you saying F-F & Barendregt needed CPOs?  It's just
> induction.  

`Just' induction?

> Keep flailing away the reduction rule until you terminate,
> and if you don't, send it to bottom.  Just because this function is
> (as you say) the solution of a fixed point equation doesn't mean you
> have to consider all possible solutions of the fixed point equation.

Actually, you do.  First of all, if the fixed point does not exist,
then the expression has no meaning.  Not `bottom', but no meaning
whatsoever.  Every time you flail away on the induction rule, you get
a different answer and it does not approach anything.

So you need to first determine that a fixed point exists.

But you may have more than one fixed point, too.  The identity
function has an infinite number of fixed points, and a lot of them are
wildly different.  You wouldn't want a semantics that said that
(factorial 6) is either 120 or 21302938 or "blue" or <HTML>.

If you have more than one fixed point, you need a mechanism by which
you can discriminate between them.

> You just inductively define the one solution you want, and you don't
> need the CPOs, which tell you the desired solution is the minimal one!

You cannot just say `I want the fixed point that gives the right
answer'!  Each fixed point gives a potentially *different* answer, so
you need a mechanism to choose the one you want.  And that mechanism
had better be realizable on a physical computer.

> I define E first by induction, and then afterward, I defined Phi in
> terms of E.  I don't define E & Phi by simultaneous recursion.

Your definition of E is incomplete without Phi, so there is
simultaneous recursion.  (Besides, you have defined E by induction,
but you haven't shown that the induction terminates).

>> In Schmidt, p. 51, the existence of the meaning functions is proven
>> by structural induction. That's what "compositional" means: defined
>> with structural induction. 
>
> I say that's false, Lauri.  Can you give me an exact quote?  Here's my
> quote again from Cartwright and Felleisen's "Extensible Denotational":
>
>     [The map from syntactic domains to semantic domains] satisfies the
>     law of compositionality: the interpretation of a phrase is a
>     function of the interpretation of the sub-phrases.
>
> It's just what I said: there must exist such functions like my (poorly
> defined) Phi.  It doesn't matter how you construct the Phi functions.

Your phi is not a function of the interpretation of the sub-phrases
but a function of the interpretation of the *values* of the
sub-phrases.  That's the problem.

-- 
~jrm
0
6/19/2004 5:34:09 PM
The problem Lauri caught me on is one I posted a solution for on 26
Jun 2002.  Looks pretty clumsy, so I'll think more before re-posting.

Joe Marshall <prunesquallor@comcast.net> wrote me 2 responses:

Joe, I'm glad to see you were using Y to avoid define.  Let's put off
your diverging calculation until we settle some theoretical points.

> >> This is a recursive equation, and the usual way of dealing with
> >> those is to get the least fixed point, but for that you need to
> >> have an _ordering_ and that gets you again into the world of
> >> CPOs.
> >
> > I say no.  Are you saying F-F & Barendregt needed CPOs?  It's just
> > induction.
> 
> `Just' induction?

Yes, and see below for why.  But this is something I want you & Lauri
to think about.  Are you saying F-F & B needed CPOs?  How come F-F & B
don't run into the problems you raise, such as:

> But you may have more than one fixed point, too.  The identity
> function has an infinite number of fixed points, and a lot of them
> are wildly different.  You wouldn't want a semantics that said that
> (factorial 6) is either 120 or 21302938 or "blue" or <HTML>.

:D  Why doesn't this problem snag F-F's eval_s?  Look on p. 51 in
<http://www.ccs.neu.edu/course/com3357/mono.ps>.

> So you need to first determine that a fixed point exists.

Yes, and I'm in arrears here.

> You cannot just say `I want the fixed point that gives the right
> answer'!  

:D 

> Each fixed point gives a potentially *different* answer, so you need
> a mechanism to choose the one you want.  And that mechanism had
> better be realizable on a physical computer.

I think I disagree.  We're going to define these fixed points by math
axioms, like induction, that's not realizable on a physical computer.

That's the value of using math as a `common language' of semantics: we
can define functions that aren't realizable on a physical computer.

> > I define E first by induction, and then afterward, I defined Phi in
> > terms of E.  I don't define E & Phi by simultaneous recursion.
> 
> Your definition of E is incomplete without Phi, so there is
> simultaneous recursion.

No, that's going to be false, after I successfully define E.

> (Besides, you have defined E by induction, but you haven't shown
> that the induction terminates).

Yes! 

> > I can't define my Phi(alpha, beta)(e, s) with alpha & beta only
> > calling the initial argument (e, s).
> 
> That's the problem.

No, that's fine.  My Phi is supposed to be a math function
 
Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)  

                                   -> (Env x Store -> Value x Store) 

So Phi(alpha, beta)(e, s) is supposed to be in Value x Store.  Now one
way to define Phi would be to have another math function

Phi1:  (Value x Store) x (Value x Store)  ---> (Value x Store) 

such that 

Phi(alpha, beta)(e, s) = Phi1(alpha(e, s), beta(e, s))

But I don't have to define Phi this way, and what I did was fine: I
used alpha(e, s) to produce (e6, s6) and plugged that into beta.

If this was as a Scheme program, you wouldn't have any trouble: you
deal with functions that take functions as args all the time.

> > there must exist such functions like my (poorly defined) Phi.  It
> > doesn't matter how you construct the Phi functions.
> 
> Your Phi is not a function of the interpretation of the sub-phrases
> but a function of the interpretation of the *values* of the
> sub-phrases.  That's the problem.

I think you're claiming my Phi is defined in terms of a Phi1.  Why?

Here's how induction constructs functions like the F-F's eval_s. 
Just like the math factorial function
f: N --> N = {0, 1, 2, ... }
Suppose we've constructed a function
f_n: {0,...,n} ---> N
that satisfies the factorial definition this far.  Then we define 
f_{n+1}: {0,...,n+1} ---> N
by restricting to f_n for the first bunch, and defining 
f_{n+1}(n+1) = (* n+1 f_n(n)).
Plus, we can construct the first one f_0 by f_0(0) = 1.

By mathematical induction, there exists a function f_n for all n in N.
Then we define f(n) = f_n(n).  That's the way the math folks do it.
The CPO fixed point way is fine, but you don't need it.
0
richter1974 (312)
6/20/2004 1:34:35 AM
Bill Richter wrote:
> The problem Lauri caught me on is one I posted a solution for on 26
> Jun 2002.  Looks pretty clumsy, so I'll think more before re-posting.
> 
> Joe Marshall <prunesquallor@comcast.net> wrote me 2 responses:
> 
> Joe, I'm glad to see you were using Y to avoid define.  Let's put off
> your diverging calculation until we settle some theoretical points.

Scott domains exist only because of diverging calculations! If we only 
worried about non-diverging calculations there would be no need for
Scott domains. You cannot understand the need for CPOs is you ignore 
diverging calculations.
0
danwang742 (171)
6/20/2004 3:28:10 AM
Lauri Alanko <la@iki.fi> responded to me:

> This is not a definition. This is a constraint, an equation that may
> or may not hold for some functions E.

Right, Lauri. Now I'll say something about how to actually define

E: Expressions -> (Env x Store -> Value x Store) 

and not just give equations for E to satisfy.  I want to follow the
Flatt & Felleisen's standard reduction function eval_s: see p. 51 in
<http://www.ccs.neu.edu/course/com3357/mono.ps>.  

I posted a solution of this sort here on 26 Jun 2002.  It looks pretty
clumsy to me now, but I think it's OK.  But instead of discussing it,
I'd prefer to talk about a version for HtDP, which I think should be
both simpler and better designed. I wasn't aware of HtDP back then.

HtDP's Semantics for Advanced Scheme is mostly explained in sec 38,
but sec 40.3 explains mutable structures, and sec 24 says to rewrite
lambda expressions as local exps.  I think it's very elegant.

HtDP use as global environment (+ store), with automatic re-naming of
variables to avoid name collisions.  I think that's a good idea, which
cuts down on the wild proliferation of environments.  The global env
is seen in sec 38 as a list of define's above the computation.

So HtDP's semantics can I think be written as DS semantic function

E: Expressions -> (Env -> Value Env)

Define E standard-reduction fashion as follows.  First define the set
Enhanced of "enhanced" expressions, where some of the subexpressions
are values.  Values is a subset of Enhanced, and there's an inclusion

Expressions -> Enhanced

Now we'll define from sec 38 a 1-step reduction function

R: Enhanced x Env -> Enhanced x Env

R looks for the evaluation context and perform some action.  If
there's no eval context, then we should be done, we should have a
value.  If e.g. the eval context is a variable, then the value of the
variable is found in the global env, and substituted in.  If the eval
context is (set! x V), the subexp is replaced by void, and a new env
is return, with x now bound to V. If the eval context is a local
expression, then some modification of is needed, as HtDP renames the
local variables as to not collide with variables already in the env,
substitutes the new names are substituted into the local exp, and
brings the local defines out to top-level.  Now instead of one
expression we have define's followed by an expressions.  So either
change this technique to handle a list of expressions, or else keep
the local-defines inside the local expression until they are reduced
to (define x# V), and then bring the define out to the global env.
I guess I don't have it all figured out.  

But we should be able to define a 1-step reduction function

R: Enhanced x Env -> Enhanced x Env 

and now we define 

F: Enhanced x Env -> Value x Env 

by applying R until we either hit bottom or the 1st coordinate is a
value.  If no iterate of R ever gets a value, then we also send it to
bottom.  That is, letting R^n mean the n-th iterate of R, 

F(exp, e) = R^n(exp, e),  if n in {0, 1, 2,...} is smallest s.t. 
			     R^n(exp, e) in Value x Env and 
			     R^i(exp, e) != bottom for i <= n

F(exp, e) = bottom, otherwise.

The omega problem yields F(omega, e) = bottom, because there is no n:
For all n in {0, 1, 2,...}, R^n(omega, e) is not in Value x Env.

Now we define the math function

E: Expressions -> (Env -> Value Env)

by including Expressions -> Enhanced and "un-currying".

Discussion: Perhaps my R is a simple version of the small-step OpS
that Robby Findler and Jacob ? are writing for Scheme.  So one might
say, "This isn't DS, it's OpS", but I say it's both, as we can easily
pass from R to E in the way that F-F define eval_s.  I say it's a
little induction, but F-F doesn't say anything at all.  I contend you
don't need CPOs. 

One might say, "The R OpS is much more useful than the E DS", and I'd
say, "You're the experts!"  But mathematicians want functions like E.
The standard miscommunication between mathematicians and physicists
goes like this: The physicists writes down a complicated Lagrangian to
integrate, and the mathematician says, "What's the range and domain of
your Lagrangian as a function?"  And that ends the discussion, but
this problem is apparently being solved in String Theory these days.

I'd prefer we didn't worry about the exact definition of R or Value
for Advanced Scheme until we settled whether it yields a DS semantic
function E.  Clearly I have things to learn about R and Value.  But I
think this clearly shows I'm not just writing down "a constraint, an
equation that may or may not hold for some functions E."

I'm not sure if E is compositional, because of this automatic
renaming.  Even if E is compositional, it might be awkward to discuss.
So I recommend that we don't insist that compositionality be part of
the definition of a DS semantic function, as some authors do.
0
richter1974 (312)
6/20/2004 6:26:56 AM
richter@math.northwestern.edu (Bill Richter) writes:

> That's the value of using math as a `common language' of semantics: we
> can define functions that aren't realizable on a physical computer.

Then you have not given a semantics of a *programming* language.

Shriram
0
sk1 (223)
6/20/2004 1:12:04 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Yes, and see below for why.  But this is something I want you & Lauri
> to think about.  Are you saying F-F & B needed CPOs?  How come F-F & B
> don't run into the problems you raise, such as:
>
>> But you may have more than one fixed point, too.  The identity
>> function has an infinite number of fixed points, and a lot of them
>> are wildly different.  You wouldn't want a semantics that said that
>> (factorial 6) is either 120 or 21302938 or "blue" or <HTML>.
>
> :D  Why doesn't this problem snag F-F's eval_s?  Look on p. 51 in
> <http://www.ccs.neu.edu/course/com3357/mono.ps>.

It probably should.  On page 33 (section 4.6.3) they assert that the Y
operator `can find the fixed point' of any function.  But a function
can have many fixed points (like the identity function) or no fixed
points (like (lambda (x) (+ x 1)) ).  

But since Felleisen and Flatt are developing an operational semantics,
perhaps they didn't feel the need to specify which (if any) fixed
point is found.

>> Each fixed point gives a potentially *different* answer, so you need
>> a mechanism to choose the one you want.  And that mechanism had
>> better be realizable on a physical computer.
>
> I think I disagree.  We're going to define these fixed points by math
> axioms, like induction, that's not realizable on a physical computer.
>
> That's the value of using math as a `common language' of semantics: we
> can define functions that aren't realizable on a physical computer.

Mathematically, any value whatsoever is a fixed point of the identity
function.  Yet when you use a computer to invoke the Y operator on the
identity function, you get an infinite loop, i.e. `bottom'.

If you don't care if your semantics apply to realizable machines,
that's fine, but it isn't of much interest to computer scientists.

>> > I define E first by induction, and then afterward, I defined Phi in
>> > terms of E.  I don't define E & Phi by simultaneous recursion.
>> 
>> Your definition of E is incomplete without Phi, so there is
>> simultaneous recursion.
>
> No, that's going to be false, after I successfully define E.

I don't see how.  Phi is just a name that we gave to the composition
step in E.  There's always going to be a composition step, and we can
always call it Phi if we choose to do so.

The only way you can avoid mutual recursion between phi and E is to
define E non-recursively.  Then Phi will exist, but it will be
independent of E and there will be no problem.

-- 
~jrm
0
6/20/2004 2:18:53 PM
Shriram Krishnamurthi wrote:
> richter@math.northwestern.edu (Bill Richter) writes:
> 
> 
>>That's the value of using math as a `common language' of semantics: we
>>can define functions that aren't realizable on a physical computer.
> 
> 
> Then you have not given a semantics of a *programming* language.
> 
> Shriram

Let see... do you consider a Turing machine physically realizable? I sure 
have not seen a physical Turing machine with one of those fancy infinite tapes.

Many programming language semantics strictly speaking are not physically 
realizable, because they assume infinite storage. Programming languages are 
nice abstractions of the physical systems we acutally build.

0
danwang742 (171)
6/20/2004 3:35:47 PM
I see now I've been wrong about the big-step OpS in PLAI!  Many
apologies to Shriram, and thanks to everyone for helping me out.

That is, one can't go from Shriram's big-step OpS to a function 

E: Expressions -> (Env -> Value Env)

without some real work, as my last post showed clearly.  So it was not
reasonable for me to think that PLAI's semantics was DS.

So there's sense to saying I've got to use CPOs or Scott models.  It's,
"You've gotta do something!  You can't just say you're done!"

We've gotten into other topics, which I'd be happy to discuss them
further, but the original impetus is now dead: I was wrong about PLAI.

For instance, I'm now maybe starting to see the advantage of CPOs: to
make sense of the big-step rule, do structural induction over the
phrases of the language.  Start small with numbers & identifiers &
induct up.  I'll have to think about it.  But my solution was fine
too: refine the big-step rule (which is really just an equation for E
to satisfy, as Lauri & Joe said) into a 1-step function.

2 other clarifications:

I wrote last post 

> sec 24 says to rewrite lambda expressions as local exps."

That was just brainlock on my part.  I forgot the rule (which MFl had
to remind me of once) of Advanced Scheme

   3. A set!-expression must occur in the lexical scope of a define
      that introduces the set!-expression's left-hand side.

That allows the simplified semantics of Advanced Scheme

     ((lambda (x-1 ... x-n) exp)
      v-1 ... v-n)
   = exp  with all  x-1 ... x-n  replaced by  v-1 ... v-n

   The law serves as a replacement of the law of application from algebra
   in the study of the foundations of computing. By convention, this law
   is called the �v (pronounced ``beta value'') axiom.

Joe wrote:

> > I say no.  Are you saying F-F & Barendregt needed CPOs?  It's just
> > induction.
> 
> `Just' induction?

Maybe we should say more.  I think the math folks are in the habit of
saying "just use induction."  They don't say, "use CPOs or Scott
models", because they don't know what they are!  Look at it yourself.
0
richter1974 (312)
6/20/2004 6:18:45 PM
Joe Marshall <prunesquallor@comcast.net> responded to me:
> 
> > Are you saying F-F & B needed CPOs?  How come F-F & B don't run
> > into the problems you raise, such as:
> >
> >> But you may have more than one fixed point, too.  The identity
> >> function has an infinite number of fixed points, and a lot of them
> >> are wildly different.  You wouldn't want a semantics that said that
> >> (factorial 6) is either 120 or 21302938 or "blue" or <HTML>.
>  
> > :D  Why doesn't this problem snag F-F's eval_s?  Look on p. 51 in
> > <http://www.ccs.neu.edu/course/com3357/mono.ps>.
> 
> It probably should.  

I really disagree, Joe, and I'm sure the authors would also.

> On page 33 (section 4.6.3) they assert that the Y operator `can find
> the fixed point' of any function.  But a function can have many
> fixed points (like the identity function) or no fixed points (like
> (lambda (x) (+ x 1)) ).

You're right, but that's different, that's not in the Standard
Reduction section.  You're definitely making a good point about Y.

> But since Felleisen and Flatt are developing an operational semantics,
> perhaps they didn't feel the need to specify which (if any) fixed
> point is found.

Please take another look at F-F's eval_s, on p. 51.

> If you don't care if your semantics apply to realizable machines,
> that's fine, but it isn't of much interest to computer scientists.

I think we're miscommunicating.  Simplest example of what I'm saying
is that the DS function E sends all infinite loops to bottom.  We
can't realize that on a machine, by the Halting Problem, as you know.

> >> > I define E first by induction, and then afterward, I defined Phi in
> >> > terms of E.  I don't define E & Phi by simultaneous recursion.
> >> 
> >> Your definition of E is incomplete without Phi, so there is
> >> simultaneous recursion.
> >
> > No, that's going to be false, after I successfully define E.
> 
> I don't see how.  Phi is just a name that we gave to the composition
> step in E.  There's always going to be a composition step, and we can
> always call it Phi if we choose to do so.
> 
> The only way you can avoid mutual recursion between phi and E is to
> define E non-recursively.  Then Phi will exist, but it will be
> independent of E and there will be no problem.

OK, Joe, but now we have a test case.  I sketched the construction of
an E for Advanced Scheme, but didn't give a Phi at all.  Do you find
an error there?  My point is that my E for Advanced Scheme, using a
1-step R, is vastly smarter than the E we've been arguing about :D
0
richter1974 (312)
6/20/2004 6:31:07 PM
Shriram Krishnamurthi <sk@cs.brown.edu> responds to me:

> > That's the value of using math as a `common language' of
> > semantics: we can define functions that aren't realizable on a
> > physical computer.
> 
> Then you have not given a semantics of a *programming* language.

Shriram, I think I see your point, & I partly agree with you.  First
let apologize "in person" for my wild-goose chase.

I think your PLAI big-step OpS is realizable on a physical computer,
because it amounts to things like this.  Your OpS says that if 3
reductions exist, then a 4th reduction for {f a} exist.  Your function

4th-reduction(1st red, 2nd red, 3rd red) 

looks like a computable math function. And furthermore it's good
semantics: it expresses mathematically what we know & believe about
our PL.  To rephrase this, you don't define the noncomputable function

V: Expressions -> (Env -> Value)

but your OpS translates to interesting equations for V to solve.
You'll say that 

V[f](e) = <x, body, e'>, V[a](e) = a_v, & V[body](e[x<-a_v]) = b_v

implies that 

V[{f a}](e) = b_v

Your "failure" to define V doesn't hurt your semantics at all: Your
(computable) implications fully expresses the semantics!  You've
mathematized what SICP & the text of R5RS say, and furthermore you've
mathematized what the interpreter will tell us.  Your math/semantics
fails to tell us what the interpreter won't tell us either: when a
program is in an infinite loop.  But we can only "answer" this
question by invoking strong math axioms anyway.

And so you've refuted a claim I've made in this thread a few times:
"Any semantics, when fully mathematized, becomes DS."

But we like to write down functions like V mathematically, and it may
even be useful (I like standard reduction), so let me weaken my claim:

Math, the `common language' of semantics, lets us define noncomputable
functions like V, i.e. V is not realizable on a physical computer.

In response to this weaker claim, I think this claim is too strong:

> Then you have not given a semantics of a *programming* language.

You can define the semantics of a PL with a noncomputable function
like V: that's done in DS.  It's just that you don't have to go the
extra noncomputable mile in order to specify the PL's semantics.
That's what I (finally!!!) learned from your PLAI semantics.
0
richter1974 (312)
6/20/2004 10:27:13 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <prunesquallor@comcast.net> responded to me:
>> 
>> > Are you saying F-F & B needed CPOs?  How come F-F & B don't run
>> > into the problems you raise, such as:
>> >
>> >> But you may have more than one fixed point, too.  The identity
>> >> function has an infinite number of fixed points, and a lot of them
>> >> are wildly different.  You wouldn't want a semantics that said that
>> >> (factorial 6) is either 120 or 21302938 or "blue" or <HTML>.
>>  
>> > :D  Why doesn't this problem snag F-F's eval_s?  Look on p. 51 in
>> > <http://www.ccs.neu.edu/course/com3357/mono.ps>.
>> 
>> It probably should.  
>
> I really disagree, Joe, and I'm sure the authors would also.

Matthias is a few doors down, I'll ask.

>> On page 33 (section 4.6.3) they assert that the Y operator `can find
>> the fixed point' of any function.  But a function can have many
>> fixed points (like the identity function) or no fixed points (like
>> (lambda (x) (+ x 1)) ).
>
> You're right, but that's different, that's not in the Standard
> Reduction section.  You're definitely making a good point about Y.
>
>> But since Felleisen and Flatt are developing an operational semantics,
>> perhaps they didn't feel the need to specify which (if any) fixed
>> point is found.
>
> Please take another look at F-F's eval_s, on p. 51.

The standard reduction section is showing that an algorithm exists by
which you can reduce an expression, and that this algorithm preserves
the algabraic properties of the reduction equations.

In other words, it proves that `standard reduction is as good as
algabraic reduction' with regard to getting the same answers.

> OK, Joe, but now we have a test case.  I sketched the construction of
> an E for Advanced Scheme, but didn't give a Phi at all.  Do you find
> an error there?  

I haven't even seen a well-formed `E', let alone one I can examine for
an error.  Every E you've presented has E on both sides of the
equation.

-- 
~jrm
0
6/21/2004 3:55:49 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Maybe we should say more.  I think the math folks are in the habit of
> saying "just use induction."  They don't say, "use CPOs or Scott
> models", because they don't know what they are!  Look at it yourself.

Dare I throw kerosene on the fire?

It is possible to show that:

  The function GOODSTEIN terminates for all positive integer values
  for seed.

  That the proof for the above statement is independent of the Peano
  axioms.


;;; Note, this depends on a version of FLOOR that returns both a
;;; quotient and remainder.  You may have to roll your own.

(define (base-bump base n)
  (if (zero? n)
      0
      (do ((exponent  0 (+ exponent 1))
           (divisor   1  next)
           (next   base (* next base)))
          ((> next n)
           (call-with-values (lambda () (floor n divisor))
             (lambda (quotient remainder)
               (+ (* quotient (expt (+ base 1) (base-bump base exponent)))
                  (base-bump base remainder))))))))

(define (goodstein seed)
  (do ((base 2   (+ base 1))
       (n   seed (- (base-bump base n) 1)))
      ((zero? n))))



-- 
~jrm
0
6/21/2004 4:24:02 AM
Joe Marshall <prunesquallor@comcast.net> responds to me:

> >> > :D  Why doesn't this problem snag F-F's eval_s?  Look on p. 51 in
> >> > <http://www.ccs.neu.edu/course/com3357/mono.ps>.
> >> 
> >> It probably should.  
> >
> > I really disagree, Joe, and I'm sure the authors would also.
> 
> Matthias is a few doors down, I'll ask.

Excellent, Joe! Do you teach there too?

> > Please take another look at F-F's eval_s, on p. 51.
> 
> The standard reduction section is showing that an algorithm exists
> by which you can reduce an expression, and that this algorithm
> preserves the algebraic properties of the reduction equations.
>
> In other words, it proves that `standard reduction is as good as
> algabraic reduction' with regard to getting the same answers.

Yeah, but I'm just concentrating on the fact that `this algorithm',
called eval_s, is well-defined, and F-F never mentioned CPOs, Scott
models, structural induction.  I think that's perfectly fine, and I
think any undergraduate Math major who'd taken a course in Point Set
Topology or Real Analysis would see why that's fine.

> > OK, Joe, but now we have a test case.  I sketched the construction of
> > an E for Advanced Scheme, but didn't give a Phi at all.  Do you find
> > an error there?  
> 
> I haven't even seen a well-formed `E', let alone one I can examine for
> an error.  Every E you've presented has E on both sides of the
> equation.

Perhaps you didn't see my E for Advanced Scheme, in a response to
Lauri.  E occurs for the first time here:

   F(exp, e) = R^n(exp, e),  if n in {0, 1, 2,...} is smallest s.t. 
				R^n(exp, e) in Value x Env and 
				R^i(exp, e) != bottom for i <= n

   F(exp, e) = bottom, otherwise.

   The omega problem yields F(omega, e) = bottom, because there is no
   n: For all n in {0, 1, 2,...}, R^n(omega, e) is not in Value x Env.

   Now we define the math function

   E: Expressions -> (Env -> Value Env)

   by including Expressions -> Enhanced and "un-currying" F.

R is not in fact an inductively defined function (i.e. no R on both
sides of the equation).  As I said in my post, there are holes in my
treatment of R, but you should be able to see that 

1) E is defined in much the same way that F-F's eval_s is, so it
   doesn't need CPOs

2) Some Advanced Scheme hotshots could easily fill in the holes.
0
richter1974 (312)
6/21/2004 10:14:40 AM
"Bill Richter" <richter@math.northwestern.edu> wrote

>>I think that's perfectly fine, and I think any undergraduate Math major
who'd taken a course in Point Set Topology or Real Analysis would see why
that's fine.

Uh well... yes I have. But I've been lost for quite some time now. :-)


0
6/21/2004 9:01:28 PM
I haven't been following this, because it looks like a repetition
of an apparently useless discussion from a couple of years ago,
so I will take things completely out of context and comment upon
them whenever it tickles my funny bone.

Bill Richter:
> I think we're miscommunicating.  Simplest example of what I'm saying
> is that the DS function E sends all infinite loops to bottom.  We
> can't realize that on a machine, by the Halting Problem, as you know.

That's like saying we can't compute pi using a computer, because of
the Halting Problem.

What we *can* compute is an infinite sequence of approximations that
converge to pi.  Similarly, we can compute an infinite sequence of
approximations that converge to bottom.

This assumes infinite computing resources, of course.  If we didn't
have infinite computing resources, then the Halting Problem would be
decidable.

Will
0
cesuraSPAM (401)
6/21/2004 10:21:31 PM
Joe Marshall <prunesquallor@comcast.net> responds to me:
> 
> >  I think the math folks are in the habit of saying "just use
> > induction."  They don't say, "use CPOs or Scott models", because
> > they don't know what they are!  
> 
> Dare I throw kerosene on the fire?
> 
> It is possible to show that:
> 
>   The function GOODSTEIN terminates for all positive integer values
>   for seed.
> 
>   That the proof for the above statement is independent of the Peano
>   axioms.

That's hot, Joe!  My favorite set theory book, by Just & Weese, says
that prior to Goedel incompleteness, no one had ever proved a theorem
in Number Theory that independent of the Peano axioms.  And of course,
the Goedel sentence is pretty weird, not clearly related to any Number
Theory we know or care about.  This sounds a lot more interesting.
Can you say more?  I couldn't grok your code.

Where's the kerosene anyway?  My point is that you can do great things
with Math, like construct DS function with or without Scott models,
but only because Math has such strong (set theory) axioms.  But we
know from Goedel incompleteness that are axioms aren't complete.
Your turn!
0
richter1974 (312)
6/21/2004 11:37:57 PM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

> I haven't been following this, because it looks like a repetition of
> an apparently useless discussion from a couple of years ago, 

I don't think it's useless, Will.  I learned from Shriram that not all
semantics is DS.  Lauri's off reading Schmidt to see if
compositionality really requires structured induction with CPOs.
Joe's gonna ask MFe if the standard reduction function requires CPOs.
My best post from a couple of years ago was something nobody responded
to, and I'll attribute that to exhaustion, as you guys had been
correcting my mistakes for months by then :D I posted 26 Jun 2002 how
to actually define a compositional semantic function for call/cc-free
Scheme without CPOs or Scott models.

I just did it again for Advanced Scheme, with some holes, but it's an
advance, as Advanced Scheme cleans up my clumsy approach, and now (I
think) I understand from Shriram that the key is using some small-step
OpS, as Robby Findler does.  That is, in order to not just write down
a buncha equations for the semantic function, I needed a non-recursive
1-step evaluation function.  And that strikes me as similar to the
standard Scott/CPO/structured approach: we can induct over the
complexity of expressions, or we can take teeny tiny evaluation steps.

> so I will take things completely out of context and comment upon
> them whenever it tickles my funny bone.

That's the spirit here!  We're playing the telephone game: Joe said
something I didn't understand, so I semi-responded...

> Bill Richter:
> > I think we're miscommunicating.  Simplest example of what I'm
> > saying is that the DS function E sends all infinite loops to
> > bottom.  We can't realize that on a machine, by the Halting
> > Problem, as you know.
> 
> That's like saying we can't compute pi using a computer, because of
> the Halting Problem.

Well, they're both uncomputable, and that's all I meant, in my
response to Joe:

      Each fixed point gives a potentially *different* answer, so you
      need a mechanism to choose the one you want.  And that mechanism
      had better be realizable on a physical computer.

I probably had no idea what Joe meant.   I don't know what Shriram
thought I meant when he responded to me:

     > That's the value of using math as a `common language' of
     > semantics: we can define functions that aren't realizable on a
     > physical computer.
     
     Then you have not given a semantics of a *programming* language.

Back to you, Will:

> What we *can* compute is an infinite sequence of approximations that
> converge to pi.  Similarly, we can compute an infinite sequence of
> approximations that converge to bottom.

Aha, interesting point.  Does that answer Joe's original point:

      And that (fixed-point-choosing) mechanism had better be
      realizable on a physical computer.

That the semantic function must be a limit?  But what does that rule
out?  Are there any semi-pseudo-semantic function that aren't limits?

> This assumes infinite computing resources, of course.  If we didn't
> have infinite computing resources, then the Halting Problem would be
> decidable.

Cool!  Yeah, all functions are computable on a finite set!

But we must assume infinite computing resources.  It doesn't matter if
Daniel's fancy infinite Turing tape physically exists.  We're talking
about computable functions on the integers and other infinite sets.
0
richter1974 (312)
6/22/2004 2:03:28 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <prunesquallor@comcast.net> responds to me:
>> 
>> >  I think the math folks are in the habit of saying "just use
>> > induction."  They don't say, "use CPOs or Scott models", because
>> > they don't know what they are!  
>> 
>> Dare I throw kerosene on the fire?
>> 
>> It is possible to show that:
>> 
>>   The function GOODSTEIN terminates for all positive integer values
>>   for seed.
>> 
>>   That the proof for the above statement is independent of the Peano
>>   axioms.
>
> That's hot, Joe!  My favorite set theory book, by Just & Weese, says
> that prior to Goedel incompleteness, no one had ever proved a theorem
> in Number Theory that independent of the Peano axioms.  And of course,
> the Goedel sentence is pretty weird, not clearly related to any Number
> Theory we know or care about.  This sounds a lot more interesting.
> Can you say more?  I couldn't grok your code.

Google for Goodstein sequences.

> Where's the kerosene anyway?  

This was in reference to `just' induction.  One of the Peano axioms is
(informally) 

  ``If a property is possessed by 0 and also by the successor of every
    natural number which possesses it, then it is possessed by all
    natural numbers.''
    http://www.brainyencyclopedia.com/encyclopedia/p/pe/peano_axioms.html

and this, of course, is the foundation for mathematical induction.

But mathematical induction isn't strong enough to show that Goodstein
sequences converge to zero.  The program I wrote halts if Goodstein
sequences converge to zero, therefore you need something stronger than
mathematical induction if you want a comprehensive semantics.

-- 
~jrm
0
6/22/2004 5:19:05 AM
In article <fb74251e.0406211421.690de060@posting.google.com>,
William D Clinger <cesuraSPAM@verizon.net> wrote:
> I haven't been following this, because it looks like a repetition
> of an apparently useless discussion from a couple of years ago,

I should just like to note that the discussion has not been useless. I
for one learned a lot from the previous round, since rarely if ever
have the rationale and technical details of such an esoteric subject
as DS been spelt out in such excruciating detail. Even if this is not
useful to the participants, it may well be to the lurkers.


Lauri
0
la (473)
6/22/2004 6:31:20 AM
Lauri Alanko <la@iki.fi> writes:

> I should just like to note that the discussion has not been useless. I
> for one learned a lot from the previous round, since rarely if ever
> have the rationale and technical details of such an esoteric subject
> as DS been spelt out in such excruciating detail. Even if this is not
> useful to the participants, it may well be to the lurkers.

This is really a weakness of textbooks.  I'm yet to find one that
provides even a rudimentary (technical) introduction to Why before it
dives into What -- a book that says what has been said on this thread,
for instance.  

If anyone knows of a counter-example, please point me to it, because I
sure as heck would love to not have to go over this in person with
each of my students.

Shriram
0
sk1 (223)
6/22/2004 3:33:39 PM
richter@math.northwestern.edu (Bill Richter) writes:

> in my response to Joe:
>
>       Each fixed point gives a potentially *different* answer, so you
>       need a mechanism to choose the one you want.  And that mechanism
>       had better be realizable on a physical computer.
>
> I probably had no idea what Joe meant.   

Consider this function (call it FOO)
  (lambda (f)
    (lambda (n)
      (if (< n 1)
          1
          (* n (f (- n 1))))))

FOO takes a function as an argument and returns a function as a
result.  The argument function must be of type (number -> number) and
the result type will be of type (number -> number).  We could, if we
wanted, call FOO on the cosine function:

> ((foo cos) 3)
-1.2484405096414273
> ((foo cos) .4)
1
> ((foo cos) 1.7)
1.3002317183836305

or perhaps the square-root function:

> ((foo sqrt) 7)
17.146428199482244

Since the result of FOO is the same type as the argument to FOO, we
can call FOO on it, too.

> ((foo (foo cos)) 3)
3.2418138352088386

As many times as you want:

> ((foo (foo (foo (foo (foo (foo sqrt)))))) 6.01)
73.78031367520926

At this point, it would be useful to write a helper:

(define (nest-wrapper wrapper depth)
  (lambda (k)
    (if (zero? depth)
        k
        ((nest-wrapper wrapper (- depth 1)) (wrapper k)))))

So now we can write

> (((nest-wrapper foo 6) sqrt) 6.01)
73.78031367520926

The interesting thing is that the deeper we nest the calls, the
less important the initial function is:

> (((nest-wrapper foo 20) sqrt) 2)
2

> (((nest-wrapper foo 20) cos) 2)
2

and a whole new behavior emerges:

> (((nest-wrapper foo 20) cos) 3)
6

> (((nest-wrapper foo 20) cos) 4)
24

> (((nest-wrapper foo 20) cos) 5)
120

> (((nest-wrapper foo 20) cos) 6)
720

But we only get this behavior if we nest deeply enough:

> (((nest-wrapper foo 3) cos) 6)
-118.79909959205344

Providing that the nesting is deep enough, the initial function
doesn't matter.  The resulting function no longer depends on the
initial function, but derives from FOO itself.  So the question
arises:  what function is generated by FOO in the limit as the nesting
depth goes to infinity?  As you must have noticed by now, 

for any function G, 

   lim   ((nest-wrapper foo n) g)  =  (lambda (x) (factorial x))
  n->inf

Now if N is really, really large, then 

            n times
   (foo (foo (foo (foo ... ))))

is almost the same function as

           n+1 times
   (foo (foo (foo (foo (foo ... )))))

For a function H and a value x, if  H(x) = x, then x is a fixed-point
of H.  So if there is an x so that (foo x) = x, then x is a
fixed-point of foo.

   (foo (lambda (x) (factorial x))) = (lambda (x) (factorial x))

So the factorial function (a function from int->int) is a fixed point
of the FOO function (a function from (int->int)->(int->int) ).  We
note that:

   lim   ((nest-wrapper foo n) g)  =  a fixed point of foo
  n->inf



So what about functions other than FOO?  This function:

    (define bar (lambda (f)
                  (lambda (x)
                    (f x))))

has an infinite number of fixed-points.  

   lim   ((nest-wrapper bar n) g)  =  g
  n->inf

Any function whatsoever (one-arg, one-value function that is) can be
wrapped by BAR and that doesn't change it:

    (bar x) = x 

for all x.

If we wish to use fixed-points to model recursion, we need to specify
which fixed-point to use if there is more than one.  Although the
factorial function is a fixed-point of BAR (as well as FOO), we really
can't expect the computer to compute that particular fixed-point from
BAR.  

>       And that (fixed-point-choosing) mechanism had better be
>       realizable on a physical computer.
>
> That the semantic function must be a limit?  But what does that rule
> out?  Are there any semi-pseudo-semantic function that aren't limits?

Sure.  Make one up.  You could say that if there are multiple fixed
points that the one chosen is the lexicographically least when
expressed in English.  So the fixed point of BAR would be `abs'.  But
you can't build a computer that behaves this way.
0
jrm (1310)
6/22/2004 3:59:56 PM
Bill Richter:
> I don't think it's useless, Will.

Lauri Alanko:
> I should just like to note that the discussion has not been useless.

I stand corrected.  I'm glad to hear the discussion has been useful.

Bill Richter:
> > That's like saying we can't compute pi using a computer, because of
> > the Halting Problem.
> 
> Well, they're both uncomputable, and that's all I meant, in my
> response to Joe:

Pi is one of the computable real numbers.

> > What we *can* compute is an infinite sequence of approximations that
> > converge to pi.  Similarly, we can compute an infinite sequence of
> > approximations that converge to bottom.
> 
> Aha, interesting point.  Does that answer Joe's original point:
> 
>       And that (fixed-point-choosing) mechanism had better be
>       realizable on a physical computer.

Yes.

> That the semantic function must be a limit?  But what does that rule
> out?  Are there any semi-pseudo-semantic function that aren't limits?

That aren't limits of computable approximations?  Yes.  Here's one you
might recognize.

Suppose E [[ * ]] gives the denotational semantics of expressions in
a Scheme-like language, as a continuous partial function from environments
to continuous partial function from continuations to continuous partial
functions from stores to integer answers.  Suppose further that r0, k0,
and s0 are some particular environment, continuation, and store.  Then
the partial function from expressions to integers that is defined by

    GOOD [[ e ]] = E [[ e ]] r0 k0 s0

is computable, and is the limit of a sequence of finitely computable
approximations, but

    BAD [[ e ]]  =  632499472      if E [[ e ]] r0 k0 s0 = bottom
                 =  532028816      otherwise

is not computable, and is not the limit of any sequence of computable
approximations.

Will
0
cesuraSPAM (401)
6/22/2004 7:21:26 PM
Joe Marshall <prunesquallor@comcast.net> responds to me:
 
> This was in reference to `just' induction.  One of the Peano axioms
> is (informally)
> 
>   ``If a property is possessed by 0 and also by the successor of
>   every natural number which possesses it, then it is possessed by
>   all natural numbers.''
> 
> and this, of course, is the foundation for mathematical induction.

Joe, that's an important mistake.  The Peano induction axiom is only
for statements expressed in the language of Peano arithmetic.  And the
Peano language is so limited that you can't get a lot of induction
that way.  So the Peano induction axiom is NOT the foundation for
mathematical induction.  The foundation is the much more powerful ZFC
axioms, and it yields the statement I think was in my Algebra II text
in High School: 

Given mathematical statements S(n), n = 0, 1, 2,...  if S(0) is true,
and S(n) => S(n+1), then S(n) is true for all n.

You probably know all this, because CS folks generally know about
nonstandard models of (Peano) arithmetic.  That is, not every element
of the model is a successor of 0.  But why not?  We can write an easy
inductive proof of this:

S(n):  for all n (n is a successor of 0)

S(0) is true!

S(n) => S(n+1) is true, because by definition, n+1 is the
		        successor of n.

But that isn't a proof because we can't state "n is a successor of 0"
in Peano arithmetic.  The problem is that we can't form the iterated
composites 

Successor^n: N ---> N 

And that's exactly the sort of induction that I want to use to show
the standard reduction function is well-defined.  That is, given the
1-step reduction function R, we want to form the iterated composites

R^n: LC_v Expressions ---> LC_v Expressions

It's obvious how to form these (partial) function R^n with induction,
because by definition, 

R^{n+1} = R . R^n

But you can't say that in Peano arithmetic.  

> But mathematical induction isn't strong enough to show that
> Goodstein sequences converge to zero.  The program I wrote halts if
> Goodstein sequences converge to zero, therefore you need something
> stronger than mathematical induction if you want a comprehensive
> semantics.

OK, you haven't established the last claim, as I showed above.  The
rest is extremely interesting.  Are you sure it's not just Peano
induction that's not strong enough?  Why don't you start up a separate
thread about Goodstein sequences?

Anyway, I want to congratulate you on coming up with a meaningful
mathematical objection to my claim (standard reduction follows
from "just induction").  I want to buy back something I wrote earlier:


> >> >  I think the math folks are in the habit of saying "just use
> >> > induction."  They don't say, "use CPOs or Scott models",
> >> > because they don't know what they are!
> >> 
> >> Dare I throw kerosene on the fire?
> >> 
> >> It is possible to show that:
> >> 
> >>   The function GOODSTEIN terminates for all positive integer values
> >>   for seed.
> >> 
> >>   That the proof for the above statement is independent of the Peano
> >>   axioms.
> >
> > That's hot, Joe!  My favorite set theory book, by Just & Weese,
> > says that prior to Goedel incompleteness, no one had ever proved a
> > theorem in Number Theory that independent of the Peano axioms.

I think that's a goof on my part.  What Just & Weese must have meant,
and maybe they wrote it, is that Goedel was the first person to prove
that he had something independent of the Peano axioms.  But number
theorists had been using complex analysis for a hundred years!  It's
pretty easy to believe that they proved with complex analysis some
theorem in number theory that was independent of the Peano axioms.
They just hadn't proved independence.
0
richter1974 (312)
6/23/2004 3:14:01 AM
cesuraSPAM@verizon.net (William D Clinger) responds to me:

> Pi is one of the computable real numbers.

Oops, I think I see my goof now, Will.  Doe computable means we can
write an integer lambda expression f s.t. (f n) is the n-th digit of
the decimal expansion of pi?  And isn't there a continued fraction
expansion of pi?  The point I would know is that since there's a
countable number of integer lambda expressions, almost all real
numbers must not be computable real numbers.

> > > What we *can* compute is an infinite sequence of approximations
> > > that converge to pi.  Similarly, we can compute an infinite
> > > sequence of approximations that converge to bottom.

I think I misunderstood you the 1st time around.  Obviously any real
number is an infinite sequence of computable numbers (the finite
decimal approximations).  But I think now you mean that the entire
sequence is computable, so you have a computable function g s.t.  
g(n) converges to pi.  

And I guess you similarly mean you have a computable function F
s.t. F(n) converges to a good DS semantic function?  I didn't know
that either.  That's sounds like `realizable on a physical computer.'

> > Aha, interesting point.  Does that answer Joe's original point:
> > 
> >       And that (fixed-point-choosing) mechanism had better be
> >       realizable on a physical computer.
> 
> Yes.

Cool!  I don't know why it's true, but I see it's meaningful.

> > That the semantic function must be a limit?  But what does that rule
> > out?  Are there any semi-pseudo-semantic function that aren't limits?
> 
> That aren't limits of computable approximations?  Yes.  Here's one you
> might recognize.
> 
> Suppose E [[ * ]] gives the denotational semantics of expressions in
> a Scheme-like language, as a continuous partial function from environments
> to continuous partial function from continuations to continuous partial
> functions from stores to integer answers.  Suppose further that r0, k0,
> and s0 are some particular environment, continuation, and store.  Then
> the partial function from expressions to integers that is defined by
> 
>     GOOD [[ e ]] = E [[ e ]] r0 k0 s0
> 
> is computable, 

Will, either I disagree strongly, or I don't understand you.  If you
mean GOOD is computable in the above sense (there's a computable
function G s.t. G(n) converges to GOOD) then I didn't know that, but
I'll take your word for it.  Sounds interesting.  But I say the
Halting problem says GOOD is not a computable function.

Let's just check. E is a function

E: Expressions ---> (U x K x S -> Integers)

and Good is the "program" version of this function 

Good: Expressions ---> Integers_bottom

obtained in R5RS DS fashion as you say.  So send the Integers to 1 and
bottom to 0, and we have a (total) function 

BooleanGood: Expressions ---> {0, 1}

which sends a program to 1 if and only if the program does not halt.
The Halting problem says there is no such such computable function
BooleanGood.

> and is the limit of a sequence of finitely computable
> approximations, but
> 
>     BAD [[ e ]]  =  632499472      if E [[ e ]] r0 k0 s0 = bottom
>                  =  532028816      otherwise
> 
> is not computable, and is not the limit of any sequence of computable
> approximations.

That's pretty interesting!  I don't know what that's true.

Will, there's no question that you a ton of stuff that I don't know.
Really useful CS stuff.  And the stuff I know is of no apparent use.
I'd be grateful if you'd run circles around me.  You can certainly
teach me a lot.  But why don't you vote on what Lauri wrote:

> In Schmidt, p. 51, the existence of the meaning functions is proven
> by structural induction. That's what "compositional" means: defined
> with structural induction.

I didn't give Lauri enough credit in my last post for exposing my
error: I just wrote down some equations for E to satisfy, without
showing giving an inductive argument relating everything back to the
base case.  In fact, I could not have, because of infinite loops!  I
think that's a long standing error of mine, which Lauri corrected.

But there's NO way I'll believe that Schmidt defines compositional to
mean proven by structural induction.  That's not the way mathematical
definitions are phrased.  And it's not what Cartwright and Felleisen
say in their "Extensible Denotational" paper.  They say:

    [The map from syntactic domains to semantic domains] satisfies the
    law of compositionality: the interpretation of a phrase is a
    function of the interpretation of the sub-phrases.

So for a DS valuation function

E:  Expressions ---> M

compositionality means e.g. there must be a function 

Phi: M x M ---> M

such that for a pair of expressions (X, Y), 

E[[ (X Y) ]] = Phi( E[[ X ]], E[[ Y ]] )

Now if you want to say that you don't believe that one could define
such a compositional E without Scott models, CPOs & structural
induction, fine, that's a reasonable mathematical opinion.  I think
the standard reduction function gives a counterexample, and hope we
get to the "bottom" of that.  But you can't just insert the phrase
"you have to define E by structural induction" into the definition.
0
richter1974 (312)
6/23/2004 3:49:00 AM
Bill Richter wrote:

> cesuraSPAM@verizon.net (William D Clinger) responds to me:

>>Pi is one of the computable real numbers.

> Oops, I think I see my goof now, Will.  Doe computable means we can
> write an integer lambda expression f s.t. (f n) is the n-th digit of
> the decimal expansion of pi?  

Almost. It means that you can make a Turing machine that writes
the digits of pi on a tape. The machine will never terminate,
but for any N you can be sure that it will at some point have written
more than N digits on the tape, if you wait long enough.

As it happens, it is possible to compute the n'th (for a fixed n)
decimal digit of pi (without calculating the other digits). This
surprising result is due to developments by the Borwein brothers
and Fabrice Bellard (author of tcc and more). Even for large n
it is possible to do the calculation with floating point.

> The point I would know is that since there's a
> countable number of integer lambda expressions, almost all real
> numbers must not be computable real numbers.

Exactly.


;;;
;;; Computation of the n'th digit of pi in any base 10
;;;
;;; Jens Axel S�gaard, 27. maj 2001, a dull sunday
;;;
;;; Reference: "Computation of the n'th digit of pi in any base in O(n^2)"
;;;            by Fabrice Bellard, 1997
;;;            <http://www.stud.enst.fr/~bellard/pi/pi_n2/pi_n2.html>

;;
;;  MAIN
;;

; returns the n'th digit of pi in base 10

(define (pi-digit n)
   (let* ((N (inexact->exact (truncate (* (+ n 20) (/ (log 10) (log 2)))))))
     (letrec ((for-p
               (lambda (p sum)
                 ;; termination condition
                 (if (> p (* 2 N))
                     sum
                     (let* ((vmax    (inexact->exact (truncate (/ (log (* 2 N)) (log p)))))
                            (modulus (expt p vmax)))
                       (letrec ((for-k
                                 (lambda (k b A v alpha)
                                   (if (> k N)
                                       alpha
                                       (let* ((v (+ (- v (multiplicity p k))
                                                    (multiplicity p (- (* 2 k) 1))))
                                              (A  (modulo (/ (* (- (* 2 k) 1)
                                                                A)
                                                             (expt p (multiplicity p (- (* 2 k) 1))))
                                                          modulus))
                                              (b (modulo (* b (/ k (expt p (multiplicity p k)))) 
modulus)))

                                         (for-k (+ k 1) b A v
                                                (if (> v 0)
                                                    (+ alpha (* k b (inv-mod A modulus) (expt p (- 
vmax v))))
                                                    alpha)))))))
                         (for-p (next-prime p)
                                (mod1 (+ sum
                                         (/ (modulo (* (expt-mod 10 (- n 1) modulus)
                                                       (for-k 1 1 1 0 0))
                                                    modulus)
                                            modulus))))))))))
       (first-digit (for-p 3 0)))))

;;
;; Arithmetical utilities
;;


; Extended version of Euclids algoritm for finding the gcd
; returns
;        (g,a,b)  s.t.  g=gcd(m,n)  and  am+bm=g

(define (gcd-ext m n)
   (letrec ((helper (lambda (m n am bm an bn)
                      (if (> m n)
                          (helper (- m n) n (- am an) (- bm bn) an bn)
                          (if (> n m)
                              (helper m (- n m) am bm (- an am) (- bn bm))
                              (cons m (cons am (cons bm '()))))))))
     (helper m n 1 0 0 1)))


;;  b  s.t.  ab = 1 mod n    ( note: requires gcd(a,n)=1 )
(define inv-mod
   (lambda (a n)
     (cadr (gcd-ext a n))))


;;  base ^ exp  mod    (Abelson et al. p.51)
(define (expt-mod base exp m)
   (cond ((= exp 0) 1)
         ((even? exp)
          (remainder (square (expt-mod base (/ exp 2) m))
                     m))
         (else
          (remainder (* base (expt-mod base (- exp 1) m))
                     m))))

(define (square x) (* x x))


;;;
;;; PRIME RELATED  (Abelson et al. p.50)
;;;

;;; TODO: Replace prime? from O(sqrt(n)) to O(log n) algorithm

(define (smallest-divisor n)
   (find-divisor n 2))

(define (find-divisor n test-divisor)
   (cond ((> (square test-divisor) n) n)
         ((divides? test-divisor n) test-divisor)
         (else (find-divisor n (+ test-divisor 1)))))

(define (divides? a b)
   (= (remainder b a) 0))

(define (prime? n)
   (= n (smallest-divisor n)))

(define (next-prime n)
   (if (= n 2)
       3
       (if (prime? (+ n 2))
           (+ n 2)
           (next-prime (+ n 2)))))

;;
;; FACTORISE
;;

; Example:  (factor 45) -> ((3 2) (5 1))

(define (factor n)
   (factor-from n 2))

(define (factor-from  n f)
   (if (= n 1)
       '()
       (if (divides? f n)
           (cons (list f (multiplicity f n))
                 (factor-from (/ n (expt f (multiplicity f n))) (next-prime f)))
           (factor-from n (next-prime f)))))

(define (multiplicity p n)
   (if (divides? p n)
       (+ 1 (multiplicity p (/ n p)))
       0))


; Utilies

(define (mod1 x)
   (- x (floor x)))

(define (first-digit x)
   (inexact->exact (floor (* 10 (mod1 x)))))


;;
;; TEST
;;

(define (interval from to)
   (if (> from to)
       '()
       (cons from (interval (+ 1 from) to))))


; guile> (map pi-digit (interval 1 100))

; (1 4 1 5 9 2 6 5 3 5
;  8 9 7 9 3 2 3 8 4 6
;  2 6 4 3 3 8 3 2 7 9
;  5 0 2 8 8 4 1 9 7 1
;  6 9 3 9 9 3 7 5 1 0
;  5 8 2 0 9 7 4 9 4 4
;  5 9 2 3 0 7 8 1 6 4
;  0 6 2 8 6 2 0 8 9 9
;  8 6 2 8 0 3 4 8 2 5
;  3 4 2 1 1 7 0 6 7 9)

-- 
Jens Axel S�gaard
0
usenet8944 (1130)
6/23/2004 6:53:56 AM
In article <57189ce0.0406221949.62937709@posting.google.com>,
Bill Richter <richter@math.northwestern.edu> wrote:
> cesuraSPAM@verizon.net (William D Clinger) responds to me:
> > the partial function from expressions to integers that is defined by
> > 
> >     GOOD [[ e ]] = E [[ e ]] r0 k0 s0
> > 
> > is computable, 
> 
> Will, either I disagree strongly, or I don't understand you.

Note the "partial". Here the function, instead of returning bottom,
simply is not defined on nonterminating expressions.

> > In Schmidt, p. 51, the existence of the meaning functions is proven
> > by structural induction. That's what "compositional" means: defined
> > with structural induction.

> But there's NO way I'll believe that Schmidt defines compositional to
> mean proven by structural induction.

True, Schmidt didn't say that. That was an interpretation of mine. And
when I said "means", I didn't mean "is defined to be", but rather
"amounts to in practice". Also, I should perhaps have said "recursion"
instead of "induction". Not that there is very much difference,
though.

>     [The map from syntactic domains to semantic domains] satisfies the
>     law of compositionality: the interpretation of a phrase is a
>     function of the interpretation of the sub-phrases.

To my mind, this is exactly what structural recursion is about. I
don't quite see what the big difference is supposed to be.

The point of my comment was, in any case, simply that compositionality
is not an arbitrary restriction. In fact, it is a _license_: you are
_allowed_ to use the meanings of subphrases when defining the meaning
of a phrase, because syntax trees are finite and induction on the
height of a tree is well-founded. If it weren't for this fact, you
couldn't use E _at all_ in the definition of E.


Lauri Alanko
la@iki.fi
0
la (473)
6/23/2004 7:55:07 AM
Lauri Alanko <la@iki.fi> writes:

> The point of my comment was, in any case, simply that compositionality
> is not an arbitrary restriction. In fact, it is a _license_: you are
> _allowed_ to use the meanings of subphrases when defining the meaning
> of a phrase, because syntax trees are finite and induction on the
> height of a tree is well-founded. If it weren't for this fact, you
> couldn't use E _at all_ in the definition of E.

And you aren't allowed to use E on other things *unless* you can prove
the induction is well-founded.  So things like  
   E [[ E [[ (lambda (x) x) ]] ]] are not valid.


-- 
~jrm
0
6/23/2004 10:12:48 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <prunesquallor@comcast.net> responds to me:
>  
>> This was in reference to `just' induction.  One of the Peano axioms
>> is (informally)
>> 
>>   ``If a property is possessed by 0 and also by the successor of
>>   every natural number which possesses it, then it is possessed by
>>   all natural numbers.''
>> 
>> and this, of course, is the foundation for mathematical induction.
>
> Joe, that's an important mistake.  The Peano induction axiom is only
> for statements expressed in the language of Peano arithmetic.  And the
> Peano language is so limited that you can't get a lot of induction
> that way.  So the Peano induction axiom is NOT the foundation for
> mathematical induction.  

I should have been more precise.  The Peano induction axiom allows you
to use mathematical induction in Peano arithmetic.   

> The foundation is the much more powerful ZFC axioms ...

Right, but these give you other kinds of induction that aren't
applicable.  So we take the induction that is applicable and declare
it to be an axiom in Peano arithmetic.

>> But mathematical induction isn't strong enough to show that
>> Goodstein sequences converge to zero.  The program I wrote halts if
>> Goodstein sequences converge to zero, therefore you need something
>> stronger than mathematical induction if you want a comprehensive
>> semantics.
>
> OK, you haven't established the last claim, as I showed above.  The
> rest is extremely interesting.  Are you sure it's not just Peano
> induction that's not strong enough?  

It's Peano induction that is not strong enough.  You need transfinite
induction.

-- 
~jrm
0
6/23/2004 10:25:46 AM
In article <hdt2vazl.fsf@comcast.net>,
Joe Marshall  <prunesquallor@comcast.net> wrote:
> And you aren't allowed to use E on other things *unless* you can prove
> the induction is well-founded.  So things like  
>    E [[ E [[ (lambda (x) x) ]] ]] are not valid.

This, incidentally, is one point where the usefulness of denotational
semantics seems to end: there is no sensible denotational definition
for an eval function.


Lauri
0
la (473)
6/23/2004 10:41:36 AM
richter@math.northwestern.edu (Bill Richter) wrote:
> > Suppose E [[ * ]] gives the denotational semantics of expressions in
> > a Scheme-like language, as a continuous partial function from environments
> > to continuous partial function from continuations to continuous partial
> > functions from stores to integer answers.  Suppose further that r0, k0,
> > and s0 are some particular environment, continuation, and store.  Then
> > the partial function from expressions to integers that is defined by
> > 
> >     GOOD [[ e ]] = E [[ e ]] r0 k0 s0
> > 
> > is computable, 
> 
> Will, either I disagree strongly, or I don't understand you.  If you
> mean GOOD is computable in the above sense (there's a computable
> function G s.t. G(n) converges to GOOD) then I didn't know that, but
> I'll take your word for it.  Sounds interesting.  But I say the
> Halting problem says GOOD is not a computable function.

GOOD is a computable partial function.  It is not a computable total
function because it isn't a total function.  You are consistently
failing to distinguish total from partial functions, and this is a
consistent flaw in your arguments.

> >     BAD [[ e ]]  =  632499472      if E [[ e ]] r0 k0 s0 = bottom
> >                  =  532028816      otherwise
> > 
> > is not computable, and is not the limit of any sequence of computable
> > approximations.
> 
> That's pretty interesting!  I don't know what that's true.

BAD solves the halting problem.  If e halts, then BAD [[ e ]] is
532028816.  If e does not halt, then BAD [[ e ]] is 632499472.
BAD is also a total function.  No computable total function solves
the halting problem.  Therefore BAD is not computable.

In the Scott domains that you have mentioned many times, the limit
of a sequence of computable approximations is itself computable.
Therefore BAD is not the limit of a sequence of computable
approximations.

> > In Schmidt, p. 51, the existence of the meaning functions is proven
> > by structural induction. That's what "compositional" means: defined
> > with structural induction.

Lauri's statement is plain enough.

> But there's NO way I'll believe that Schmidt defines compositional to
> mean proven by structural induction.

He defines compositional to mean *defined* by structural induction.
This implies that we can do proofs by structural induction, but that's
a consequence of Schmidt's definition, not his definition itself.

> That's not the way mathematical definitions are phrased.

You talk like you've seen a ghost.

HORATIO:  O day and night, but this is wondrous strange!
HAMLET:   And therefore as a stranger give it welcome.
          There are more things in heaven and earth, Horatio,
          Than are dreamt of in your philosophy....

> And it's not what Cartwright and Felleisen
> say in their "Extensible Denotational" paper.  They say:
> 
>     [The map from syntactic domains to semantic domains] satisfies the
>     law of compositionality: the interpretation of a phrase is a
>     function of the interpretation of the sub-phrases.

That's almost the same as Schmidt's definition, and the difference
is a very minor mistake by Cartwright and Felleisen:  They should
have said the interpretation of a phrase is defined by a function
of the interpretation of the sub-phrases.

> Now if you want to say that you don't believe that one could define
> such a compositional E without Scott models, CPOs & structural
> induction, fine, that's a reasonable mathematical opinion.

There are zillions of compositional definitions of one E or another
that don't rely on Scott models or CPOs.  There are no compositional
definitions that don't use structural induction, because structural
induction is implied by the meaning of "compositional".

> But you can't just insert the phrase
> "you have to define E by structural induction" into the definition.

So you say.  But Schmidt did it.  Stoy did it.  Strachey did it.
Lots of other people have done it.  All of these people provided
counterexamples by doing exactly what you say they can't do.

On the most basic level, the word "compositional" refers to the
definition of the function.  If a function's definition is
compositional, then the function will have certain properties,
and it is reasonable to allude to those properties by saying the
function is compositional, but that's a minor abuse of language.
It's really the function's definition that is compositional.

Will
0
cesuraSPAM (401)
6/23/2004 2:18:56 PM
Lauri Alanko <la@iki.fi> responds to me:

> > > In Schmidt, p. 51, the existence of the meaning functions is
> > > proven by structural induction. That's what "compositional"
> > > means: defined with structural induction.
>  
> > But there's NO way I'll believe that Schmidt defines compositional
> > to mean proven by structural induction.
> 
> True, Schmidt didn't say that. That was an interpretation of mine. And
> when I said "means", I didn't mean "is defined to be", but rather
> "amounts to in practice". 

Excellent, Lauri!!!  So do we agree that for a DS semantic function E:
Expressions ---> M
is compositional iff various functions exist and equations hold.
E.g. there must be a function Phi: M x M ---> M
s.t. for expressions X & Y, E[[ (X Y) ]] = Phi( E[[ X ]], E[[ Y ]] ) ?

> >     [The map from syntactic domains to semantic domains] satisfies
> >     the law of compositionality: the interpretation of a phrase is
> >     a function of the interpretation of the sub-phrases.
> 
> To my mind, this is exactly what structural recursion is about. I
> don't quite see what the big difference is supposed to be.

Big difference between Schmidt & Cartwright-Felleisen?  I don't think
there's any difference.  I just don't have Schmidt's book anymore.

> The point of my comment was, in any case, simply that
> compositionality is not an arbitrary restriction. In fact, it is a
> _license_: you are _allowed_ to use the meanings of subphrases when
> defining the meaning of a phrase, because syntax trees are finite
> and induction on the height of a tree is well-founded. If it weren't
> for this fact, you couldn't use E _at all_ in the definition of E.

I think I understand you: we can define E and prove compositionality
at the same time: by your structured induction.  That's fine. 

I'm just asserting that one could define DS semantic functions E
without this structured induction, and then prove compositionality
later, so we'd construct the functions like Phi later.

> > cesuraSPAM@verizon.net (William D Clinger) responds to me:

> > > the partial function from expressions to integers that is defined by
> > > 
> > >     GOOD [[ e ]] = E [[ e ]] r0 k0 s0
> > > 
> > > is computable, 
> > 
> > Will, either I disagree strongly, or I don't understand you.
> 
> Note the "partial". Here the function, instead of returning bottom,
> simply is not defined on nonterminating expressions.

Ah, excellent.  I just got so much in the habit of thinking that DS
semantic functions were total functions.  But Will didn't assert that
GOOD was a DS semantic function, so it doesn't have to be total.  

One of the first things that Schmidt says is that a partial function

f: X ---> Y

is the same thing as a total function 

g: X ---> Y_bottom

But of course we can consider partial functions

h: X ---> Y_bottom

And Will gave a good reason to do so.  GOOD is a computable partial
function, because it's a mathematization of a Scheme-like interpreter.
0
richter1974 (312)
6/24/2004 12:01:43 AM
cesuraSPAM@verizon.net (William D Clinger) responds to me:

> You are consistently failing to distinguish total from partial
> functions, and this is a consistent flaw in your arguments.

I certainly failed here, Will, but I don't see yet that's it's a
serious flaw.  My interest in your example is that it points out the
importance of the Scott topology.  Your convergence to E is only
meaningful because of a topology.  That's not something I've done.

> In the Scott domains that you have mentioned many times, the limit
> of a sequence of computable approximations is itself computable.

I didn't know that either, and I think sounds great!  I'll go read
some Barendregt tonight.

> On the most basic level, the word "compositional" refers to the
> definition of the function.  If a function's definition is
> compositional, then the function will have certain properties, and
> it is reasonable to allude to those properties by saying the
> function is compositional, but that's a minor abuse of language.
> It's really the function's definition that is compositional.

I'm sure this is false.  Can you quote me from these sources:

>So you say.  But Schmidt did it.  Stoy did it.  Strachey did it.
>Lots of other people have done it.

And can you explain how you'd rule anything out by your definition?
That is, suppose I can define a function E without structured
induction, and then later construct the various Phi functions that
satisfy what I think the compositionality equations are. So e.g.  
Phi: M x M ---> M & forall X & Y, E[ (X Y) ] = Phi( E[ X ], E[ Y ] )

Now forget E altogether, and let's use my Phi functions to build a new
function E' by structured induction using my (already contructed) Phi
functions.  So isn't E' function-definition-compositional, and don't
we see by structured induction that E' = E?

> HORATIO:  O day and night, but this is wondrous strange!
> HAMLET:   And therefore as a stranger give it welcome.
>           There are more things in heaven and earth, Horatio,
>           Than are dreamt of in your philosophy....

Good quote, and partly it means that when you make a mathematical
definition, folks use it in ways you didn't consider.

> > And it's not what Cartwright and Felleisen
> > say in their "Extensible Denotational" paper.  They say:
> > 
> >     [The map from syntactic domains to semantic domains] satisfies the
> >     law of compositionality: the interpretation of a phrase is a
> >     function of the interpretation of the sub-phrases.
> 
> That's almost the same as Schmidt's definition, and the difference
> is a very minor mistake by Cartwright and Felleisen:  They should
> have said the interpretation of a phrase is defined by a function
> of the interpretation of the sub-phrases.

I think you're wrong.  I think C-F said what they meant.  But I like
your precision.  If we can stick in the 2 extra words "defined by",
then you're right.
0
richter1974 (312)
6/24/2004 12:36:40 AM
richter@math.northwestern.edu (Bill Richter) wrote:
> > You are consistently failing to distinguish total from partial
> > functions, and this is a consistent flaw in your arguments.
> 
> I certainly failed here, Will, but I don't see yet that's it's a
> serious flaw.

Of course you don't.

> > On the most basic level, the word "compositional" refers to the
> > definition of the function.  If a function's definition is
> > compositional, then the function will have certain properties, and
> > it is reasonable to allude to those properties by saying the
> > function is compositional, but that's a minor abuse of language.
> > It's really the function's definition that is compositional.
> 
> I'm sure this is false.

Of course you are.

> Can you quote me from these sources:
> 
> >So you say.  But Schmidt did it.  Stoy did it.  Strachey did it.
> >Lots of other people have done it.

Of course I can.

> And can you explain how you'd rule anything out by your definition?

Of course.  For examples of semantics or pseudo-semantics that aren't
compositional, search this newsgroup's archives for posts that contain
both "Richter" and "compositional".

> > That's almost the same as Schmidt's definition, and the difference
> > is a very minor mistake by Cartwright and Felleisen:  They should
> > have said the interpretation of a phrase is defined by a function
> > of the interpretation of the sub-phrases.
> 
> I think you're wrong.

Of course you think I'm wrong.  The essence of your argument is that
the form of a definition cannot be a mathematically relevant concept.

That's plumb crazy.  Go look up the definition of a finitely presented
group.  Go look up the definition of primitive recursion.

Will
0
cesuraSPAM (401)
6/24/2004 7:44:19 PM
cesuraSPAM@verizon.net (William D Clinger) heatedly responded to me:

> The essence of your argument is that the form of a definition cannot
> be a mathematically relevant concept.

That's probably a fair charge, Will, so let me recant.  Maybe
compositionality can be defined your way, and I try to do so below,
and that would refute numerous claims of mine.

But what's the big fuss?  In my last post, I semi-proved that

   a function's DEFINITION is compositional (in your sense)

   if and only if 

   the FUNCTION is compositional (in my sense)

You didn't respond to my proof, so I'll do it again, and first I'll
actually try to mathematize your definition:

   Definition: A function 

   E: Expressions ---> M 

   is called compositional iff there exist functions such as

   Phi: M x M ---> M   ;; for combinations

   so that E is defined by structural induction by the various Phi
   functions.

And then we have the immediate result: 

   Lemma: If E is compositional, and compositionally defined by
   functions such as Phi, then for all expressions X & Y,

   E[ (X Y) ] = Phi( E[ X ], E[ Y ] )

I'd call that a meaningful mathematical definition.  Is that
acceptable to you?  I assert this is equivalent to my definition:

   Definition: A function 

   E: Expressions ---> M 

   is called compositional iff there exist functions such as

   Phi: M x M ---> M 

   satisfying equations like

   E[ (X Y) ] = Phi( E[X], E[Y] )  for all expressions X & Y.


As I proved in my last post, my definition is equivalent to yours.
Given my E & Phi functions, use the Phi functions to define by
structural induction a new function E', which is compositional by your
definition.   Then by structural induction, we see E' = E.

 
Let me make a heat comment now: we didn't get into this argument
because I said you guys didn't know the def of compositionality.
Instead, you guys ruled out my standard-reduction-type semantic
functions because you said they didn't satisfy compositionality.  So
if we agree that my def and yours are equivalent, we can use your def.


> That's plumb crazy.  Go look up the definition of a finitely presented
> group.  Go look up the definition of primitive recursion.

Now I would say in neither definition is in terms of "the form of a
definition."  Maybe that's a judgment call.  Here's my defs:

  Definition: G is a finitely GENERATED group if there is a surjective
  map F -->> G from a free group on a finite number of generators.

  Definition: G is a finitely PRESENTED group if there is a surjective
  map f: F -->> G from a free group on a finite number of generators
  such that the kernel of f is a finitely generated group.

  Definition: B-J's class of PARTIAL recursive functions N^k ---> N is
  the class inductively generated from the n projection maps

  pi_i: N^k ---> N

  via the operations of composition, primitive recursion

  f(x, 0) = alpha(x)
  f(x, n) = f(beta(x), n-1) for n > 0

  and minimizing recursion

  f(x) = least n such that g(x,n) = 0, with g(x,i) defined for i < n.


Furthermore, there's 2 good pieces of evidence that my definition of
compositionality is the one that's used in the literature:

1) my quoted from C-F (into which you wanted to insert 2 words)

2) Lauri agreed with my recollection that Schmidt used my definition.

Now I don't have Schmidt's book anymore, but here's a quote I dug up
from the last go-round, from Stephen Bevan:

> "the meaning of a syntax phrase may only be defined in terms of the
> meanings of its proper subparts"

That's evidence for your position, but I wouldn't say it's compelling
evidence.  It could be evidence for my position as well. But Stephen's
quote is consistent with the way you wanted to modify C-F:

     A function satisfies the law of compositionality if the
     interpretation of a phrase is DEFINED BY a function of the
     interpretation of the sub-phrases.

And to you that means:

     A function satisfies the law of compositionality if it's defined
     in a certain way by structural induction. 

To me, that's an odd way to talk.  If a function satisfies a law, why
not express the law in terms the function, and not the function's
definition?  I'd call that an improvement, if it was possible.
Math & Scheme move in the direction of greater abstraction.


> > And can you explain how you'd rule anything out by your definition?
> 
> Of course.  For examples of semantics or pseudo-semantics that aren't
> compositional, search this newsgroup's archives for posts that contain
> both "Richter" and "compositional".

:D That might be technically correct, as I've often failed to
correctly define my semantic functions.  I've only succeeded twice: I
correctly defined a standard-reduction-type semantic function on 26
Jun 2002, and then recently in regards to Advanced Scheme.  But in
neither post did I explain why my function was compositional, and with
Advanced Scheme, I'm not even sure it's true, because of global
variable renaming.  Probably that's a solvable problem...


> > > You are consistently failing to distinguish total from partial
> > > functions, and this is a consistent flaw in your arguments.

Will, you've made 2 general charges about my posts, and this is one of
them.  The other is my talk about Scott models.  Let me address both:

******** Scott models of LC ********

I'm definitely not an expert in how Scott models are useful in DS.  If
I ever gave that impression, I apologize.  The one thing that I've
posted (often) is that Scott models are hard!  I actually know what
the topology is!  P(omega) is a non-Hausdorff version of a Cantor set.
And I've been through a Barendregt's proof that P = (P -> P).  Now
Will, I imagine that you've replicated these 2 feats of mine, but I
bet most folks here haven't.  Stoy proves these results, and I bet you
slogged through the proofs.  Furthermore, I think you can say a lot
about how Scott models.  Other than to merely get P = (P -> P) so we
can define procedures as functions on values.  I only post my topology
knowledge to give folks pause: this isn't chopped liver!  We're using
serious math here, and we hope there are serious rewards.

******** Total-vs-Partial ******** 

I thought that Schmidt requires all DS semantic functions to be total
functions, where the target has a bottom.  I thought everyone here
insists on that convention as well.   

I thought I had a big breakthrough last go-round, when I finally
realized that the R5RS DS semantic function curly-E can not possible
be "an algorithm" (as I'd earlier many times posted), because the
Halting problem says no such total function is computable.

Furthermore, I'm only interested in total functions myself in trying
to explain the semantics of a language.  I want a total function 

E: Expressions ---> M 

where M has a bottom element.  I want to describe mathematically the
inverse image of bottom.  So I like Schmidt's total convention.

Having said that, I've been sloppy about partial vs total functions in
my recent posts, but only to be more clear.   Now your post took me by
surprise, when you gave a computable partial function

E: Expressions ---> Z

I hadn't ever considered anything like that in my posts, and I didn't
think my audience was either.
0
richter1974 (312)
6/25/2004 4:26:56 AM
richter@math.northwestern.edu (Bill Richter) writes:

> But what's the big fuss?  In my last post, I semi-proved that
> [...]
> You didn't respond to my proof [...]

In a paragraph it went from a "semi-proof" to a "proof"?

Geez, Bill, I happen to know that you're a pretty smart mathematician.
But the standards you've adopted on this newsgroup are so sloppy I
wouldn't accept them coming even from my undergrad students.  Do you
really think you'd get away with this sort of thing in the math
community?  This suggests that you either

- think the people on this ng are so mathematically illiterate that
you can throw around a few symbols and they'll be impressed (and
recant their former ways)

- think ng posts don't need to meet the levels of rigor of a
manuscript

- think "math math" has different, more rigorous, standards than "CS
xmath"

- other options that involve typing in lieu of thought

The right way to address a question like this is not to post ad
infinitum to a newsgroup until your audience has been ground down to
exhaustion (which, I note, you've now accomplished twice in this
millennium, an impressive accomplishment).  Let me make a suggestion.

The right way is to prepare a manuscript.  Clearly you know how to
write one.  I also have a feeling that when you're writing in TeX
content that will appear printed on paper and be read in serious
referee channels, you automatically police your sloppiness.

When your manuscript is done, and is of a level of rigor that you
would not be ashamed to submit it to a rigorous math venue, get in
touch.  I'd be happy to suggest a few journals to which you can submit
it for formal, refereed reviews.  So would others on this ng.

Shriram
0
sk1 (223)
6/25/2004 12:49:39 PM
richter@math.northwestern.edu (Bill Richter) writes:

> cesuraSPAM@verizon.net (William D Clinger) heatedly responded to me:
>
>> The essence of your argument is that the form of a definition cannot
>> be a mathematically relevant concept.
>

In particular you need to distinguish between `definition' and
`specification'.  I can say the following:

                   /  42,   if  x = 0
            F(x) = |
                   \ F(x+1),  otherwise

This is not a definition of F, but rather a condition that a putative
F must satisfy.  If I need a definition, I must find a function that
satisfies the specification.  For some purposes it may be sufficient
to simply prove that a function exists, for other purposes it may be
necessary to prove that a unique function meets the specification.

If one can prove that F exists and is unique, then the specification
is about as good as a definition, but in some cases (like when
defining an interpreter) you need to be able to specify F in closed
form as a combination of primitive operators so you can create an
algorithm that computes F.

If F the argument to F is created through the application of a finite
number of composition steps applied to a finite number of primitive
elements, then a specification of the form

   F ( composition(a, b, c, d) ) = 
          different-composition(F(a), F(b), F(c), F(d))

  where a, b, c, and d are primitive elements (or compositions
  thereof) and the composition steps are unrelated to F.

can suffice as a definition because you can obviously create a closed
form by substitution.  This is often done with *expressions* because
expressions in a computer language are usually composed of primitive
expression elements and computer programs are generally finite.  The
technique does not generalize to composition steps dependent upon F.

> You didn't respond to my proof, so I'll do it again, and first I'll
> actually try to mathematize your definition:
>
>    Definition: A function 
>
>    E: Expressions ---> M 
>
>    is called compositional iff there exist functions such as
>
>    Phi: M x M ---> M   ;; for combinations
>
>    so that E is defined by structural induction by the various Phi
>    functions.

This isn't acceptable to me unless Phi satisfies the requirement that
it is *independent* of E.  Furthermore, you are doing the induction on
the wrong term!  

    Definition: A function 

    E: Expr ---> M 

    is called compositional iff there exist function pairs
    *independent of E* such as

    Psi: Expr x Expr ---> Expr
    Phi: M x M ---> M   ;; for combinations

    so that if   Psi (e1, e2) = e3, E(e1) = m1, E(e2) = m2, then 
    E(e3) = E (Psi (e1, e2)) = Phi (E(e1), E(e2)) = Phi (m1, m2)

    Furthermore, it must be possible to express any Expr as one of a
    set of primitive Exprs or the result of a finite number of Psis
    applied to primitive Exprs.

Now this may not be quite the standard definition of the term
`compositional', but it is what we have in mind.  The Psi function
performs syntactic composition while the Phi performs the appropriate
semantic composition.

> ******** Scott models of LC ********
>
> I'm definitely not an expert in how Scott models are useful in DS.  

Since we want to talk about higher-order functions, we need an
appropriate set D = [D->D] that has something useful in it.  The Scott
model demonstrates that such a set exists and that we can use it to
model our computable functions.

> The one thing that I've posted (often) is that Scott models are
> hard!  

No, they're not!  They are reasonably simple.  The big issue is this:
a computer can only represent objects by strings of bits, so our
domain of values D must be countable.  But the number of mappings from
 D -> D isn't countable, so we have to restrict ourselves to a
countable subset of them.  The subset had better be rich enough to be
useful and we need to ensure that all elements in D correspond to
elements in D->D so that our subset is closed under iterative mapping.

> I actually know what the topology is!  P(omega) is a non-Hausdorff
> version of a Cantor set.  And I've been through a Barendregt's proof
> that P = (P -> P).  Now Will, I imagine that you've replicated these
> 2 feats of mine, but I bet most folks here haven't.

I bet most of the folks arguing with you have.  I slogged through Stoy
several years back.
0
jrm (1310)
6/25/2004 4:51:57 PM
Shriram Krishnamurthi <sk@cs.brown.edu> responds to me:

> > But what's the big fuss?  In my last post, I semi-proved that
> > [...]
> > You didn't respond to my proof [...]
> 
> In a paragraph it went from a "semi-proof" to a "proof"?

:D Shriram, I meant that in my first post, I hadn't defined Will's
function-definition-compositionality, so I only semi-proved that it's
the same as function-compositionality.  Last post, when I gave a
rigorous definition of function-definition-compositionality, the same
argument becomes a real proof.


> Geez, Bill, I happen to know that you're a pretty smart
> mathematician.  But the standards you've adopted on this newsgroup
> are so sloppy I wouldn't accept them coming even from my undergrad
> students.  
 
Thanks!  I'm definitely embarrassed to have made so many dumb mistakes.
But it's easy to make dumb mistakes when you're arguing with folks who
are also making dumb mistakes.  The matter is of course symmetrical...

> The right way is to prepare a manuscript.  Clearly you know how to
> write one.  I also have a feeling that when you're writing in TeX
> content that will appear printed on paper and be read in serious
> referee channels, you automatically police your sloppiness.

Thanks, but it's not clear I have a result that requires a proof.  I'm
saying that we can define a DS semantic function (probably
compositional) for State Scheme (probably Control as well) by
standard-reduction techniques.  More or less what's published in HtDP
sec 38.  I think folks saying that's trivial, but they object that my
function isn't a DS semantic function, on the grounds that I didn't
define it the proper way.  I can't write a paper saying,

"Some folks think that compositionality means there's an irreversible
 flow of information: first we learn how to build the helper Phi
 functions, and that's the only way we can know how to build the
 semantic function E, related to the Phi's as e.g.:

 E[[ (X Y) ]] = Phi( E[[ X ]], E[[ Y ]] ), for expressions X & Y. 

 But it's impossible to formulate this notion mathematically."
0
richter1974 (312)
6/25/2004 7:19:18 PM
I am *so* glad this thread has not been useless.

richter@math.northwestern.edu (Bill Richter) wrote:
> > The essence of your argument is that the form of a definition cannot
> > be a mathematically relevant concept.
> 
> That's probably a fair charge, Will, so let me recant.  Maybe
> compositionality can be defined your way, and I try to do so below,
> and that would refute numerous claims of mine.
> 
> But what's the big fuss?  In my last post, I semi-proved that
> 
>    a function's DEFINITION is compositional (in your sense)
> 
>    if and only if 
> 
>    the FUNCTION is compositional (in my sense)
> 
> You didn't respond to my proof, so I'll do it again, and first I'll
> actually try to mathematize your definition:
> 
>    Definition: A function 
> 
>    E: Expressions ---> M 
> 
>    is called compositional iff there exist functions such as
> 
>    Phi: M x M ---> M   ;; for combinations
> 
>    so that E is defined by structural induction by the various Phi
>    functions.
> 
> And then we have the immediate result: 
> 
>    Lemma: If E is compositional, and compositionally defined by
>    functions such as Phi, then for all expressions X & Y,
> 
>    E[ (X Y) ] = Phi( E[ X ], E[ Y ] )
> 
> I'd call that a meaningful mathematical definition.  Is that
> acceptable to you?

Almost.  I have one minor quibble and one major reservation.

Minor quibble:  Instead of saying "functions such as Phi", you
should say "a family of functions Phi_P indexed by the set of
syntactic productions P".

> I assert this is equivalent to my definition:
> 
>    Definition: A function 
> 
>    E: Expressions ---> M 
> 
>    is called compositional iff there exist functions such as
> 
>    Phi: M x M ---> M 
> 
>    satisfying equations like
> 
>    E[ (X Y) ] = Phi( E[X], E[Y] )  for all expressions X & Y.
> 
> 
> As I proved in my last post, my definition is equivalent to yours.
> Given my E & Phi functions, use the Phi functions to define by
> structural induction a new function E', which is compositional by your
> definition.   Then by structural induction, we see E' = E.

Major reservation:  When you prove something about E by structural
induction, you *must* use a compositional definition of E, not some
extensionally equivalent definition that isn't compositional.  The
reason for this is that a structural induction proceeds by cases
over a well-founded set.  If your alternative definition proceeds
by a case analysis that isn't well-founded, as in many (though
possibly not all) of your alternative definitions, then its case
analysis does not support the principle of structural induction.

That's not to say you can't prove anything at all using a definition
that isn't compositional.  It just says you can't use the principle
of structural induction unless your definition is compositional.

> Let me make a heat comment now: we didn't get into this argument
> because I said you guys didn't know the def of compositionality.
> Instead, you guys ruled out my standard-reduction-type semantic
> functions because you said they didn't satisfy compositionality.  So
> if we agree that my def and yours are equivalent, we can use your def.

It's this kind of statement that makes me suspect you don't understand
what I've said above.

I suggested you look up the definition of primitive recursion.  You came
back with this definition of partial recursive functions:

>   Definition: B-J's class of PARTIAL recursive functions N^k ---> N is
>   the class inductively generated from the n projection maps
> 
>   pi_i: N^k ---> N
> 
>   via the operations of composition, primitive recursion
> 
>   f(x, 0) = alpha(x)
>   f(x, n) = f(beta(x), n-1) for n > 0
> 
>   and minimizing recursion
> 
>   f(x) = least n such that g(x,n) = 0, with g(x,i) defined for i < n.

If you extract the definition of primitive recursion from this context,
you will see that it has everything to do with the form of a function
definition.

> > > And can you explain how you'd rule anything out by your definition?
> > 
> > Of course.  For examples of semantics or pseudo-semantics that aren't
> > compositional, search this newsgroup's archives for posts that contain
> > both "Richter" and "compositional".
>  
> :D That might be technically correct, as I've often failed to
> correctly define my semantic functions.  I've only succeeded twice: I
> correctly defined a standard-reduction-type semantic function on 26
> Jun 2002, and then recently in regards to Advanced Scheme.  But in
> neither post did I explain why my function was compositional, and with
> Advanced Scheme, I'm not even sure it's true, because of global
> variable renaming.  Probably that's a solvable problem...

Probably?  For two years now, you've been claiming your definitions are
compositional.  You still haven't posted one that is.


> We're using serious math here, and we hope there are serious rewards.

The serious reward (of Scott models) is that we know that certain forms
of recursive definitions are guaranteed to have solutions, and that those
solutions are computable.

This guarantee does not apply to recursive definitions that are not in
the approved forms.  You have persisted in claiming that your definitions
have an approved form when they do not, you have persisted in claiming
that your definitions have computable solutions when that is not clear
to the most expert readers of this newsgroup, and you have persisted in
denying the computability of functions whose computability is immediately
obvious by direct application of the major theorems in this area.

> ******** Total-vs-Partial ******** 
> 
> I thought that Schmidt requires all DS semantic functions to be total
> functions, where the target has a bottom.  I thought everyone here
> insists on that convention as well.

There are two things to say here.

One is that a semantic function may be a total function whose range is
a domain of partial functions.  You have often failed to distinguish
the totality of semantic function from the non-totality of functions
in its range.

Secondly, the Scott topology (and related models) allow us to regard a
partial function as a total function whose range is extended (lifted)
by adding a bottom element.  The totality of this extended (lifted)
function does not imply totality of the original unextended function.
You have often failed to distinguish the two.

> I thought I had a big breakthrough last go-round, when I finally
> realized that the R5RS DS semantic function curly-E can not possible
> be "an algorithm" (as I'd earlier many times posted), because the
> Halting problem says no such total function is computable.

That wasn't a breakthrough, that was tremendous confusion on your
part.  Scheme's curly-E is total, by a trivial structural induction.
It is also computable (assuming the missing parameters are computable),
by a trivial examination of the form of the definition of curly-E.
Furthermore

    curly-E [[ * ]] r,
    curly-E [[ * ]] r k,
and curly-E [[ * ]] r k s

are computable for any fixed r, k, and s.  The last is not total,
however.  If it were total, it would solve the halting problem.
But it is not total.  It is computable, and it is the limit of a
sequence of computable finite approximations.

> Furthermore, I'm only interested in total functions myself in trying
> to explain the semantics of a language.

One of the most basic facts about recursion theory is that you can't
restrict yourself to total functions without either (1) restricting
yourself to some proper subset of the total functions or (2) having
no effective way to distinguish the total functions you care about
from the non-total functions you don't care about.

>  I want a total function 
> 
> E: Expressions ---> M 
> 
> where M has a bottom element.  I want to describe mathematically the
> inverse image of bottom.  So I like Schmidt's total convention.

You can describe that inverse image mathematically, but that inverse
image is generally not computable.

> Having said that, I've been sloppy about partial vs total functions in
> my recent posts, but only to be more clear.   Now your post took me by
> surprise, when you gave a computable partial function
> 
> E: Expressions ---> Z
> 
> I hadn't ever considered anything like that in my posts, and I didn't
> think my audience was either.

The thing that surprised you is that E is computable, even though
E^{-1} (\bottom) is not computable.

I think that surprised you a lot more than it surprised your audience,
but I have no idea what audience you think you have at this point.

Will
0
cesuraSPAM (401)
6/25/2004 8:33:14 PM
Shriram Krishnamurthi <sk@cs.brown.edu> responded to me:

> [maybe you] think "math math" has different, more rigorous,
> standards than "CS math"

Shriram, whatever you think, please please do not think I think that!

I've been very favorably impressed with the math rigor of the CS stuff
I've read, especially Felleisen-Flatt's notes on LC_v, but I haven't
seen any papers or books (e.g. Schmidt's book) that fell below my
rigor standards.  Maybe I didn't understand what I was reading :^0

It my "math math" guys who fall below my standards.  I've been trying
to rigorize a part of homotopy theory for almost 4 years now, and the
bugs they ignored in constructing the cutting edge field, they now
pretend don't exist, the better to destroy the now-passe field.

> not to post ad infinitum to a newsgroup until your audience has been
> ground down to exhaustion (which, I note, you've now accomplished
> twice in this millennium, an impressive accomplishment).  

:D I don't think we're near exhaustion.  I thought I was having a
pleasant chat with Joe & Lauri, and then Will jumped in to give me
some valuable lessons.  Will sounds angry, but that's all from the
last go-round.  I'm not angry, & Will made at least 3 contributions.

> I also have a feeling that when you're writing in TeX content that
> will appear printed on paper and be read in serious referee
> channels, you automatically police your sloppiness.

Unfortunately, I have proof that I'm not much better in TeX than text.
I posted twice on a homotopy theory mailing list:

http://www.lehigh.edu/~dmd1/postings.html

Near the bottom you'll see 2 links:

# Richter's EHP proof of lambda admissible basis and more
	    ------------------------------------     ----

The first is 400+ lines of text, and the 2nd is a link to a 7-page
dvi/pdf file a week later, and there was no increase in rigor.

It's a paper I've been trying to write for 2+ years.  Lots of folks
 published that some basic result is trivial, but I actually have a
 proof, and IMO it's not trivial.  In January, a retired hotshot told
 me "No, that's not trivial, that's my 40-yr old published proof!"  I
 thought that'd fix the political mess, but another hotshot said he
 had a better proof the 1st hotshot never saw, and I said tell me the
 proof, I'll refute it on the spot, and he said no, he didn't want to
 discuss it, but a 3rd hotshot taught a version of this `better proof'
 in his class, and it was 5 lines long!  My 7 page paper (which cites
 results from my preprint that's being held up) is supposed to create
 "reasonable doubt" in my soon-to-be referee's mind...
0
richter1974 (312)
6/26/2004 1:20:17 AM
Joe Marshall <jrm@ccs.neu.edu> responded to me:

> > P(omega) is a non-Hausdorff version of a Cantor set.  And I've
> > been through a Barendregt's proof that P = (P -> P).  Now Will, I
> > imagine that you've replicated these 2 feats of mine, but I bet
> > most folks here haven't.
> 
> I bet most of the folks arguing with you have.  I slogged through Stoy
> several years back.

Then I stand corrected, Joe.  Perhaps then there was a point in me
posting Barendregt's version, which is pretty different from Stoy's.

BTW I don't know what Stoy wrote about compositionality, which I
didn't get clued into until later when I was reading Schmidt.

> > You didn't respond to my proof, so I'll do it again, and first I'll
> > actually try to mathematize your definition:
> >
> >    Definition: A function 
> >
> >    E: Expressions ---> M 
> >
> >    is called compositional iff there exist functions such as
> >
> >    Phi: M x M ---> M   ;; for combinations
> >
> >    so that E is defined by structural induction by the various Phi
> >    functions.
> 
> This isn't acceptable to me unless Phi satisfies the requirement that
> it is *independent* of E.  

Yeah, that's what I meant, Joe.  That's structural induction, right?

> Furthermore, you are doing the induction on the wrong term!

I didn't mean to: we start with the Phi/Psi pairs, and induct up the
expression trees to build E.  Just like you say:

>     Definition: A function 
> 
>     E: Expr ---> M 
> 
>     is called compositional iff there exist function pairs
>     *independent of E* such as
> 
>     Psi: Expr x Expr ---> Expr
>     Phi: M x M ---> M   ;; for combinations
> 
>     so that if   Psi (e1, e2) = e3, E(e1) = m1, E(e2) = m2, then 
>     E(e3) = E (Psi (e1, e2)) = Phi (E(e1), E(e2)) = Phi (m1, m2)

Yeah, you want to consider more complicated expression-builders than
(X Y). For my Psi, which is Psi(e1, e2) = (e1 e2), you get

	         E( (e1 e2) ) = Phi( E(e1), E(e2) )

and that's my    E[[ (X Y) ]] = Phi( E[[ X ]], E[[ Y ]] ).

>     Furthermore, it must be possible to express any Expr as one of a
>     set of primitive Exprs or the result of a finite number of Psis
>     applied to primitive Exprs.

Yeah, that's how structural induction works. 

> Now this may not be quite the standard definition of the term
> `compositional', but it is what we have in mind.  

Great!  Then I (or maybe you) succeeded in mathematically formulating
Will's definition of compositionality.  Now back to (my) 1st line:

> > You didn't respond to my proof, so I'll do it again, and first

What do you think of my proof that Will's definition of
compositionality is equivalent to mine?  I'll rephrase:

Suppose I can define a function E without structured induction, and
then later construct the Phi functions giving E/Phi/Psi equations

E[ Psi(e, f) ] = Phi( E[e], E[f] ) in M,   for all e, f in Expr

Now forget E altogether, and let's use my Phi functions to build a new
function E' by structured induction using my (already constructed) Phi
functions.  So isn't E' compositional in Will's sense, and don't we
see by structured induction that E' = E?


> >> (William D Clinger) The essence of your argument is that the form
> >> of a definition cannot be a mathematically relevant concept.
> >
> 
> In particular you need to distinguish between `definition' and
> `specification'.  I can say the following:
> 
>                    /  42,   if  x = 0
>             F(x) = |
>                    \ F(x+1),  otherwise
> 
> This is not a definition of F, but rather a condition that a putative
> F must satisfy.  If I need a definition, I must find a function that
> satisfies the specification.  

Yeah, but in this case, Boolos & Jeffrey would say there's a obvious
preferred solution, and it's your minimal fixed point:

F(0) = 42, F(n) = bottom for n > 0.

The point (I just recently understood from Lauri) is that F is like
the 1-step reduction map R I defined for Advanced Scheme.  We can
(mathematically) "time" how many iterations of F we go through, and if
it's "infinite time" then we can (mathematically) say: bottom!

BTW I think I understood your transfinite induction post: the Peano
axioms can't settle a certain question the Scheme interpreter can
answer.  But the Peano axioms + transfinite induction can settle it?
And that raises an interesting question about whether "real"
mathematical induction is enough to construct a semantic function?

If so, that's an interesting question.  It's definitely not enough to
take the Peano axioms plus "real" mathematical induction.  I know I
need the Subset or Comprehension axiom from ZFC.  From an old post:

      Given a set X with a subset A and a set Y and a function

      f: A ---> Y

      and a partial function R: X ---> X, we define the partial
      functions R^n (the iterative composites of R with itself) by
      induction, and then define a partial function

      F: X ---> Y   given by 

      F(x) = f(R^n(x)), where n is the least nonnegative integer such
			that R^n(x) belongs to the subset A and all
			earlier R^i(x) are defined.

   Note that the construction of F (similar to the `minimizing recursion'
   of Boolos & Jeffrey) doesn't use induction, but instead constructs F
   as a graph (as all sets are), i.e. the following subset of X x Y

   { (x, y) : there exists n >= 0 such that R^n(x) in A, and 

	      for all i >= 0 and i <= n, R^i(x) is defined, and 

	      for all i >= 0 and i < n, R^i(x) not in A, and 
      
              y = f(R^n(x))                                       }

   The construction of this subset 

   Graph(F) subset X x Y 

   only requires the comprehension axiom of ZFC, i.e. we can define
   subsets by formulas which can contain quantifiers (like above).
                                                             ^
If the formulas are too crazy, just read the last line above.|
0
richter1974 (312)
6/26/2004 2:09:18 AM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

> I am *so* glad this thread has not been useless.

I think you've earned some sarcasm, Will, especially after such a
thoughtful post as this.  Thanks for spending so much time on it.

> >    Definition: A function E: Expressions ---> M 
> > 
> >    is called compositional iff there exist functions such as
> >    Phi: M x M ---> M   ;; for combinations
> > 
> >    so that E is defined by structural induction by the various Phi
> >    functions.  
> > 
> > And then we have the immediate result: 
> >    E[ (X Y) ] = Phi( E[ X ], E[ Y ] )
> > 
> > I'd call that a meaningful mathematical definition.  Is that
> > acceptable to you?
> 
> Almost.  I have one minor quibble and one major reservation.
> 
> Minor quibble:  Instead of saying "functions such as Phi", you
> should say "a family of functions Phi_P indexed by the set of
> syntactic productions P".

Absolutely.  I didn't know the lingo. So we've concentrated on Phi_P
for the syntactic production P(X, Y) = (X Y).  And I left out the very
important base case of variables: We must be given a function

E_0: Variables ---> M

such that E[x] = E_0(x).  I don't want to call E_0 a Phi_P, because if
if P is an n-ary syntactic production, then Phi_P is a function

Phi_P: M^n ---> M 

Now, the structural induction produces a function E satisfying

E[ P(X_1,..,X_n) ] = Phi_P( E[X_1] ,.., E[X_n] )

and E[x] = E_0(x).


> > I assert this is equivalent to my definition:
> > 
> >    Definition: A function E: Expressions ---> M 
> > 
> >    is called compositional iff there exist functions such as
> >    Phi: M x M ---> M 
> > 
> >    satisfying equations like
> >    E[ (X Y) ] = Phi( E[X], E[Y] )  for all expressions X & Y.
> > 
> > As I proved in my last post, my definition is equivalent to yours.
> > Given my E & Phi functions, use the Phi functions to define by
> > structural induction a new function E', which is compositional by
> > your definition.  Then by structural induction, we see E' = E.
> 
> Major reservation:  When you prove something about E by structural
> induction, you *must* use a compositional definition of E, not some
> extensionally equivalent definition that isn't compositional.  The
> reason for this is that a structural induction proceeds by cases
> over a well-founded set.  If your alternative definition proceeds
> by a case analysis that isn't well-founded, as in many (though
> possibly not all) of your alternative definitions, then its case
> analysis does not support the principle of structural induction.
> 
> That's not to say you can't prove anything at all using a definition
> that isn't compositional.  It just says you can't use the principle
> of structural induction unless your definition is compositional.

I'm having trouble here, but largely I think I because I misstated my
claim.  I meant to say E was compositional in my sense.  I don't know
what "extensionally equivalent definition" or "well-founded" means,
and I left out the base case of variables!  Geez. So let me rephrase:


Suppose we have a function E: Expressions ---> M, and a family of
functions Phi_P indexed by the set of syntactic productions P.
Suppose the following equations are satisfied:

E[ P(X_1,..,X_n) ] = Phi_P( E[X_1] ,.., E[X_n] )

for all n-ary syntactic productions P.  

That's my definition of E being compositional.  I'm not assuming
anything about how E is defined (by structural induction e.g.).

Now restrict E to variables by a map E_0, so E_0(x) = E[x].  

Now use E_0 and the the family of functions Phi_P to create a
compositional (in your sense) function E' by structural induction.  So

                  E'[x] = E_0(x)   

and E'[ P(X_1,..,X_n) ] = Phi_P( E'[X_1] ,.., E'[X_n] )

for all X_i in Expressions, and x in Variables subset Expressions.

Then I claim that E = E'.  I claim a proof follows from induction over
the tree-structure of the expressions.  (I thought that was called
structural induction, but I'll defer to your terminology).  I'll
sketch the proof, so you can guess if I know what I'm talking about!

For a variable x, E'[x] = E[x] by definition.  

All more complicated expressions are of the form  P(X_1,...,X_n), for
some n-ary syntactic productions P, and expressions X_i.  Then 

E'[ P(X_1,...,X_n) ] = Phi_P( E'[X_1] ,.., E'[X_n] )

By induction on the tree-structure of expressions, we can assume that 

E'[X_i] = E[X_i], for i = 1,...,n.  But then we're done by

E[ P(X_1,..,X_n) ] = Phi_P( E[X_1] ,.., E[X_n] )
\qed

And then since E' = E, and E' is compositional in your sense, my E is
also compositional in your sense.

Does this work, or would you repeat your Major Reservation above?

> It's this kind of [heat comment] that makes me suspect you don't
> understand what I've said above.

Well, I definitely didn't understand what you wrote above.  But now
that I've been more clear, maybe we're in agreement anyway.


> If you extract the definition of primitive recursion from this
> context, you will see that it has everything to do with the form of
> a function definition.

I shouldn't have carped at all.  Sure, you're right: finitely
presented groups and primitive recursive functions are defined in
terms of their definitions.

> Probably?  For two years now, you've been claiming your definitions are
> compositional.  You still haven't posted one that is.

:) But we haven't made it past the definition of compositionality.
And if we take my definition of compositionality;

   given E, we can find the Phi_P giving equations 
   E[ P(X_1,..,X_n) ] = Phi_P( E[X_1] ,.., E[X_n] )

wouldn't you say it's obvious that 

1) we can well-define an E in standard-reduction fashion, by
distilling a 1-step reduction function?

2) E will be compositional in my sense?

It's not obvious to me: I've had lots of trouble over 2 years!  But
you're a lot better at Scheme (even Advanced Scheme) than I am.

Will, I responded to the rest of you comments, but please just give a
pass/fail score below.  You've clearly shown that I'm lost about
computability, and I need to book up, and it can't be worth your time
to respond. and the above compositionality issue is more pressing.


> > We're using serious math here, and we want serious rewards.
> 
> The serious reward (of Scott models) is that we know that certain
> forms of recursive definitions are guaranteed to have solutions, and
> that those solutions are computable.

I haven't read Barendregt yet, but I realized computable = continuous,
and the uniform limit of continuous maps is continuous, and we should
get uniformity with compactness (but non-Hausdorff??), so I believe
the limit of computable maps is computable.

> This guarantee does not apply to recursive definitions that are not
> in the approved forms.  You have persisted in claiming that your
> definitions have an approved form when they do not, you have
> persisted in claiming that your definitions have computable
> solutions when that is not clear to the most expert readers of this
> newsgroup, and you have persisted in denying the computability of
> functions whose computability is immediately obvious by direct
> application of the major theorems in this area.

OK, good points.  One of the ways I'm planning to do should show I
have computable solutions: R5RS DS defines a semantic function

curly-E: Expressions ---> (U x K x S -> A)

I'm going to give different definitions of values & therefore K & S
(but not U & A).  So I want to define a function

D: Expressions ---> (U x K' x S' -> A)

I want to show that my D is compositional, at least in my sense.  R5RS
DS  uses an initial env, cont & store (r0, k0, s0).  I'll do the
same thing, and call mine (r0', k0', s0').  Then I claim we'll get the
same "program" functions: for all expressions X, 

curly-E[ X ] (r0, k0, s0) = D[ X ] (r0', k0', s0')

If I can make good on my above claim, then I'll have your
computability, whether I know what it means or not.


> > ******** Total-vs-Partial ******** 
> > 
> > I thought that Schmidt requires all DS semantic functions to be
> > total functions, where the target has a bottom.  I thought
> > everyone here insists on that convention as well.
> 
> There are two things to say here.
> 
> One is that a semantic function may be a total function whose range
> is a domain of partial functions.  You have often failed to
> distinguish the totality of semantic function from the non-totality
> of functions in its range.

Yeah. I thought the Schmidt solution was that all the domains (e.g. U,
K, S above) have bottoms, so I've got a total function whose range is
a domain of TOTAL functions.  But let's do it your way:

curly-E: Expressions ---> (U x K x S -> A)

is a total function to the space of partial functions.  That sounds
computable, it sounds like Scheme: 

for EVERY expression X, the interpreter will return an answer for an
expression in some env/cont/store's, but not all.

Use the initial (r0, k0, s0), we get a partial "program" function

Prog_E: Expressions --->  A

X |--> curly-E[ X ] (r0, k0, s0) 

But you can think of Prog_E as a total function, if you stick in a
bottom to A, or an extra bottom, and I say that this total function

Prog_E: Expressions --->  A_bottom

is not computable, by the Halting problem.  If you say that even this
Prog_E is still computable, then I just don't understand the meaning
of the word computable.  I'll book up.  As before, let me just take
the composite to {0,1}, send A to 1, bottom to 0, and I have

Expressions --->  {0,1}

Surely we agree that this total map is not computable!

> That wasn't a breakthrough, that was tremendous confusion on your
> part.  

Hmm, I'm starting to think you're right...

> Scheme's curly-E is total, by a trivial structural induction.  It is
> also computable (assuming the missing parameters are computable), by
> a trivial examination of the form of the definition of curly-E.
> Furthermore
> 
>     curly-E [[ * ]] r,
>     curly-E [[ * ]] r k,
> and curly-E [[ * ]] r k s
> 
> are computable for any fixed r, k, and s.  The last is not total,
> however.  If it were total, it would solve the halting problem.
> But it is not total.  It is computable, and it is the limit of a
> sequence of computable finite approximations.

But we can replace it by my total function

Prog_E: Expressions --->  A_bottom

> > Furthermore, I'm only interested in total functions myself in trying
> > to explain the semantics of a language.
> 
> One of the most basic facts about recursion theory is that you can't
> restrict yourself to total functions without either (1) restricting
> yourself to some proper subset of the total functions or (2) having
> no effective way to distinguish the total functions you care about
> from the non-total functions you don't care about.

Boolos & Jeffrey are definitely strong on partial functions.

> >  I want a total function 
> > 
> > E: Expressions ---> M 
> > 
> > where M has a bottom element.  I want to describe mathematically the
> > inverse image of bottom.  So I like Schmidt's total convention.
> 
> You can describe that inverse image mathematically, but that inverse
> image is generally not computable.

Yeah, and I wouldn't say I understood a function if I didn't
understand the inverse images.  And it take strong math axioms to
describe the inverse images, even if the function E is computable.
0
richter1974 (312)
6/26/2004 5:13:45 AM
Will, I found strong evidence for your position in a 1992 book by
Nielson & Nielson, "Semantics with Applications".  Anyone like this
book?  Very early in the book he defines compositionality exactly as
you did, and on p 85 he says, 

"S_{ns} and S_{sos} are *not* denotational definitions because they
 are *not* defined compositionality [i.e. by structural induction]."

I still contend my interp of C-F is correct, that the structural
induction definition is equivalent to equations for the function E to
satisfy, involving extra functions Phi_P, however E is constructed.
But I see I've got real resistance to overcome.  And I need more
precision: my base case E_0 for variables isn't nearly enough.   It's
not going to be true that 
E[(set! x Y)] = Phi_1(E[x], E[Y])
E[(lambda x Y)] = Phi_2(E[x], E[Y])
R5RS DS of course does no such thing.

Nielson's gives a much simpler example of a language While has
semantic functions curly-N for numerals, curly-A for arithmetic
expressions, curly-B for booleans, and then finally
curly-S: Statements -> (State -> State)
is defined in terms of curly-N, curly-A & curly-B, 
where State = (Variable -> Integers).
So e.g.
curly-S[x := a] s = s[x -> curly-A[a]s]

which is not in my form.  But Nielson's curly-S is compositional in
his sense, "Semantic clauses for the compositional elements are
defined in terms of the semantic clauses of the basis elements." Or

curly-S[(if b S_1 S_2)] s = curly-S[S_1] s, if curly-B[b]s = true 

			    curly-S[S_2] s, if curly-B[b]s = false

I'm just saying we get an equivalent definition of compositionality by
thinking of these as equations to hold, rather than as structurally
inductive definitions.  I think that's obvious by tree-induction, but 
I see that some precision is needed to state the result.
0
richter1974 (312)
6/26/2004 8:21:39 PM
richter@math.northwestern.edu (Bill Richter) writes:

>> > You didn't respond to my proof, so I'll do it again, and first I'll
>> > actually try to mathematize your definition:
>> >
>> >    Definition: A function 
>> >
>> >    E: Expressions ---> M 
>> >
>> >    is called compositional iff there exist functions such as
>> >
>> >    Phi: M x M ---> M   ;; for combinations
>> >
>> >    so that E is defined by structural induction by the various Phi
>> >    functions.
>> 
>> This isn't acceptable to me unless Phi satisfies the requirement that
>> it is *independent* of E.  
>
> Yeah, that's what I meant, Joe.  That's structural induction, right?

There are other constraints, too.  The reason that Will brought up the
subject of groups with finite presentation and primitive recursive
functions is because that spells out in excruciating detail what sort
of additional constraints are necessary.

> Great!  Then I (or maybe you) succeeded in mathematically formulating
> Will's definition of compositionality.  

I think Will formulated it.  The Phi functions must be primitive
recursive and must form a group with a finite presentation on M.

> Suppose I can define a function E without structured induction, 

Since we are talking about computer languages, such a function E over
expressions would only be able to operate on a subset of the language.

> and then later construct the Phi functions giving E/Phi/Psi equations
>
> E[ Psi(e, f) ] = Phi( E[e], E[f] ) in M,   for all e, f in Expr
>
> Now forget E altogether, and let's use my Phi functions to build a new
> function E' by structured induction using my (already constructed) Phi
> functions.  So isn't E' compositional in Will's sense, and don't we
> see by structured induction that E' = E?

I don't know what you mean by build a new function E' by structured
induction using Phi functions.

>> >> (William D Clinger) The essence of your argument is that the form
>> >> of a definition cannot be a mathematically relevant concept.
>> 
>> In particular you need to distinguish between `definition' and
>> `specification'.  I can say the following:
>> 
>>                    /  42,   if  x = 0
>>             F(x) = |
>>                    \ F(x+1),  otherwise
>> 
>> This is not a definition of F, but rather a condition that a putative
>> F must satisfy.  If I need a definition, I must find a function that
>> satisfies the specification.  
>
> Yeah, but in this case, Boolos & Jeffrey would say there's a obvious
> preferred solution, and it's your minimal fixed point:
>
> F(0) = 42, F(n) = bottom for n > 0.

So the semantics of the program

   (define f (lambda (x) (if (zero? x) 42 (f (+ x 1)))))

ought to involve taking a minimal fixed point.  Thus it would be nice
if we could prove that our domains contain the ones of interest.

> BTW I think I understood your transfinite induction post:  the Peano
> axioms can't settle a certain question the Scheme interpreter can
> answer.  But the Peano axioms + transfinite induction can settle it?

Yes.

> And that raises an interesting question about whether "real"
> mathematical induction is enough to construct a semantic function?

Yes.  In a prior discussion you attempted to prove termination by
strong induction over the integers.  The program I posted demonstrates
that this isn't sufficient.  If you wish to stitch together a
denotational semantics from the judgements of operational semantics,
you'll need to show that induction over the operational semantics is
valid.


-- 
~jrm
0
6/26/2004 9:58:45 PM
Joe Marshall <prunesquallor@comcast.net> responds to me:
 
> > BTW I think I understood your transfinite induction post:  the Peano
> > axioms can't settle a certain question the Scheme interpreter can
> > answer.  But the Peano axioms + transfinite induction can settle it?
> 
> Yes.
> 
> > And that raises an interesting question about whether "real"
> > mathematical induction is enough to construct a semantic function?
> 
> Yes.  In a prior discussion you attempted to prove termination by
> strong induction over the integers.  The program I posted
> demonstrates that this isn't sufficient.  

Good try, Joe, but you didn't actually do so.  I didn't use
transfinite induction, but I made heavy use of the ZFC subset axiom.
Strong induction over the integers and transfinite induction aren't
separate ZFC axioms that are independent from the others.  Both
induction results are consequences of the basic ZFC axioms.

> > Suppose I can define a function E without structured induction,
> > and then later construct the Phi functions giving E/Phi/Psi
> > equations
> >
> > E[ Psi(e, f) ] = Phi( E[e], E[f] ) in M,   for all e, f in Expr
> >
> > Now forget E altogether, and let's use my Phi functions to build a
> > new function E' by structured induction using my (already
> > constructed) Phi functions.  So isn't E' compositional in Will's
> > sense, and don't we see by structured induction that E' = E?
> 
> I don't know what you mean by build a new function E' by structured
> induction using Phi functions.

I'll try again, because there are things I've missed so far.  So using
terminology from Nielson's "Semantics with Applications", first
suppose you have a total function
 
E: Expressions ---> M 

that has what Nielson calls a denotational definition.  So we're given

1) functions E_b1, E_b2... on the basic syntactic elements, and

2) functions like Phi above for the compositional syntactic elements

and we define E by structured induction, so that e.g.

E[ Psi(e, f) ] = Phi( E[e], E[f] ) in M

So as Nielson says, "Semantic clauses for the compositional elements
are defined in terms of the semantic clauses of the basis elements."

Now forget structured induction, & suppose we have a total function
 
E: Expressions ---> M 

whose restriction to the basic syntactic elements is given by some
collection of functions E_b1, E_b2..., and suppose we have functions
like Phi which satisfy exactly those equations:

E[ Psi(e, f) ] = Phi( E[e], E[f] ) in M

etc.  Whatever equations we would have written down for a denotational
definition of E, write down the same equations.  But now we're not
defining E by structured induction.  We're just supposing these
equations are true.  However the blazes we produced E or the Phi's.
That's my definition (I think C-F's) of compositionality.

But now build a new function E' by structured induction using these
equations.  So the base case of the induction is that E' restricted to
the basic syntactic elements is given by this collection of functions
E_b1, E_b2...  And we perform the structured induction so that e.g.

E'[ Psi(e, f) ] = Phi( E'[e], E'[f] ) in M

So E' has a denotational definition, even though E did not.  Now I
assert that E' = E, by a tree-induction over the syntax.  

> So the semantics of the program
> 
>    (define f (lambda (x) (if (zero? x) 42 (f (+ x 1)))))
> 
> ought to involve taking a minimal fixed point.  Thus it would be
> nice if we could prove that our domains contain the ones of
> interest.

Well, I certainly don't want to proceed this way.  Now let me get back
to something of yours I snipped above:

> > Suppose I can define a function E without structured induction, 
> 
> Since we are talking about computer languages, such a function E
> over expressions would only be able to operate on a subset of the
> language.

I disagree, but you're correct in a sense.  I tried to prove above
that my E which is compositional in my sense actually has a
denotational definition.  And you might say, "then you're just doing
what everyone else is doing.  E can't be a total function unless it
has a denotational definition."  OK, in a sense you're right...

But there are different ways of defining the same set (or function).
Suppose I say I'm going to construct a square matrix.  We can e.g.
construct the column vectors first, or we can construct the row
vectors first.  If we construct columns first, we get the rows later,
but that doesn't mean we had to construct the rows first.
0
richter1974 (312)
6/27/2004 2:56:37 AM
richter@math.northwestern.edu (Bill Richter) writes:

>> > Suppose I can define a function E without structured induction, 
>> 
>> Since we are talking about computer languages, such a function E
>> over expressions would only be able to operate on a subset of the
>> language.
>
> I disagree, but you're correct in a sense.  

The function E operates on expressions.  *Most* computer languages
have an infinite set of expressions that are generated by a set of
induction rules applied to some primitive elements.  If E is defined
without induction, then every expression in the language will need its
own term in the definition of E.

So you are doing one of two things:

    1.  Defining E only on a subset of the language, and you can do
        this in two ways:

        a.  Define E only on the primitive elements.

        b.  Define E on a huge, but finite subset of programs.

    2.  Restricting your language to a finite set of programs and
        defining an E for each one.

>> > Suppose I can define a function E without structured induction,
>> > and then later construct the Phi functions giving E/Phi/Psi
>> > equations
>> >
>> > E[ Psi(e, f) ] = Phi( E[e], E[f] ) in M,   for all e, f in Expr

Based on the above, you are doing one or both of these:

   1.  If E is defined only on the primitive elements of the language,
       you are using structural induction to extend the definition of
       E to non-primitive expressions.

or

   2.  If E is defined only on a huge, but finite subset of programs,
       you are using structural induction to *remove* terms from E (by
       showing that some terms in E are derivable by induction from
       others.)

>> > Now forget E altogether, and let's use my Phi functions to build a
>> > new function E' by structured induction using my (already
>> > constructed) Phi functions.  So isn't E' compositional in Will's
>> > sense, and don't we see by structured induction that E' = E?
>> 
>> I don't know what you mean by build a new function E' by structured
>> induction using Phi functions.
>
> I'll try again, because there are things I've missed so far.  So using
> terminology from Nielson's "Semantics with Applications", first
> suppose you have a total function
>  
> E: Expressions ---> M 
>
> that has what Nielson calls a denotational definition.  So we're given
>
> 1) functions E_b1, E_b2... on the basic syntactic elements, and
>
> 2) functions like Phi above for the compositional syntactic elements
>
> and we define E by structured induction, so that e.g.
>
> E[ Psi(e, f) ] = Phi( E[e], E[f] ) in M
>
> So as Nielson says, "Semantic clauses for the compositional elements
> are defined in terms of the semantic clauses of the basis elements."

Ok, this corresponds to 

   1.  Defining E only on a subset of the language, and you can do
       this in two ways:

       a.  Define E only on the primitive elements.
and 

   1.  If E is defined only on the primitive elements of the language,
       you are using structural induction to extend the definition of
       E to non-primitive expressions.

> Now forget structured induction, & suppose we have a total function
>  
> E: Expressions ---> M 
>
> whose restriction to the basic syntactic elements is given by some
> collection of functions E_b1, E_b2..., and suppose we have functions
> like Phi which satisfy exactly those equations:
>
> E[ Psi(e, f) ] = Phi( E[e], E[f] ) in M
>
> etc.  Whatever equations we would have written down for a denotational
> definition of E, write down the same equations.  But now we're not
> defining E by structured induction.  We're just supposing these
> equations are true.  However the blazes we produced E or the Phi's.
> That's my definition (I think C-F's) of compositionality.

Ok, but I have a problem here.  Above, we created the function Psi so
that we could generate syntactic constructs for our language from
primitive elements.  We then constructed Phi so that we could generate
an appropriate meaning for each new construct.  Structural induction
was the guiding principle:  new language constructs were *designed* to
be exactly (and only) those constructs that could be created by
structural induction.

If we pretend that these equations are simply axioms with no
particular connection to each other, then we need to consider the
following: 

  - Are the semantics complete?  Can every valid syntactic language
    construct be generated from a finite application of Psi to base
    elements?  Does every valid syntactic language construct have a
    corresponding semantic value generated from a finite application
    of Phi to the base semantics?  What are the semantic constructs
    that can be generated?

  - Are the semantics consistent?  If there are two ways to generate a
    syntactic construct, are the equivalent?  Is there more than one
    meaning that can be attributed to a compound expression by choice
    of Phi? 

> But now build a new function E' by structured induction using these
> equations.  So the base case of the induction is that E' restricted to
> the basic syntactic elements is given by this collection of functions
> E_b1, E_b2...  And we perform the structured induction so that e.g.
>
> E'[ Psi(e, f) ] = Phi( E'[e], E'[f] ) in M
>
> So E' has a denotational definition, even though E did not.  Now I
> assert that E' = E, by a tree-induction over the syntax.  

I certainly agree that E' = E if every component of E' = its
corresponding component in E, but how can E' have a denotational
definition yet E not have one?  I don't understand the point of this. 

>> So the semantics of the program
>> 
>>    (define f (lambda (x) (if (zero? x) 42 (f (+ x 1)))))
>> 
>> ought to involve taking a minimal fixed point.  Thus it would be
>> nice if we could prove that our domains contain the ones of
>> interest.
>
> Well, I certainly don't want to proceed this way.  

I can't figure out *how* you want to proceed (or *where*).  I was
under the impression that you want to develop a set-theoretic
denotational semantics that does not involve Scott's domain theory.
(And I was under the impression that the reason for this is because
Scott's domain theory is complicated.)

Now you *can* develop a non-standard semantics, but if you think Scott
domains are complicated, you haven't seen *anything* yet.  The first two
problems you need to solve are these:

  - Model recursion -- This is usually done by ensuring that all
    functions on domains have fixed points.

  - Model untyped lambda calculus -- This is usually done by
    developing an approximate one-to-one mapping between a domain and
    its power set.





-- 
~jrm
0
6/27/2004 4:56:10 PM
Joe Marshall <prunesquallor@comcast.net> responded to me:

Joe, I googled Goodstein sequences, and it looks like real fun.  I
didn't try to understand the transfinite induction proof.

But this has nothing to do with semantic functions.  The code you
wrote must only check one Goodstein sequence at a time.  I'm surprised
that any single Goodstein sequence converges to 0 (not infinity), and
your Scheme code will allow us to check these (individual) surprising
facts.  Certainly each individual fact can be proved in Peano
arithmetic (PA).  It's just addition, subtraction & multiplication you
could perform on N pages (N may be really large!).  So Goodstein's
theorem is that *all* Goodstein sequences converge to 0.  That's what
they say needs transfinite induction and is independent from PA.  The
existence of the semantic function doesn't prove Goodstein's theorem.

In a sense, you're saying something interesting.  The semantic
function "knows" that all Goodstein sequences converge to 0.  But
that's only because the domain Z of integers is built into the
semantic function.  It's really Z that "knows" Goodstein's theorem.

I have Schmidt's & Stoy's books, so I can try to find what they say
about compositionality.  Nothing in the indexes.  Stoy says on p 13:

"These valuation functions are usually recursively defined: the value
 denoted by a construct is specified in terms of values denoted by its
 syntactic components."

I say that's ambiguous, re our argument about compositionality, and he
doesn't define structural induction until p 159.  I'll keep reading.

> I can't figure out *how* you want to proceed (or *where*).  

I want a compositional DS semantic function for Scheme that's defined
similarly to the standard reduction function, and much along the lines
of HtDP's sec 38 Semantics for Advanced Scheme.  

> I was under the impression that you want to develop a set-theoretic
> denotational semantics that does not involve Scott's domain theory.
> (And I was under the impression that the reason for this is because
> Scott's domain theory is complicated.)

Yeah, I just want simplicity.  I like Scott models, and I'm happy with
folks using them to prove cool theorems (such as computability, which
Will pointed out I have trouble with), but Scheme is a language we're
trying to program in!  A really simple semantics might help.  But
mathematicians insist on a total function Expr ---> MeaningSet.

> Now you *can* develop a non-standard semantics, 

I don't think I've said anything nonstandard.

> but if you think Scott domains are complicated, you haven't seen
> *anything* yet.  The first two problems you need to solve are these:
> 
>   - Model recursion -- This is usually done by ensuring that all
>   functions on domains have fixed points.

Wouldn't that have to come up with standard reduction?

>   - Model untyped lambda calculus -- This is usually done by
>     developing an approximate one-to-one mapping between a domain and
>     its power set.

Barendregt gives (at least) 2 models of LC: the term model, and that's
basically the standard reduction function, and the Scott models.  I'm
happy to stick with the term model for now.


> >> > Suppose I can define a function E without structured induction,
> >> 
> >> Since we are talking about computer languages, such a function E
> >> over expressions would only be able to operate on a subset of the
> >> language.
> >
> > I disagree, but you're correct in a sense.  
> 
> The function E operates on expressions.  *Most* computer languages
> have an infinite set of expressions that are generated by a set of
> induction rules applied to some primitive elements.  If E is defined
> without induction, then every expression in the language will need
> its own term in the definition of E.

I'm using induction!  Just like Felleisen-Flatt do in constructing
their standard reduction function eval_s, about which BTW I'm still
waiting to hear from you.  If you understand transfinite induction,
Scott models & CPOs, you shouldn't have any trouble deciding if F-F
rigorously defined their function.  You don't need to ask MFe.

> So you are doing one of two things:
> 
>     1.  Defining E only on a subset of the language, and you can do
>         this in two ways:
> 
>         a.  Define E only on the primitive elements.
> 
>         b.  Define E on a huge, but finite subset of programs.
> 
>     2.  Restricting your language to a finite set of programs and
>         defining an E for each one.

None of the above, and I think have to skip a lot now:

> > Now forget structured induction, & suppose we have a total function
> >  
> > E: Expressions ---> M 
> >
> > whose restriction to the basic syntactic elements is given by some
> > collection of functions E_b1, E_b2..., and suppose we have
> > functions like Phi which satisfy exactly those equations:
> >
> > E[ Psi(e, f) ] = Phi( E[e], E[f] ) in M
> >
> > etc.  Whatever equations we would have written down for a
> > denotational definition of E, write down the same equations.  But
> > now we're not defining E by structured induction.  We're just
> > supposing these equations are true.  However the blazes we
> > produced E or the Phi's.  That's my definition (I think C-F's) of
> > compositionality.
> 
> OK, but I have a problem here.  Above, we created the function Psi
> so that we could generate syntactic constructs for our language from
> primitive elements.  We then constructed Phi so that we could
> generate an appropriate meaning for each new construct.  Structural
> induction was the guiding principle: new language constructs were
> *designed* to be exactly (and only) those constructs that could be
> created by structural induction.

Right, that's one of Nielson's denotational definition, what you guys
want to call compositionality.

> If we pretend that these equations are simply axioms with no
> particular connection to each other, then we need to consider the
> following:

Dunno about axioms.  Just take a whole collection of functions and
say we're presented with them, without being told that some of the
functions are defined by structural induction using the others.

>   - Are the semantics complete?  Can every valid syntactic language
>   construct be generated from a finite application of Psi to base
>   elements?  Does every valid syntactic language construct have a
>   corresponding semantic value generated from a finite application
>   of Phi to the base semantics?  What are the semantic constructs
>   that can be generated?
> 
>   - Are the semantics consistent?  If there are two ways to generate
>   a syntactic construct, are the equivalent?  Is there more than one
>   meaning that can be attributed to a compound expression by choice
>   of Phi?

There's nothing special about these questions doing things my way,
since I'm showing that my way coincides with your way.

> > But now build a new function E' by structured induction using these
> > equations.  So the base case of the induction is that E' restricted to
> > the basic syntactic elements is given by this collection of functions
> > E_b1, E_b2...  And we perform the structured induction so that e.g.
> >
> > E'[ Psi(e, f) ] = Phi( E'[e], E'[f] ) in M
> >
> > So E' has a denotational definition, even though E did not.  Now I
> > assert that E' = E, by a tree-induction over the syntax.  
> 
> I certainly agree that E' = E if every component of E' = its
> corresponding component in E, 

Doesn't sound like you're agreeing with me that E = E'.

> but how can E' have a denotational definition yet E not have one?  I
> don't understand the point of this.

The point is the 2 definitions of compositionality coincide! We don't
have to insist that E be defined by structural induction etc etc.  We
can just say that E & the Phi's satisfy equations.

Joe, I haven't done a good job explaining this.  I can hardly claim to
have constructed a compositional (with my meaning) semantic function
if I don't even produce names of all the various Phi functions, and
say what equations they are supposed to satisfy!  I'm giving a
meta-proof, that anyone who knows a particular language and a
compositional (with your meaning) semantic function can turn into a
proof.  Maybe that's not satisfactory.  Maybe I need to either state
an actual result, and maybe in just one specific case.
0
richter1974 (312)
6/28/2004 1:51:23 AM
Joe Marshall <prunesquallor@comcast.net> writes:

> The reason that Will brought up the subject of groups with finite
> presentation and primitive recursive functions is because that
> spells out in excruciating detail what sort of additional
> constraints are necessary.

I shouldn't put words in Will's mouth, so pretend the word `Perhaps'
is in front of this paragraph. 
0
jrm (1310)
6/28/2004 1:42:59 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <prunesquallor@comcast.net> responded to me:

>> I can't figure out *how* you want to proceed (or *where*).  
>
> I want a compositional DS semantic function for Scheme that's defined
> similarly to the standard reduction function, and much along the lines
> of HtDP's sec 38 Semantics for Advanced Scheme.  
>
>> I was under the impression that you want to develop a set-theoretic
>> denotational semantics that does not involve Scott's domain theory.
>> (And I was under the impression that the reason for this is because
>> Scott's domain theory is complicated.)
>
> Yeah, I just want simplicity.  I like Scott models, and I'm happy with
> folks using them to prove cool theorems (such as computability, which
> Will pointed out I have trouble with), but Scheme is a language we're
> trying to program in!  A really simple semantics might help.  But
> mathematicians insist on a total function Expr ---> MeaningSet.

There's nothing wrong with partial semantics if you are able to
specify the limits of your model.

You don't need a `total function', but certainly you want your
semantics to cover a rich set of legal expressions.  There are some
expressions in Scheme that are ambiguous and certainly many that are
not defined, so those don't need to be covered by the semantics.

>> Now you *can* develop a non-standard semantics, 
>
> I don't think I've said anything nonstandard.

Yes, but you've been outvoted.

>> >> > Suppose I can define a function E without structured induction,
>> >> 
>> >> Since we are talking about computer languages, such a function E
>> >> over expressions would only be able to operate on a subset of the
>> >> language.
>> >
>> > I disagree, but you're correct in a sense.  
>> 
>> The function E operates on expressions.  *Most* computer languages
>> have an infinite set of expressions that are generated by a set of
>> induction rules applied to some primitive elements.  If E is defined
>> without induction, then every expression in the language will need
>> its own term in the definition of E.
>
> I'm using induction!  

But you just said `Suppose I can define a function E *without*
structured induction'  Are you using it or not?

>> So you are doing one of two things:
>> 
>>     1.  Defining E only on a subset of the language, and you can do
>>         this in two ways:
>> 
>>         a.  Define E only on the primitive elements.
>> 
>>         b.  Define E on a huge, but finite subset of programs.
>> 
>>     2.  Restricting your language to a finite set of programs and
>>         defining an E for each one.
>
> None of the above, 

E is either defined on a finite or infinite set of expressions.  Your
language has either a finite or infinite set of expressions.  If your
language is finite, then E can be finite and attribute a meaning to
each expression (case 2).  If your language has an infinite number of
expressions, however, E cannot be finite and still attribute a meaning
to each expression (case 1).

[snip]
> The point is the 2 definitions of compositionality coincide! We don't
> have to insist that E be defined by structural induction etc etc.  We
> can just say that E & the Phi's satisfy equations.
>
> Joe, I haven't done a good job explaining this.  I can hardly claim to
> have constructed a compositional (with my meaning) semantic function
> if I don't even produce names of all the various Phi functions, and
> say what equations they are supposed to satisfy!  I'm giving a
> meta-proof, that anyone who knows a particular language and a
> compositional (with your meaning) semantic function can turn into a
> proof.  Maybe that's not satisfactory.  Maybe I need to either state
> an actual result, and maybe in just one specific case.

The reason I'm being nit-picky about this is because I suspect that
you want to `hide' the recursion inside the Phi functions.  It is easy
to define a compositional semantics for Scheme in this manner:

E [[ (a b) ]] = Phi (E [[ a ]], E [[ b ]])

where Phi(x, y) = R5RS-curly-E [[ x ]] (R5RS-curly-E [[ y ]])

No domains in sight and neither E nor Phi is recursive.  
Problem solved. 

But the problem isn't really solved because we've done nothing more
than rename everything.



0
jrm (1310)
6/28/2004 2:16:20 PM
richter@math.northwestern.edu (Bill Richter) concluded:
> Maybe I need to either state
> an actual result, and maybe in just one specific case.

That would be different.

Will
0
cesuraSPAM (401)
6/28/2004 5:46:28 PM
richter@math.northwestern.edu (Bill Richter) wrote:
> Will, I found strong evidence for your position in a 1992 book by
> Nielson & Nielson, "Semantics with Applications".  Anyone like this
> book?  Very early in the book he defines compositionality exactly as
> you did, and on p 85 he says, 
> 
> "S_{ns} and S_{sos} are *not* denotational definitions because they
>  are *not* defined compositionality [i.e. by structural induction]."

See also:

Milne and Strachey, "A Theory of Programming Language Semantics",
    1976, section 3.1.3, pages 376-377.
Stoy, "Denotational Semantics", 1977, section on computability,
    pages 130-131.
Stoy, "Denotational Semantics", 1977, end of section on operational
    semantics, the paragraph beginning with "At first sight",
    pages 338-339.
Scott, "Lectures on a Mathematical Theory of Computation", Oxford
    Technical Monograph PRG-19, May 1981, Lecture 7: Computability
    in effectively given domains, pages 113-131.
Schmidt, "Denotational Semantics", 1986, section 3.3, especially
    Theorem 3.13, its proof, and the discussion leading up to it,
    pages 50-51.
Nielson and Nielson, "Semantics with Applications", 1992, the boxes
    that summarize "Compositional Definitions" and "Structural
    Induction", page 11.

Just in case someone is still reading this thread and doesn't have
their own copy of Nielson and Nielson, I'll quote the definition
of "compositional definitions" from that last reference:

Compositional Definitions
-------------------------

1.  The syntactic category is specified by an abstract syntax
    giving the _basis elements_ and the _compositive elements_.
    The composite elements have a unique decomposition into
    their immediate constituents.

2.  The semantics is defined by _compositional_ definitions of
    a function: There is a _semantic clause_ for each of the
    basis elements of the syntactic category and one for each
    of the methods for constructing composite elements.  The
    clauses for composite elements are defined in terms of the
    semantics of the immediate constituents of the elements.

Will
0
cesuraSPAM (401)
6/28/2004 6:50:19 PM
Joe Marshall <jrm@ccs.neu.edu> responded to me:

> >> Now you *can* develop a non-standard semantics, 
> >
> > I don't think I've said anything nonstandard.
> 
> Yes, but you've been outvoted.

Joe, I'm not using the "Scott-Strachey method", as Stoy calls it, of
Scott models & CPOs.  I'm happy with the Strachey part of functions 

Expr ---> (U x S x K -> A)

But I claim it's compositional DS by C-F's definition.  But that's a
matter of proof, which we may get to.  I'm extremely happy to realize
that this is an honest mathematical dispute.  This isn't culture
masquerading as math.  It's no rejection of math rigor to fail to
understand a theorem.  Especially if the theorem hasn't been stated
properly or proved... and may even be false :)

> > I'm using induction!  
> 
> But you just said `Suppose I can define a function E *without*
> structured induction'  Are you using it or not?

I am *not* using structured induction.  I'm using ordinary induction,
as in standard reduction, which you didn't respond to.

> The reason I'm being nit-picky about this is because I suspect that
> you want to `hide' the recursion inside the Phi functions.  It is
> easy to define a compositional semantics for Scheme in this manner:
> 
> E [[ (a b) ]] = Phi (E [[ a ]], E [[ b ]])
> 
> where Phi(x, y) = R5RS-curly-E [[ x ]] (R5RS-curly-E [[ y ]])
> 
> No domains in sight and neither E nor Phi is recursive.  
> Problem solved. 

OK, fair charge, because Lauri caught me on something similar.  But
I'm being rigorous now, via standard reduction. 

> But the problem isn't really solved because we've done nothing more
> than rename everything.

No, if we really define E, then this definition of Phi is fine. Or it
could be: that doesn't look right to me.  And that's of course the
reason to believe in compositionality, is that it sounds like Scheme:
to evaluate a combination, first evaluate the arguments, etc..
0
richter1974 (312)
6/29/2004 12:16:56 AM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

> Schmidt, "Denotational Semantics", 1986, section 3.3, especially
>     Theorem 3.13, its proof, and the discussion leading up to it,
>     pages 50-51.

Thanks for the references, Will.  The existence part of Theorem 3.13
yields the compositional semantic function via structural induction.
But the uniqueness part of Theorem 3.13 gives a precise statement &
proof of my claim: our 2 definitions of compositionality coincide!

That is, 3.13 builds, by structural induction, functions

BB_i: B_i ---> D_i

But 3.13 also proves the BB_i are uniquely determined by a family of
compositionality equations (listed below).  So we're given from the
start a set of functions f_ij defined on the D's.  

So if I start with a set of functions EE_i, then build some f_ij
functions, and show the compositionality equations are satisfied, then
EE_i = BB_i, for the official structural induction solution, by
uniqueness of 3.13.  That means that my EE_i are also built by
structural induction, after the fact.

Now I'll discuss Theorem 3.13, and later answer a point you raised
earlier about `extensionally equivalent definitions'. 


**** discussion of 3.13 + compositionality equations ****

Notes: 3.13 is pretty terse, and to my mind there's a typo, and
important points not stated, so I'll go through it.  We're giving
syntactic domains B_i and semantic domains D_i, for i = 1,...,n.
We're trying to construct (and study the uniqueness of) functions
BB_i: B_i ---> D_i.
Let's use these "function names" BB_i even though the functions
haven't been constructed yet.

I understood from p 12 and def 1.2 that the Strachey version of
Abstract set-theoretic syntax is

d_i in B_i

d_i ::= Option_i1 | ... | Option_im

So d_i is a *nonterminal* symbol.  Each Option_ij is a list of
symbols, either terminal or nonterminal.  Any terminal symbols stands
for an actual element of the set B_i.  It's a pardonable typo to have
the same m for all i=1,...,n.  I object however to his typo of
telescoping my Strachey-set-BNF as

B_i ::= Option_i1 | ... | Option_im

I'm new enough at this that I want to keep the nonterminal symbol d_i
separate from the set B_i that it belongs to.  So continuing my way,
let S_ij1,..., S_ijk be the nonterminal symbols used in Option_ij.
Again, it's a pardonable typo to have the same k for all 
i=1,...,n, j =1,...,m.  
Then for l = 1,...,k, we have 
S_ijl = d_p, 
for some p, and using this p, define
B_ijk = B_p,
D_ijk = D_p.
BB_ijk = BB_p.

We're given functions (like our Phi_P) between products of the
semantic domains

f_ij: D_ij1 x ... x D_ijk ---> D_i

An important point not mentioned is that if Option_ij is a list of
terminal symbols, then this product is just a one point set, so f_ij
is becomes an element of D_i.  That's how the base case is built into
the structural induction!  I was very confused until I realized this.

Now we can state the compositionality equations:

BB_i( Option_ij ) = f_ij ( BB_ij1(S_ij1),..., BB_ijk(S_ijk) )

To stress the above point, if Option_ij contains no nonterminal
symbols, then the equation is 

BB_i( Option_ij ) = f_ij in D_i

There's still some trouble parsing the equations, so let's just give
one example of this, from p 10, of arithmetic expressions.  Here's the
syntactic and the semantic domains plus the valuation functions:

e in Expressions  -EE--> N

o in Operators -OO--> (N x N -> N)

n in Numerals -NN--> N

e ::= n | e o e | ( e ) 

o ::= + | - | * | /

n ::= 0 | 1 | ... | 99 | 100 | ...

I'll give the standard f_ij functions:

f_11: N ---> N, the identity function,

f_12: N x (N x N -> N) x N ---> N, f_12(x, alpha, y) = alpha(x, y)

f_13: N ---> N, again the identity 

f_21, f_22, f_23, f_24 are the operators +, -, *, / in  (N x N -> N)

f_3i = i in N

And our compositionality equations are 

EE(n) = n
 
EE(e1 o e2) = OO(o)( EE(e1), EE(e2) )

EE( (e) ) = EE(e)

OO(+) = +, OO(-) = -, OO(*) = *, OO(/) = /

NN(n) = n

Well, to my mind, the 2nd & 3rd equations are not exactly of the form
listed, but there's some obvious translation that takes place.

Now let's suppose I produce functions 
EE_i: B_i ---> D_i 
and then find  functions f_ij s.t.

EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )

These are my compositionality equations.  And that's what I think
compositionality should mean, based on what Cartwright and Felleisen
wrote in their "Extensible Denotational" paper:

    [The map from syntactic domains to semantic domains] satisfies the
    law of compositionality: the interpretation of a phrase is a
    function of the interpretation of the sub-phrases.

That is, the valuation functions EE_i satisfy the law of
compositionality if there exist functions f_ij satisfying the above
compositionality equations.  Then the uniqueness of Theorem 3.13 says
that my EE_i are equal to the official BB_i constructed by structural
induction.  So my definition of compositionality coincides with yours!


**** Back to your earlier response to my proof ****

> Major reservation:  When you prove something about E by structural
> induction, you *must* use a compositional definition of E, not some
> extensionally equivalent definition that isn't compositional.  

Ah, Schmidt on p 45 explains "the principle of extensionality: for 
f, g: A ---> B, if for all a in A, f(a) = g(a), then f = g."

I think you're close to agreeing I can do what I claimed: given an E
which is compositional in my sense, I can produce an E' which is
compositional in your (i.e. Nielson's) sense, and E' = E.  I want to
then claim that E is compositional in your sense.  But you respond, I
think: E is only extensionally equivalent to E'.  That's fine, because
all I want is E' = E.  And then you're quite correct:

> The reason for this is that a structural induction proceeds by cases
> over a well-founded set.  If your alternative definition proceeds by
> a case analysis that isn't well-founded, as in many (though possibly
> not all) of your alternative definitions, then its case analysis
> does not support the principle of structural induction.

Yes, but I don't plan to use structural induction to prove results
about E!  Other than of course that E' = E.  If it turns out that I
need to, then my approach looks like a bust.  If I need a lot of
structural induction, why don't I just use the Scott-Strachey method?

> That's not to say you can't prove anything at all using a definition
> that isn't compositional.  It just says you can't use the principle
> of structural induction unless your definition is compositional.

I agree there are definite limitations.  With structural induction, E
is defined recursively via the complexity of the syntax.  With my
standard reduction approach, E is defined recursively via how many
steps it takes to reduce the expression to a value.  Simplicity of
syntax and number of reduction steps are very different, and I can't
conclude that a bound on one gives a bound on the other.  Just take
Lauri's example of omega, a syntactically simple infinite loop!
0
richter1974 (312)
6/29/2004 7:20:06 AM
richter@math.northwestern.edu (Bill Richter) writes:

>> > I'm using induction!  
>> 
>> But you just said `Suppose I can define a function E *without*
>> structured induction'  Are you using it or not?
>
> I am *not* using structured induction.  I'm using ordinary induction,
> as in standard reduction, which you didn't respond to.

Frankly, I don't know how to respond.  

Standard induction is usually defined as a method to prove a property
about integers.  I don't see any useful way to apply this to a
definition of valuation function.  Furthermore, I didn't recognize the
usual elements of an standard inductive proof:  the base case 
(for n = 0) and the inductive step (showing that p(n) implies p(n+1)
for all n).

I'm getting frustrated.
0
jrm (1310)
6/29/2004 3:21:51 PM
richter@math.northwestern.edu (Bill Richter) writes:

> cesuraSPAM@verizon.net (William D Clinger) responded to me:
>>
>> The reason for this is that a structural induction proceeds by cases
>> over a well-founded set.  If your alternative definition proceeds by
>> a case analysis that isn't well-founded, as in many (though possibly
>> not all) of your alternative definitions, then its case analysis
>> does not support the principle of structural induction.
>
> Yes, but I don't plan to use structural induction to prove results
> about E!  Other than of course that E' = E.  If it turns out that I
> need to, then my approach looks like a bust.  If I need a lot of
> structural induction, why don't I just use the Scott-Strachey method?

Why, indeed.

>> That's not to say you can't prove anything at all using a definition
>> that isn't compositional.  It just says you can't use the principle
>> of structural induction unless your definition is compositional.
>
> I agree there are definite limitations.  With structural induction, E
> is defined recursively via the complexity of the syntax.  

It would be better said that E is defined recursively by a set of
rules for primitive syntax and rules for structural induction over
composite syntax.

Since program text is finite, it follows that a denotation exists for
any program.

> With my standard reduction approach, E is defined recursively via
> how many steps it takes to reduce the expression to a value.

And this is nonsense for many reasons.

   1.  A reduction step naturally depends on E, so E is not
       well-defined.

   2.  Many programs never reduce to a value.

   3.  The number of steps it takes for those that do reduce is not
       computable in general.

   4.  Self-application is meaningless.  This is crucial because
       without self-application we have no model for iteration or
       recursion.
0
jrm (1310)
6/29/2004 4:07:02 PM
richter@math.northwestern.edu (Bill Richter) wrote an awful lot,
and then he wrote:
> Now let's suppose I produce functions 
> EE_i: B_i ---> D_i 
> and then find  functions f_ij s.t.
> 
> EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )
> 
> These are my compositionality equations.  And that's what I think
> compositionality should mean, based on what Cartwright and Felleisen
> wrote in their "Extensible Denotational" paper:
> 
>     [The map from syntactic domains to semantic domains] satisfies the
>     law of compositionality: the interpretation of a phrase is a
>     function of the interpretation of the sub-phrases.
> 
> That is, the valuation functions EE_i satisfy the law of
> compositionality if there exist functions f_ij satisfying the above
> compositionality equations.  Then the uniqueness of Theorem 3.13 says
> that my EE_i are equal to the official BB_i constructed by structural
> induction.  So my definition of compositionality coincides with yours!

You seem to be misreading Theorem 3.13.  The theorem says that the f_ij
uniquely determine the B_i (which you are calling the EE_i since you
don't like the overloading that is resolved in Schmidt's book by the
use of bold face for the functions I'm talking about here).  Theorem
3.13 does not say that the B_i uniquely determine the f_ij.

In point of fact, there are usually many different sets of f_ij that
would determine exactly the same B_i.  Thus the f_ij that you choose
after I have defined the B_i are likely to be different from the f_ij
that I used to define the B_i.  You cannot use Theorem 3.13 as stated
to prove that the f_ij you choose will define B_i functions that are
extensionally equal to those I defined.

Now I know how to get around this, via a structural induction that
uses the original f_ij that I used to define B_i.  But your refusal
to recognize those f_ij as canonical puts you in a difficult place.
You're going to have a hard time proving that your versions of the
f_ij define the same B_i that my versions defined.  I think you're
going to have to break down and use my versions of the f_ij in your
proof.

The reason this is important is that I think you're still trying to
sneak in some mutual recursion between your f_ij functions and the
B_i functions that are defined by those f_ij functions.  If so, then
you're stuck, and you can't use Theorem 3.13 to get unstuck.

> I think you're close to agreeing I can do what I claimed: given an E
> which is compositional in my sense, I can produce an E' which is
> compositional in your (i.e. Nielson's) sense, and E' = E.

The main thing I understand about your notion of "compositional" is
that it isn't what everybody else means by that word.  If your
definition is equivalent to mine, then it will not allow you to
sneak in any mutual recursion between your f_ij functions and the
semantic functions B_i.  But I suspect that to be your goal still,
so I doubt whether your notion of "compositional" is equivalent to
mine.

Your attempted sleight of hand with respect to Theorem 3.13 has only
heightened my suspicion of your motives here.

> Yes, but I don't plan to use structural induction to prove results
> about E!

In that case, your definition of E doesn't need to be compositional,
and I have no idea why you have been arguing that it is.  For more
than two years now.

> I agree there are definite limitations.  With structural induction, E
> is defined recursively via the complexity of the syntax.  With my
> standard reduction approach, E is defined recursively via how many
> steps it takes to reduce the expression to a value.

That's because your approach is not a denotational semantics at all,
but a bog-standard small-step operational semantics.

There is nothing wrong with small-step operational semantics.  But
it is wrong for you to maintain that your small-step operational
semantics is compositional or denotational.

Will
0
cesuraSPAM (401)
6/29/2004 6:28:11 PM
Joe Marshall <jrm@ccs.neu.edu> responded to me:

> > I am *not* using structured induction.  I'm using ordinary
> > induction, as in standard reduction, which you didn't respond to.
> 
> Standard induction is usually defined as a method to prove a property
> about integers.  I

Joe, perhaps have misread me.  I didn't say `standard induction'.  I
said `standard reduction', as in F-F's eval_s.   

> I'm getting frustrated.

I'm saying 2 things that you've got the math skills to understand:

1) I'm saying we can define DS semantic functions along the lines of
F-F's standard reduction function eval_s.  

2) I'm saying that Nielson's definition of compositionality is
equivalent to a definition that I thought was in C-F.  

I can't say whether it's worth your time to try.  I do think there's a
limited payoff in trying to refute my 1 or 2 without actually reading
the math, either my posts or F-F's treatment of eval_s.
0
richter1974 (312)
6/30/2004 1:38:39 AM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

> richter@math.northwestern.edu (Bill Richter) wrote an awful lot,
> and then he wrote:

That's funny, Will.  Thanks for the long response.  I have things to
say first, which should help with your response.  You earlier posted:

   Stoy, "Denotational Semantics", 1977, section on computability,
       pages 130-131.

Thanks.  That's about the computability of the semantic function which
(as you pointed out) is a problem for me.  I'll book up & report.

   Stoy, "Denotational Semantics", 1977, end of section on operational
       semantics, the paragraph beginning with "At first sight",
       pages 338-339.

I'm using Stoy's unevaluated text, but he's talking about dynamic
scope, and he writes down an obviously non-compositional equation:

curly-E[[ I ]] rho = curly-E[[ rho[[ I ]] ]] rho	 (13.39)

With either my definition of compositionality or yours, we write down
the same equations!  It's just a question of whether those equations
are a recipe for structural induction or not.  I'm in complete
agreement with what Stoy writes:

"13.39 breaks the rule about the value associated with a construct
 being defined in terms of the values associated with the
 subcomponents: rho[[ I ]] is not a subcomponent of I." 

Now on to Nielson, who proves a result of my sort!

   Theorem 4.55: For every statement S of While we have 

   curly-S_{sos}[[ S ]] = curly-S_{ds}[[ S ]] 

But he wrote (as I quoted earlier) on p 85:

   The functions curly-S_{ns} and curly-S_{sos} [...] are not
    denotational definitions because they are not defined
    compositionally.

There's a real communication problem here!  I won't say error, because
it's an "Applied" book, and the authors are Danish.

But we agree that compositionality is a property of a function.  Just
like "finitely presented" is a property of a group, or "primitively
recursive" is a property of a function.  These definitions are defined
in terms of the definition of the group/function.  So it's the function

curly-S_* : Stm ---> (State |-->  State) 

which is either compositional or not.  Not it's definition.

Theorem 4.55 says that curly-S_{sos} = curly-S_{ds}.  Two functions
are equal if and only if they're equal on all elements of the source,
as Schmidt says in regards to extensionality (a ZFC axiom BTW).

Now if curly-S_{sos} = curly-S_{ds}, and a mathematical property holds
for curly-S_{ds}, then the property holds for curly-S_{sos}!

So curly-S_{sos} is a compositional function, by our definitions. 

So when Nielson says curly-S_{sos} is not a denotational definition, I
think he's miscommunicating.  I think he means that the definition of
curly-S_{sos} is not compositional.  Which is true, and that's fine.

Thus, the group G = (x : x^2, x^3, x^4, ... ) is finitely presented,
even though I just gave an infinite number of relations.  It doesn't
matter.  G is isomorphic (I'd say equal) to G' = (x : x^2) which is
the cyclic group of order 2.  We can define the factorial function
using Bessel functions, but it's still partial recursive!


> > Now let's suppose I produce functions 
> > EE_i: B_i ---> D_i 
> > and then find  functions f_ij s.t.
> > 
> > EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )
> > 
> > These are my compositionality equations.  And that's what I think
> > compositionality should mean, [...]

> You seem to be misreading Theorem 3.13.  The theorem says that the
> f_ij uniquely determine the B_i (which you are calling the EE_i
> since you don't like the overloading that is resolved in Schmidt's
> book by the use of bold face for the functions I'm talking about
> here).  

No, Will, I first went through Theorem 3.13 and I resolved the
overloading by writing Schmidt's functions as
BB_i: B_i ---> D_i
To me XX means boldface X.  I'll assume you understand/agree and I'll
change your B_i's to BB_i's accordingly.  B_i is a domain.

And then: Yes, the f_ij uniquely determine the BB_i!

Then I assumed I had a different group of functions EE_i, which were
compositional in my sense, and I used the uniqueness of 3.13 to assert
that EE_i = BB_i.  Let's put this off for a while, though:

> Theorem 3.13 does not say that the BB_i uniquely determine the f_ij.

Sure, absolutely!  

> In point of fact, there are usually many different sets of f_ij that
> would determine exactly the same BB_i.

Interesting, I hadn't thought of that.

> Thus the f_ij that you choose after I have defined the BB_i are
> likely to be different from the f_ij that I used to define the BB_i.
> You cannot use Theorem 3.13 as stated to prove that the f_ij you
> choose will define BB_i functions that are extensionally equal to
> those I defined.

Very clever!!!  But that's not what I meant.  Sorry if I wasn't clear.

I said I'll choose both the functions EE_i and f_ij, satisfying the
compositionality equations

EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )

Then you have to define the BB_i (by structural induction) using 
*my* f_ij, and your BB_i satisfy the equations of 3.13:

BB_i( Option_ij ) = f_ij ( BB_ij1(S_ij1),..., BB_ijk(S_ijk) )

And then by the uniqueness of 3.13, EE_i = BB_i.  

BTW let's note that Schmidt's proof of 3.13 is 7 lines long.  The
statement is arduous, but the proof is trivial.  This supports my
repeated claims that it's "obvious" that the 2 definitions of
compositionality are the same.

> Now I know how to get around this, via a structural induction that
> uses the original f_ij that I used to define BB_i.  But your refusal
> to recognize those f_ij as canonical puts you in a difficult place.

I don't think so.  I'm merely asserting that my EE_i are compositional
in your sense.  I'm not making any other claims (such as
computability) about my EE_i.

> You're going to have a hard time proving that your versions of the
> f_ij define the same BB_i that my versions defined.  

I flat out won't be able to do it!!!  I won't try!!!

> I think you're going to have to break down and use my versions of
> the f_ij in your proof.

I think I'm fine. Please take another look now that I've clarified.

> The reason this is important is that I think you're still trying to
> sneak in some mutual recursion between your f_ij functions and the
> BB_i functions that are defined by those f_ij functions.  If so,
> then you're stuck, and you can't use Theorem 3.13 to get unstuck.

Yes, but I've repented that error, which Lauri caught me on.  I'm now

1) defining my EE_i first, by standard reduction, then

2) defining my f_ij, in terms of EE_i, then 

3) showing the compositionality equations displayed in 3.13 are
satisfied for my EE_i & f_ij, then

4) using 3.13 to deduce that my EE_i are equal to the BB_i produced
by structural induction, and from this deduce only only thing:

5) my EE_i are compositional in your sense.


> > I think you're close to agreeing I can do what I claimed: given an
> > E which is compositional in my sense, I can produce an E' which is
> > compositional in your (i.e. Nielson's) sense, and E' = E.
> 
> The main thing I understand about your notion of "compositional" is
> that it isn't what everybody else means by that word.  

Allow me to restate it: The EE_i: B_i ---> D_i are compositional if
there exist f_ij such that

EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )

> If your definition is equivalent to mine, then it will not allow you
> to sneak in any mutual recursion between your f_ij functions and the
> semantic functions BB_i.  

Absolutely!!!

> But I suspect that to be your goal still, so I doubt whether your
> notion of "compositional" is equivalent to mine.
>
> Your attempted sleight of hand with respect to Theorem 3.13 has only
> heightened my suspicion of your motives here.

I wish you had said instead: 

"I agree your notion of compositional is equivalent to mine, but I
 expect you to fail to produce any compositional functions: I think
 you'll sneak in mutual recursion between f_ij & BB_i."

> > Yes, but I don't plan to use structural induction to prove results
> > about E!
> 
> In that case, your definition of E doesn't need to be compositional,
> and I have no idea why you have been arguing that it is.  For more
> than two years now.

For 2 reasons: 

many DS authors (such as Nielson) insist on compositionality as part
of the definition of a DS valuation function

To me, it's like conservation laws in Hamiltonian systems.
If someone tells you about conservation of angular momentum, you try
to prove it in your setup.

> There is nothing wrong with small-step operational semantics.  But
> it is wrong for you to maintain that your small-step operational
> semantics is compositional or denotational.

If you fight through what I'm saying, I think you'll agree I succeeded
now with re-defining compositionality, and then I can attempt to
define a compositional semantic function. 

But I'm close to saying that any small-step OpS (if I understand the
phrase) is easily recast as DS, if we can only prove "conservation
laws", i.e find f_ij and prove the compositionality equations.
0
richter1974 (312)
6/30/2004 3:22:34 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>
>> > I am *not* using structured induction.  I'm using ordinary
>> > induction, as in standard reduction, which you didn't respond to.
>> 
>> Standard induction is usually defined as a method to prove a property
>> about integers.  I
>
> Joe, perhaps have misread me.  I didn't say `standard induction'.  I
> said `standard reduction', as in F-F's eval_s.   

Yes, my apologies.

>> I'm getting frustrated.
>
> I'm saying 2 things that you've got the math skills to understand:
>
> 1) I'm saying we can define DS semantic functions along the lines of
> F-F's standard reduction function eval_s.  
>
> 2) I'm saying that Nielson's definition of compositionality is
> equivalent to a definition that I thought was in C-F.  
>
> I can't say whether it's worth your time to try.  I do think there's a
> limited payoff in trying to refute my 1 or 2 without actually reading
> the math, either my posts or F-F's treatment of eval_s.

I do indeed try to read them, but you have to take it in smaller
steps.

> 1) I'm saying we can define DS semantic functions along the lines of
>     F-F's standard reduction function eval_s.

Part of the issue seems to revolve around the definition of
`denotational semantics'.  Let me try to describe what we're
attempting to do with denotational semantics.

The purpose of semantics is to ascribe `meaning' to programs.  In the
denotational semantics approach, we describe the meaning of a program
in terms of a mathematical system.  Typically, the system is some
version of Zermelo-Frankel set theory, but there are others.  The
basic idea is to take a program (a syntactic construct) and construct
a mathematical `statement' or `formula' that we say is the `meaning'.
The point being that we can now do interesting things with the
mathematical statement, for example, prove it true, assume it to be
true as part of a proof, disprove it, show that it is equivalent to a
different mathematical statement, simplify it, etc.

One thing we *might* do is attempt to simplify the statement by
reducing it to a `normal form'.  There may not be a `normal form'.

In denotational semantics, we describe the operation of a computer as
a mechanical means of generating iterative approximations to the
normal form (if it exists).


Operational semantics has a very different focus.  Operational
semantics views a program not as a point in a value space, but rather
as description of a process.  The process evolves over time and may
(or may not) reach a `final state'.  The judgements in operational
semantics mathematically describe the allowed evolution of process.

In operational semantics, we describe the operation of a computer as a
mechanical embodiment of the judgements.  The computer iteratively
applies the judgements and stops only if it reaches the `final state'
(halts) or if no judgements apply (it is `stuck').


Yes, these views are related, but not in an obvious manner.  The
primary issue is this:  if a normal form exists, we want the
operational semantics to terminate with the final state being that
normal form.  This will constrain the rules of our operational
semantics, but when someone puts forth an operational semantics, it is
incumbent upon them to show that it reaches a normal form if one
exists (assuming, of course that the *intent* of the operational
semantics is to do so!)  This is usually done by appeal to standard
reduction and the Curch-Rosser theorem.  On the other hand, if the
operational semantics terminates, we want the final state to be the
normal form of the original expression.  `Termination' is a property
of operational semantics.

`Termination' is *not* a property of denotational semantics.  There is
nothing to terminate.  You might ask if the square root of two is
rational, but you wouldn't ask if it `terminates'.  In denotational
semantics we introduce a `bottom' element into the domain equations to
mean `insufficient information'.  The `bottom' element comes in handy
in several ways, for example, you may wish to prove that all programs
with certain properties map to bottom.  You might want to show that no
programs of a certain type map to bottom.  Bottom is *frequently*
found as a subexpression in a denotational expression.  Some languages
have rules that most (but not all) expressions *containing* bottom are
equivalent to bottom.

Since `bottom' is a limit, a syntactic construct that maps to bottom
cannot be completely reduced by a computer.  Yet a computer can
continue to make iterative approximations to it.  Thus the bottom
element is indicative of a non-terminating program.  `Bottom' is *not*
a property of operational semantics.  You will not find judgments
predicated on expressions that denote bottom, nor will you find bottom
in a `final state'.


Now to return to your statement:

> 1) I'm saying we can define DS semantic functions along the lines of
>     F-F's standard reduction function eval_s.

Do you see why this is nonsensical?  `Standard reduction' is an
algorithm for reducing something to normal form.  This is something
you would do with operational semantics, but it has no analog in
denotational semantics.

It is not completely unrelated:  iterative applications of standard
reduction may result in a normal form, and that normal form ought to
be the same you would get by writing the denotation in normal form.
But you cannot *derive* the denotational semantics by repeated
application of standard reduction.  The reason is that you need a
closed-form solution of the recursive reduction algorithm and such a
form DOESN'T EXIST.  (Not only does it not exist, it's rather easy to
prove that it cannot exist.)

> 2) I'm saying that Nielson's definition of compositionality is
>    equivalent to a definition that I thought was in C-F.

It wasn't the compositionality that we were hung up on, it was
circular definition of the composition function.

0
jrm (1310)
6/30/2004 8:32:40 PM
Joe Marshall <jrm@ccs.neu.edu> responds to me:

> > With my standard reduction approach, E is defined recursively via
> > how many steps it takes to reduce the expression to a value.
> 
> And this is nonsense for many reasons.

Joe, it really sounds like you're saying that F-F's treatment of
standard reduction is nonsense.  You really ought to read it.

>    1.  A reduction step naturally depends on E, so E is not
>        well-defined.

F-F and I use a small-step function, which I called R.  R is not
recursively defined.  R finds the evaluation context and performs a
single action, such as the beta_v rule
(\x . b) V |---> b[x <- V]

One of the attractions of HtDP's Advanced Scheme is that the beta_v
rule holds!  Which is to say, they've pushed the mutation problems
into their 'local' construction.  But even so, I still like it.

>    2.  Many programs never reduce to a value.

Yes, and so I (presumably F-F mean this as well) say the program gets
mapped to bottom if no iterate R^i(program) is a value.  

It's like Boolos & Jeffrey's treatment of partial recursive functions.
Just think of a partial function E: N ---> N defined by
E(x) = f(x), for P(x) = true
       E(R(x)), otherwise
Folks here tend to use CPOs & fixed-point operators to define E.  But
B-J don't do so!  They produce instead what the CPO approach calls the
minimal fixed point.  Will you speculate that B-J are flubbing it?

>    3.  The number of steps it takes for those that do reduce is not
>        computable in general.

Interesting!  Makes sense.  But it doesn't matter, because we're using
strong math axioms to define the function E, and we're not printing
the number of steps, and the same issue arises for B-J.

>    4.  Self-application is meaningless.  This is crucial because
>        without self-application we have no model for iteration or
>        recursion.

But you get recursion in LC and also LC_v, right?  So it sounds like
you're saying that F-F and Barendregt flubbed it.  I'm just doing what
they do, and I think it's fine.  If you're going to attack folks, even
indirectly, it makes sense to read what they wrote!
0
richter1974 (312)
7/1/2004 12:57:22 AM
richter@math.northwestern.edu (Bill Richter) writes:

> 1) I'm saying we can define DS semantic functions along the lines of
> F-F's standard reduction function eval_s.  

Then it's not a denotational semantics.

If you use a step-wise sequence of reductions to construct the
"meaning" of a program, you're effectively evaluating it.  This is an
operational semantics.  You can make statements about the length of
this sequence, whether or not it terminates, and so on.  These are all
properties of an operational semantics.  Not only are these not
properties of a denotational semantics, if something claims to be a DS
that exhibits these properties, the claim is nonsensical.

Shriram
0
sk1 (223)
7/1/2004 1:25:15 AM
richter@math.northwestern.edu (Bill Richter) wrote a lot more
stuff, ending with:
> > There is nothing wrong with small-step operational semantics.  But
> > it is wrong for you to maintain that your small-step operational
> > semantics is compositional or denotational.
> 
> If you fight through what I'm saying, I think you'll agree I succeeded
> now with re-defining compositionality, and then I can attempt to
> define a compositional semantic function. 

Abraham Lincoln posed this question:  If you call a tail a leg, then
how many legs does a dog have?

Four.  Calling a tail a leg doesn't make it a leg.

Re-defining the word "compositional" doesn't make your semantics
compositional.  In particular, your bog-standard small-step
operational semantics doesn't support proofs by structural
induction, as a compositional semantics would.

> But I'm close to saying that any small-step OpS (if I understand the
> phrase) is easily recast as DS, if we can only prove "conservation
> laws", i.e find f_ij and prove the compositionality equations.

You can say anything you want, but saying doesn't make it so.

Will
0
cesuraSPAM (401)
7/1/2004 2:28:39 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responds to me:
>
>> > With my standard reduction approach, E is defined recursively via
>> > how many steps it takes to reduce the expression to a value.
>> 
>> And this is nonsense for many reasons.
>
> Joe, it really sounds like you're saying that F-F's treatment of
> standard reduction is nonsense.  You really ought to read it.

It may sound that way, but I assure you I am not.  I have read Matthew
Flatt and Matthias Felleisen's paper at
http://www.ccs.neu.edu/course/com3357/mono.ps 

>>    1.  A reduction step naturally depends on E, so E is not
>>        well-defined.
>
> F-F and I use a small-step function, which I called R.  R is not
> recursively defined.  R finds the evaluation context and performs a
> single action, such as the beta_v rule
> (\x . b) V |---> b[x <- V]

And exactly what is the beta_v rule?  We have to go all the way back
to Church's lambda calculus to answer this.  The beta rule is one of
the three rules for reducing lambda expressions.  Implicit in this
definition is that beta reduction does not change the *meaning* of the
expression.  That is to say, for *any* lambda expression that is
capable of being beta-reduced, the beta-reduced expression is
equivalent.

Now consider the `42' rule.  It works like this:
(\x . b) V |---> b[x <- 42] 

Why is it that beta_v is used frequently but 42_v is not?  Because the
42 rule does *not* preserve the meaning of (all) expressions.

But how we think about the rule --- how a person *understands* the
rule --- is relative to the context in which we discuss the rule.  We
reject the 42 rule because it makes no sense in the context of lambda
calculus.  If we were discussing not lambda calculus but strings of 27
characters, then the 42 rule is a valid example.

So how are we to understand beta_v?  In the context of lambda
calculus, beta_v is one of the many rules that *preserve* the meaning
of E.  Starting from E, you *define* rules that preserve E.

But you want to go in the other direction.  Starting with beta_v, you
wish to *define* E.  But beta_v by itself, without reference to E, is
a meaningless manipulation of meaningless symbols.  Beta_v in the
context of E cannot be used to define E, but beta_v without the
context of E has no meaning.

>>    2.  Many programs never reduce to a value.
>
> Yes, and so I (presumably F-F mean this as well) say the program gets
> mapped to bottom if no iterate R^i(program) is a value.  

You really ought to read the paper.  
Page 43: 
   If eval_v(M) does not exist, we say that M diverges.

Page 89:
   This situation corresponds to a genuine infinite loop in the
   program, which, in general, cannot be detected.

They do *not* at any point identify this as `bottom'.

> It's like Boolos & Jeffrey's treatment of partial recursive functions.
> Just think of a partial function E: N ---> N defined by
> E(x) = f(x), for P(x) = true
>        E(R(x)), otherwise
> Folks here tend to use CPOs & fixed-point operators to define E.  But
> B-J don't do so!  They produce instead what the CPO approach calls the
> minimal fixed point.  Will you speculate that B-J are flubbing it?

I won't speculate on what they do.  I will, however, make the bold
assertion that their functions P and R are *not* defined in terms of
E.

>>    3.  The number of steps it takes for those that do reduce is not
>>        computable in general.
>
> Interesting!  Makes sense.  But it doesn't matter, because we're using
> strong math axioms to define the function E, and we're not printing
> the number of steps, and the same issue arises for B-J.

It matters because you cannot decide in general whether E(<form>)
is bottom or not in any bounded number of steps.  If you purport to
have a `definition' of E, and I have a function that I claim satisfies
your definition, then you should be able to either show that my claim
is true or false.  It can hardly be a `definition' if you can only say
`I don't know if object x satisfies my definition' in the general case.

>>    4.  Self-application is meaningless.  This is crucial because
>>        without self-application we have no model for iteration or
>>        recursion.
>
> But you get recursion in LC and also LC_v, right?  So it sounds like
> you're saying that F-F and Barendregt flubbed it.  I'm just doing what
> they do, and I think it's fine.  If you're going to attack folks, even
> indirectly, it makes sense to read what they wrote!

If you're going to put words in my mouth it makes sense to read what I
wrote.

You certainly aren't doing what Flatt and Felleisen did.  They develop
`standard reduction' and show the following:

    - If *any* reduction produces a value, then standard reduction
      produces that value.

    - Since standard reduction works by repeated application of
      value-preserving transformations, standard reduction is
      value-preserving.


These are important because they lead to this:

    - No computer will compute more answers than standard reduction.

    - If the computer yields an answer, it is correct.

Flatt and Felleisen show that standard reduction converges to the
meaning (the E function) in the limit.  They *never* claim that E can
be *defined* as the limit that standard reduction approaches.
0
jrm (1310)
7/1/2004 4:07:22 PM
Shriram Krishnamurthi <sk@cs.brown.edu> responds to me:
> 
> > 1) I'm saying we can define DS semantic functions along the lines
> > of F-F's standard reduction function eval_s.
> 
> Then it's not a denotational semantics.

Thanks, Shriram.  The question is: what's the definition of DS, or a
DS semantic function?  We can give a cultural definition, and I think
that's what Stoy does in his intro (could be wrong).  But we can say,
DS (like Homotopy Theory) is a subject defined by its practitioners.
Good rules of thumbs are: you should use Scott models of LC, CPOs, the
value of a lambda expr should be a certain kind of a function...

But instead one could give a mathematical definition of a DS semantic
function.  I thought that's what Will wanted to do.  And I thought
Will's definition for a compositional DS semantic function was that it
had a certain specific type of definition via structural induction.

And then I proved (using Schmidt's Theorem 3.13, which Will cited) 2
days ago that this was equivalent to a different definition for a
compositional DS semantic function.  I don't think anyone's responded
to my proof yet. And I'm ready to prove that functions similar to
F-F's standard reduction function eval_s satisfy my definition.

> If you use a step-wise sequence of reductions to construct the
> "meaning" of a program, you're effectively evaluating it.  This is
> an operational semantics.  You can make statements about the length
> of this sequence, whether or not it terminates, and so on.  These
> are all properties of an operational semantics.  

So far so good.

> Not only are these not properties of a denotational semantics, if
> something claims to be a DS that exhibits these properties, the
> claim is nonsensical.

This isn't good.  A function doesn't "know" what it's definition is.
0
richter1974 (312)
7/2/2004 1:14:05 AM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

> (Bill Richter) wrote a lot more stuff, ending with:

Will, we're having a communication problem. I think I'm writing too
much, and you're not reading it, but just skipping to the end.  

> Re-defining the word "compositional" doesn't make your semantics
> compositional.  

Sure, but I proved, using Schmidt's Theorem 3.13, that my definition
of compositional was equivalent to yours.  

> In particular, your bog-standard small-step operational semantics
> doesn't support proofs by structural induction, as a compositional
> semantics would.

Well, that's perhaps a weakness of my semantic function.  However, I
of course can give proofs by structural induction, since I proved that
my definition of compositionality is the same as yours, i.e. that my
function was defined by structural induction.
0
richter1974 (312)
7/2/2004 1:22:30 AM
Joe Marshall <jrm@ccs.neu.edu> responded to me:

   > Joe, it really sounds like you're saying that F-F's treatment of
   > standard reduction is nonsense.  You really ought to read it.
   
   It may sound that way, but I assure you I am not.  I have read
   Matthew Flatt and Matthias Felleisen's paper at
   http://www.ccs.neu.edu/course/com3357/mono.ps

Great, Joe!  You say some interesting things below, but you don't vote
on whether their standard reduction function eval_s is well-defined.

   >>    1.  A reduction step naturally depends on E, so E is not
   >>    well-defined.
   >
   > F-F and I use a small-step function, which I called R.  R is not
   > recursively defined.  R finds the evaluation context and performs
   > a single action, such as the beta_v rule
   > (\x . b) V |---> b[x <- V]
   
   And exactly what is the beta_v rule?  We have to go all the way
   back to Church's lambda calculus to answer this.

No, we can read it in F-F, or HtDP, sec 38.3 :)

   >>    2.  Many programs never reduce to a value.
   >
   > Yes, and so I (presumably F-F mean this as well) say the program
   > gets mapped to bottom if no iterate R^i(program) is a value.
   
   You really ought to read the paper.  

You ought to be more polite :) I've even simplified the proof of their
Standard Reduction Theorem 6.2, and extended my simple proof to the
much harder LC result (on my web page).

   Page 43: 
      If eval_v(M) does not exist, we say that M diverges.
   
   Page 89:
      This situation corresponds to a genuine infinite loop in the
      program, which, in general, cannot be detected.
   
   They do *not* at any point identify this as `bottom'.

Correct.  As you know, there's a standard translation between partial
functions
f: X ---> Y
and total functions
g: X ---> Y_bottom.

g^{-1}(bottom) is the subset of X on which f in undefined, and f & g
agree otherwise.  Folks on c.l.s., and DS books in general, prefer
total/bottom functions to partial functions, so I "translated" F-F.
   
   > It's like Boolos & Jeffrey's treatment of partial recursive
   > functions.  Just think of a partial function E: N ---> N defined
   > by
   > E(x) = f(x), for P(x) = true
   >        E(R(x)), otherwise
   > Folks here tend to use CPOs & fixed-point operators to define E.
   > But B-J don't do so!  They produce instead what the CPO approach
   > calls the minimal fixed point.  Will you speculate that B-J are
   > flubbing it?
   
   I won't speculate on what they do.  I will, however, make the bold
   assertion that their functions P and R are *not* defined in terms
   of E.

Yes, absolutely!  I said that myself in my case.  It's part of their
(recursive!) definition of partial recursive functions N^k ---> N.
We're given that f, P & R are already known to be partial recursive
functions, and then E is then one too.  B-J's definition is more
general, in term of minimizing recursion.
   
   >>    3.  The number of steps it takes for those that do reduce is
   >>    not computable in general.
   >
   > Interesting!  Makes sense.  But it doesn't matter, because we're
   > using strong math axioms to define the function E, and we're not
   > printing the number of steps, and the same issue arises for B-J.
   
   It matters because you cannot decide in general whether E(<form>)
   is bottom or not in any bounded number of steps.  

Correct.  Similarly, F-F cannot decide if E(<Lambda_v expression>) is
undefined in any bounded number of steps.   BTW I now tend to think 
this number-of-steps function is computable, not that it matters.

   If you purport to have a `definition' of E, and I have a function
   that I claim satisfies your definition, then you should be able to
   either show that my claim is true or false.  It can hardly be a
   `definition' if you can only say `I don't know if object x
   satisfies my definition' in the general case.

I don't understand this, but it seems to apply equally well to F-F.
   
   You certainly aren't doing what Flatt and Felleisen did.  

Right, I'm doing much less.  But my construction of my function E is
similar to their construction of eval_s.

   They develop `standard reduction' and show the following:
   
       - If *any* reduction produces a value, then standard reduction
         produces that value.

Very good!  I can't do that in Scheme though!!!  We don't have any
other notion of reduction other than standard reduction.  So I can't
prove or state any analogue of their Standard Reduction Theorem.
   
       - Since standard reduction works by repeated application of
         value-preserving transformations, standard reduction is
         value-preserving.

I'm not sure what that means.  If E is already a value, then 
eval_s(E) = E.  
You don't do anything in that case.

      These are important because they lead to this:
   
       - No computer will compute more answers than standard
         reduction.
   
       - If the computer yields an answer, it is correct.
   
Uh, doesn't sound quite right.  See, they already defined general
reductions in LC_v, the combination of alpha and beta_v reductions.
So let's say we've got some reduction of an expr E to a value V.  F-F
prove the deep theorem that V is "essentially" eval_s(E).
Furthermore, if eval_s(E) is undefined, then there is no other
possible reduction of E to a value.  That's a much deeper result in
LC, though, because if E = (E_1 E_2), the only possible reductions on
E consist of reducing E_1 and E_2 until E_1 & E_2 become values.

   Flatt and Felleisen show that standard reduction converges to the
   meaning (the E function) in the limit.  They *never* claim that E
   can be *defined* as the limit that standard reduction approaches.

I think your 2nd statement true, I don't understand your 1st.  One of
the great advantages of the Scott/CPO approach is that you can talk
about limits.  So you can't talk about eval_s being a limit, can you?

That's not the issue about well-defined-ness.  The issue is whether
you can define eval_s by saying 

eval_s(E) = V, if the algorithm terminates at V
            undefined, if if the algorithm never terminates.

F-F don't say a great deal.  They write on p 53, 
"By Theorem 6.1, eval_s is a well-defined partial function."
That's enough for me.
0
richter1974 (312)
7/2/2004 2:19:09 AM
Joe Marshall <jrm@ccs.neu.edu> responded to me:
   
   > I'm saying 2 things that you've got the math skills to
   > understand:
   >
   > 1) I'm saying we can define DS semantic functions along the lines
   > of F-F's standard reduction function eval_s.
   >
   > 2) I'm saying that Nielson's definition of compositionality is
   > equivalent to a definition that I thought was in C-F.
   >
   > I can't say whether it's worth your time to try.  I do think
   > there's a limited payoff in trying to refute my 1 or 2 without
   > actually reading the math, either my posts or F-F's treatment of
   > eval_s.
   
   I do indeed try to read them, but you have to take it in smaller
   steps.

Sure, let's go slow.  What I'm saying is simple, but I think it's
worth understanding. 
   
   > 1) I'm saying we can define DS semantic functions along the lines
   > of F-F's standard reduction function eval_s.
   
   Part of the issue seems to revolve around the definition of
   `denotational semantics'.  Let me try to describe what we're
   attempting to do with denotational semantics.
   
I'm going to skip your discussion, Joe, because I'm only interested in
a mathematical definition of a DS valuation function.  I thought Will
and I had one, and I want to stick to that.

   Now [after my semantics discussion] to return to your statement:
   
   > 1) I'm saying we can define DS semantic functions along the lines of
   >     F-F's standard reduction function eval_s.
   
   Do you see why this is nonsensical?  `Standard reduction' is an
   algorithm for reducing something to normal form.  This is something
   you would do with operational semantics, but it has no analog in
   denotational semantics.
   
Perhaps you're not appreciating the abstract nature of Math.  These
are all vague cultural remarks.  The mathematical definition of a DS
valuation function is a mathematically defined function

E: Expressions ---> Some-Set-of-Meanings

and the only way culture gets into this is we have decide (with our
real-world experience) whether E is actually a good mathematical
formulation of the output of a program.  OK, and then Will & I hatched
on a mathematical definition of E being compositional.  Culture plays
absolutely no role in asking whether functions E that I define are
compositional.  It's all math!  And I got the ball rolling by proving
that Will's definition of compositionality (E must be definable by
structure induction in a certain way) was equivalent to my definition
of compositionality.
   
   > 2) I'm saying that Nielson's definition of compositionality is
   > equivalent to a definition that I thought was in C-F.
   
   It wasn't the compositionality that we were hung up on, it was
   circular definition of the composition function.

I don't think you speak for Will here.   But this is why you & I are
discussing whether F-F's eval_s is circular, or well-defined.
0
richter1974 (312)
7/2/2004 3:03:13 AM
richter@math.northwestern.edu (Bill Richter) writes:

> But instead one could give a mathematical definition of a DS semantic
> function.  I thought that's what Will wanted to do.  And I thought
> Will's definition for a compositional DS semantic function was that it
> had a certain specific type of definition via structural induction.

Correct.  I would say the term "compositional DS semantic function" is
redundant: a defining characteristic is that it be compositional, ie,
defined by structural induction.  Furthermore, it cannot diverge
(because functions in mathematics don't "diverge").

> And then I proved (using Schmidt's Theorem 3.13, which Will cited) 2
> days ago that this was equivalent to a different definition for a
> compositional DS semantic function.  I don't think anyone's responded
> to my proof yet. 

I wonder why.

>                   And I'm ready to prove that functions similar to
> F-F's standard reduction function eval_s satisfy my definition.

Does your function-similar-to-eval_s diverge on some programs?

> > Not only are these not properties of a denotational semantics, if
> > something claims to be a DS that exhibits these properties, the
> > claim is nonsensical.
> 
> This isn't good.  A function doesn't "know" what it's definition is.

Geez, everyone's a pedant.  "Not only are these not properties of a
denotational semantics, if someone claims that a semantics S is a DS
and S exhibits these properties, than that person's claim is
nonsensical."  (I thought this was pretty obvious, given that a
function doesn't `know' what its definition is.)

Shriram
0
sk1 (223)
7/2/2004 3:20:37 AM
Bill Richter wrote:
{stuff deleted}
>    Page 43: 
>       If eval_v(M) does not exist, we say that M diverges.
>    
>    Page 89:
>       This situation corresponds to a genuine infinite loop in the
>       program, which, in general, cannot be detected.
>    
>    They do *not* at any point identify this as `bottom'.
> 
> Correct.  As you know, there's a standard translation between partial
> functions
> f: X ---> Y
> and total functions
> g: X ---> Y_bottom.
> 
> g^{-1}(bottom) is the subset of X on which f in undefined, and f & g
> agree otherwise.  Folks on c.l.s., and DS books in general, prefer
> total/bottom functions to partial functions, so I "translated" F-F.

No such "g" can exist. If it did you have solved the halting problem.
You cannot turn the eval_v function into a total function by wishful thinking.
0
danwang742 (171)
7/2/2004 4:23:12 AM
"Daniel C. Wang" <danwang74@hotmail.com> wrote in message news:<2kk69iF39ri4U1@uni-berlin.de>...
> Bill Richter wrote:
> {stuff deleted}
> >    Page 43: 
> >       If eval_v(M) does not exist, we say that M diverges.
> >    
> >    Page 89:
> >       This situation corresponds to a genuine infinite loop in the
> >       program, which, in general, cannot be detected.
> >    
> >    They do *not* at any point identify this as `bottom'.
> > 
> > Correct.  As you know, there's a standard translation between partial
> > functions
> > f: X ---> Y
> > and total functions
> > g: X ---> Y_bottom.
> > 
> > g^{-1}(bottom) is the subset of X on which f in undefined, and f & g
> > agree otherwise.  Folks on c.l.s., and DS books in general, prefer
> > total/bottom functions to partial functions, so I "translated" F-F.
> 
> No such "g" can exist. If it did you have solved the halting problem.
> You cannot turn the eval_v function into a total function by wishful thinking.

function != algorithm.

Just because a function is not computable does not mean it does not
exist.

Every program does indeed either halt or not halt.  So a function that
is the solution to the halting problem certainly exists; we just can't
write down a way to calculate such a function totally.
0
pnkfelix (27)
7/2/2004 2:32:59 PM
Felix Klock wrote:
{stuff deleted}
> 
> function != algorithm.
> 
> Just because a function is not computable does not mean it does not
> exist.
> 
> Every program does indeed either halt or not halt.  So a function that
> is the solution to the halting problem certainly exists; we just can't
> write down a way to calculate such a function totally.

I guess, I'm too much of a constructivist. If I can't write it down it 
doesn't exist. This is a point of philosophy as well as the peculiarities of 
your formal logic.

In any case if you are willing to admit this kind of trickery then I'll be 
surprised to admit that what Bill is doing is "sensible".

He's just defining the meaning of a term as its canonical form if it exists.
You can of course easily do this compositionally. It is a compositional 
semantics of some sort, but the notion of equality or meaning under this 
system is not very interesting.

For example the following two terms do not have the same meaning under 
Bill's semantics

   lam x . 1 + x

and

   lam x . x + 1

What most people recognize as a proper denotational semantics would say the 
two terms map to the same semantic object. However, I don't think in any 
strict technical sense a denotational semantics is required to do so.

In a strict technical sense a denotational semantics seems to only require a 
inductive definition of some total function from program syntax to some set 
of semantic objects. Under this interpretation it seem just about everything 
can be considered a "denotational semantics" even a type-checker. This 
understanding completely ignores the "spirit" of denotational semantics and 
the historical motivations for it.
0
danwang742 (171)
7/2/2004 4:24:03 PM
"Daniel C. Wang" <danwang74@hotmail.com> wrote:
> > g^{-1}(bottom) is the subset of X on which f in undefined, and f & g
> > agree otherwise.  Folks on c.l.s., and DS books in general, prefer
> > total/bottom functions to partial functions, so I "translated" F-F.
> 
> No such "g" can exist. If it did you have solved the halting problem.
> You cannot turn the eval_v function into a total function by wishful thinking.

Friendly amendment:  Wishful thinking can turn eval_v into a total
function, but it can't turn eval_v into a computable total function.

Will
0
cesuraSPAM (401)
7/2/2004 5:32:24 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>    
>    Part of the issue seems to revolve around the definition of
>    `denotational semantics'.  Let me try to describe what we're
>    attempting to do with denotational semantics.
>    
> I'm going to skip your discussion, Joe, because I'm only interested in
> a mathematical definition of a DS valuation function.  

Then perhaps you'd be so kind as to tell us what the terms
`mathematical', `definition', and `denotational semantic valuation
function' mean, because it seems that no one here knows.

> The mathematical definition of a DS valuation function is a
> mathematically defined function

Hmmm.  This definition seems a bit circular. 

>
> E: Expressions ---> Some-Set-of-Meanings
>
> and the only way culture gets into this is we have decide (with our
> real-world experience) whether E is actually a good mathematical
> formulation of the output of a program.  

I was laboring under the illusion that we were attempting to model the
program as a mathematical construct.  It seems that you wish to model
the *output* of a program.  These are very different goals.

0
jrm (1310)
7/2/2004 8:24:31 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>
>    > Joe, it really sounds like you're saying that F-F's treatment of
>    > standard reduction is nonsense.  You really ought to read it.
>    
>    It may sound that way, but I assure you I am not.  I have read
>    Matthew Flatt and Matthias Felleisen's paper at
>    http://www.ccs.neu.edu/course/com3357/mono.ps
>
> Great, Joe!  You say some interesting things below, but you don't vote
> on whether their standard reduction function eval_s is well-defined.

Of course eval_s is well-defined.  What is not well defined is the
limit as N goes to infinity of the nth composition of eval_s.

>    >>    1.  A reduction step naturally depends on E, so E is not
>    >>    well-defined.
>    >
>    > F-F and I use a small-step function, which I called R.  R is not
>    > recursively defined.  R finds the evaluation context and performs
>    > a single action, such as the beta_v rule
>    > (\x . b) V |---> b[x <- V]
>    
>    And exactly what is the beta_v rule?  We have to go all the way
>    back to Church's lambda calculus to answer this.
>
> No, we can read it in F-F, or HtDP, sec 38.3 :)
>
>    >>    2.  Many programs never reduce to a value.
>    >
>    > Yes, and so I (presumably F-F mean this as well) say the program
>    > gets mapped to bottom if no iterate R^i(program) is a value.
>    
>    You really ought to read the paper.  
>
> You ought to be more polite :) 

See line 6.

>    >>    3.  The number of steps it takes for those that do reduce is
>    >>    not computable in general.
>    >
>    > Interesting!  Makes sense.  But it doesn't matter, because we're
>    > using strong math axioms to define the function E, and we're not
>    > printing the number of steps, and the same issue arises for B-J.
>    
>    It matters because you cannot decide in general whether E(<form>)
>    is bottom or not in any bounded number of steps.  
>
> Correct.  Similarly, F-F cannot decide if E(<Lambda_v expression>) is
> undefined in any bounded number of steps.   BTW I now tend to think 
> this number-of-steps function is computable, not that it matters.

It matters a great deal.  I'd be interested in seeing a derivation
that the number of steps is a computable function.

>    If you purport to have a `definition' of E, and I have a function
>    that I claim satisfies your definition, then you should be able to
>    either show that my claim is true or false.  It can hardly be a
>    `definition' if you can only say `I don't know if object x
>    satisfies my definition' in the general case.
>
> I don't understand this, but it seems to apply equally well to F-F.

It does.

>    You certainly aren't doing what Flatt and Felleisen did.  
>
> Right, I'm doing much less.  But my construction of my function E is
> similar to their construction of eval_s.
>
>    They develop `standard reduction' and show the following:
>    
>        - If *any* reduction produces a value, then standard reduction
>          produces that value.
>
> Very good!  I can't do that in Scheme though!!!  We don't have any
> other notion of reduction other than standard reduction.  So I can't
> prove or state any analogue of their Standard Reduction Theorem.
>    
>        - Since standard reduction works by repeated application of
>          value-preserving transformations, standard reduction is
>          value-preserving.
>
> I'm not sure what that means.  If E is already a value, then 
> eval_s(E) = E.  
> You don't do anything in that case.
>
>       These are important because they lead to this:
>    
>        - No computer will compute more answers than standard
>          reduction.
>    
>        - If the computer yields an answer, it is correct.
>    
> Uh, doesn't sound quite right.  See, they already defined general
> reductions in LC_v, the combination of alpha and beta_v reductions.
> So let's say we've got some reduction of an expr E to a value V.  F-F
> prove the deep theorem that V is "essentially" eval_s(E).
> Furthermore, if eval_s(E) is undefined, then there is no other
> possible reduction of E to a value.  That's a much deeper result in
> LC, though, because if E = (E_1 E_2), the only possible reductions on
> E consist of reducing E_1 and E_2 until E_1 & E_2 become values.
>
>    Flatt and Felleisen show that standard reduction converges to the
>    meaning (the E function) in the limit.  They *never* claim that E
>    can be *defined* as the limit that standard reduction approaches.
>
> I think your 2nd statement true, I don't understand your 1st.  One of
> the great advantages of the Scott/CPO approach is that you can talk
> about limits.  So you can't talk about eval_s being a limit, can you?

Just so.

> That's not the issue about well-defined-ness.  The issue is whether
> you can define eval_s by saying 
>
> eval_s(E) = V, if the algorithm terminates at V
>             undefined, if if the algorithm never terminates.
>
> F-F don't say a great deal.  They write on p 53, 
> "By Theorem 6.1, eval_s is a well-defined partial function."
> That's enough for me.

Unfortunately, it is not enough to make it denotational.
0
jrm (1310)
7/2/2004 8:41:20 PM
pnkfelix@gmail.com (Felix Klock) writes:

> Every program does indeed either halt or not halt.  

Oh?

> So a function that is the solution to the halting problem certainly
> exists;  we just can't write down a way to calculate such a function
> totally.

You're playing fast and loose with infinities here.
0
jrm (1310)
7/2/2004 8:43:01 PM
"Daniel C. Wang" <danwang74@hotmail.com> writes:

> He's just defining the meaning of a term as its canonical form if it exists.
> You can of course easily do this compositionally. It is a
> compositional semantics of some sort, but the notion of equality or
> meaning under this system is not very interesting.
>
> For example the following two terms do not have the same meaning under
> Bill's semantics
>
>    lam x . 1 + x
>
> and
>
>    lam x . x + 1
>

What is more interesting is this:

(define (goldbach-search n)
  (let loop ((i 2))
     (cond ((and (prime? i) (prime? (- n i))) (goldbach-search (+ n 2)))
           ((> i n) n)
           ((= i 2) (loop (+ i 1)))
           (else (loop (+ i 2))))))
(goldbach-search 2)

and this:

(define (loop) (loop))
(loop)

only have different meanings under Bill's `semantics' if Golbach's
conjecture is false.  Bill cannot assign a meaning to the first one
once until he proves or disproves Golbach's conjecture.

0
jrm (1310)
7/2/2004 8:56:55 PM
richter@math.northwestern.edu (Bill Richter) wrote:
> Will, we're having a communication problem. I think I'm writing too
> much, and you're not reading it, but just skipping to the end.

Yes, you are writing too much and saying too little.  I am indeed
reading it, only to conclude that reading it was a waste of my time,
and that to write a detailed response would be a waste of others'
time as well as my own.

> > Re-defining the word "compositional" doesn't make your semantics
> > compositional.  
> 
> Sure, but I proved, using Schmidt's Theorem 3.13, that my definition
> of compositional was equivalent to yours.

No, you didn't.  According to my definition, it is the definition of
a function that is compositional.  So long as you disagree with this,
your definition of compositional can't possibly be equivalent to mine.

> > In particular, your bog-standard small-step operational semantics
> > doesn't support proofs by structural induction, as a compositional
> > semantics would.
> 
> Well, that's perhaps a weakness of my semantic function.  However, I
> of course can give proofs by structural induction, since I proved that
> my definition of compositionality is the same as yours, i.e. that my
> function was defined by structural induction.

Perhaps, my foot.  You claim to have proved that you are getting close
to the point at which you could claim that some as-yet-unspecified
correspondence holds between your semantics and a compositional
semantics.  When you actually specify that correspondence (e.g. the
f_ij functions that you claim you might be able to find), I will
comment upon it.

You know, Bill, I've been doing this sort of thing for more than 25
years.  I have proved theorems of the sort you are claiming to have
some idea of how to state and perhaps even to prove, so I'm not going
to argue that it is impossible to do something along the lines of what
you claim you will soon be able to do.  Having done it myself, however,
I know what's going to happen.  I very much look forward to commenting
upon your claims if and when you figure out how to make them concrete.

Will
0
cesuraSPAM (401)
7/2/2004 9:52:01 PM
Joe Marshall <jrm@ccs.neu.edu> writes:

> pnkfelix@gmail.com (Felix Klock) writes:
> 
> > Every program does indeed either halt or not halt.  
> 
> Oh?

What's the surprise?

> > So a function that is the solution to the halting problem certainly
> > exists;  we just can't write down a way to calculate such a function
> > totally.
> 
> You're playing fast and loose with infinities here.

No.  He is absolutely right.
0
find19 (1244)
7/2/2004 10:00:59 PM
Matthias Blume <find@my.address.elsewhere> writes:

> Joe Marshall <jrm@ccs.neu.edu> writes:
>
>> pnkfelix@gmail.com (Felix Klock) writes:
>> 
>> > Every program does indeed either halt or not halt.  
>> 
>> Oh?
>
> What's the surprise?

This is analagous to
 `Every logical assertion is either true or false.'

I'm comfortable with `Every program does either halt or not halt
within 5 minutes.'  (choose any time span you wish here)


>
>> > So a function that is the solution to the halting problem certainly
>> > exists;  we just can't write down a way to calculate such a function
>> > totally.
>> 
>> You're playing fast and loose with infinities here.
>
> No.  He is absolutely right.

Rather than argue what `exists' and `is' means, I'll just ask for a
proof.


0
jrm (1310)
7/2/2004 10:38:40 PM
Shriram Krishnamurthi <sk@cs.brown.edu> responded to me:

First let me apologize/clarify, Shriram.  All human fields, from
Physics to PostMod lit-crit, are defined culturally by their
practitioners: if you want to publish in an area, you must do work the
field considers interesting.  So you're right: my "bog-simple" stuff
belongs to the human field of OpS, not DS.  Sometimes it's hard to
locate the field, as there are interesting boundaries of CS/Biology,
or Cryptography/Number-Theory, or String-Theory/all-other-Math.

Nonetheless, fields often make definite definitions, such as
"topological space", or "homotopy groups of a (pointed) top space".
So the interesting question (to me) is whether I can meet the
definition of a compositional DS semantic function

E: Expressions ----> M ;; M is some set of "meanings"

Below, you seem to accept the idea that this is indeed a mathematical
question, and that's enough for me.  I'm not arguing that my work or
F-F's eval_s ought to be reclassified in (the human field of) DS.

   > But instead one could give a mathematical definition of a DS
   > semantic function.  I thought that's what Will wanted to do.  And
   > I thought Will's definition for a compositional DS semantic
   > function was that it had a certain specific type of definition
   > via structural induction.
   
   Correct.  I would say the term "compositional DS semantic function"
   is redundant: a defining characteristic is that it be
   compositional, ie, defined by structural induction.  

That's fine.  Cartwright-Felleisen insist on compositionality.

   Furthermore, it cannot diverge (because functions in mathematics
   don't "diverge").

I think I grok, test me: We're only interested in total functions E,
and a total mathematical function doesn't "diverge".  

   >  And I'm ready to prove that functions similar to F-F's standard
   > reduction function eval_s satisfy my definition.
   
   Does your function-similar-to-eval_s diverge on some programs?
   
No, and neither does my prototype, F-F's eval_s, once we make the
obvious translation of sending the "undefiners" to bottom.

That is, if E is an LC_v expr that doesn't standard-reduce to a value
in finite time, we send it to bottom, rather than just saying that
eval_s in undefined on this element.  Schmidt explains this
translation: a partial function f: X ---> Y is the same as a total
function g: X ---> Y_bottom, where g^{-1}(bottom) is the subset of X
on which f is undefined.  
[BTW I appreciated Felix's support here.  His chiding hadn't really
bothered me, especially after I realized that NKS is Wolfram's
bombastification: "all ideas on this newsgroup are my IP"]

   > And then I proved (using Schmidt's Theorem 3.13, which Will
   > cited) 2 days ago that this was equivalent to a different
   > definition for a compositional DS semantic function.  I don't
   > think anyone's responded to my proof yet.
   
   I wonder why.

Maybe the group is waiting for you to read my 2 long posts to Will!
You're a young hotshot CS professor, it wouldn't hurt you to learn
some Schmidt or Nielson^2, or take up much of your time.  Will has no
obligation to read my posts: Will (also the DS books) hasn't published
false claims opposing me (unlike my homotopy theorists :D).
   
   > > Not only are these not properties of a denotational semantics,
   > > if something claims to be a DS that exhibits these properties,
   > > the claim is nonsensical.
   > 
   > This isn't good.  A function doesn't "know" what it's definition
   > is.
   
   Geez, everyone's a pedant.  

IMO this is a serious point that Nielson^2 don't do a good job on.

   "Not only are these not properties of a denotational semantics, if
   someone claims that a semantics S is a DS and S exhibits these
   properties, than that person's claim is nonsensical."  (I thought
   this was pretty obvious, given that a function doesn't `know' what
   its definition is.)

Well, if you're making a cultural remark about which field my
function-similar-to-eval_s belongs to, you're right & I apologize.
But I just want to say that any compositional function
   
E: Expressions ----> M ;; M is some set of "meanings"

is a DS semantic function. It makes no difference how E is defined.
We can use Scott models, or we can use standard reduction.  We've just
got a mathematical test (of compositionality) to satisfy.

For that matter, the R5RS DS has an OpS-ish look to it as well.  It's
a great thing that curly-E[[lambda-expr]] is a function (let's ignore
call/cc), but it's the function you'd get by your big-step OpS or my
(Robby/Jacob/HtDP) small-step OpS.  It's just that with Scott models
you can solve the recursive set problem, and have Procedure-Values
both a subset of Values and acting on Values.
0
richter1974 (312)
7/3/2004 1:24:56 AM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

   > > Re-defining the word "compositional" doesn't make your
   > > semantics compositional.
   > 
   > Sure, but I proved, using Schmidt's Theorem 3.13, that my
   > definition of compositional was equivalent to yours.
   
   No, you didn't.  According to my definition, it is the definition
   of a function that is compositional.

Thanks, Will.  I didn't realize that.  But this change only causes me
inconvenience: I have to tell you that the "real" definition of my
function isn't my 1st definition, but my 2nd compositional definition.

But I think this puts you in disagreement with C-F & Nielson^2, and
contradicts what wrote earlier.  First you & C-F:

      > And it's not what Cartwright and Felleisen say in their
      > "Extensible Denotational" paper.  They say:
      > 
      >     [The map from syntactic domains to semantic domains]
      >     satisfies the law of compositionality: the interpretation of
      >     a phrase is a function of the interpretation of the
      >     sub-phrases.

      That's almost the same as Schmidt's definition, and the
      difference is a very minor mistake by Cartwright and Felleisen:
      They should have said the interpretation of a phrase is defined
      by a function of the interpretation of the sub-phrases.

It's still the *functions* that satisfy the law of compositionality,
not their definitions.   That is, let's doctor C-F to read:

[The map from syntactic domains to semantic domains] satisfies the law
of compositionality: the interpretation of a phrase is DEFINED BY a
function of the interpretation of the sub-phrases.

Thus: the maps are compositional if the functions can be defined in
this structural induction fashion.  Nielson^2, p 85 says the same:

The hallmark of DS is that the semantic functions are defined
compositionally, that is [structural induction etc.]


   So long as you disagree with this, your definition of compositional
   can't possibly be equivalent to mine.

OK, but now I can say:  Suppose my functions 
EE_i: B_i ---> D_i
are compositional in my sense.  That is, there exist functions f_ij s.t. 

EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )

Then I can give a `definition of a function that is compositional':  

I have a new compositional definition of my functions EE_i.  Just run
Schmidt's Theorem 3.13 on my f_ij, to produce some BB_i.  By the
uniqueness of 3.13, EE_i = BB_i.  So EE_i has a compositional
definition.  Let's just "forget" any earlier definition of EE_i.

   
   > > In particular, your bog-standard small-step operational semantics
   > > doesn't support proofs by structural induction, as a compositional
   > > semantics would.
   > 
   > Well, that's perhaps a weakness of my semantic function.  However, I
   > of course can give proofs by structural induction, since I proved that
   > my definition of compositionality is the same as yours, i.e. that my
   > function was defined by structural induction.
   
   Perhaps, my foot.  

I don't contest this!  Surely I lose power by not using Scott models!

   You claim to have proved that you are getting close to the point at
   which you could claim that some as-yet-unspecified correspondence
   holds between your semantics and a compositional semantics.  When
   you actually specify that correspondence (e.g. the f_ij functions
   that you claim you might be able to find), I will comment upon it.

I did that 2 long posts ago!  And above.  The post that you gave a
long thoughtful reply, but you seem to have only started reading
toward the end.  I gave another long reply (with Stoy & Nielson^2 at
the top) that you responded to with your funny Lincoln/dog/tail joke.
   
   You know, Bill, I've been doing this sort of thing for more than 25
   years.  

I have a lot of respect for your experience & wisdom.  I got a lot out
of your Scheme standard, esp the R5RS DS, and I bet this is one of
your lesser accomplishments.

   I have proved theorems of the sort you are claiming to have some
   idea of how to state and perhaps even to prove, so I'm not going to
   argue that it is impossible to do something along the lines of what
   you claim you will soon be able to do.  Having done it myself,
   however, I know what's going to happen.  I very much look forward
   to commenting upon your claims if and when you figure out how to
   make them concrete.

I like the sound of this, but I need feedback on what I've already
done.  I certainly can't believe I'll prove any new "real" theorems
that you & others didn't know.  But I think I'm a really really good
mathematician, and I think it's quite possible I've noticed something
very simple (as opposed to useful!) that you didn't notice.
0
richter1974 (312)
7/3/2004 2:22:17 AM
Joe Marshall <jrm@ccs.neu.edu> responded to Matthias Blume:
 
> >> > So a function that is the solution to the halting problem
> >> > certainly exists; we just can't write down a way to calculate
> >> > such a function totally.
> >> 
> >> You're playing fast and loose with infinities here.
> >
> > No.  He is absolutely right.
> 
> Rather than argue what `exists' and `is' means, I'll just ask for a
> proof.

Joe, can we stick to a proof in LC_v with F-F's standard reduction
function eval_s?  This is the point I've been griping you about.

On p 53, F-F define a partial function 

eval_s: LC_v Expr ---> Values

Schmidt immediately turns this into a total function 

teval_s: LC_v Expr ---> Values_bottom

where teval_s^{-1}(bottom) = Undef, the subset of LC_v Expr where the
standard reduction algorithm does not terminates at a value.

So a simpler version of your question is why teval_s is a well-defined
function.  Or why Undef is actually a subset of LC_v Expr.  

If you say you don't understand this, we can try to give you a proof,
and discuss whether F-F should've said more.  If you say you do
understand this, then we can ask what's the extra complication of real
programming languages that you think is the problem.  And then we can
say that R5RS DS defines a total function

curly-E: Scheme Expr ---> (U x S x K -> A)

to (as Will posted) the set of partial functions, and therefore a
partial program function 

program-E: Scheme Expr ---> A

given by program-E[[ X ]] = curly-E[[ X ]](rho_0 sigma_0 kappa_0)

and that immediately turns into a total function 

total-program-E: Scheme Expr ---> A_bottom

where total-program-E^{-1}(bottom) is the subset of Scheme expressions
which don't evaluate to an answer.  That's going to be pretty much the
non-halting programs, because R5RS DS is pretty good about error
messages for other errors.  Could you be saying that you don't know
why total-program-E is well-defined as a mathematical function?  I
wouldn't think less of you if you did!  I don't know where this is
explained in the CS curriculum.  I could give proofs (for tevals_s
anyway) with induction & the ZFC comprehension axiom.  I'm hoping
Marlene could give a proof, using her point set topology course.
0
richter1974 (312)
7/3/2004 2:57:10 AM
Joe Marshall <jrm@ccs.neu.edu> wrote:
> This is analagous to
>  `Every logical assertion is either true or false.'

That statement is false.  What is true is that every closed
logical assertion (i.e. sentence) defines a function from
interpretations to truth values.

Come to think of it, this has a lot to do with this thread.
People should study Tarski's denotational semantics of first
order logic before moving on to the denotational semantics of
programming languages, which are rather more complex than
first order logic.

Will
0
cesuraSPAM (401)
7/3/2004 4:02:14 AM
Matthias Blume <find@my.address.elsewhere> writes:

> Joe Marshall <jrm@ccs.neu.edu> writes:
>
>> pnkfelix@gmail.com (Felix Klock) writes:
>> 
>> > Every program does indeed either halt or not halt.  
>> 
>> Oh?
>
> What's the surprise?
>
>> > So a function that is the solution to the halting problem certainly
>> > exists;  we just can't write down a way to calculate such a function
>> > totally.
>> 
>> You're playing fast and loose with infinities here.
>
> No.  He is absolutely right.

I should clarify.  The statement

   `Every program does indeed either halt or not halt.'

implies that the halting problem is a total function.


-- 
~jrm
0
7/3/2004 4:47:53 AM
richter@math.northwestern.edu (Bill Richter) writes:

> On p 53, F-F define a partial function 
>
> eval_s: LC_v Expr ---> Values
>
> Schmidt immediately turns this into a total function 
>
> teval_s: LC_v Expr ---> Values_bottom
>
> where teval_s^{-1}(bottom) = Undef, the subset of LC_v Expr where the
> standard reduction algorithm does not terminates at a value.
>
> So a simpler version of your question is why teval_s is a well-defined
> function.  Or why Undef is actually a subset of LC_v Expr.  
>
> If you say you don't understand this, we can try to give you a proof,
> and discuss whether F-F should've said more.  If you say you do
> understand this, then we can ask what's the extra complication of real
> programming languages that you think is the problem.  

It isn't that I don't believe that one can turn an operational
semantics based on standard reduction into a denotational semantic, it
is that I don't believe that it can be done without a mechanism at
least as complex as domain theory.


-- 
~jrm
0
7/3/2004 4:57:36 AM
Joe Marshall wrote:
{stuff deleted}
> It isn't that I don't believe that one can turn an operational
> semantics based on standard reduction into a denotational semantic, it
> is that I don't believe that it can be done without a mechanism at
> least as complex as domain theory.

What's your notion of denotational semantics?

Consider the denotation function that recursively assigns the meaning of 
every scheme sub-expression to the number 0. No domain theory required!

Well, okay so it's not very interesting.... but I think this is where the 
problems lies... There are many things that technically are denotational 
semantics but few of them are "interesting".

I don't think there is a formal definition of "interesting" in general. 
However, you can easily ask that a set of equalities be provably true in 
your semantics for a specific context and then you have a well defined 
notion of interesting/useful denotational semantics.

If you want a semantic notion of equality of functions that many not 
terminate then you'll need domain theory.

However,  if you are only interested  in a semantics for total functions, 
you can indeed get away without any domain theory. The models for truth of a 
first-order logic as Will has mentioned does not require domain theory.

Anyway, I'm completely lost as to what Bill Richter's claim is.. does he 
have a denotational semantics that avoids CPOs. I don't think there's any 
reason to believe that he doesn't. Is what he has any more interesting then 
a OPSem in terms of what equalities can be proved. I think not...

You can say that a tiger is just some sort of big cat... a true statement, 
which ignores some rather important differences.
0
danwang742 (171)
7/3/2004 5:37:10 AM
Joe Marshall <jrm@ccs.neu.edu> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> 
> > Joe Marshall <jrm@ccs.neu.edu> writes:
> >
> >> pnkfelix@gmail.com (Felix Klock) writes:
> >> 
> >> > Every program does indeed either halt or not halt.  
> >> 
> >> Oh?
> >
> > What's the surprise?
> 
> This is analagous to
>  `Every logical assertion is either true or false.'

Wrong.

0
find19 (1244)
7/3/2004 5:48:00 AM
Matthias Blume <find@my.address.elsewhere> writes:

> Joe Marshall <jrm@ccs.neu.edu> writes:
> 
> > Matthias Blume <find@my.address.elsewhere> writes:
> > 
> > > Joe Marshall <jrm@ccs.neu.edu> writes:
> > >
> > >> pnkfelix@gmail.com (Felix Klock) writes:
> > >> 
> > >> > Every program does indeed either halt or not halt.  
> > >> 
> > >> Oh?
> > >
> > > What's the surprise?
> > 
> > This is analagous to
> >  `Every logical assertion is either true or false.'
> 
> Wrong.

Ok, to be a bit less terse:  It would be analogous to, e.g.,

   "Every logical assertion is either true or it is not true."

meaning (to spell it out for those who don't get where the difference
is):

   For every logical assertion one of the following two cases must
   hold:
     1. it has a well-defined truth value that is "true"
     2. it does not have a well-defined truth value that is "true",
        i.e., it either does not have a well-defined truth value at
        all, or if it does have one, the value is not "true" (implying
        that it must then be "false")

In any case, any given (deterministic) program on any given input
either halts or it does not.  There is no third possibility such as
"whether or not the program halts is not well-defined".

Matthias
0
find19 (1244)
7/3/2004 5:56:55 AM
"Daniel C. Wang" <danwang74@hotmail.com> writes:

> Joe Marshall wrote:
> {stuff deleted}
>> It isn't that I don't believe that one can turn an operational
>> semantics based on standard reduction into a denotational semantic, it
>> is that I don't believe that it can be done without a mechanism at
>> least as complex as domain theory.
>
> What's your notion of denotational semantics?

[elided]
> If you want a semantic notion of equality of functions that may not
> terminate then you'll need domain theory.

This is exactly what I meant.  I should have been more specific.  

> However,  if you are only interested  in a semantics for total
> functions, you can indeed get away without any domain theory. The
> models for truth of a first-order logic as Will has mentioned does not
> require domain theory.

I'm interested in a semantics that can model recursive functions
through self-application.

> Anyway, I'm completely lost as to what Bill Richter's claim is.. does
> he have a denotational semantics that avoids CPOs.  I don't think
> there's any reason to believe that he doesn't.  

I believe his claim is that he can construct a denotational semantics
by repeated application of standard reduction and an appeal to
`bottom' when that doesn't work.  I believe that he further claims
that this is as powerful (or nearly so) as the semantics based on
domain theory.

My argument with him is that his semantic equations are circular and
he has given no reason to believe that they have solutions.  I believe
that in order to show that the semantics are reasonable, he will have
to invoke domain theory.

-- 
~jrm
0
7/3/2004 1:46:09 PM
richter@math.northwestern.edu (Bill Richter) writes:

> But I just want to say that any compositional function
>    
> E: Expressions ----> M ;; M is some set of "meanings"
>
> is a DS semantic function.  It makes no difference how E is defined.

It certainly does make a difference how E is defined.

> We can use Scott models, or we can use standard reduction.  We've just
> got a mathematical test (of compositionality) to satisfy.

There are more tests to satisfy.

  1.  Does E exist?  Depending on how E is specified, it is incumbent
      upon you to show that there is an E that meets the necessary
      criteria.

  2.  If E exists, is it unique?  If your specification for E admits
      two different semantics, then you need to differentiate between
      the two.

  3.  Is E a function?

  4.  Is E a total function?

0
jrm (1310)
7/3/2004 5:52:13 PM
richter@math.northwestern.edu (Bill Richter) writes:

> I'm going to skip your discussion, Joe, because I'm only interested in
> a mathematical definition of a DS valuation function.  

Very well.  If the math is well-defined, then we ought to be able to
write a formal specification.  If we use Scheme to write the
specification, then we can test the spec using a program.

To make things concrete, here is a tiny scheme-like language:

    an expression is one of:

          numeral, such as 10 33 (integer numerals)

          variable

          (lambda <var> <expr>)  denotes a function of one argument

          (call <expr> <expr>) denotes applying a function to an argument

          (if-equal? <expr> <expr> <expr> <expr>)
              If the first two exprs are numeric and have equal
              values, the result is the third expr, otherwise the
              fourth. 

          (succ <expr>)
              If the expr evaluates to a number, (succ <expr>)
              evaluates to the successor of that number.

Because there is no assignment in this language, we can omit modeling
the store.  We don't provide an EQ? test for functions, so locations
are unnecessary.  There is no unusual control flow, so we can omit
modeling continuations, too.  Nonetheless, this little language is
rich enough to illustrate the important issues.

We define Curly-E as the semantic function that maps expressions in
our tiny language to meanings.  Since we are writing this in scheme,
the meaning will be (the text of) a scheme expression.  Curly-E for
numerals is trivial.

(define (curly-e-numeral n)
  `(lambda (env)
     ,(numeral->number n)))

Since environments map identifiers to values, the curly-e of a
variable applies the environment to the variable name.

(define (curly-e-variable var)
  `(lambda (env)
     (env ',var)))

The succ form needs the value of its subexpression:

(define (curly-e-succ expr)
  `(lambda (env)
     (let ((result (,expr env)))
       (if (number? result)
           (+ result 1)
           (error "Wrong (succ)")))))

If-equal is a tad more complicated:

(define (curly-e-ifequal left right cons alt)
  `(lambda (env)
     (let ((lresult (,left env))
           (rresult (,right env)))
       (if (and (number? lresult)
                (number? rresult)
                (= lresult rresult))
           (,cons env)
           (,alt env)))))

Now here's where I use domains.  I allow functions to operate on
functions, so the domain of expressible values includes functions over
that domain.  This allows me to write the curly-e for lambda
expressions like this:

(define (curly-e-lambda var body)
  `(lambda (env)
     (lambda (arg)  ;; need domains for this
       (,body (lambda (v)
                (if (eq? v ',var)
                    arg
                    (env v)))))))

A semantics that does not use Scott domains cannot return a function
at this point and must do something else.

Finally, call expressions can simply apply the value of the operator
(a function from above) to the operand.  If domains are not being
used, this cannot be written this way because the operator will not be
a function.

(define (curly-e-call operator argument)
  `(lambda (env)
     ((,operator env) (,argument env))))

Now we need to stitch these together into a full semantic function.

(define (curly-e expression)
  (cond ((numeral? expression)     (curly-e-numeral expression))
        ((variable? expression)    (curly-e-variable expression))
        ((succ? expression)        (curly-e-succ
                                     (curly-e (succ-arg expression))))

        ((ifequal? expression) (curly-e-ifequal
                                     (curly-e (ifequal-left   expression))
                                     (curly-e (ifequal-right   expression))
                                     (curly-e (ifequal-consequent  expression))
                                     (curly-e (ifequal-alternative expression))))

        ((lambda? expression) (curly-e-lambda
                               (lambda-variable expression)
                               (curly-e (lambda-body  expression))))

        ((call? expression) (curly-e-call
                             (curly-e (call-operator expression))
                             (curly-e (call-argument expression))))

        (else (error "Illegal expression." expression))))


CURLY-E is compositional.  Each term is either a primitive term or the
composition of calls to CURLY-E on the subterms that denote expressions.

CURLY-E is a total function and well-defined for all syntactically
correct programs in our tiny language.  (Proof by structural induction
over the argument `expression'.)

Curly-E is formal enough to provide a semantic value for this
expression in our language:
  (call
    (call
      (call
        (call
          (lambda f (call (lambda d (call d d))
                          (lambda x (call f (lambda i (call x x))))))
          (lambda dec
            (lambda x
              (lambda y
                (lambda z
                  (if-equal? x y
                             z
                             (call
                               (call
                                 (call
                                   (call dec 0)
                                   (succ x))
                                  y)
                                (succ z)))))))) 1) 40) 0)

(lambda (env)
  (((lambda (env)
      (((lambda (env)
          (((lambda (env)
              (((lambda (env)
                  (lambda (arg)
                    ((lambda (env)
                       (((lambda (env)
                           (lambda (arg)
                             ((lambda (env)
                                (((lambda (env) (env 'd)) env)
                                 ((lambda (env) (env 'd)) env)))
                              (lambda (v) (if (eq? v 'd) arg (env v))))))
                         env)
                        ((lambda (env)
                           (lambda (arg)
                             ((lambda (env)
                                (((lambda (env) (env 'f)) env)
                                 ((lambda (env)
                                    (lambda (arg)
                                      ((lambda (env)
                                         (((lambda (env) (env 'x)) env)
                                          ((lambda (env) (env 'x)) env)))
                                       (lambda (v)
                                         (if (eq? v 'i) arg (env v))))))
                                  env)))
                              (lambda (v) (if (eq? v 'x) arg (env v))))))
                         env)))
                     (lambda (v) (if (eq? v 'f) arg (env v))))))
                env)
               ((lambda (env)
                  (lambda (arg)
                    ((lambda (env)
                       (lambda (arg)
                         ((lambda (env)
                            (lambda (arg)
                              ((lambda (env)
                                 (lambda (arg)
                                   ((lambda (env)
                                      (let ((lresult
                                              ((lambda (env) (env 'x)) env))
                                            (rresult
                                              ((lambda (env) (env 'y)) env)))
                                        (if (and (number? lresult)
                                                 (number? rresult)
                                                 (= lresult rresult))
                                          ((lambda (env) (env 'z)) env)
                                          ((lambda (env)
                                             (((lambda (env)
                                                 (((lambda (env)
                                                     (((lambda (env)
                                                         (((lambda (env)
                                                             (env 'dec))
                                                           env)
                                                          ((lambda (env) 0)
                                                           env)))
                                                       env)
                                                      ((lambda (env)
                                                         (let ((result
                                                                 ((lambda (env)
                                                                    (env 'x))
                                                                  env)))
                                                           (if (number? result)
                                                             (+ result 1)
                                                             (error
                                                              "Wrong arg to succ"))))
                                                       env)))
                                                   env)
                                                  ((lambda (env) (env 'y))
                                                   env)))
                                               env)
                                              ((lambda (env)
                                                 (let ((result
                                                         ((lambda (env)
                                                            (env 'z))
                                                          env)))
                                                   (if (number? result)
                                                     (+ result 1)
                                                     (error
                                                      "Wrong arg to succ"))))
                                               env)))
                                           env))))
                                    (lambda (v) (if (eq? v 'z) arg (env v))))))
                               (lambda (v) (if (eq? v 'y) arg (env v))))))
                          (lambda (v) (if (eq? v 'x) arg (env v))))))
                     (lambda (v) (if (eq? v 'dec) arg (env v))))))
                env)))
            env)
           ((lambda (env) 1) env)))
        env)
       ((lambda (env) 40) env)))
    env)
   ((lambda (env) 0) env)))

This is `meaning' of our expression.  We could manually simplify it,
but since we are defining our meanings in terms of scheme expressions,
we can simplify it by calling scheme's eval function:

((eval (curly-e test-expression)) #f)
39

I can define reduction steps for this language as well.

(define (reduce-succ sub-expr env)
  (if (number? sub-expr)
      (+ sub-expr 1)
      `(+ (reduce-step ,sub-expr env) 1)))

(define (reduce-ifequal left right cons alt env)
  (cond ((not (number? left))
         `(if-equal? ,(reduce-step left env) ,right ,cons ,alt))
        ((not (number? right))
         `(if-equal? ,left (reduce-step ,right env) ,cons ,alt))
        ((= left right) cons)
        (else alt)))

(define (subst var val expr)
  (cond ((numeral? expr) expr)
        ((variable? expr) (if (eq? var expr) val expr))
        ((succ? expr) `(succ ,(subst var val (succ-arg expr))))
        ((ifequal? expr) `(if-equal? ,(subst var val (ifequal-left expr))
                                     ,(subst var val (ifequal-right expr))
                                     ,(subst var val (ifequal-consequent expr))
                                     ,(subst var val (ifequal-alternative expr))))
        ((lambda? expr) (if (eq? (lambda-variable expr) var)
                            expr
                            (let ((alpha (gensym (string-append (symbol->string (lambda-variable expr)) "-"))))
                              `(lambda ,alpha
                                 ,(subst var val (subst (lambda-variable expr) alpha (lambda-body expr)))))))
        ((call? expr) `(call ,(subst var val (call-operator expr))
                             ,(subst var val (call-argument expr))))
        (else (error "subst unrecognized" expr))))

(define (reduce-call oper arg env)
  (if (or (numeral? arg)
          (and (pair? arg)
               (eq? (car arg) 'lambda)))
      (if (and (pair? oper)
               (eq? (first oper) 'lambda))
          (subst (second oper) arg (third oper))
          `(call ,(reduce-step oper env) ,arg))
      `(call ,oper ,(reduce-step arg env))))

(define (reduce-step expression env)
  (cond ((number? expression) expression)
        ((lambda? expression) expression)
        ((succ? expression) (reduce-succ (succ-arg expression) env))
        ((ifequal? expression) (reduce-ifequal
                                (ifequal-left   expression)
                                (ifequal-right   expression)
                                (ifequal-consequent  expression)
                                (ifequal-alternative expression)
                                env))

        ((call? expression) (reduce-call
                              (call-operator expression)
                              (call-argument expression)
                              env))
        (else (error "bad expression"))))

(define (evalv expr env)
  (if (number? expr)
      expr
      (evalv (reduce-step expr env) env)))

And you will see that if I repeatedly apply the reduction step that I
will produce the result 39.

>    > 1) I'm saying we can define DS semantic functions along the lines of
>    >     F-F's standard reduction function eval_s.

Here's your opportunity.  Please provide the definition.
0
jrm (1310)
7/3/2004 8:56:50 PM
richter@math.northwestern.edu (Bill Richter) wrote:
> Thanks, Will.  I didn't realize that.  But this change only causes me
> inconvenience: I have to tell you that the "real" definition of my
> function isn't my 1st definition, but my 2nd compositional definition.

And if your second definition coincides with what everyone else
would do, then there is no sense in which you are able to give
a compositional definition that avoids the complexity you claim
to be avoiding.

> It's still the *functions* that satisfy the law of compositionality,
> not their definitions.   That is, let's doctor C-F to read:
> 
> [The map from syntactic domains to semantic domains] satisfies the law
> of compositionality: the interpretation of a phrase is DEFINED BY a
> function of the interpretation of the sub-phrases.
> 
> Thus: the maps are compositional if the functions can be defined in
> this structural induction fashion.  Nielson^2, p 85 says the same:
> 
> The hallmark of DS is that the semantic functions are defined
> compositionally, that is [structural induction etc.]

The phrases "is DEFINED BY" and "are defined" are not equivalent to
the phrase "can be defined by".

>    You claim to have proved that you are getting close to the point at
>    which you could claim that some as-yet-unspecified correspondence
>    holds between your semantics and a compositional semantics.  When
>    you actually specify that correspondence (e.g. the f_ij functions
>    that you claim you might be able to find), I will comment upon it.
> 
> I did that 2 long posts ago!  And above.

No you didn't.  On 29 June you gave a compositional denotational
semantics for a four-function calculator language of arithmetic
expressions over numerals.  There was no recursion in that language,
and your semantics was exactly the same as everyone else's.

A few minutes later you posted another message that concluded:
> But I'm close to saying that any small-step OpS (if I understand the
> phrase) is easily recast as DS, if we can only prove "conservation
> laws", i.e find f_ij and prove the compositionality equations.

You have never actually stated any f_ij that give an alternative
compositional definition of a semantic function originally defined
by your style of standard reduction.  Never.  Not once.

You claim to have done so, but that claim is false.  A Google Groups
search on messages by Bill Richter that contain "f_ij" will confirm
that your claim is false.  There are three such messages, and the
only concrete f_ij you have ever stated were the ones for that
four-function calculator.

> ....But I think I'm a really really good
> mathematician, and I think it's quite possible I've noticed something
> very simple (as opposed to useful!) that you didn't notice.

It's possible, despite the copious evidence of your writings thus
far.  If you've got something to say, please get on with saying it.

Will
0
cesuraSPAM (401)
7/3/2004 10:07:42 PM
Matthias Blume <find@my.address.elsewhere> writes:

> Ok, to be a bit less terse:  It would be analogous to, e.g.,
>
>    "Every logical assertion is either true or it is not true."
>
> meaning (to spell it out for those who don't get where the difference
> is):
>
>    For every logical assertion one of the following two cases must
>    hold:
>      1. it has a well-defined truth value that is "true"
>      2. it does not have a well-defined truth value that is "true",
>         i.e., it either does not have a well-defined truth value at
>         all, or if it does have one, the value is not "true" (implying
>         that it must then be "false")
>
> In any case, any given (deterministic) program on any given input
> either halts or it does not.  

Well, that's the question, no?

> There is no third possibility such as  "whether or not the program
> halts is not well-defined".

Why not?  It seems to me that if you write a program that halts or not
depending upon a logical assertion that does not have a well-defined
truth value, then whether it halts or not would also not be well
defined.
0
jrm (1310)
7/3/2004 10:15:02 PM
Joe Marshall <jrm@ccs.neu.edu> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> 
> > Ok, to be a bit less terse:  It would be analogous to, e.g.,
> >
> >    "Every logical assertion is either true or it is not true."
> >
> > meaning (to spell it out for those who don't get where the difference
> > is):
> >
> >    For every logical assertion one of the following two cases must
> >    hold:
> >      1. it has a well-defined truth value that is "true"
> >      2. it does not have a well-defined truth value that is "true",
> >         i.e., it either does not have a well-defined truth value at
> >         all, or if it does have one, the value is not "true" (implying
> >         that it must then be "false")
> >
> > In any case, any given (deterministic) program on any given input
> > either halts or it does not.  
> 
> Well, that's the question, no?

No, that is not a question at all.  In fact, it is beyond any doubt.

> > There is no third possibility such as  "whether or not the program
> > halts is not well-defined".
> 
> Why not?  It seems to me that if you write a program that halts or not
> depending upon a logical assertion that does not have a well-defined
> truth value, then whether it halts or not would also not be well
> defined.

One cannot write a program that depends on the "outcome" of such an
assertion.
0
find19 (1244)
7/3/2004 10:19:57 PM
Matthias Blume <find@my.address.elsewhere> writes:

>> > In any case, any given (deterministic) program on any given input
>> > either halts or it does not.  
>> 
>> Well, that's the question, no?
>
> No, that is not a question at all.  In fact, it is beyond any doubt.

I disagree.  It is an axiom that one can choose to use or not.
0
jrm (1310)
7/3/2004 11:03:37 PM
Joe Marshall <jrm@ccs.neu.edu> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> 
> >> > In any case, any given (deterministic) program on any given input
> >> > either halts or it does not.  
> >> 
> >> Well, that's the question, no?
> >
> > No, that is not a question at all.  In fact, it is beyond any doubt.
> 
> I disagree.  It is an axiom that one can choose to use or not.

False.  It is a *fact*.  I don't know how else to put it anymore, but
let me try one more time anyway:

   Scenario 1: The program reaches a stop state in finite time (after
               a finite number of "steps").  We call this "the
               program halts".
   Scenario 2: Scenario 1 does not apply, i.e., it is not true that
               the program reaches a stop state in finite time.  In
               other words, for every finite time that you pick, the
               program will not have reached a stop state at that
               time.  We call this "the program does not halt".
   Scenario 3: There is no scenario 3.

This has nothing to do with "axioms" and whether or not you choose to
use them.

[A logic that does not let you prove "A or not A" is incomplete if you
want "A" to be modeled by "program P halts".  Axioms are part of a
logic, but here we are -- at best -- talking about the /model/ of such
a logic.]
0
find19 (1244)
7/4/2004 12:30:42 AM
Matthias Blume wrote:
{stuff deleted}
> 
> [A logic that does not let you prove "A or not A" is incomplete if you
> want "A" to be modeled by "program P halts".  Axioms are part of a
> logic, but here we are -- at best -- talking about the /model/ of such
> a logic.]

There are many interesting logics that fall in to this category. 
Constructive logics which include all the logics based on pure type systems 
if I'm not mistaken. The models of constructive logics are based on 
computable functions.
0
danwang742 (171)
7/4/2004 1:04:51 AM
"Daniel C. Wang" <danwang74@hotmail.com> writes:

> Matthias Blume wrote:
> {stuff deleted}
> > [A logic that does not let you prove "A or not A" is incomplete if
> > you
> > want "A" to be modeled by "program P halts".  Axioms are part of a
> > logic, but here we are -- at best -- talking about the /model/ of such
> > a logic.]
> 
> There are many interesting logics that fall in to this
> category. Constructive logics which include all the logics based on
> pure type systems if I'm not mistaken. The models of constructive
> logics are based on computable functions.

I know.  But that is not the point.  All deterministic programs still
either halt or they don't.  Equivalently, all pure lambda terms either
have a normal form or they don't.  Two grammars either generate the
same language or they don't.  And so on, ad nauseum.
0
find19 (1244)
7/4/2004 1:12:55 AM
Matthias Blume wrote:
{stuff deleted}
> I know.  But that is not the point.  All deterministic programs still
> either halt or they don't.  Equivalently, all pure lambda terms either
> have a normal form or they don't.  Two grammars either generate the
> same language or they don't.  And so on, ad nauseum.

This is a philosophical issue. It depends on what your notion of Truth is. 
You have a particular model of "Truth" in your head. Other, people may have 
different notions. If you nail down a specific formal system of logic we can 
agree about those statements within that framework, but there is not a 
reason why classical frameworks are "better"  than constructive frameworks.
0
danwang742 (171)
7/4/2004 1:54:58 AM
Matthias Blume <find@my.address.elsewhere> writes:

> Joe Marshall <jrm@ccs.neu.edu> writes:
>
>> Matthias Blume <find@my.address.elsewhere> writes:
>> 
>> >> > In any case, any given (deterministic) program on any given input
>> >> > either halts or it does not.  
>> >> 
>> >> Well, that's the question, no?
>> >
>> > No, that is not a question at all.  In fact, it is beyond any doubt.
>> 
>> I disagree.  It is an axiom that one can choose to use or not.
>
> False.  It is a *fact*.  

You certainly haven't come to know this by personal experience, so I
have to wonder if you are accepting this as a matter of faith (an
axiom) or whether there is some logical rationale (which would imply a
logical model).

-- 
~jrm
0
7/4/2004 2:24:57 AM
Matthias Blume <find@my.address.elsewhere> writes:

> Joe Marshall <jrm@ccs.neu.edu> writes:
>> 
>> I disagree.  It is an axiom that one can choose to use or not.
>
> False.  It is a *fact*.  I don't know how else to put it anymore, but
> let me try one more time anyway:
>
>    Scenario 1: The program reaches a stop state in finite time (after
>                a finite number of "steps").  We call this "the
>                program halts".
>    Scenario 2: Scenario 1 does not apply, i.e., it is not true that
>                the program reaches a stop state in finite time.  In
>                other words, for every finite time that you pick, the
>                program will not have reached a stop state at that
>                time.  We call this "the program does not halt".

First, we all know that *every* program eventually halts.  In the
limiting case, the universe is either closed and the machine will
disappear in the `big crunch' or the universe is open and the machine
will eventually run out of energy.

So I imagine that we are talking about programs in the abstract,
rather than physical machines.

Consider an accelerating Turing Machine in which each step executes in
1/2 the time of the previous step.

-- 
~jrm
0
7/4/2004 2:43:05 AM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

   > Thanks, Will.  I didn't realize that.  But this change only
   > causes me inconvenience: I have to tell you that the "real"
   > definition of my function isn't my 1st definition, but my 2nd
   > compositional definition.
   
   And if your second definition coincides with what everyone else
   would do, 

That's a good question, Will.  I'm going to define some function 
EE_i: B_i ---> D_i
from syntactic to semantic domains 
(as Schmidt says, domain just means set).  Then I'm going to use my
EE_i's to define some f_ij.  Then I'm going to say that my f_ij's
give a compositional definition
BB_i: B_i ---> D_i
that `coincides with what everyone else would do', and then by
Schmidt's 3.13, BB_i = EE_i.

If you assure me that you won't say, "That's cheating, you're not
allowed to know what EE_i is before you define f_ij," then yes: indeed
my `second definition coincides with what everyone else would do.'

   then there is no sense in which you are able to give a
   compositional definition that avoids the complexity you claim to be
   avoiding.

Nah, the only thing you could possibly prove would be an independence
result.  Such as Joe's Goodstein sequences: the proof is independent
of the Peano axioms.  But I'm using much the same ZFC axioms that you
need for CPO/Scott approach.  Perhaps I'm not using the power set
axiom!  Scott's P(omega) is the power set of the natural numbers.  I
doubt you're ready to prove that the compositionality is dependent on
the ZFC power set axiom!  

So I'd say you're making a prediction.  Joe also predicted I couldn't
do it.  So let's settle the definition of compositionality, so I can
define my function & try to prove compositionality, and you can see if
your prediction holds: can I actually define EE_i? Then find f_ij?

BTW that's a good tip to look at Tarski's DS.  Where can we read this?

But there is a point which I should've explained earlier: my semantic
domains D_i will be different from yours.  Earlier you posted that I'd
have to replicate your work, or use it, and I said "I flat out can't
do it!!!"  I should have instead said I had different domains.
   
   > It's still the *functions* that satisfy the law of
   > compositionality, not their definitions.  That is, let's doctor
   > C-F to read:
   > 
   > [The map from syntactic domains to semantic domains] satisfies
   > the law of compositionality: the interpretation of a phrase is
   > DEFINED BY a function of the interpretation of the sub-phrases.
   > 
   > Thus: the maps are compositional if the functions can be defined
   > in this structural induction fashion.  Nielson^2, p 85 says the
   > same:
   > 
   > The hallmark of DS is that the semantic functions are defined
   > compositionally, that is [structural induction etc.]
   
   The phrases "is DEFINED BY" and "are defined" are not equivalent to
   the phrase "can be defined by".

Maybe not in English, Will, but in Math they are!  A function doesn't
have a unique definition.   It makes no sense to ask, 
"What's THE definition of your function?"  
You can only ask,
 "What's A definition of your function?"  
Functions are just sets, as you know, and the same set can be defined
in numerous ways.  A 10x10 matrix can be constructed by first building
the 10 column vectors and lining them up in columns.  Or we can build
rows first, or spiral in clockwise, or counter-clockwise.

But I was really making an English point myself, which is clearer in
C-F than Nielson^2.  Even with your extra 2 words DEFINED BY:

 [The map from syntactic domains to semantic domains] satisfies
 the law of compositionality: the interpretation of a phrase is
 DEFINED BY a function of the interpretation of the sub-phrases.

it's clearly the maps which satisfy the law of compositionality.  One
could argue English-ologically that Nielson^2's

   the semantic functions are defined compositionally

means that the definition of the semantic functions  are
compositional.  It just doesn't make good Math sense.

   >    You claim to have proved that you are getting close to the
   >    point at which you could claim that some as-yet-unspecified
   >    correspondence holds between your semantics and a
   >    compositional semantics.  When you actually specify that
   >    correspondence (e.g. the f_ij functions that you claim you
   >    might be able to find), I will comment upon it.
   > 
   > I did that 2 long posts ago!  And above.
   
   No you didn't.  

Right, I goofed.  But I've done so before.  You've seen this before,
and you & Joe predicted I couldn't define E independently of Phi:

E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store) 

where 

Phi(alpha, beta)(e, s) = (b_v, s3), if 

			 alpha(e, s) = (<i,b,e'>, s1)  

			 beta(e, s1) = (a_v, s2)

			 l = (new s2)      
  
			 E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and

Phi(alpha, beta)(e, s) = bottom, otherwise.


   On 29 June you gave a compositional denotational semantics for a
   four-function calculator language of arithmetic expressions over
   numerals.  There was no recursion in that language, and your
   semantics was exactly the same as everyone else's.

Right.  Sorry again. 
   
   A few minutes later you posted another message that concluded:
   > But I'm close to saying that any small-step OpS (if I understand
   > the phrase) is easily recast as DS, if we can only prove
   > "conservation laws", i.e find f_ij and prove the compositionality
   > equations.
   
   You have never actually stated any f_ij that give an alternative
   compositional definition of a semantic function originally defined
   by your style of standard reduction.  Never.  Not once.
   
That's technically true, as I haven't completely defined an E yet, nor
proved the equation
 E[{f a}] = Phi(E[f], E[a]) 
for my Phi function above.  So let's finish this general discussion
about compositionality so "push can come to shove."  

   > ....But I think I'm a really really good mathematician, and I
   > think it's quite possible I've noticed something very simple (as
   > opposed to useful!) that you didn't notice.
   
   It's possible, despite the copious evidence of your writings thus
   far.  

:D It's rather a pleasure to be insulted by you guys.  You're very
witty, and Mark Twain thought humor was the highest form of
intelligence. 

   If you've got something to say, please get on with saying it.

I claim to have 'said' one such thing: the definition of
compositionality.  Since we're still arguing about it, we can't
calculate my "math grade" yet.  But I think we're close to being done.
0
richter1974 (312)
7/4/2004 2:49:57 AM
Joe Marshall <prunesquallor@comcast.net> responded to me:
 
  > On p 53, F-F define a partial function
  >
  > eval_s: LC_v Expr ---> Values
  >
  > Schmidt immediately turns this into a total function
  >
  > teval_s: LC_v Expr ---> Values_bottom
  >
  > where teval_s^{-1}(bottom) = Undef, the subset of LC_v Expr where
  > the standard reduction algorithm does not terminates at a value.
  >
  > So a simpler version of your question is why teval_s is a
  > well-defined function.  Or why Undef is actually a subset of LC_v
  > Expr.
  >
  > If you say you don't understand this, we can try to give you a
  > proof, and discuss whether F-F should've said more.  If you say
  > you do understand this, then we can ask what's the extra
  > complication of real programming languages that you think is the
  > problem.
  
  It isn't that I don't believe that one can turn an operational
  semantics based on standard reduction into a denotational semantic,
  it is that I don't believe that it can be done without a mechanism
  at least as complex as domain theory.

Thanks, Joe.  Now it really sounds like you're saying that F-F's proof
is inadequate.  That for F-F to assert that teval_s is a well-defined
mathematical function, they needed to invoke some domain theory, or
some equally powerful theory.  Since F-F invoked nothing, I conclude
that you think F-F's proof is inadequate.  

If you'll only admit this, then I think MB can give you a proof you'll
accept of the well-definedness of teval_s.  I've given proofs in the
past, but maybe I'm not speaking the right language.

BTW it's a triviality to pass from the partial function eval_s to the
total function teval_s.  Schmidt does it very quickly, and it's just
because to say we have a partial function 
f: X ---> Y
means exactly that we have a subset D of X and a total function 
f_real: D ---> Y.
We define the total function 
g: X ---> Y_bottom
by merely sending the complement (X - D) to bottom.

And BTW yes!  The halting problem function is indeed a total function!
To be precise, we have a total function 

halt-t-eval_s: LC_v Expr ---> {0, 1}

where the inverse image of 1 is exactly the expression which
standard-reduce in finite time to a value.

Now perhaps I can answer your question now:

   I don't believe that it can be done without a mechanism at least as
  complex as domain theory.

F-F and I are using a very complex mechanism: the ZFC axioms!  It's
the same ZFC axioms you need (except perhaps the power set axiom) for
the CPO/Scott approach.

My Math biz bias tells me it's a deficiency to learn the CPO/Scott
approach without going through the ZFC axiom biz which makes it run.

I could be wrong, but I thought Daniel said he was a constructionist
who (earlier) liked the CPO/Scott approach.  I apologize if I got
Daniel wrong, but that's Will's "tremendous confusion".  If you're
using Scott models, you're no constructionist, you're a ZFC-er.
0
richter1974 (312)
7/4/2004 3:09:18 AM
"Daniel C. Wang" <danwang74@hotmail.com> writes:

> Matthias Blume wrote:
> {stuff deleted}
> > I know.  But that is not the point.  All deterministic programs still
> > either halt or they don't.  Equivalently, all pure lambda terms either
> > have a normal form or they don't.  Two grammars either generate the
> > same language or they don't.  And so on, ad nauseum.
> 
> This is a philosophical issue. It depends on what your notion of Truth
> is. You have a particular model of "Truth" in your head. Other, people
> may have different notions. If you nail down a specific formal system
> of logic we can agree about those statements within that framework,

The point is that I am not doing it.  Yes, specific /formal/ systems
may be unable to prove that "A or not A", others don't.  But
semantically a program still either halts or does not halt, even if
the formal system that you use to reason about this fact might not be
able to prove it.  Everything else is pure nonsense.
0
find19 (1244)
7/4/2004 3:44:55 AM
Joe Marshall <prunesquallor@comcast.net> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> 
> > Joe Marshall <jrm@ccs.neu.edu> writes:
> >
> >> Matthias Blume <find@my.address.elsewhere> writes:
> >> 
> >> >> > In any case, any given (deterministic) program on any given input
> >> >> > either halts or it does not.  
> >> >> 
> >> >> Well, that's the question, no?
> >> >
> >> > No, that is not a question at all.  In fact, it is beyond any doubt.
> >> 
> >> I disagree.  It is an axiom that one can choose to use or not.
> >
> > False.  It is a *fact*.  
> 
> You certainly haven't come to know this by personal experience, so I
> have to wonder if you are accepting this as a matter of faith (an
> axiom) or whether there is some logical rationale (which would imply a
> logical model).

It is actually very simple.  By /definition/ of "does not halt" there
is no third possibility between "halts" and "does not halt".  "Does
not halt" refers to all situations where "halts" is not true.

I am surprised that we are even arguing about this.
0
find19 (1244)
7/4/2004 3:49:40 AM
Joe Marshall <prunesquallor@comcast.net> writes:

> So I imagine that we are talking about programs in the abstract,
> rather than physical machines.

Indeed.

> Consider an accelerating Turing Machine in which each step executes in
> 1/2 the time of the previous step.

TM time is, of course, number of steps.  Don't be silly.
0
find19 (1244)
7/4/2004 3:50:49 AM
Matthias Blume <find@my.address.elsewhere> writes:

> Joe Marshall <prunesquallor@comcast.net> writes:
>
>> So I imagine that we are talking about programs in the abstract,
>> rather than physical machines.
>
> Indeed.
>
>> Consider an accelerating Turing Machine in which each step executes in
>> 1/2 the time of the previous step.
>
> TM time is, of course, number of steps.  Don't be silly.

I'm not being silly.  I was offering this as a counterexample.

-- 
~jrm
0
7/4/2004 4:27:28 AM
Matthias Blume wrote:
{stuff deleted}
> The point is that I am not doing it.  Yes, specific /formal/ systems
> may be unable to prove that "A or not A", others don't.  But
> semantically a program still either halts or does not halt, even if
> the formal system that you use to reason about this fact might not be
> able to prove it. Everything else is pure nonsense.


You're living in a world where Truth is different from the proof of the 
truth. The constructivist notion of Truth is all about provability. You are 
assuming there is some model behind logic and the truth in the model is 
independent of the issue or provability within the logic. I have no problem 
with that world view.

I'm merely making the point that there are some people who believe there are 
just the rules, and the only true things are the provably true things. If 
you adopt that world view what you are saying makes no sense. This is an 
issue for philosophers/logicians to argue endlessly about. I think, most 
anyone with a standard mathematical background can find these view points 
very odd.

I'm sure however no odder than the notion of irrational numbers seemed to 
some of the ancient greeks. I personally find the notion of a non-computable 
real numbers amazingly odd.
0
danwang742 (171)
7/4/2004 4:28:39 AM
Matthias Blue wrote:
{stuff deleted}
> 
> It is actually very simple.  By /definition/ of "does not halt" there
> is no third possibility between "halts" and "does not halt".  "Does
> not halt" refers to all situations where "halts" is not true.

There is a world view where there are three alternatives.

provably halts, provably does not halt, or simply not provable either way.

 > I am surprised that we are even arguing about this.

Your implicit assumption about what truth is all about is getting in the 
way. Granted your view is probably accepted by the majority of every day 
mathematicians, but last I checked they haven't been given a monopoly on the 
definitions of truth.
0
danwang742 (171)
7/4/2004 4:37:24 AM
Joe Marshall <prunesquallor@comcast.net> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> 
> > Joe Marshall <prunesquallor@comcast.net> writes:
> >
> >> So I imagine that we are talking about programs in the abstract,
> >> rather than physical machines.
> >
> > Indeed.
> >
> >> Consider an accelerating Turing Machine in which each step executes in
> >> 1/2 the time of the previous step.
> >
> > TM time is, of course, number of steps.  Don't be silly.
> 
> I'm not being silly.  I was offering this as a counterexample.

I know.  But it wasn't one.  When the machine is abstract (as you
pointed out yourself), time is obviously abstract, too.  TM time is
the same as the number of state transitions it makes.  By definition,
it is not possible to execute one step twice as fast as the previous
one.  It is always one step per step.
0
find19 (1244)
7/4/2004 4:41:02 AM
"Daniel C. Wang" <danwang74@hotmail.com> writes:

> Matthias Blume wrote:
> {stuff deleted}
> > The point is that I am not doing it.  Yes, specific /formal/ systems
> > may be unable to prove that "A or not A", others don't.  But
> > semantically a program still either halts or does not halt, even if
> > the formal system that you use to reason about this fact might not be
> > able to prove it. Everything else is pure nonsense.
> 
> 
> You're living in a world where Truth is different from the proof of
> the truth. The constructivist notion of Truth is all about
> provability. You are assuming there is some model behind logic and the
> truth in the model is independent of the issue or provability within
> the logic. I have no problem with that world view.
>
> I'm merely making the point that there are some people who believe
> there are just the rules, and the only true things are the provably
> true things.

This is obviously nonsense.  Whether or not something is provable
depends on the logic that you choose.  So truth in the model would
depend on my choice of logic then?  In classical logic it is trivially
true that programs either halt or they don't.  So there I *can* prove
it. Am I done now?

On the other hand, I can make an inconsistent logic.  Does this make
everything true?  I can make a logic where "Matthias Blume is the
richest man in the world" is an axiom.  Does that make it true?  I
didn't think so.

> If you adopt that world view what you are saying makes no
> sense. This is an issue for philosophers/logicians to argue endlessly
> about. I think, most anyone with a standard mathematical background
> can find these view points very odd.

As they should.  I actually think that you are subtly misrepresenting
constructivism here.  As far as I remember, constructivism essentially
abolishes axioms within the logic that effectively state or imply "A
or not A".  As a result, if you want to prove "A or not A" you have to
either prove A or you have to prove "not A".  To generalize this, if
you want to prove "exists x such that A(x)" you have to construct a
concrete X for which you can prove A(X).  And so on.

But this is all at the level of the formal calculus, not the level of
the model, and it says nothing about the true nature of truth.

Matthias
0
find19 (1244)
7/4/2004 5:00:41 AM
"Daniel C. Wang" <danwang74@hotmail.com> writes:

> Matthias Blue wrote:
> {stuff deleted}
> > It is actually very simple.  By /definition/ of "does not halt"
> > there
> > is no third possibility between "halts" and "does not halt".  "Does
> > not halt" refers to all situations where "halts" is not true.
> 
> There is a world view where there are three alternatives.
> 
> provably halts, provably does not halt, or simply not provable either way.

Actually, if it does not provably halt, then it does not halt.  Now
group your latter two cases into one larger case, and voila: a program
either halts (and provably so), or it does not halt (although in some
cases we will not be able to prove that).

Matthias
0
find19 (1244)
7/4/2004 5:03:27 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Thanks, Joe.  Now it really sounds like you're saying that F-F's proof
> is inadequate.  

No.  I'm saying that F-F *never* claim to provide a denotational
semantics.

> That for F-F to assert that teval_s is a well-defined mathematical
> function, they needed to invoke some domain theory, or some equally
> powerful theory.  Since F-F invoked nothing, I conclude that you
> think F-F's proof is inadequate.

No, they claim that teval_s is a well-defined *partial* function and
make no claim whatsoever about how to determine its range.  They claim
that *when* it produces a value that it produces the correct value.  I
have no problem with this.


I'm going to answer the rest of this tomorrow.


> If you'll only admit this, then I think MB can give you a proof you'll
> accept of the well-definedness of teval_s.  I've given proofs in the
> past, but maybe I'm not speaking the right language.
>
> BTW it's a triviality to pass from the partial function eval_s to the
> total function teval_s.  Schmidt does it very quickly, and it's just
> because to say we have a partial function 
> f: X ---> Y
> means exactly that we have a subset D of X and a total function 
> f_real: D ---> Y.
> We define the total function 
> g: X ---> Y_bottom
> by merely sending the complement (X - D) to bottom.
>
> And BTW yes!  The halting problem function is indeed a total function!
> To be precise, we have a total function 
>
> halt-t-eval_s: LC_v Expr ---> {0, 1}
>
> where the inverse image of 1 is exactly the expression which
> standard-reduce in finite time to a value.
>
> Now perhaps I can answer your question now:
>
>    I don't believe that it can be done without a mechanism at least as
>   complex as domain theory.
>
> F-F and I are using a very complex mechanism: the ZFC axioms!  It's
> the same ZFC axioms you need (except perhaps the power set axiom) for
> the CPO/Scott approach.
>
> My Math biz bias tells me it's a deficiency to learn the CPO/Scott
> approach without going through the ZFC axiom biz which makes it run.
>
> I could be wrong, but I thought Daniel said he was a constructionist
> who (earlier) liked the CPO/Scott approach.  I apologize if I got
> Daniel wrong, but that's Will's "tremendous confusion".  If you're
> using Scott models, you're no constructionist, you're a ZFC-er.

-- 
~jrm
0
7/4/2004 5:24:39 AM
Matthias Blume wrote:
{stuff deleted}
> On the other hand, I can make an inconsistent logic.  Does this make
> everything true?  I can make a logic where "Matthias Blume is the
> richest man in the world" is an axiom.  Does that make it true?  I
> didn't think so.

There are some people.. (quite  a few of them world wide actually.) that 
devote their life to inconsistent logical systems where certain claims 
become true by fiat.  (Pick, your favorite religion...)

>>If you adopt that world view what you are saying makes no
>>sense. This is an issue for philosophers/logicians to argue endlessly
>>about. I think, most anyone with a standard mathematical background
>>can find these view points very odd.
> 
> 
> As they should.  I actually think that you are subtly misrepresenting
> constructivism here.  As far as I remember, constructivism essentially
> abolishes axioms within the logic that effectively state or imply "A
> or not A".  As a result, if you want to prove "A or not A" you have to
> either prove A or you have to prove "not A".  To generalize this, if
> you want to prove "exists x such that A(x)" you have to construct a
> concrete X for which you can prove A(X).  And so on.
> 
> But this is all at the level of the formal calculus, not the level of
> the model, and it says nothing about the true nature of truth.

The constructivist have designed their logics to reflect their notion of the 
nature of truth. A constructivist expects a finite witness to demonstrate a 
fact. A witness may exists that proves A or the negation of A.  It may also 
be the case that no such finite witness exists at which point they just 
throw up their arms and wonder if the question is malformed or some how 
nonsensical.

Alternatively you can adopt classical logic and do things that make a true 
constructivist roll over in his/her grave.

Reasonable people can believe that what seems like nonsense to a 
mathematician is not. Reasonable people may believe that all this talk about 
infinites of infinities and non-computable real numbers is just nonsense too.

Classical mathematics is not the only model of truth and if you really think 
about classical mathematics it has lots of odd things going on inside it.
Especially if you accept the axiom of choice....

Of course on the surface they seem odd then you get into the formal details 
of classical mathematics and start to understand what it means formally 
within the framework of classical mathematics. At some point your intuitions 
change, and what used to be odd just seems "natural". Of course they seem 
natural because you've started to internalize some of the formal details of 
classical mathematics into your perspective on the world and your personal 
notion of Truth.

Seriously do you think the axiom of choice is *obviously* true or not true?

This is the same question as asking if the law of the exclude middle is true 
or not true which entails this whole discussion about whether a TM obviously 
either halts or doesn't.





0
danwang742 (171)
7/4/2004 6:27:53 AM
Bill Richter wrote:

{stuff deleted}
> I could be wrong, but I thought Daniel said he was a constructionist
> who (earlier) liked the CPO/Scott approach.  I apologize if I got
> Daniel wrong, but that's Will's "tremendous confusion".  If you're
> using Scott models, you're no constructionist, you're a ZFC-er.

Constructive logic is quite closely related to domain theory.

http://en.wikipedia.org/wiki/Category_theory

Categorical logic is now a well-defined field based on type theory for 
intuitionistic logics, with application to the theory of functional 
programming and domain theory, all in a setting of a cartesian closed 
category as non-syntactic description of a lambda calculus. At the very 
least, the use of category theory language allows one to clarify what 
exactly these related areas have in common (in an abstract sense).

P.S. At some point we've all become category theorists without knowing it.
0
danwang742 (171)
7/4/2004 6:37:59 AM
Matthias Blume wrote:
{stuff deleted}
> Actually, if it does not provably halt, then it does not halt.  Now
> group your latter two cases into one larger case, and voila: a program
> either halts (and provably so), or it does not halt (although in some
> cases we will not be able to prove that).

Here are some insightful links from the Wikipedia just for those who really 
care.. :)

http://en.wikipedia.org/wiki/Mathematical_intuitionism
http://en.wikipedia.org/wiki/Intuitionistic_logic
http://en.wikipedia.org/wiki/Anti-realism
http://en.wikipedia.org/wiki/Topos
http://en.wikipedia.org/wiki/Axiom_of_choice
0
danwang742 (171)
7/4/2004 7:02:05 AM
richter@math.northwestern.edu (Bill Richter) wrote
> I'm going to define...
> Then I'm going to...
> Then I'm going to...

When you have actually done it, I will point out the obvious.

>    >    You claim to have proved that you are getting close to the
>    >    point at which you could claim that some as-yet-unspecified
>    >    correspondence holds between your semantics and a
>    >    compositional semantics.  When you actually specify that
>    >    correspondence (e.g. the f_ij functions that you claim you
>    >    might be able to find), I will comment upon it.
>    > 
>    > I did that 2 long posts ago!  And above.
>    
>    No you didn't.  
> 
> Right, I goofed.

Consistency is a virtue, I suppose.  And then you goofed again:

> But I've done so before.  You've seen this before,
> and you & Joe predicted I couldn't define E independently of Phi:

No, we predicted you wouldn't be able to define Phi without using E.
You then confirmed our prediction by writing:

> E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store) 
> 
> where 
> 
> Phi(alpha, beta)(e, s) = (b_v, s3), if 
> 
> 			 alpha(e, s) = (<i,b,e'>, s1)  
> 
> 			 beta(e, s1) = (a_v, s2)
> 
> 			 l = (new s2)      
>   
> 			 E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and
> 
> Phi(alpha, beta)(e, s) = bottom, otherwise.

Please notice your use of E in the next-to-last line.

Since E is defined in terms of Phi, and Phi is defined in terms of E,
you have a circular pair of definitions.  To prove that these circular
definitions actually define anything, you're going to have to appeal
to domain theory.

Whereas a compositional definition of E is well-defined by structural
induction, and we do *not* have to use domain theory to show this.

You've got this completely backwards.

>    On 29 June you gave a compositional denotational semantics for a
>    four-function calculator language of arithmetic expressions over
>    numerals.  There was no recursion in that language, and your
>    semantics was exactly the same as everyone else's.
> 
> Right.  Sorry again. 

>    You have never actually stated any f_ij that give an alternative
>    compositional definition of a semantic function originally defined
>    by your style of standard reduction.  Never.  Not once.
>    
> That's technically true...

> So let's finish this general discussion
> about compositionality so "push can come to shove."

No.  Let's skip your obfuscating generalities.  You've been spewing
this stuff for two years now, and you still can't see the obvious
circularity between your definitions of E and Phi.  It's time for
pushing and shoving.

>    If you've got something to say, please get on with saying it.
> 
> I claim to have 'said' one such thing: the definition of
> compositionality.  Since we're still arguing about it, we can't
> calculate my "math grade" yet.  But I think we're close to being done.

If you don't stop saying you're "close" to saying something, I'm
going to scream.

Will
0
cesuraSPAM (401)
7/4/2004 1:23:35 PM
In article <m2oemwwi4o.fsf@hanabi-air.shimizu.blume>,
Matthias Blume  <find@my.address.elsewhere> wrote:
> The point is that I am not doing it.  Yes, specific /formal/ systems
> may be unable to prove that "A or not A", others don't.  But
> semantically a program still either halts or does not halt, even if
> the formal system that you use to reason about this fact might not be
> able to prove it.  Everything else is pure nonsense.

To an intuitionist "A or not A, but I have no way of knowing which" is
just as much pure nonsense. But then again, intuitionism is a much
stronger position than constructivism is.


Lauri
0
la (473)
7/4/2004 5:46:51 PM
I said I'd answer the rest of this, but I don't even know where to
begin, so I'm going to go back a ways.  This should be easy because
you are much closer to having a complete definition of `Bill
Semantics'.

>> Bill Richter wrote a while back:
>>
>> I claimed I could easily turn Shriram's big-step OpS into a
>> mathematical function
>> 
>> V : Expressions -> (Env -> Values)
>>
>> where V[[ (lambda (x) body) ]](E) = <x, body, E>  
>>
>> I said I didn't need Scott models or CPO's to define V, just some
>> induction.

You also claim that V can be compositionally defined:

>> [compositionality is] that there's a function Phi s.t. 
>>
>> V [[ ((lambda (x) x) 3) ]] = Phi(V[[ (lambda (x) x) ]], V[[3]])

Well, obviously we need a case by case for each production in our
language.  These are two base cases:

V [[ numeral ]](env) = number

V [[ id ]](env) = lookup (id, env)

using your definition of compositionality we can add function calls
and conditionals:

V [[ (<exp0> <exp1>) ]](env) = Phi0 (V [[<exp0>]](env), V [[<exp1>]](env))

V [[ (if <exp0> <exp1> <exp2>) ]](env) 
    = Phi1 (V [[<exp0>]](env), V [[<exp1>]](env), V [[<exp2>]](env))

If LAMBDA is to be compositional, it ought to be in this form:

V [[ (lambda <var> <exp0>) ]](env) = Phi2 (V [[<exp0>]])(env)

  yet you seem to want it to be a primitive like this:

V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)

Is it compositional or not?

This more or less completes the case analysis, we now need definitions
of Phi1, Phi2, and Phi3.  This is where I am having the difficulty.  I
recall that Phi0 is supposed to depend upon some number of
applications of eval_s in some way.  I suppose we could say this:

                  /  if a is tuple <v, exp, env>,
   Phi0 (a, b) = |      and there exists some n such that
                 |      eval_s^n (exp, extend_env (v, b, env]) = number or tuple
                 |      then that number or tuple
                 |
                 |   otherwise `bottom'
                  \

(this assumes that the definition for lambda is *not* compositional)

I don't know what Phi1 is supposed to be.

If we can get the definitions of Phi0 and Phi1 nailed down, we can go
on to the more interesting question of whether `Bill Semantics'
correctly transforms Shrirams Big-Step Ops to a DS form.


-- 
~jrm
0
7/4/2004 11:27:41 PM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

  If you don't stop saying you're "close" to saying something, I'm
  going to scream.

Will, I'm 100% done with a proof that I can produce compositionally
defined functions by: 

1) first defining some maps EE_i: B_i ---> D_i

2) then defining the f_ij in terms of the EE_i

3) then use the uniqueness of Schmidt's 3.13 to show that the EE_i are
   defined compositionally by the f_ij.

Since you don't seem to believe it, I've done something.  Anyway, what
I posted was "close" was us being in agreement. I guess I was wrong.

  You've been spewing this stuff for two years now, and you still
  can't see the obvious circularity between your definitions of E and
  Phi.  It's time for pushing and shoving.

Push+Shove: you're all wrong.  I've got a lot of respect for you, and
it's no fun to keep telling this.  The problem IMO is that you haven't
re-read Schmidt's 3.13.  I can't blame you: it took me hours to
understand.  It's not IMO well-stated, and it's an overly general
result for our purposes.  But I told you something about it
(uniqueness) that you didn't know (you know existence), and you can't
really respond to me until you re-read Schmidt's 3.13.  So maybe we
need a new topic.  If you can tell me the smallest subset of Scheme
that you'd find meaningful, we can do Schmidt's 3.13 in that context.
Perhaps you'd like me to restate (same 7-line proof) Schmidt's 3.13.

   we predicted you wouldn't be able to define Phi without using E.

That's an easy prediction, I've said that all along!  I said it in the
post you're responding to, and I shortened it above.  Now to be fair,
until Lauri caught me, my E tended to not be well-defined.  But not
because of circular reasoning with Phi, but because I hadn't handled
the infinite-loopers, which are easily handled with standard reduction.

  You then confirmed our prediction by writing:
  
 > 			 E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and
  > 
  > Phi(alpha, beta)(e, s) = bottom, otherwise.
  
  Please notice your use of E in the next-to-last line.
  
Right, it's deliberate. 

  Since E is defined in terms of Phi, 

No No No No No No No No!  I've said over & over I'll define E by
standard reduction, using a 1-step non-recursive function R.  I've
twice (recently about Advanced Scheme, and 2 years ago) posted
reasonable sketches of my E & R functions.  Since nobody responded to
those 2 posts, it's unfair to say that I can't do it.  Plus, you know
it can be done, I think you said you did it 25 years ago!  The only
question is whether you can show this standard reduction E is
compositional.  Which is why we're discussing compositionality...
0
richter1974 (312)
7/5/2004 12:51:06 AM
Joe Marshall <prunesquallor@comcast.net> responds to me:
  
  > sounds like you're saying that F-F's proof is inadequate.
  
  No.  I'm saying that F-F *never* claim to provide a denotational
  semantics.

Sure, Joe.  But did they need to invoke it, or something similar, to
define (my) total function teval_s, which is completely equivalent,
as Schmidt explains early, to their partial function eval_s?
  
  > That for F-F to assert that teval_s is a well-defined mathematical
  > function, they needed to invoke some domain theory, or some
  > equally powerful theory.  Since F-F invoked nothing, I conclude
  > that you think F-F's proof is inadequate.
  
  No, they claim that teval_s is a well-defined *partial* function and
  make no claim whatsoever about how to determine its range.  

No, they claim that eval_s is a well-defined partial function.  Not
teval_s.  I'm the one who defined teval_s, as a total function.

And you mean domain, not range.  F-F don't describe the element in
LC_v Exp on which eval_s does not return a value.  But my point is
that F-F believes the collection of such elements is a subset

Undef   subset   LC_v Exp

That's what partial function means, and then we can redirect this
subset Undef to bottom to get a total function teval_s.  Undef is not
a computable subset, but it doesn't have to be.

   They claim that *when* it produces a value that it produces the
  correct value.  I have no problem with this.
  
Right, but partial function means more than that.
  
  I'm going to answer the rest of this tomorrow.
  
Take your time!  BTW I think you're the engine driving the thread.
I'm interested in settling old scores, although my blood pressure is
safely down now, as I realize that my Math guys are much worse :^0 
But it seems to me that several times folks have tried to stop the
thread, and you've always come through with a thoughtful technical
post for me to respond to, and that's what's kept the thread alive.
0
richter1974 (312)
7/5/2004 1:08:33 AM
"Daniel C. Wang" <danwang74@hotmail.com> responded to me:

  > If you're using Scott models, you're no constructionist, you're a
  > ZFC-er.
  
  P.S. At some point we've all become category theorists without
  knowing it.

Daniel, I'm a ZFC-category theorist, like all algebraic topologists.
You don't have to do constructionist category theory.

Using weaker axioms than ZFC is great, but you didn't address my
point, which is quite relevant IMO to your argument with MB.  So:

Do you want to use the DS approach in Schmidt's book, of CPO minimal
fixed points and Scott models of LC?  If so, you must work in ZFC!  MB
take it for granted that you're working with ZFC-caliber axioms, so he
can't understand what you're havering about.  I think.
0
richter1974 (312)
7/5/2004 1:22:13 AM
Joe Marshall <jrm@ccs.neu.edu> responded to me:

  To make things concrete, here is a tiny scheme-like language:
  
      an expression is one of:
  
            numeral, such as 10 33 (integer numerals)
  
            variable
  
            (lambda <var> <expr>)  denotes a function of one argument
  
            (call <expr> <expr>) denotes applying a function to an argument
  
            (if-equal? <expr> <expr> <expr> <expr>)
                If the first two exprs are numeric and have equal
                values, the result is the third expr, otherwise the
                fourth. 
  
            (succ <expr>)
                If the expr evaluates to a number, (succ <expr>)
                evaluates to the successor of that number.
  
Joe, this looks close to a special case of F-F's language ISWIM.  Look
on p 39: your numbers and numerical functions are ISWIM basic
constants and primitive operation.  Note that they even define an if0
macro in terms of the unary primitive operation iszero?  I'm ignoring
your call function, as procedure application is already part of ISWIM.

  >    > 1) I'm saying we can define DS semantic functions along the lines of
  >    >     F-F's standard reduction function eval_s.
  
  Here's your opportunity.  Please provide the definition.

So let's turn your language entirely into F-F's language ISWIM, and
then we can actually use F-F's standard reduction function eval_s.
Are you questioning whether eval_s is compositional?  We have to first
straighten out the definition of compositionality, and settle whether
my teval_s is a well-defined total function, and also I guess we need
to change this definition on p 53:

eval_s(M) = function, if M |-->>_v  (\X. N)

That's obviously going to destroy compositionality.  But this is
definitely a good project, after we clear up the preliminaries.

  Now here's where I use domains.  I allow functions to operate on
  functions, so the domain of expressible values includes functions over
  that domain.  

Sure, but F-F doesn't, so I don't need domains.  Functions for me will
only be lambda expressions (\X. N), and that's a subset of Values, but
the set of functions (Values -> Values) has larger cardinality than
Lambda-EXPR, since we don't have a Scott topology.  So that's a
deficiency of mine, but I claim it does not prevent compositionality.

That's a lot of nice Scheme code you wrote!  I wasn't clear what the
purpose was.  The issues we've been on (compositionality, Halting
Problem total functions) aren't settled by actual computations :^0
0
richter1974 (312)
7/5/2004 2:07:16 AM
AAAAIIIIIIIIIIIGGGGGGGGGGGGGGGGGHHHHHHHHHHHHHHHHHHHHHH!

richter@math.northwestern.edu (Bill Richter) wrote:
>   If you don't stop saying you're "close" to saying something, I'm
>   going to scream.
> 
> Will, I'm 100% done with a proof that I can produce compositionally
> defined functions by: 
> 
> 1) first defining some maps EE_i: B_i ---> D_i
> 
> 2) then defining the f_ij in terms of the EE_i
> 
> 3) then use the uniqueness of Schmidt's 3.13 to show that the EE_i are
>    defined compositionally by the f_ij.

No, Bill, you've been leaving out the important part:  Showing that
the unique functions determined by your f_ij are equal to your EE_i.

>   You've been spewing this stuff for two years now, and you still
>   can't see the obvious circularity between your definitions of E and
>   Phi.  It's time for pushing and shoving.
> 
> Push+Shove: you're all wrong.  I've got a lot of respect for you, and
> it's no fun to keep telling this.  The problem IMO is that you haven't
> re-read Schmidt's 3.13.

No, the problem is that you aren't showing enough respect for
mathematical rigor.  I don't care whether you respect me, but
I wish you'd show a little respect for mathematics.

> I can't blame you: it took me hours to
> understand.  It's not IMO well-stated, and it's an overly general
> result for our purposes.  But I told you something about it
> (uniqueness) that you didn't know (you know existence), and you can't
> really respond to me until you re-read Schmidt's 3.13.

Nonsense, Bill.  Complete nonsense.  You yourself said "the proof
[of Theorem 3.13] is trivial."  You said "the statement is arduous"
because it took you hours to understand.  That doesn't imply that
it took me hours to understand it.

>    we predicted you wouldn't be able to define Phi without using E.
> 
> That's an easy prediction, I've said that all along!

By E, I meant the semantic function that is defined compositionally
in terms of Phi.  If you can't define Phi without using that function,
then E is defined in terms of Phi, Phi is defined in terms of E, and
you've got a circularity in your definitions.

In a previous message, you wrote

> Right, I goofed.  But I've done so before.  You've seen this before,
> and you & Joe predicted I couldn't define E independently of Phi:
> 
> E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store) 

This equality is either a definition or a claimed theorem.  Since you
have given no proof of it, I assumed it was a definition, and pointed
out that it is circular because Phi is defined in terms of E.

Now you're saying the above equality wasn't a definition:

>   Since E is defined in terms of Phi, 
> 
> No No No No No No No No!  I've said over & over I'll define E by
> standard reduction, using a 1-step non-recursive function R.

Since your equality wasn't a definition, you must have been claiming
it as a theorem.  That is an empty claim: you haven't proved any such
theorem.

> I've
> twice (recently about Advanced Scheme, and 2 years ago) posted
> reasonable sketches of my E & R functions.  Since nobody responded to
> those 2 posts, it's unfair to say that I can't do it.  Plus, you know
> it can be done, I think you said you did it 25 years ago!

No one is saying it can't be done.  I'm saying you haven't done it.

> The only
> question is whether you can show this standard reduction E is
> compositional.  Which is why we're discussing compositionality...

No, that's not the question at all.

The real question is whether you can prove that the E you define via
reduction is equal to the E' that, by Schmidt's Theorem 3.13, is
uniquely determined by the f_ij that you define using E.

You've been using the same names for E and E', thereby assuming at
the outset that they are equal.  The mathematical term for that is
"cheating".  You aren't allowed to use notation that assumes the
theorem you haven't yet proved.

In fact, you have not yet defined a complete set of f_ij for any
language that can express iteration or recursion.  You asked me
what would be a reasonable subset of Scheme.  I suggest you start
with the lambda calculus subset plus numeric constants.

Once you have used reduction to define your E for that language,
you should then give a clear statement of the f_ij that you want
to use in Theorem 3.13.

Then, and only then, can you prove that the unique functions
whose existence is asserted by Theorem 3.13 coincide with the E
that you defined by reduction.

That will not be a trivial proof.  I predict that you will be
unable to complete that proof without formulating and using a
compositional semantic function that is defined in the usual
denotational way, without using your reduction semantics.  If
my prediction is correct, then it will be obvious that your
claim to have avoided the complexity of the usual denotational
definitions was bogus.

I look forward to pointing that out.  In fact, I have lost
patience with your excruciatingly slow progress toward the
definitions and theorems you claim you will soon be able to
give or to prove, or in some cases (as in the post to which
I am responding) that you falsely claim to have given or proved
already.

Put up or shut up.  Please.

Will
0
cesuraSPAM (401)
7/5/2004 7:20:14 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>
>   To make things concrete, here is a tiny scheme-like language:
>   
>       an expression is one of:
>   
>             numeral, such as 10 33 (integer numerals)
>   
>             variable
>   
>             (lambda <var> <expr>)  denotes a function of one argument
>   
>             (call <expr> <expr>) denotes applying a function to an argument
>   
>             (if-equal? <expr> <expr> <expr> <expr>)
>                 If the first two exprs are numeric and have equal
>                 values, the result is the third expr, otherwise the
>                 fourth. 
>   
>             (succ <expr>)
>                 If the expr evaluates to a number, (succ <expr>)
>                 evaluates to the successor of that number.
>   
> Joe, this looks close to a special case of F-F's language ISWIM.  

ISWIM was invented in 1966 by Landin.

> Look on p 39: your numbers and numerical functions are ISWIM basic
> constants and primitive operation.  Note that they even define an if0
> macro in terms of the unary primitive operation iszero?  I'm ignoring
> your call function, as procedure application is already part of ISWIM.

It isn't surprising that it looks similar.

>
>   >    > 1) I'm saying we can define DS semantic functions along the lines of
>   >    >     F-F's standard reduction function eval_s.
>   
>   Here's your opportunity.  Please provide the definition.
>
> So let's turn your language entirely into F-F's language ISWIM, and
> then we can actually use F-F's standard reduction function eval_s.

I don't wish to add any more complexity.

> Are you questioning whether eval_s is compositional?  

No.

>   Now here's where I use domains.  I allow functions to operate on
>   functions, so the domain of expressible values includes functions over
>   that domain.  
>
> Sure, but F-F doesn't, ...

Where did you get that impression?  Flatt and Felleisen first
introduce lambda arguments on page 25, develop recursion as
self-application on page 31, and use lambda expressions as arguments
on the very first page of their introduction to ISWIM.

> That's a lot of nice Scheme code you wrote!  I wasn't clear what the
> purpose was.  The issues we've been on (compositionality, Halting
> Problem total functions) aren't settled by actual computations :^0

Many of the issues we've been on *are* settled by actual computations.
If the computer operates on an expression and produces a result, then
clearly that particular instance of the problem has a well-defined
solution.  The rigor required to make a program run is very high and
does not admit sloppy formulations.  The program can easily be used a
starting point for proofs.

For example, my definition of curly-e is itself virtually a proof that
curly-e is compositional (the remainder of the proof is in the fact
that no function on which curly-e relies itself calls curly-e).

(define (curly-e expression)
  (cond ((numeral? expression)     (curly-e-numeral expression))
        ((variable? expression)    (curly-e-variable expression))
        ((succ? expression)        (curly-e-succ
                                     (curly-e (succ-arg expression))))

        ((ifequal? expression) (curly-e-ifequal
                                     (curly-e (ifequal-left   expression))
                                     (curly-e (ifequal-right   expression))
                                     (curly-e (ifequal-consequent  expression))
                                     (curly-e (ifequal-alternative expression))))

        ((lambda? expression) (curly-e-lambda
                               (lambda-variable expression)
                               (curly-e (lambda-body  expression))))

        ((call? expression) (curly-e-call
                             (curly-e (call-operator expression))
                             (curly-e (call-argument expression))))

        (else (error "Illegal expression." expression))))

If you can write the equations for `Bill Semantics' in the form above,
then no one will argue whether your definition is compositional.  If
you cannot, then there is more than sufficient reason to doubt it.





0
7/5/2004 8:18:58 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <prunesquallor@comcast.net> responds to me:
>   
>   > sounds like you're saying that F-F's proof is inadequate.
>   
>   No.  I'm saying that F-F *never* claim to provide a denotational
>   semantics.
>
> Sure, Joe.  But did they need to invoke it, or something similar, to
> define (my) total function teval_s, which is completely equivalent,
> as Schmidt explains early, to their partial function eval_s?

Flatt and Felleisen never claim to define a total eval_s function,
either.

>   > That for F-F to assert that teval_s is a well-defined mathematical
>   > function, they needed to invoke some domain theory, or some
>   > equally powerful theory.  Since F-F invoked nothing, I conclude
>   > that you think F-F's proof is inadequate.
>   
>   No, they claim that teval_s is a well-defined *partial* function and
>   make no claim whatsoever about how to determine its range.  
>
> No, they claim that eval_s is a well-defined partial function.  Not
> teval_s.  I'm the one who defined teval_s, as a total function.

Yes.

> F-F don't describe the element in LC_v Exp on which eval_s does not
> return a value.

There are many expressions for which eval_s does not return.

> But my point is that F-F believes the collection of such elements is
> a subset
>
> Undef   subset   LC_v Exp

I see no indication of this.  Flatt and Felleisen restrict their
concern to expressions that produce values.

> That's what partial function means, and then we can redirect this
> subset Undef to bottom to get a total function teval_s.  Undef is not
> a computable subset, but it doesn't have to be.

Undef certainly does have to be a noncomputable subset.

>    They claim that *when* it produces a value that it produces the
>   correct value.  I have no problem with this.
>   
> Right, but partial function means more than that.
>   
>   I'm going to answer the rest of this tomorrow.
>   
> Take your time!  BTW I think you're the engine driving the thread.

I simply cannot take the credit for this.

-- 
~jrm
0
7/5/2004 8:26:02 PM
Joe Marshall <prunesquallor@comcast.net> responded to me:

  I said I'd answer the rest of this, but I don't even know where to
  begin, so I'm going to go back a ways.  This should be easy because
  you are much closer to having a complete definition of `Bill
  Semantics'.

Good idea, Joe.  But let's point out the c.l.s. sticking point: It
seems to be generally believed that my (1) is false.  Nielson^2 makes
no false statements AFAIK, but I'd guess Nielson^2 is unaware of (1).
[I think Schmidt & C-F understand it quite well.]
But if we assume that my (1) is true, then I think most folks here
will say that (3) is obvious, even if (2) isn't quite well-understood.

1) Compositionality is equivalent to the following definition:

The functions from syntactic to semantic domains
BB_i: B_i ---> D_i
must be accompanied by functions 
f_ij: prod_{l=1}^k D_ijl ---> D_i
satisfying the equations 
EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) ),
in terms of Schmidt's BNF setup.

Compositionality doesn't depend on whether the BB_i are defined before
the f_ij are defined, or afterward, by structural induction.  The
proof of this is immediate from the uniqueness of Schmidt's Thm 3.13.

2) F-F's standard reduction partial function eval_s gives a
well-defined total function
teval_s: LC_v Expr ---> Value_bottom.

No DS is needed, but perhaps more should be said about induction &
defining sets by formulas involving quantifiers, although this is
completely standard in Math (unlike Scott models & CPOs!).

3) We'll define Scheme-like semantic functions
   E: Expr ---> (U x S -> Value x S)
and eventually 
   E: Expr ---> (U x S x K -> A)
following F-F's eval_s, or the semantics of Advanced Scheme in HtDP.
E will be compositional because Scheme is!  The value of a combination
is obtained by first evaluating the arguments...


  >> I claimed I could easily turn Shriram's big-step OpS into a
  >> mathematical function
  >> 
  >> V : Expressions -> (Env -> Values)
  >>
  >> where V[[ (lambda (x) body) ]](E) = <x, body, E>  
  >>
  >> I said I didn't need Scott models or CPO's to define V, just some
  >> induction.

That was an error that Lauri caught me on, which I corrected, by
instead using small-step OpS: I now define V in a similar way to F-F's
standard reduction function eval_s.  So we need to understand eval_s.

Interestingly enough, Nielson^2 makes the same "error" I did, so I
suppose my V could be rigorized in a different way, although I won't
attempt this.  As Lauri pointed I think, big-step OpS is the same
thing as natural semantics.  On p 31, Nielson^2 defines a function

curly-S_{ns}: Stm ---> (State -> State)

curly-S_{ns}[[ S ]]s = s',		if <S, s> -> s'

		       undefined,	otherwise.

The attentive readers of c.l.s. will immediately point out this
curly-S_{ns} is not well-defined!  How do we know whether or not
there's a big-step reduction of <S, s> to a state or not?  

But it quickly occurred to me that Nielson^2 can rigorize this by
generating recursively the set of all big-step reduction <S, s> -> s'.
And then curly-S_{ns} is well-defined: we just check if there's a pair
in the set of big-step reductions with first = <S, s>.

I point this out not to excuse my error, because I wasn't thinking
about this fix when I erred, but instead, to point out that my error
had absolutely nothing to do with circular reasoning: I wasn't
defining V & Phi by some bogus mutual recursion.
  
  You also claim that V can be compositionally defined:
  
  >> [compositionality is] that there's a function Phi s.t. 
  >>
  >> V [[ ((lambda (x) x) 3) ]] = Phi(V[[ (lambda (x) x) ]], V[[3]])
  
Yes, and the claim is true, once we rigorously define V, by standard
reduction.  We'll just prove that we get these big-step reductions.

  Well, obviously we need a case by case for each production in our
  language.  These are two base cases:
  
  V [[ numeral ]](env) = number
  
  V [[ id ]](env) = lookup (id, env)

This brings up another problem!  I can't work with Shriram's 

V : Expressions ---> (Env -> Value)

I need to unconflate Env & Store.  Interestingly enough, Dana Scott
attributes this idea to himself in his intro to Stoy's book!  I want

E: Expressions ---> (Env x Store -> Value x Store)

But IMO it's premature to get into this, as we're still stuck on the
definition of compositionality, and perhaps we should consider Pure
Scheme first, i.e. F-F's LC_v, and then it's even simpler:

E: Expressions --->  Value

  using your definition of compositionality we can add function calls
  and conditionals:
  
  V [[ (<exp0> <exp1>) ]](env) 
  = 
  Phi0 (V [[<exp0>]](env), V [[<exp1>]](env))

As I posted earlier, my Phi doesn't reduce to your

Phi0: Value x Value ---> Value

But I guess you could get away with that if there are no side-effects
in the language.  But I won't do it.  As LC_v shows, you don't need to
talk about environments if you don't mutate.  We can use Y_v to define
recursion. 

  V [[ (if <exp0> <exp1> <exp2>) ]](env) 
      = Phi1 (V [[<exp0>]](env), V [[<exp1>]](env), V [[<exp2>]](env)) 

Same problem.  Joe, maybe I ought to be more flexible, but I can't
seem to unbend.  I need to talk about Phi as a map

Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)

     ---> (Env x Store -> Value x Store)
  
  If LAMBDA is to be compositional, it ought to be in this form:
  
  V [[ (lambda <var> <exp0>) ]](env) = Phi2 (V [[<exp0>]])(env)
  
you know that won't work!  There's no <var> on the RHS!  You of course
know how to handle this in Scheme: (lambda <var> <exp0>) is a special
form, as they say in Lisp.  Or syntax, as Schemers say.  We can in
fact handle this in Schmidt's Theorem 3.13 setup.  We'll say that
<var> belongs to a different syntactic domain

<var> in Lambda-Var

and we'll say that Lambda-var is also a semantic domain, and we'll
insist on the identity map

LLVV: Lambda-Var  ---> Lambda-Var

Now in Schmidt's setup, LLVV is indicated by some f_ij's, as I posted
earlier:  If Lambda-Var = B_i, then Schmidt will say 
f_{i, x} = x
I wouldn't state it this way.  But I'm impressed that Schmidt could
state such a general result at all!  I tried & failed myself. 

Similarly for the body of a lambda expression.


  yet you seem to want it to be a primitive like this:
  
  V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)
  
Yes. 

  Is it compositional or not?

Yes, for the same reason that Scheme is "compositional"!  But it will
take me a while to prove it.  We have to first settle on a definition
of compositionality, and you have to understand that Phi need not
reduce to a Phi0.

I think you're asking a specific question here.  If we insisted that 

V [[ (lambda <var> <exp0>) ]] = Phi3( V [[ <var> ]],  V [[<exp0>) ]] ) 

we'd be hosed, right?  But Schmidt's statement is flexible enough to
avoid this, as I said above.
  
  This more or less completes the case analysis, we now need
  definitions of Phi1, Phi2, and Phi3.  

I don't want to try to define them.  Another problem with Shriram's
setup is that evaluating expressions can change the environment, even
with just define.  So you'd really want 

V: Expressions -> (Env -> Value x Env)


  This is where I am having the difficulty.  I recall that Phi0 is
  supposed to depend upon some number of applications of eval_s in
  some way.

I posted Phi (not Phi0) again in the post that Will just responded
to.  And yes, Phi depends on V.  V will be some version of eval_s, so
there's no need to say eval_s.

   I suppose we could say this:
  
                    /  if a is tuple <v, exp, env>,
     Phi0 (a, b) = |      and there exists some n such that
                   |      eval_s^n (exp, extend_env (v, b, env]) 
			  = number or tuple
                   |      then that number or tuple
                   |
                   |   otherwise `bottom'
                    \
  
  (this assumes that the definition for lambda is *not* compositional)

I suppose you mean lambda is syntax/special-form, re above.  OK, I'll
do it again, you're close enough:

Phi(alpha, beta)(e, s) = (b_v, s3), if 

			 alpha(e, s) = (<i,b,e'>, s1)  

			 beta(e, s1) = (a_v, s2)

			 l = (new s2)      
  
			 E[b](e'[i->l], s2[l->a_v]) = (b_v, s3), and

Phi(alpha, beta)(e, s) = bottom, otherwise.

  
  I don't know what Phi1 is supposed to be.
  
  If we can get the definitions of Phi0 and Phi1 nailed down, we can
  go on to the more interesting question of whether `Bill Semantics'
  correctly transforms Shrirams Big-Step Ops to a DS form.

OK, I'll do it:

  E [ (if X Y Z) ] =  Phiif( E[X], E[Y], E[Z] )

Phiif: (Env x Store -> Value x Store)^3 

     ---> (Env x Store -> Value x Store)


Phiif(a, b, c)(r, s) = b(r, s1), if a(r, s) = (v1, s1) & v1 != false  

		       c(r, s1), if a(r, s) = (v1, s1) & v1  = false

		       bottom,	 otherwise 

		       
Here are my notes, which I needed to work this out:
 
We first evaluate X in env/store (r, s), so that means 

a(r, s) = (v1, s1)

if v1 != false, then we evaluate Y in (r, s1), so that means

b(r, s1) = (v2, s2)  

if v1 = false, then we evaluate Z in (r, s1), so that means

c(r, s1) = (v3, s3)
  

Now I think it's interesting that `if' requires no futzing like with
`lambda', even though in Scheme `if' is a special-form, syntax.  That
is, we don't evaluate both X & Y in Scheme.  But in DS, it doesn't
matter!  We don't change anything by noticing that c(r, s1) = (v3, s3)
even though we're only supposed to evaluate Y.

Might as well do lambda as well:

E_lv: Lambda-Vars ---> Lambda-Vars	;; identity map 

E_lb: Lambda-Body ---> Lambda-Body;; identity map 


  E [ (lambda x Y) ] =  Philam( E_lv[ x ], E_lb[ Y ] )

Philam: Lambda-Vars x Lambda-Body  ---> (Env x Store -> Value x Store) 

Philam(x, Y)(r, s) = (<x, Y, r>, s)
0
richter1974 (312)
7/6/2004 1:09:02 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Compositionality doesn't depend on whether the BB_i are defined before
> the f_ij are defined, or afterward, by structural induction.

Yes it does.  It is the /definition/ that is compositional, not the
function.  Every function can be made compositional in the above
sense, so this wouldn't be interesting in the least.

The point is that you need a *concrete* compositional /definition/ in
order to be able to perform proofs by structural induction.  It is not
enough ot just show the existence of one.
0
find19 (1244)
7/6/2004 3:59:08 AM
Matthias Blume <me> wrote:

> [...] Every function can be made compositional in the above
> sense, so this wouldn't be interesting in the least.

I guess I should be a bit more careful here.  It is, of course, not
true that every function can be defined compositionally.  For a
variety of very good reasons we want semantic functions to have this
property:

The interpretation [[X]] of a piece of program text X should capture
/everything/ there is no know about X as far as its semantics is
concerned.  What this implies is that if you embed X into a larger
piece of program text Y[X], you one should be able to express the
meaning of this Y[X] as y([[X]]) where y is a function determined by Y
alone.  More generally, if you have X1..Xn with meanings
[[X1]]..[[Xn]] embedded in some piece of glue code Y with n "holes",
then Y determines some n-ary function y such that [[Y[X1,...,Xn]]] =
y([[X1]],...,[[Xn]]).  This implies the possibility of giving a
compositional definition of [[.]]  as one merely needs to identify a
finite number of "glue" (Y) and their corresponding functions (y)
which correspond to the right-hand sides of production rules that
generate the whole language.

In short, all "interesting" semantic functions can be defined
compositionally.  But just knowing that it can be done is not enough:
when we do proofs by structural induction we need those concrete Y and
their corresponding y!  And, of course, not all "interesting" [[.]] in
the above sense are created equal: some are far more interesting
than others.  For example, the [[.]] that maps X to X can obviously be
defined compositionally, but it fails as an interesting semantic
function in pretty much every other respect.
0
find19 (1244)
7/6/2004 6:28:46 AM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <prunesquallor@comcast.net> responded to me:
>
>   I said I'd answer the rest of this, but I don't even know where to
>   begin, so I'm going to go back a ways.  This should be easy because
>   you are much closer to having a complete definition of `Bill
>   Semantics'.
>
> Good idea, Joe.  But let's point out the c.l.s. sticking point:  It
> seems to be generally believed that my (1) is false.  Nielson^2 makes
> no false statements AFAIK, but I'd guess Nielson^2 is unaware of (1).
> [I think Schmidt & C-F understand it quite well.]

Here is what Schmidt says:

  `On the positive side, it is possible to show that functions defined
   recursively over abstract syntax arguments *do* denote unique
   functions.  Structural induction comes to the rescue.  ...

  `Let a language L be defined by BNF equations:

     B_1 ::= Option_11 | Option_12 | ... | Option_1m
     B_2 ::= Option_21 | Option_22 | ... | Option_2m
     ...
     B_n ::= Option_n1 | Option_n2 | ... | Option_nm

   and let BB_i be a function symbol of type B_i->D_i for all 1<=i<=n.
   For an Option_ij, let S_ij1, S_ij2, ... S_ijk be the nonterminal
   symbols used in Option_ij, and let BB_ijl represent the B_l
   appropriate for each S_ijl (for example, if S_ijl=B_p, then
   BB_ijl=BB_p).

  `3.13 Theorem:
      If, for each B_i in L's definition and each Option_ij of B_i's
      rule, there exists an equation of the form:

        BB_i(Option_ij)
            = f_ij (BB_ij1 (S_ij1), BB_ij2 (S_ij2), ... BB_ijk(S_ijk))

      where f_ij is a function of functionality D_ij1 x D_ij2 x ... D_ijk -> D_i,
      then the set of equations uniquely defines a family of functions
      BB_i: B_i -> D_i for 1<=i<=n.

  `Proof:  The proof is by a simultaneous structural induction on the
   rules of L.  ...

If anyone else is still reading this, what this amounts to is the
assertion that if your language is built by composing syntactic
constructs, then it is valid to specify your semantics recursively if
they meet the following conditions:

    Each primitive syntactic construct has a denotation.
  
    Each non-primitive syntactic construct can be written as a
    non-recursive function of the denotations of its constituent
    parts.

The proof of this rests on the fact that any program in a language is
a finite syntactic combination of a finite number of syntactic
primitive forms.

This is the basis for which I use structural recursion to define the
semantics for my tiny language:

(define (curly-e expression)
  (cond ((numeral? expression)     (curly-e-numeral expression))
        ((variable? expression)    (curly-e-variable expression))
        ((succ? expression)        (curly-e-succ
                                     (curly-e (succ-arg expression))))

        ((ifequal? expression) (curly-e-ifequal
                                     (curly-e (ifequal-left   expression))
                                     (curly-e (ifequal-right   expression))
                                     (curly-e (ifequal-consequent  expression))
                                     (curly-e (ifequal-alternative expression))))

        ((lambda? expression) (curly-e-lambda
                               (lambda-variable expression)
                               (curly-e (lambda-body  expression))))

        ((call? expression) (curly-e-call
                             (curly-e (call-operator expression))
                             (curly-e (call-argument expression))))

        (else (error "Illegal expression." expression))))



> But if we assume that my (1) is true, then I think most folks here
> will say that (3) is obvious, even if (2) isn't quite well-understood.
>
> 1) Compositionality is equivalent to the following definition:
>
> The functions from syntactic to semantic domains
> BB_i: B_i ---> D_i
> must be accompanied by functions 
> f_ij: prod_{l=1}^k D_ijl ---> D_i
> satisfying the equations 
> EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) ),
> in terms of Schmidt's BNF setup.
>
> Compositionality doesn't depend on whether the BB_i are defined before
> the f_ij are defined, or afterward, by structural induction.  The
> proof of this is immediate from the uniqueness of Schmidt's Thm 3.13.

It depends on the BB_ijl(S_ijl) being independent of f_ij, and f_ij
being a function.  This is where the problem lies.

[snipping a lot because it does not answer any question I asked]

>   You also claim that V can be compositionally defined:
>   
>   >> [compositionality is] that there's a function Phi s.t. 
>   >>
>   >> V [[ ((lambda (x) x) 3) ]] = Phi(V[[ (lambda (x) x) ]], V[[3]])
>   
> Yes, and the claim is true, once we rigorously define V, by standard
> reduction.  We'll just prove that we get these big-step reductions.
>
>   Well, obviously we need a case by case for each production in our
>   language.  These are two base cases:
>   
>   V [[ numeral ]](env) = number
>   
>   V [[ id ]](env) = lookup (id, env)
>
> This brings up another problem!  I can't work with Shriram's 
>
> V : Expressions ---> (Env -> Value)
>
> I need to unconflate Env & Store.  Interestingly enough, Dana Scott
> attributes this idea to himself in his intro to Stoy's book!  I want
>
> E: Expressions ---> (Env x Store -> Value x Store)

In the tiny language I defined, there are no operations whatsoever on
the store.  Without side effects, there is no need to model the store.

> But IMO it's premature to get into this, as we're still stuck on the
> definition of compositionality, and perhaps we should consider Pure
> Scheme first, i.e. F-F's LC_v, and then it's even simpler:
>
> E: Expressions --->  Value
>
>   using your definition of compositionality we can add function calls
>   and conditionals:
>   
>   V [[ (<exp0> <exp1>) ]](env) 
>   = 
>   Phi0 (V [[<exp0>]](env), V [[<exp1>]](env))
>
> As I posted earlier, my Phi doesn't reduce to your
>
> Phi0: Value x Value ---> Value

Phi0 is simply a name for one of the f_ij.  If your Phi is not such a
function, then you have not fulfilled the requirements for applying
Schmidt's Theorem 3.13 and your equations are not compositional.

> But I guess you could get away with that if there are no side-effects
> in the language.  But I won't do it.  As LC_v shows, you don't need to
> talk about environments if you don't mutate.  We can use Y_v to define
> recursion. 
>
>   V [[ (if <exp0> <exp1> <exp2>) ]](env) 
>       = Phi1 (V [[<exp0>]](env), V [[<exp1>]](env), V [[<exp2>]](env)) 
>
> Same problem.  Joe, maybe I ought to be more flexible, but I can't
> seem to unbend.  I need to talk about Phi as a map
>
> Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)
>
>      ---> (Env x Store -> Value x Store)

The equations do not change if you add the store because no operation
uses or affects the store.  Feel free to read `env' as `env x store'
and `value' as `value x store' if you wish.  It makes no difference.  

>   If LAMBDA is to be compositional, it ought to be in this form:
>   
>   V [[ (lambda <var> <exp0>) ]](env) = Phi2 (V [[<exp0>]])(env)
>   
> you know that won't work!  There's no <var> on the RHS!  

I was focusing on the requirements for composition, so I neglected to
put in the var.  The equation should read this:

   V [[ (lambda <var> <exp0>) ]](env) = Phi2 (<var>, V [[<exp0>]])(env)

>   yet you seem to want it to be a primitive like this:
>   
>   V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)
>   
> Yes. 
>
>   Is it compositional or not?
>
> Yes, for the same reason that Scheme is "compositional"!  

This has nothing to do with Scheme.  I'm only interested in
composition as in Schmidt's Theorem 3.13  This definition is
compositional:
      V [[ (lambda <var> <exp0>) ]](env) = Phi2 (<var>, V [[<exp0>]])(env)

this one is not:
      V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)

> But it will take me a while to prove it.  We have to first settle on
> a definition of compositionality, and you have to understand that
> Phi need not reduce to a Phi0.

I believe that you wish to use Schmidt's Theorem 3.13 to define
compositionality.  As I said before, if your Phi does not reduce to
Phi0, then you do not satisfy the requirements for applying Schmidt's
Theorem 3.13 and your equations are not compositional.

> I think you're asking a specific question here.  If we insisted that 
>
> V [[ (lambda <var> <exp0>) ]] = Phi3( V [[ <var> ]],  V [[<exp0>) ]] ) 
>
> we'd be hosed, right?  

Variable names are not part of the value space, so the correct form is
this:

   V [[ (lambda <var> <exp0>) ]] = Phi3( <var>,  V [[<exp0>) ]] ) 

but your conclusion is still correct:  you are hosed.

> But Schmidt's statement is flexible enough to
> avoid this, as I said above.
>   
>   This more or less completes the case analysis, we now need
>   definitions of Phi1, Phi2, and Phi3.  
>
> I don't want to try to define them.  

Then you have not proven that Schmidt's Theorem 3.13 applies and there
is no reason to believe that your equations are compositional.

> Another problem with Shriram's setup is that evaluating expressions
> can change the environment, even with just define.  So you'd really
> want
>
> V: Expressions -> (Env -> Value x Env)

This is why my tiny language has no DEFINE.

>   This is where I am having the difficulty.  I recall that Phi0 is
>   supposed to depend upon some number of applications of eval_s in
>   some way.
>
> I posted Phi (not Phi0) again in the post that Will just responded
> to.  And yes, Phi depends on V.  

You haven't shown that V is well-defined.  There is no reason to
believe, therefore, that Phi is well-defined, nor that anything
derived from Phi is well-defined.
0
jrm (1310)
7/6/2004 2:59:53 PM
Matthias Blume <find@my.address.elsewhere> writes:

> Joe Marshall <prunesquallor@comcast.net> writes:
>
>> Matthias Blume <find@my.address.elsewhere> writes:
>> 
>> > Joe Marshall <prunesquallor@comcast.net> writes:
>> >
>> >> So I imagine that we are talking about programs in the abstract,
>> >> rather than physical machines.
>> >
>> > Indeed.
>> >
>> >> Consider an accelerating Turing Machine in which each step executes in
>> >> 1/2 the time of the previous step.
>> >
>> > TM time is, of course, number of steps.  Don't be silly.
>> 
>> I'm not being silly.  I was offering this as a counterexample.
>
> I know.  But it wasn't one.  When the machine is abstract (as you
> pointed out yourself), time is obviously abstract, too.  TM time is
> the same as the number of state transitions it makes.  By definition,
> it is not possible to execute one step twice as fast as the previous
> one.  It is always one step per step.

Fine, but what is wrong with a transfinite number of steps?
0
jrm (1310)
7/6/2004 3:05:46 PM
cesuraSPAM@verizon.net (William D Clinger) responded to me:
  
  You yourself said "the proof [of Theorem 3.13] is trivial."  You
  said "the statement is arduous" because it took you hours to
  understand.  That doesn't imply that it took me hours to understand
  it.

Great, Will.  I could believe that!  Your BNF is much better than
mine, and maybe it didn't throw you off when there were no terminals
mentioned, but only nonterminal S_ijl.  So let's stick with 3.13.

  > Will, I'm 100% done with a proof that I can produce
  > compositionally defined functions by:
  > 
  > 1) first defining some maps EE_i: B_i ---> D_i
  > 
  > 2) then defining the f_ij in terms of the EE_i
  > 
  > 3) then use the uniqueness of Schmidt's 3.13 to show that the EE_i
  > are defined compositionally by the f_ij.
  
  No, Bill, you've been leaving out the important part: Showing that
  the unique functions determined by your f_ij are equal to your EE_i.
  
As I keep saying, this follows from the uniqueness of Schmidt's
3.13.  But thanks for a substantive objection.  Let's go through this:

I'm asserting that my EE_i's & f_ij's satisfy these equations:

EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )

I mean this is the assumption. In specific cases, we'll have to show
that the EE_i actually satisfy these equations.  The mere fact that
f_ij are defined in terms of the EE_i does not prove these equations.

But given my f_ij, the existence part of 3.13 produces (by structural
induction) functions BB_i satisfying the equations

BB_i( Option_ij ) = f_ij ( BB_ij1(S_ij1),..., BB_ijk(S_ijk) )

So you believe the uniqueness part of 3.13, right?  But uniqueness
implies that BB_i = EE_i, since we have 2 sets of solutions to the
same equations, and there is a unique solution to these equations.

Are you saying that you don't believe that's what uniqueness means?
Do you think that Schmidt's uniqueness means only that the structural
induction machine will produce a unique answer?  If that's what you
think, then I'd say

1) That's not what Schmidt states.  Joe just posted it, and I quote:

     If, for each B_i in L's definition and each Option_ij of B_i's
      rule, there exists an equation of the form:

        BB_i(Option_ij)
            = f_ij (BB_ij1 (S_ij1), BB_ij2 (S_ij2), ... BB_ijk(S_ijk))

      where f_ij is a function of functionality 
      D_ij1 x D_ij2 x ... D_ijk -> D_i,
      then the set of equations uniquely defines a family of functions
      BB_i: B_i -> D_i for 1<=i<=n.

[Thanks Joe!]  Let me rewrite this for clarity:

   Given functions f_ij: D_ij1 x D_ij2 x ... D_ijk -> D_i, 

   for i = 1,..,n, j = 1,...,m, there is a unique family of functions
   BB_i: B_i -> D_i for 1<=i<=n satisfying the equations 

        BB_i(Option_ij)
            = f_ij (BB_ij1 (S_ij1), BB_ij2 (S_ij2), ... BB_ijk(S_ijk))

There's no mention of structural induction in the statement of 3.13. 

2) Even if Schmidt wrote what I just speculated you thought, we easily
prove that BB_i = EE_i by structural induction.  

  >   You've been spewing this stuff for two years now, and you still
  >   can't see the obvious circularity between your definitions of E
  >   and Phi.  It's time for pushing and shoving.
  > 
  > Push+Shove: you're all wrong.  I've got a lot of respect for you,
  > and it's no fun to keep telling this.  The problem IMO is that you
  > haven't re-read Schmidt's 3.13.
  
  No, the problem is that you aren't showing enough respect for
  mathematical rigor.  I don't care whether you respect me, but I wish
  you'd show a little respect for mathematics.
  
I think you got it backward, Will.  I've shown the math here a ton of
respect.  But I think you fell into a trap I've fallen in repeatedly:
Somebody I don't think is much of a mathematician says something I
didn't know, and I jump on it, "That's false, that's nonsense!"  Often
it happens that they were right.  But it's really hard to judge
mathematical strength, especially with folks you don't know.
  
  >    we predicted you wouldn't be able to define Phi without using
  >    E.
  > 
  > That's an easy prediction, I've said that all along!
  
  By E, I meant the semantic function that is defined compositionally
  in terms of Phi.  

OK, but that's not what I meant!  And I defined E!  I should win this
argument :)

   If you can't define Phi without using that function, then E is
  defined in terms of Phi, Phi is defined in terms of E, and you've
  got a circularity in your definitions.

Right, good.  But I did something different.
  
  In a previous message, you wrote
  
  > Right, I goofed.  But I've done so before.  You've seen this
  > before, and you & Joe predicted I couldn't define E independently
  > of Phi:
  > 
  > E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store)
  
  This equality is either a definition or a claimed theorem.  

It's neither, it's a claim, but it will become a theorem shortly after
I completely define E.  I think you know how to prove it yourself
already. I've already explained how Phi is defined in terms of E.

   Since you have given no proof of it, I assumed it was a definition,
  and pointed out that it is circular because Phi is defined in terms
  of E.

OK, fine, sounds like an honest mistake.
  
  Now you're saying the above equality wasn't a definition:

Correct!
  
  >   Since E is defined in terms of Phi,
  > 
  > No No No No No No No No!  I've said over & over I'll define E by
  > standard reduction, using a 1-step non-recursive function R.
  
  Since your equality wasn't a definition, you must have been claiming
  it as a theorem.  That is an empty claim: you haven't proved any
  such theorem.

Right, but it's not all my fault.  The action in this discussion, now
and 2 years ago, has entirely been on the definition of
compositionality.
  
  > I've twice (recently about Advanced Scheme, and 2 years ago)
  > posted reasonable sketches of my E & R functions.  Since nobody
  > responded to those 2 posts, it's unfair to say that I can't do it.
  > Plus, you know it can be done, I think you said you did it 25
  > years ago!
  
  No one is saying it can't be done.  I'm saying you haven't done it.

Sure, absolutely!  I mean, let's not forget that your Scheme is a
zillion times better than mine!   As I said above, I not only think
you can define E & R, I think you can prove the theorem

E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store)

for my already-defined-in-terms-of-E Phi.
  
  > The only question is whether you can show this standard reduction
  > E is compositional.  Which is why we're discussing
  > compositionality...
  
  No, that's not the question at all.
  
  The real question is whether you can prove that the E you define via
  reduction is equal to the E' that, by Schmidt's Theorem 3.13, is
  uniquely determined by the f_ij that you define using E.

OK, but that breaks down into 2 parts:

1) my proof that Bill-compositionality = Will-compositionality 

2) a proof that my E define via reduction satisfies the equations
   defined by the f_ij, that is Schmidt's equations above

EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )

We can't do it with just one E function, we'll need a few EE_i.

Now I haven't gone ahead with (2) since we're still stuck on (1).
Once we agree on (1), then you can prove (2) yourself, faster than I
can, and of course the first thing is to really define E & R.

However, if we don't believe (1), then I can't possibly succeed!
  
  You've been using the same names for E and E', thereby assuming at
  the outset that they are equal.  

I thought I was quite clear above, distinguishing between EE_i and
BB_i.  I've always been that clear in Schmidt 3.13 discussions.

   The mathematical term for that is "cheating".  You aren't allowed
  to use notation that assumes the theorem you haven't yet proved.

I don't think I did that, but you remind me of Bertrand's Russell's
line, that the axiomatic method has advantages similar to those of
stealing, compared to honest living.  A lot of the problems in this DS
discussions is that folks just can't believe how powerful the ZFC
axioms are.  I wasn't thinking of you at all here, Will.
  
  In fact, you have not yet defined a complete set of f_ij for any
  language that can express iteration or recursion.  

Absolutely!  

   You asked me what would be a reasonable subset of Scheme.  I
  suggest you start with the lambda calculus subset plus numeric
  constants.

Great!  I'll get working on this as we continue to work on 3.13.  I
need clarification: I think this means a version of F-F's ISWIM on p
39 <http://www.ccs.neu.edu/course/com3357/mono.ps>.  So it's LC_v with
some "basic constants", i.e. numeric constants, and should we throw in
some of F-F's "primitive operation"?  How about +, -,*,/ and F-F's
primitive op `iszero', which defines if0?

  Once you have used reduction to define your E for that language, you
  should then give a clear statement of the f_ij that you want to use
  in Theorem 3.13.

I'll do so!
  
  Then, and only then, can you prove that the unique functions whose
  existence is asserted by Theorem 3.13 coincide with the E that you
  defined by reduction.

Right.
  
  That will not be a trivial proof.  I predict that you will be unable
  to complete that proof without formulating and using a compositional
  semantic function that is defined in the usual denotational way,
  without using your reduction semantics.  If my prediction is
  correct, then it will be obvious that your claim to have avoided the
  complexity of the usual denotational definitions was bogus.

This is why we're still stuck on the definition of compositionality.
If we believe that compositionality means exactly what you think it
means, then it's indeed impossible for me to complete my task.  So we
need to first straighten out only hard part of the proof, which is

Bill-compositionality = Will-compositionality 

proved by the uniqueness of Schmidt's 3.13.  I'll go ahead on the
small subset, but I can not complete the proof until we settle this.
0
richter1974 (312)
7/7/2004 12:53:19 AM
Joe Marshall <prunesquallor@comcast.net> responded to me:
  
Joe, you said some things I liked:

  > Are you questioning whether eval_s is compositional?  
  
  No.

Great!  And you're indeed working with F-F's ISWIM, a simple numerical
version of it.  But we're having a communication failure:
  
  >   Now here's where I use domains.  I allow functions to operate on
  >   functions, so the domain of expressible values includes
  >   functions over that domain.
  >
  > Sure, but F-F doesn't, ...
  
  Where did you get that impression?  Flatt and Felleisen first
  introduce lambda arguments on page 25, develop recursion as
  self-application on page 31, and use lambda expressions as arguments
  on the very first page of their introduction to ISWIM.

By domains I thought you meant Scott models of LC, so that you could
do P = (P -> P), and in particular have Procedures act on Values, even
though Procedures is a subset of Values.  Did I misunderstand you?
  
  > That's a lot of nice Scheme code you wrote!  I wasn't clear what the
  > purpose was.  The issues we've been on (compositionality, Halting
  > Problem total functions) aren't settled by actual computations :^0
  
  Many of the issues we've been on *are* settled by actual computations.
  If the computer operates on an expression and produces a result, then
  clearly that particular instance of the problem has a well-defined
  solution.  The rigor required to make a program run is very high and
  does not admit sloppy formulations.  The program can easily be used a
  starting point for proofs.

Yeah, sure.  It won't settle non-halting issues, though.  But it's
good to have code.
  
  For example, my definition of curly-e is itself virtually a proof that
  curly-e is compositional (the remainder of the proof is in the fact
  that no function on which curly-e relies itself calls curly-e).
  
  (define (curly-e expression)
    (cond ((numeral? expression)     (curly-e-numeral expression))
          ((variable? expression)    (curly-e-variable expression))
          ((succ? expression)        (curly-e-succ
                                       (curly-e (succ-arg expression))))
  
          ((ifequal? expression) (curly-e-ifequal
                                       (curly-e (ifequal-left   expression))
                                       (curly-e (ifequal-right   expression))
                                       (curly-e (ifequal-consequent  expression))
                                       (curly-e (ifequal-alternative expression))))
  
          ((lambda? expression) (curly-e-lambda
                                 (lambda-variable expression)
                                 (curly-e (lambda-body  expression))))
  
          ((call? expression) (curly-e-call
                               (curly-e (call-operator expression))
                               (curly-e (call-argument expression))))
  
          (else (error "Illegal expression." expression))))
  
  If you can write the equations for `Bill Semantics' in the form
  above, then no one will argue whether your definition is
  compositional.  If you cannot, then there is more than sufficient
  reason to doubt it.

I'm going to write down equations like that.  I like that, "BS" :) But
I won't define define curly-e that way.  Just like F-F's eval_s is not
defined as in the form of your curly-e.  It's a theorem that F-F's
eval_s satisfies such equations.  But people may indeed argue.
0
richter1974 (312)
7/7/2004 1:06:09 AM
Joe Marshall <prunesquallor@comcast.net> responds to me:
  
  There are many expressions for which eval_s does not return.

Sure, Joe, and that's OK, since it's a partial function.
  
  > But my point is that F-F believes the collection of such elements
  > is a subset
  >
  > Undef subset LC_v Exp
  
  I see no indication of this.  Flatt and Felleisen restrict their
  concern to expressions that produce values.

But that's what "partial function" means.  Since F-F calls eval_s a
partial function, they must believe there's a subset Undef, and then
eval_s is really a total function defined on the complement 

eval_s: (LC_v Exp) - Undef ---> Value

Thanks for citing Schmidt's 3.13.  Read Schmidt's discussion of
partial functions vs. total/bottom functions.  Real early in the book.
  
  > That's what partial function means, and then we can redirect this
  > subset Undef to bottom to get a total function teval_s.  Undef is
  > not a computable subset, but it doesn't have to be.
  
  Undef certainly does have to be a noncomputable subset.

Absolutely!  That's the power of the ZFC axioms: we can talk
mathematically about uncomputable sets and functions.
0
richter1974 (312)
7/7/2004 1:15:48 AM
Matthias Blume responded to me:
  
  > Compositionality doesn't depend on whether the BB_i are defined
  > before the f_ij are defined, or afterward, by structural
  > induction.
  
  Yes it does.  It is the /definition/ that is compositional, not the
  function.  

Matthias, I think this is wrong, but even if it's correct, it imposes
only an inconvenience on folks like me.  I've been arguing about this
with Will, but we didn't finish it.  Will wanted to insert the phrase
DEFINED BY into Cartwright-Felleisen's Ext Den paper:

 [The map from syntactic domains to semantic domains] satisfies
 the law of compositionality: the interpretation of a phrase is
 DEFINED BY a function of the interpretation of the sub-phrases.

But even so, the *maps* satisfy the law of compositionality.  Not the
definitions of the maps.  Since maps don't have unique definitions, it
can only mean that the maps have *some* definition by structural
induction etc.  What I've been proving is that you can take those 2
words DEFINED BY out and it's the same definition: That is, the maps
must satisfy some equations, which Schmidt writes as

EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )

It's a result (possibly (but I doubt it) due to me) that if functions
EE_i & f_ij exist s.t. these nxm equations are all true, then the
functions EE_i are indeed defined by structural induction by the f_ij.

   Every function can be made compositional in the above sense, so
  this wouldn't be interesting in the least.

No, but you fixed this fine in 2nd very informative post.
  
  The point is that you need a *concrete* compositional /definition/ in
  order to be able to perform proofs by structural induction.  It is not
  enough to just show the existence of one.

Yeah, good point!
0
richter1974 (312)
7/7/2004 1:31:41 AM
Joe Marshall <jrm@ccs.neu.edu> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> 
> > Joe Marshall <prunesquallor@comcast.net> writes:
> >
> >> Matthias Blume <find@my.address.elsewhere> writes:
> >> 
> >> > Joe Marshall <prunesquallor@comcast.net> writes:
> >> >
> >> >> So I imagine that we are talking about programs in the abstract,
> >> >> rather than physical machines.
> >> >
> >> > Indeed.
> >> >
> >> >> Consider an accelerating Turing Machine in which each step executes in
> >> >> 1/2 the time of the previous step.
> >> >
> >> > TM time is, of course, number of steps.  Don't be silly.
> >> 
> >> I'm not being silly.  I was offering this as a counterexample.
> >
> > I know.  But it wasn't one.  When the machine is abstract (as you
> > pointed out yourself), time is obviously abstract, too.  TM time is
> > the same as the number of state transitions it makes.  By definition,
> > it is not possible to execute one step twice as fast as the previous
> > one.  It is always one step per step.
> 
> Fine, but what is wrong with a transfinite number of steps?

If it takes more than a finite number of steps, then it falls under
the "does not halt" category.

    "halts"         = reaches stop state after finite number of steps
    "does not halt" =  everything else

There is no third possibility.
0
find19 (1244)
7/7/2004 1:57:50 AM
Joe Marshall <jrm@ccs.neu.edu> responded to me:
  
  > Compositionality doesn't depend on whether the BB_i are defined
  > before the f_ij are defined, or afterward, by structural
  > induction.  The proof of this is immediate from the uniqueness of
  > Schmidt's Thm 3.13.
  
  It depends on the BB_ijl(S_ijl) being independent of f_ij, and f_ij
  being a function.  This is where the problem lies.

Only if you're defining the BB_i by structural induction, Joe.   But
let's move on to your Scheme subset:

  > I need to unconflate Env & Store.  Interestingly enough, Dana Scott
  > attributes this idea to himself in his intro to Stoy's book!  I want
  >
  > E: Expressions ---> (Env x Store -> Value x Store)
  
  In the tiny language I defined, there are no operations whatsoever
  on the store.  Without side effects, there is no need to model the
  store.

But let's toss Env as well!  It's more in the spirit of LC_v to define 

E: Expressions ---> Value
  
  > But IMO it's premature to get into this, as we're still stuck on the
  > definition of compositionality, and perhaps we should consider Pure
  > Scheme first, i.e. F-F's LC_v, and then it's even simpler:
  >
  > E: Expressions --->  Value
  >
  >   using your definition of compositionality we can add function calls
  >   and conditionals:
  >   
  >   V [[ (<exp0> <exp1>) ]](env) 
  >   = 
  >   Phi0 (V [[<exp0>]](env), V [[<exp1>]](env))
  >
  > As I posted earlier, my Phi doesn't reduce to your
  >
  > Phi0: Value x Value ---> Value
  
  Phi0 is simply a name for one of the f_ij.  

That's a serious misunderstanding, and we won't be able to talk about
State until you fix this.  You're asserting a theorem which I'm not
going to be able to prove.  But it's not required.  Let's go back:

Suppose we define V in Shriram's form, although I can't work this way:

V: Expressions ---> (Env  -> Value)
  
Then we need a function Phi satisfying 

V [[ (<exp0> <exp1>) ]] = Phi (V [[<exp0>]], V [[<exp1>]]) 

That of course means that

V [[ (<exp0> <exp1>) ]](env) = Phi (V [[<exp0>]], V [[<exp1>]])(env)

for all env in Env.  It does not mean there's a Phi0 s.t. 

Phi (V [[<exp0>]], V [[<exp1>]])(env) 
= 
Phi0 (V [[<exp0>]](env), V [[<exp1>]](env)) 

or more simply, for alpha, beta: Env -> Value

Phi (alpha, beta)(env) = Phi0 (alpha(env), beta(env))

for a function Phi0: Value x Value ---> Value.

You know you can't do this, right?  We have to evaluate <exp1> in the
environment grabbed by V [[<exp0>]](env).  We don't evaluate <exp1> in
the current environment.


  If your Phi is not such a function, then you have not fulfilled the
  requirements for applying Schmidt's Theorem 3.13 and your equations
  are not compositional.

Right, but I can define Phi, which would in Shriram's setup be 

Phi: (Env  -> Value) x (Env  -> Value) ---> (Env  -> Value)
  
  > But I guess you could get away with that if there are no
  > side-effects in the language.  But I won't do it.  As LC_v shows,
  > you don't need to talk about environments if you don't mutate.  We
  > can use Y_v to define recursion.
  >
  >   V [[ (if <exp0> <exp1> <exp2>) ]](env) 
  >       = Phi1 (V [[<exp0>]](env), V [[<exp1>]](env), V [[<exp2>]](env)) 
  >
  > Same problem.  Joe, maybe I ought to be more flexible, but I can't
  > seem to unbend.  I need to talk about Phi as a map
  >
  > Phi: (Env x Store -> Value x Store) x (Env x Store -> Value x Store)
  >
  >      ---> (Env x Store -> Value x Store)
  
  The equations do not change if you add the store because no
  operation uses or affects the store.  Feel free to read `env' as
  `env x store' and `value' as `value x store' if you wish.  It makes
  no difference.

Surely everyone sees I'm not a great Schemer, right?  I learned the
Env x Store biz, in spite of my uh, less than stellar grasp of Scheme.
I just don't feel comfortable with this (Env -> Value) biz.  To me,
define is a set!.  That's how R5RS DS defines define.  If you have
some set!'s, you might as well throw in real mutation.  But if you
only want define, let's use Y_v in LC_v.
  
  >   If LAMBDA is to be compositional, it ought to be in this form:
  >   
  >   V [[ (lambda <var> <exp0>) ]](env) = Phi2 (V [[<exp0>]])(env)
  >   
  > you know that won't work!  There's no <var> on the RHS!  
  
  I was focusing on the requirements for composition, so I neglected
  to put in the var.  The equation should read this:
  
     V [[ (lambda <var> <exp0>) ]](env) = Phi2 (<var>, V [[<exp0>]])(env)
  
But that's not true either, right?  I still contend that you're
pointing out that syntax/special-form require special treatment from
compositionality.   What you really want is 

V [[ (lambda <var> <exp0>) ]] = Phi2 (<var>, <exp0>)

and you can fit that under Schmidt's 3.13 umbrella.

  >   yet you seem to want it to be a primitive like this:
  >   
  >   V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)
  >   
  > Yes. 
  >
  >   Is it compositional or not?
  >
  > Yes, for the same reason that Scheme is "compositional"!  
  
  This has nothing to do with Scheme.  

I'm sorry, Joe, I misread your equation above.  Yes, that's the
primitive I want, Shriram's triples.  And to answer your question, I
think I can prove compositionality, but it will involve a number of
things.  That definition by itself get us compositionality.

  I'm only interested in composition as in Schmidt's Theorem 3.13
  This definition is compositional: V [[ (lambda <var> <exp0>) ]](env)
  = Phi2 (<var>, V [[<exp0>]])(env)

sort of, see above, but you're right in spirit.
  
  this one is not:
     V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)   

No, this is exactly the argument I've been having with you guys for 2+
years. It now looks like we're getting to a resolution.  You're right
in that the RHS is not part of a structural induction definition.  But
I'm not defining V by structural induction.  It turns out that
Schmidt's 3.13 says that's OK.
  
  > But it will take me a while to prove it.  We have to first settle
  > on a definition of compositionality, and you have to understand
  > that Phi need not reduce to a Phi0.
  
  I believe that you wish to use Schmidt's Theorem 3.13 to define
  compositionality.  

Yes.

  As I said before, if your Phi does not reduce to Phi0, then you do
  not satisfy the requirements for applying Schmidt's Theorem 3.13 and
  your equations are not compositional.

Nah, that's just a dumb error.  I say dumb because I know you're
really good at functions that take functions as arguments.
  
  > But Schmidt's statement is flexible enough to
  > avoid this, as I said above.
  >   
  >   This more or less completes the case analysis, we now need
  >   definitions of Phi1, Phi2, and Phi3.  
  >
  > I don't want to try to define them.  
  
  Then you have not proven that Schmidt's Theorem 3.13 applies and
  there is no reason to believe that your equations are compositional.

No, you're only reading the proof of the existence part of 3.13.
There's no structural induction in the statement of Schmidt's 3.13.
  
  > Another problem with Shriram's setup is that evaluating
  > expressions can change the environment, even with just define.  So
  > you'd really want
  >
  > V: Expressions -> (Env -> Value x Env)
  
  This is why my tiny language has no DEFINE.

Oh, then we must use the LC_v setup of 

V: Expressions -> Value

At least at 1st.  Later to hook up with your code, maybe we should
ramp up to (Env -> Value).
  
  >   This is where I am having the difficulty.  I recall that Phi0 is
  >   supposed to depend upon some number of applications of eval_s in
  >   some way.
  >
  > I posted Phi (not Phi0) again in the post that Will just responded
  > to.  And yes, Phi depends on V.  
  
  You haven't shown that V is well-defined.  

That's mostly false.  I've twice given good sketches of functions like
V using standard reduction.  As Will pointed out, I didn't completely
define V, nor my nonrecursive R.

  There is no reason to believe, therefore, that Phi is well-defined,
  nor that anything derived from Phi is well-defined.

That's not cutting me much slack, Joe :)  But the heart of the matter
isn't in defining V, but the meaning of 3.13.
0
richter1974 (312)
7/7/2004 2:07:08 AM
richter@math.northwestern.edu (Bill Richter) spews on:

>   No, Bill, you've been leaving out the important part: Showing that
>   the unique functions determined by your f_ij are equal to your EE_i.
>   
> As I keep saying, this follows from the uniqueness of Schmidt's
> 3.13.

Yes, you keep saying that, but no, it doesn't follow.  The f_ij
uniquely determine a function, all right, but that function doesn't
have to coincide with your EE_i.

> But thanks for a substantive objection.  Let's go through this:
> 
> I'm asserting that....
> I mean this is the assumption. In specific cases, we'll have to show...

AAAAAAAAAAAAAAIIIIIIIIIIIIIIIIIIIIGGGGGGGGGGGGGGGGGHHHHHHHHHHHHHH!!!!

You're always claiming or asserting.  Claims and assertions are
empty talk, Bill.  If you want to be taken seriously here, you'll
have to prove something.

> The mere fact that
> f_ij are defined in terms of the EE_i does not prove these equations.

Finally, Bill admits the obvious!

Or does he?  He goes on to say:

> But given my f_ij, the existence part of 3.13 produces (by structural
> induction) functions BB_i satisfying the equations
> 
> BB_i( Option_ij ) = f_ij ( BB_ij1(S_ij1),..., BB_ijk(S_ijk) )
> 
> So you believe the uniqueness part of 3.13, right?  But uniqueness
> implies that BB_i = EE_i, since we have 2 sets of solutions to the
> same equations, and there is a unique solution to these equations.

No, Bill, you haven't shown that your EE_i are a solution to those
equations.  By your logic, all functions would be equal.  Fortunately,
most mathematicians reject your logic.

> Are you saying that you don't believe that's what uniqueness means?

That's absolutely right, Bill.  The uniqueness of the functions whose
existence is asserted by Schmidt's Theorem 3.13 does not imply that
all functions are equal to them.  In particular, it does not imply that
your EE_i coincide with them.

> 2) Even if Schmidt wrote what I just speculated you thought, we easily
> prove that BB_i = EE_i by structural induction.

No, Bill, it isn't easy.  If it were easy, you would do it, instead
of blathering about how easy it would be for you to do it.

> I think you got it backward, Will.  I've shown the math here a ton of
> respect.  But I think you fell into a trap I've fallen in repeatedly:
> Somebody I don't think is much of a mathematician says something I
> didn't know, and I jump on it, "That's false, that's nonsense!"  Often
> it happens that they were right.  But it's really hard to judge
> mathematical strength, especially with folks you don't know.

You've been demonstrating that behavior for more than two years now.

No, you aren't showing any respect for mathematics when you repeatedly
make empty claims and ignore another mathematician's demand for a real
proof.

I'm on your case because I am disgusted by your evident belief that
you can con this newsgroup by flashing your credentials, giving proofs
by intimidation or incessant blather, and repeatedly claiming theorems
you don't know how to prove.

>   > Right, I goofed.  But I've done so before.  You've seen this
>   > before, and you & Joe predicted I couldn't define E independently
>   > of Phi:
>   > 
>   > E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store)
>   
>   This equality is either a definition or a claimed theorem.  
> 
> It's neither, it's a claim,

So it's a claimed non-theorem?

> but it will become a theorem shortly after
> I completely define E.

AAAAAAAAAAAAAAIIIIIIIIIIIIIIIIIGGGGGGGGGGGGGGGGGGGHHHHHHHHHHHHHHH!!!!!

> As I said above, I not only think
> you can define E & R, I think you can prove the theorem
> 
> E[{f a}] = Phi(E[f], E[a]) in (Env x Store -> Value x Store)
> 
> for my already-defined-in-terms-of-E Phi.

No, I haven't proved that theorem, but I have proved theorems similar
to it.  That's how I know what it takes to prove a theorem like that.
That's how I know that your claim to avoid the complexity of the usual
denotational definitions was completely bogus.

>   The real question is whether you can prove that the E you define via
>   reduction is equal to the E' that, by Schmidt's Theorem 3.13, is
>   uniquely determined by the f_ij that you define using E.
> 
> OK, but that breaks down into 2 parts:
> 
> 1) my proof that Bill-compositionality = Will-compositionality

Why don't you just ignore this part, so you can get on with the
important part, which is:

> 2) a proof that my E define via reduction satisfies the equations
>    defined by the f_ij, that is Schmidt's equations above
> 
> EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )
> 
> We can't do it with just one E function, we'll need a few EE_i.

> Now I haven't gone ahead with (2) since we're still stuck on (1).

That makes a nice excuse, Bill, but it's nothing more than an excuse.
Forget (1).  Get on with (2).

> Once we agree on (1), then you can prove (2) yourself, faster than I
> can, and of course the first thing is to really define E & R.

Of course I could prove (2) faster than you.  My proof would involve
only a finite number of posts to comp.lang.scheme.

> However, if we don't believe (1), then I can't possibly succeed!

Why say "if A, then B" when you already know B?

>   You've been using the same names for E and E', thereby assuming at
>   the outset that they are equal.  
> 
> I thought I was quite clear above, distinguishing between EE_i and
> BB_i.  I've always been that clear in Schmidt 3.13 discussions.

You weren't as unclear in your most recent post, but you clearly used
E to refer to both functions in your previous post, and you have done
that sort of thing quite often throughout this entire thread.

>    The mathematical term for that is "cheating".  You aren't allowed
>   to use notation that assumes the theorem you haven't yet proved.
> 
> I don't think I did that,

Of course you don't.  You have a PhD in mathematics, and have published
several papers, and some of your colleagues actually agree with your
opinion that you're a pretty good mathematician, but you don't seem to
know the difference between cheating and real mathematics.  What gives?

> A lot of the problems in this DS
> discussions is that folks just can't believe how powerful the ZFC
> axioms are.  I wasn't thinking of you at all here, Will.

You're real good at talking about everything but the subject at hand,
aren't you?  According to Google Groups, you have referred to ZFC in
46 of your posts to this newsgroup.

You appear completely ignorant of other axiomatizations of set theory.
You don't understand Tarski's denotational semantics of first order
logic, and had probably never heard of it.  Maybe it isn't such a good
idea for you to try to intimidate us with your knowledge of logic and
set theory.

>   In fact, you have not yet defined a complete set of f_ij for any
>   language that can express iteration or recursion.  
> 
> Absolutely!  
> 
>    You asked me what would be a reasonable subset of Scheme.  I
>   suggest you start with the lambda calculus subset plus numeric
>   constants.
> 
> Great!  I'll get working on this as we continue to work on 3.13.

AAAAAAAAAAAAAAAIIIIIIIIIIIIIIIIIIIIGGGGGGGGGGGGGGGGGGGHHHHHHHHHHHHH!

> I
> need clarification: I think this means a version of F-F's ISWIM on p
> 39 <http://www.ccs.neu.edu/course/com3357/mono.ps>.

That would be fine.  Joe Marshall's language would be fine.  Just
pick one and get on with it.

> So it's LC_v with
> some "basic constants", i.e. numeric constants, and should we throw in
> some of F-F's "primitive operation"?  How about +, -,*,/ and F-F's
> primitive op `iszero', which defines if0?

Those functions can be the values of variables in your initial
environment, so you don't need them as syntax.  If "Bill-semantics"
requires an empty initial environment, however, then by all means
put them in as syntax.

>   Once you have used reduction to define your E for that language, you
>   should then give a clear statement of the f_ij that you want to use
>   in Theorem 3.13.
> 
> I'll do so!

AAAAAAAAAAAAAAAIIIIIIIIIIIIIIIIIIGGGGGGGGGGGGGGGGGHHHHHHHHHHHHHHHHH!!!!!

>   
>   Then, and only then, can you prove that the unique functions whose
>   existence is asserted by Theorem 3.13 coincide with the E that you
>   defined by reduction.
> 
> Right.

We have agreement on Bill's goal!

>   That will not be a trivial proof.  I predict that you will be unable
>   to complete that proof without formulating and using a compositional
>   semantic function that is defined in the usual denotational way,
>   without using your reduction semantics.  If my prediction is
>   correct, then it will be obvious that your claim to have avoided the
>   complexity of the usual denotational definitions was bogus.
> 
> This is why we're still stuck on the definition of compositionality.

You would like to be stuck on that definition, because it's the excuse
you're using to avoid having to prove something.

> If we believe that compositionality means exactly what you think it
> means, then it's indeed impossible for me to complete my task.  So we
> need to first straighten out only hard part of the proof, which is
> 
> Bill-compositionality = Will-compositionality 
> 
> proved by the uniqueness of Schmidt's 3.13.  I'll go ahead on the
> small subset, but I can not complete the proof until we settle this.

For the sake of our argument, I am perfectly willing to pretend that
Bill-compositionality = Will-compositionality.  Please get on with
the proof.

That takes care of what you said was the only hard part of the proof,
so your next post should contain your Bill-semantics for some toy
programming language, and a complete proof that this Bill-semantics
is Bill-compositional.

Will
0
cesuraSPAM (401)
7/7/2004 1:37:06 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <prunesquallor@comcast.net> responded to me:
>   
>   >   Now here's where I use domains.  I allow functions to operate on
>   >   functions, so the domain of expressible values includes
>   >   functions over that domain.
>   >
>   > Sure, but F-F doesn't, ...
>   
>   Where did you get that impression?  Flatt and Felleisen first
>   introduce lambda arguments on page 25, develop recursion as
>   self-application on page 31, and use lambda expressions as arguments
>   on the very first page of their introduction to ISWIM.
>
> By domains I thought you meant Scott models of LC, so that you could
> do P = (P -> P), and in particular have Procedures act on Values, even
> though Procedures is a subset of Values.  Did I misunderstand you?

No.  You understand correctly.  A lambda form in my tiny language
denotes a function in Scheme, this is part of the value space, and
since function may operate on functions, I need to rely on domains.

>   > That's a lot of nice Scheme code you wrote!  I wasn't clear what the
>   > purpose was.  The issues we've been on (compositionality, Halting
>   > Problem total functions) aren't settled by actual computations :^0
>   
>   Many of the issues we've been on *are* settled by actual computations.
>   If the computer operates on an expression and produces a result, then
>   clearly that particular instance of the problem has a well-defined
>   solution.  The rigor required to make a program run is very high and
>   does not admit sloppy formulations.  The program can easily be used a
>   starting point for proofs.
>
> Yeah, sure.  It won't settle non-halting issues, though.  But it's
> good to have code.
>   
>   For example, my definition of curly-e is itself virtually a proof that
>   curly-e is compositional (the remainder of the proof is in the fact
>   that no function on which curly-e relies itself calls curly-e).
>   
>   (define (curly-e expression)
>     (cond ((numeral? expression)     (curly-e-numeral expression))
>           ((variable? expression)    (curly-e-variable expression))
>           ((succ? expression)        (curly-e-succ
>                                        (curly-e (succ-arg expression))))
>   
>           ((ifequal? expression) (curly-e-ifequal
>                                        (curly-e (ifequal-left   expression))
>                                        (curly-e (ifequal-right   expression))
>                                        (curly-e (ifequal-consequent  expression))
>                                        (curly-e (ifequal-alternative expression))))
>   
>           ((lambda? expression) (curly-e-lambda
>                                  (lambda-variable expression)
>                                  (curly-e (lambda-body  expression))))
>   
>           ((call? expression) (curly-e-call
>                                (curly-e (call-operator expression))
>                                (curly-e (call-argument expression))))
>   
>           (else (error "Illegal expression." expression))))
>   
>   If you can write the equations for `Bill Semantics' in the form
>   above, then no one will argue whether your definition is
>   compositional.  If you cannot, then there is more than sufficient
>   reason to doubt it.
>
> I'm going to write down equations like that.  I like that, "BS" :) But
> I won't define curly-e that way.  

Then you won't have a denotational semantics.

> Just like F-F's eval_s is not defined as in the form of your
> curly-e.

They do not have a denotational semantics, either.  Theirs is
operational.

> It's a theorem that F-F's eval_s satisfies such equations.  But
> people may indeed argue.

That is not one of their theorems.  Their eval_s is defined in terms
of standard reduction.  Standard reduction is not compositional.
0
jrm (1310)
7/7/2004 1:58:50 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>   
>   > Compositionality doesn't depend on whether the BB_i are defined
>   > before the f_ij are defined, or afterward, by structural
>   > induction.  The proof of this is immediate from the uniqueness of
>   > Schmidt's Thm 3.13.
>   
>   It depends on the BB_ijl(S_ijl) being independent of f_ij, and f_ij
>   being a function.  This is where the problem lies.
>
> Only if you're defining the BB_i by structural induction, Joe.   

That is the point of Theorem 3.13

> But let's move on to your Scheme subset:
>
>   > I need to unconflate Env & Store.  Interestingly enough, Dana Scott
>   > attributes this idea to himself in his intro to Stoy's book!  I want
>   >
>   > E: Expressions ---> (Env x Store -> Value x Store)
>   
>   In the tiny language I defined, there are no operations whatsoever
>   on the store.  Without side effects, there is no need to model the
>   store.
>
> But let's toss Env as well!  

Tossing the environment makes the expressions more complicated because
you have to perform substitution rather than lookup.  Keeping the
environment is a pragmatic choice.

>   > But IMO it's premature to get into this, as we're still stuck on the
>   > definition of compositionality, and perhaps we should consider Pure
>   > Scheme first, i.e. F-F's LC_v, and then it's even simpler:
>   >
>   > E: Expressions --->  Value
>   >
>   >   using your definition of compositionality we can add function calls
>   >   and conditionals:
>   >   
>   >   V [[ (<exp0> <exp1>) ]](env) 
>   >   = 
>   >   Phi0 (V [[<exp0>]](env), V [[<exp1>]](env))
>   >
>   > As I posted earlier, my Phi doesn't reduce to your
>   >
>   > Phi0: Value x Value ---> Value
>   
>   Phi0 is simply a name for one of the f_ij.  
>
> That's a serious misunderstanding, and we won't be able to talk about
> State until you fix this.  

That is a typo.  The correct equations are these:

V [[ numeral ]](env) = number

V [[ id ]](env) = lookup (id, env)

V [[ (<exp0> <exp1>) ]](env) = Phi0 (V [[<exp0>]], V [[<exp1>]])(env)

V [[ (if <exp0> <exp1> <exp2>) ]](env) 
    = Phi1 (V [[<exp0>]], V [[<exp1>]], V [[<exp2>]])(env)

and if LAMBDA is to be compositional, it ought to be in this form:

V [[ (lambda <var> <exp0>) ]](env) = Phi2 (<var>, V [[<exp0>]])(env)

  if it is to be a primitive, then it is this:

V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)

> Suppose we define V in Shriram's form, although I can't work this way:
>
> V: Expressions ---> (Env  -> Value)
>   
> Then we need a function Phi satisfying 
>
> V [[ (<exp0> <exp1>) ]] = Phi (V [[<exp0>]], V [[<exp1>]]) 
>
> That of course means that
>
> V [[ (<exp0> <exp1>) ]](env) = Phi (V [[<exp0>]], V [[<exp1>]])(env)
>
> for all env in Env.  It does not mean there's a Phi0 s.t. 
>
> Phi (V [[<exp0>]], V [[<exp1>]])(env) 
> = 
> Phi0 (V [[<exp0>]](env), V [[<exp1>]](env)) 
>
> or more simply, for alpha, beta: Env -> Value
>
> Phi (alpha, beta)(env) = Phi0 (alpha(env), beta(env))
>
> for a function Phi0: Value x Value ---> Value.
>
> You know you can't do this, right?  We have to evaluate <exp1> in the
> environment grabbed by V [[<exp0>]](env).  We don't evaluate <exp1> in
> the current environment.

Yes.  Corrected above.

>   >   If LAMBDA is to be compositional, it ought to be in this form:
>   >   
>   >   V [[ (lambda <var> <exp0>) ]](env) = Phi2 (V [[<exp0>]])(env)
>   >   
>   > you know that won't work!  There's no <var> on the RHS!  
>   
>   I was focusing on the requirements for composition, so I neglected
>   to put in the var.  The equation should read this:
>   
>      V [[ (lambda <var> <exp0>) ]](env) = Phi2 (<var>, V [[<exp0>]])(env)
>   
> But that's not true either, right?  I still contend that you're
> pointing out that syntax/special-form require special treatment from
> compositionality.   What you really want is 
>
> V [[ (lambda <var> <exp0>) ]] = Phi2 (<var>, <exp0>)

No, and this is important.  For it to be compositional, the equation
must be of this form:

  V [[ (lambda <var> <exp0>) ]](env) = Phi2 (<var>, V [[<exp0>]])(env)

(double checking carefully for typos.... don't see any)
And this can be seen in the Scheme code:

  ((lambda? expression) (curly-e-lambda
                          (lambda-variable expression)
                          (curly-e (lambda-body  expression))))

> and you can fit that under Schmidt's 3.13 umbrella.

Yes.

>   I'm only interested in composition as in Schmidt's Theorem 3.13
>   This definition is compositional: V [[ (lambda <var> <exp0>) ]](env)
>   = Phi2 (<var>, V [[<exp0>]])(env)
>
> sort of, see above, but you're right in spirit.
>   
>   this one is not:
>      V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)   
>
> No, this is exactly the argument I've been having with you guys for 2+
> years. It now looks like we're getting to a resolution.  You're right
> in that the RHS is not part of a structural induction definition.  But
> I'm not defining V by structural induction.  It turns out that
> Schmidt's 3.13 says that's OK.

No, it gives a justification for defining a recursive function on a
recursive structure.  If you aren't going to do that, there is no need
to invoke Schmidt's theorem.

But if you are not going to do that, you are going to have a problem.
The point of V is to assign meaning to expressions.  Your tuple has an
expression within it, but no meaning is attached to that expression.
If you assign a meaning, you have to prove that the assignment is
well-defined.

>   > But it will take me a while to prove it.  We have to first settle
>   > on a definition of compositionality, and you have to understand
>   > that Phi need not reduce to a Phi0.
>   
>   I believe that you wish to use Schmidt's Theorem 3.13 to define
>   compositionality.  
>
> Yes.
>
>   As I said before, if your Phi does not reduce to Phi0, then you do
>   not satisfy the requirements for applying Schmidt's Theorem 3.13 and
>   your equations are not compositional.
>
> Nah, that's just a dumb error.  I say dumb because I know you're
> really good at functions that take functions as arguments.

I corrected my typo above.

>   > But Schmidt's statement is flexible enough to
>   > avoid this, as I said above.
>   >   
>   >   This more or less completes the case analysis, we now need
>   >   definitions of Phi1, Phi2, and Phi3.  
>   >
>   > I don't want to try to define them.  
>   
>   Then you have not proven that Schmidt's Theorem 3.13 applies and
>   there is no reason to believe that your equations are compositional.
>
> No, you're only reading the proof of the existence part of 3.13.

No, I'm reading the statement of the theorem, in particular this
phrase:
     ... f_ij(BB_ij1(S_ij1), BB_ij2(S_ij2), ..., BB_ijk(S_ijk))

   where f_ij is a function of functionality 
     D_ij1 x D_ij2 x ... x D_ijk -> D_i

As I mentioned before (despite the typos), Phi0, Phi1, Phi2, and Phi3
are simply names for f_ij.
   
> There's no structural induction in the statement of Schmidt's 3.13.

No.  Schmidt calls upon structural induction for the proof that the
function so defined is well-defined.

>   > Another problem with Shriram's setup is that evaluating
>   > expressions can change the environment, even with just define.  So
>   > you'd really want
>   >
>   > V: Expressions -> (Env -> Value x Env)
>   
>   This is why my tiny language has no DEFINE.
>
> Oh, then we must use the LC_v setup of 
>
> V: Expressions -> Value
>
> At least at 1st.  Later to hook up with your code, maybe we should
> ramp up to (Env -> Value).
>   
>   >   This is where I am having the difficulty.  I recall that Phi0 is
>   >   supposed to depend upon some number of applications of eval_s in
>   >   some way.
>   >
>   > I posted Phi (not Phi0) again in the post that Will just responded
>   > to.  And yes, Phi depends on V.  
>   
>   You haven't shown that V is well-defined.  
>
> That's mostly false.

`Mostly' false?

> I've twice given good sketches of functions like V using standard
> reduction.  As Will pointed out, I didn't completely define V, nor
> my nonrecursive R.

Yes.  That is a problem.

>   There is no reason to believe, therefore, that Phi is well-defined,
>   nor that anything derived from Phi is well-defined.
>
> That's not cutting me much slack, Joe :)  But the heart of the matter
> isn't in defining V, but the meaning of 3.13.

No, it is in defining V.  Theorem 3.13 is fairly transparent in its
meaning.
0
jrm (1310)
7/7/2004 2:48:31 PM
I just now got around to reading this.  It is the closest Bill has
yet come to giving a concrete definition that would allow us to
show why his Bill-semantics is not Bill-compositional.

richter@math.northwestern.edu (Bill Richter) quoting Joe Marshall:

>      V [[ (lambda <var> <exp0>) ]](env) = Phi2 (<var>, V [[<exp0>]])(env)
>   
> But that's not true either, right?  I still contend that you're
> pointing out that syntax/special-form require special treatment from
> compositionality.   What you really want is 
> 
> V [[ (lambda <var> <exp0>) ]] = Phi2 (<var>, <exp0>)
> 
> and you can fit that under Schmidt's 3.13 umbrella.

No, you can't fit that into Schmidt's Theorem 3.13.  That theorem
says that, if you have a set of functions f_ij, where i ranges over
the nonterminals of the language's grammar and j ranges over the
productions for nonterminal i, and f_ij takes as its arguments the
denotations of the nonterminals on the right hand side of production
j for nonterminal i, and for each i all of the f_ij have the same
range (the denotations appropriate for nonterminal i), then the f_ij
define a unique family of semantic functions that map phrases of
syntactic type i to their denotations.

Bill has the right to choose his domain of denotations for the
<exp> nonterminal, but he only gets to choose one such domain.
If he tries to use a different domain of denotation for expressions
that occur as the body of a lambda expression than for expressions
that occur as the rator or rand of a call, then he won't be able to
use Theorem 3.13.  He'll be stuck.

In the equation above, Bill is using <exp0> itself as the denotation
of <exp0>.  That's fine, but if he wants to use Theorem 3.13, then
it implies that <exp0> must be its own denotation in things like
(<exp0> <exp1>) as well.  Indeed, it means that <exp0> is its own
denotation when it occurs all by itself.  In other words, no two
expressions will have the same denotation unless they are identical
as syntax.

In Bill's operational semantics, however, expressions like 1+2 and
2+1 probably have the same semantics.  Therefore Bill's operational
semantics will not coincide with the denotational semantics whose
existence is asserted by Schmidt's Theorem 3.13.

Let's review this, for the benefit of the logic-challenged:

1.  If Bill uses different denotations for expressions depending
on the context in which they occur, then he's hosed because he can't
use Theorem 3.13.

2.  If Bill does not use different denotations for expressions
depending on the context in which they occur, then he's hosed because
his operational semantics will not coincide with the semantics whose
existence is asserted by Theorem 3.13.

Therefore Bill is hosed.

Will
0
cesuraSPAM (401)
7/7/2004 4:56:19 PM
"Daniel C. Wang" <danwang74@hotmail.com> wrote in message news:<2klgh2F3l9i9U1@uni-berlin.de>...
> Felix Klock wrote:
> {stuff deleted}
> > 
> > function != algorithm.
> > 
> > Just because a function is not computable does not mean it does not
> > exist.
> > 
> > Every program does indeed either halt or not halt.  So a function that
> > is the solution to the halting problem certainly exists; we just can't
> > write down a way to calculate such a function totally.
> 
> I guess, I'm too much of a constructivist. If I can't write it down it 
> doesn't exist. This is a point of philosophy as well as the peculiarities of 
> your formal logic.

Fair enough.  I just got through with a personal conversation with Joe
about this.  I think it boils down to: when one is discussing truth in
the "real world", either one believes in the law of the excluded
middle (e.g. me, Matthias), or one does not (e.g. you, Joe).  If you
don't accept the law of the excluded middle, then there's not much
point in discussing this point further.

> In any case if you are willing to admit this kind of trickery then I'll be 
> surprised to admit that what Bill is doing is "sensible".
> 
> He's just defining the meaning of a term as its canonical form if it exists.
> You can of course easily do this compositionally. It is a compositional 
> semantics of some sort, but the notion of equality or meaning under this 
> system is not very interesting.
> 
> For example the following two terms do not have the same meaning under 
> Bill's semantics
> [[snip]]

I absolutely agree with these statements.  I did not mean for my
response to you to be taken as support for Bill's propostiions in
general.  I just thought it was important to establish the distinction
between an algorithm (computable) and an arbitrary function
(potentially non-computable).

I also agree that most of the real numbers are pretty weird (if I
can't describe a way to calculate it to arbitrary precision, then its
gotta be weird...)
0
pnkfelix (27)
7/7/2004 9:50:41 PM
pnkfelix@gmail.com (Felix Klock) writes:

> "Daniel C. Wang" <danwang74@hotmail.com> wrote in message news:<2klgh2F3l9i9U1@uni-berlin.de>...
> > Felix Klock wrote:
> > {stuff deleted}
> > > 
> > > function != algorithm.
> > > 
> > > Just because a function is not computable does not mean it does not
> > > exist.
> > > 
> > > Every program does indeed either halt or not halt.  So a function that
> > > is the solution to the halting problem certainly exists; we just can't
> > > write down a way to calculate such a function totally.
> > 
> > I guess, I'm too much of a constructivist. If I can't write it down it 
> > doesn't exist. This is a point of philosophy as well as the peculiarities of 
> > your formal logic.
> 
> Fair enough.  I just got through with a personal conversation with Joe
> about this.  I think it boils down to: when one is discussing truth in
> the "real world", either one believes in the law of the excluded
> middle (e.g. me, Matthias), or one does not (e.g. you, Joe).  If you
> don't accept the law of the excluded middle, then there's not much
> point in discussing this point further.

I think philosophically it is rather interesting.  My belief is that
both Dan and Joe -- deep at their hearts -- do believe in the excluded
middle just like you and I (or Hilbert, to give a more authoritative
example).  For example, notice that Dan wrote "if I can't write it
down ...".  Now, how does he know that he cannot write "it" down if he
can't write it down?  How can he assert any properties such as "does
not exist" to it -- if he can't write it down?  In real life, the
constructivist's pov (or any other intuitionistic one, for that
matter) leads to all kinds of awkwardness which, when pursued with
conviction, would make normal discourse almost impossible.  (As the
example demonstrates, a hard-core constructivist would have a heard
time talking about constructivism without violating his own
principles.)

The formal aspects of intuitionistic logic are, indeed, more
interesting -- and practical -- because of how they relates to
computation.  Obviously, I would not recommend rejecting their study.
0
find19 (1244)
7/7/2004 10:15:21 PM
Matthias Blume <find@my.address.elsewhere> writes:
> If it takes more than a finite number of steps, then it falls under
> the "does not halt" category.
> 
>     "halts"         = reaches stop state after finite number of steps
>     "does not halt" =  everything else
> 
> There is no third possibility.

"do not know if it halts or not" is a quite practical possibility, since the
difference between "finite" and "infinite" cannot actually be measured in
general.

In general, we can only ultimately know if an algorithm is in the "halts"
category if we actually execute it and witness it halting.

If it hasn't stopped yet by step N, we still don't know if it halts or not.
It might at N+1, in which case it is definitely in the "halts" category.

But if we weren't willing to wait till N+1, then all we can say is "don't know".

On the other hand, if your categories are "halts by step N" and "does not halt
by step N", then those are quite clearly the only two possibilities.

-- 
Cheers,                                        The Rhythm is around me,
                                               The Rhythm has control.
Ray Blaak                                      The Rhythm is inside me,
rAYblaaK@STRIPCAPStelus.net                    The Rhythm has my soul.
0
rAYblaaK (362)
7/7/2004 10:33:01 PM
Matthias Blume wrote:
{stuff deleted}
> In real life, the
> constructivist's pov (or any other intuitionistic one, for that
> matter) leads to all kinds of awkwardness which, when pursued with
> conviction, would make normal discourse almost impossible.  (As the
> example demonstrates, a hard-core constructivist would have a heard
> time talking about constructivism without violating his own
> principles.)

In mathematics non-constructive POV can lead to awkwardness such as the 
Banach-Tarski paradox
   http://en.wikipedia.org/wiki/Banach-Tarski_paradox

I think these awkwardness reflect imperfections in our understanding of 
things, so niether view point is *right*. Both viewpoints entail different 
sorts of bagage.

Seriously there are really lots of awkward things going on in classical 
mathematics.. but they are familiar to those in the know so they don't seem 
so awkward.

I personally am pragmatic about these issues. My day job as a FPCC hacker 
requires me to use a constructive meta-logic  (LF) to define a 
classical-logic (HOL) and use that define yet another constructive object 
logic (our type system). One can understand about why I'm sometimes confused 
about what Truth is all about :)


0
danwang742 (171)
7/7/2004 10:46:57 PM
Ray Blaak <rAYblaaK@STRIPCAPStelus.net> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> > If it takes more than a finite number of steps, then it falls under
> > the "does not halt" category.
> > 
> >     "halts"         = reaches stop state after finite number of steps
> >     "does not halt" =  everything else
> > 
> > There is no third possibility.
> 
> "do not know if it halts or not" is a quite practical possibility, since the
> difference between "finite" and "infinite" cannot actually be measured in
> general.

But it is not a possibility that exists in addition to the other two.

> In general, we can only ultimately know if an algorithm is in the "halts"
> category if we actually execute it and witness it halting.

I am not talking about whether or not we /know/ if the program halts.
I am talking about whether it /does/.
0
find19 (1244)
7/7/2004 10:59:39 PM
Matthias Blume <find@my.address.elsewhere> writes:
> Ray Blaak <rAYblaaK@STRIPCAPStelus.net> writes:
> > Matthias Blume <find@my.address.elsewhere> writes:
> > >     "halts"         = reaches stop state after finite number of steps
> > >     "does not halt" =  everything else
> > > 
> > > There is no third possibility.
> > 
> > "do not know if it halts or not" is a quite practical possibility, since the
> > difference between "finite" and "infinite" cannot actually be measured in
> > general.
> 
> But it is not a possibility that exists in addition to the other two.
> 
> > In general, we can only ultimately know if an algorithm is in the "halts"
> > category if we actually execute it and witness it halting.
> 
> I am not talking about whether or not we /know/ if the program halts.
> I am talking about whether it /does/.

Even if a program is /really/ always in one of the "halts" or "not halts"
category and no other, that fact does not really help us.

Until we have the knowledge, we cannot make practical use of the fact.

-- 
Cheers,                                        The Rhythm is around me,
                                               The Rhythm has control.
Ray Blaak                                      The Rhythm is inside me,
rAYblaaK@STRIPCAPStelus.net                    The Rhythm has my soul.
0
rAYblaaK (362)
7/8/2004 12:11:30 AM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

   Bill Richter responded to Joe Marshall:
   
   > I still contend you're pointing out that syntax/special-form
   > require special treatment from compositionality.  What we want is
   > 
   > V [[ (lambda <var> <exp0>) ]] = Phi2 (<var>, <exp0>)
   > 
   > and you can fit that under Schmidt's 3.13 umbrella.
   
   No, you can't fit that into Schmidt's Theorem 3.13.  
   [...] 
   In the equation above, Bill is using <exp0> itself as the
   denotation of <exp0>.  That's fine, but if he wants to use Theorem
   3.13, then it implies that <exp0> must be its own denotation in
   things like (<exp0> <exp1>) as well.  Indeed, it means that <exp0>
   is its own denotation when it occurs all by itself.  In other
   words, no two expressions will have the same denotation unless they
   are identical as syntax.

No, Will, but thanks for the coherent mathematical objection!  I'll be
able to slip out of this, as I explained earlier to Joe:

First, lambda is a definite problem for compositionality, because it's
syntax, as Scheme says, or a Lisp special form.  A quick reading of
Schmidt's 3.13 would say anyone was hosed, because we need

V [ (lambda <var> <exp0>) ] = Phi2 (V [ <var> ], V [ <exp0> ])

I don't think there's any Phi2!  I don't think Scheme works that way.
We're not supposed to evaluate either <var> or <exp0>.  

No Will, if you know how solve this problem in your Scott/CPO setup,
that's fine.  But we don't have to solve it, by Schmidt's 3.13:

We've got 2 syntactic & semantic domains (at least) 

M in Expr = B_1 -E--> D_1 = (U x S -> Value x S)

x in Var = B_2 -V--> D_1

To use 3.13, we have to write this in BNF form 

M ::= Option_11 | Option_12 | ....

x ::= v_1 | v_2 | v_3 | ...

where the v_i's run through our entire list of variables.  Each
variable gets it's own f_2j: That is, by 3.13 we must have

f_2j = V(v_j) in D_2 

That's how Schmidt avoids mentioning terminal symbols in 3.13.

So I would be hosed if I took one of the options for M to be

Option_1j = (lambda x M)

;; just for simplicity let's assume that the body of a lambda is just
;; one expression, and not a sequence of expressions.

Then I would need a function f_1j: D_2 x D_1 ---> D_1 s.t. 

E[ (lambda (x) M) ] = f_1j( V[x], E[M] )

And I don't know how to do that.  (I don't think R5RS DS does it
either). So I'll define 2 new syntactic & semantic domains:

P in Syntax-Expr  = B_6 -identity--> Syntax-Expr

y in Syntax-Var  = B_7 -identity--> Syntax-Var

And I won't give any options for either P or y.  I'll give a list of
terminal symbols.  

P  ::= exp_1 | exp_2 | ... 

y ::= v_1 | v_2 | v_3 | ...

For y, that's easy, because it's the same list of terminals we used
for Var.   Now 3.13 says I can do this: 

M ::= Option_11 | ... | (lambda (y)  P) | ...

and now I need a function f_67: D_6 x D_7 ---> D_1, and that's my Phi2
above, and it's really simple:

E[ (lambda (y) P) ] = f_67( y, P ) in (U x S -> Value x S)

which is obviously defined as 

f_67( y, P )(rho, sigma) = (<y, P, rho>, sigma) 

So I indeed fit the syntax lambda under the 3.13 umbrella.
   
   In Bill's operational semantics, however, expressions like 1+2 and
   2+1 probably have the same semantics.  

I guess that's right:

E[(+ 1 2)](rho, sigma) = (3, sigma) in Value x S

E[(+ 2 1)](rho, sigma) = (3, sigma) in Value x S
   
   Let's review this, for the benefit of the logic-challenged:
   
   1.  If Bill uses different denotations for expressions depending
   on the context in which they occur, then he's hosed because he can't
   use Theorem 3.13.

No, but I would be hosed if I didn't define extra copies of the
variable & expression syntactic domains (with identical semantic
domains).  It's a good objection.
   
   2.  If Bill does not use different denotations for expressions
   depending on the context in which they occur, then he's hosed because
   his operational semantics will not coincide with the semantics whose
   existence is asserted by Theorem 3.13.

I see, yeah, by the objection you made at the top, I'd be stuck with
an equation I can't make sense of:
   
V [ (lambda x M (V [ x ], V [ M ])

   Therefore Bill is hosed.

No, therefore Bill read 3.13 more carefully than Will did :)

This a reasonably good time for a slogan I've been waiting to deliver:

Being good at Math is much the same as being good at programming.
It's much more about debugging code than writing flawless code!

So you and others have been wrong for a long time about fundamental
induction issues.  I don't draw any negative conclusions from that.
It's how you deal with these issues that makes an impression on me.
There's certainly plenty of Math I don't know myself.  I think you
(Will) know a lot of Math that I don't know!  As long as you & the
others fight through the Math, you make a good impression on me.  I
salute you, Will, for a fighting well in this post.
0
richter1974 (312)
7/8/2004 12:20:06 AM
William D Clinger responded to me with a lot of disrespect, but also
also real mathematical progress:
   
   > I'm asserting that....  I mean this is the assumption. In
   > specific cases, we'll have to show...
   
   AAAAAAAAAAAAAAIIIIIIIIIIIIIIIIIIIIGGGGGGGGGGGGGGGGGHHHHHHHHHHHHHH!!!!
   
   You're always claiming or asserting. 

I think you misunderstood, Will.  The assumption you choked on is the
hypothesis of Bill-compositionality.  I know I haven't proven the
hypothesis holds yet.  Will-compositionality has a much simpler
hypothesis: we assume we have domains B_i, D_i & function f_ij.

   >   The real question is whether you can prove that the E you
   >   define via reduction is equal to the E' that, by Schmidt's
   >   Theorem 3.13, is uniquely determined by the f_ij that you
   >   define using E.
   > 
   > OK, but that breaks down into 2 parts:
   > 
   > 1) my proof that Bill-compositionality = Will-compositionality
   
   Why don't you just ignore this part, 

OK, sure, and we'll come back to it later.  That's great!

   so you can get on with the important part, which is:
   
   > 2) a proof that my E define via reduction satisfies the equations
   >    defined by the f_ij, that is Schmidt's equations above
   > 
   > EE_i( Option_ij ) = f_ij ( EE_ij1(S_ij1),..., EE_ijk(S_ijk) )
   > 
   > We can't do it with just one E function, we'll need a few EE_i.
    
   > Now I haven't gone ahead with (2) since we're still stuck on (1).
   
   [...] Forget (1).  Get on with (2).

Excellent.  I'll post tonight for your small Scheme subset!
   
   > Once we agree on (1), then you can prove (2) yourself, faster than I
   > can, and of course the first thing is to really define E & R.
   
   You appear completely ignorant of other axiomatizations of set
   theory.  

Sure!  I'm proud to understand ZFC as well as I do.  I'm not a set
theorist, I'm an algebraic topologist.  So I'm like a plumber who
understand something about the Navier Stokes equation.

   You don't understand Tarski's denotational semantics of first order
   logic, and had probably never heard of it.  

Yes & almost yes.  I requested a reference earlier.  That was a great
suggestion you made, that we study Tarski's DS before Programming DS.
   
   > I need clarification: I think this means a version of F-F's ISWIM
   > on p 39 <http://www.ccs.neu.edu/course/com3357/mono.ps>.
   
   That would be fine.  Joe Marshall's language would be fine.  Just
   pick one and get on with it.

I'll post tonight.  Great.   I figured it out last night.
   
   > So it's LC_v with some "basic constants", i.e. numeric constants,
   > and should we throw in some of F-F's "primitive operation"?  How
   > about +, -,*,/ and F-F's primitive op `iszero', which defines
   > if0?
   
   Those functions can be the values of variables in your initial
   environment, so you don't need them as syntax.  

That would be fine too, but that's not F-F'-s way.  I guess that's
right, F-F have an empty initial env.  In fact, they never have any
environments at all!  Huh, thanks.  I never understood their basic
constant & primitive op biz in that light before...
   
   >   Then, and only then, can you prove that the unique functions
   >   whose existence is asserted by Theorem 3.13 coincide with the E
   >   that you defined by reduction.
   > 
   > Right.
   
   We have agreement on Bill's goal!

That's half the goal, and we'll put off the other half until later.
   
   > If we believe that compositionality means exactly what you think
   > it means, then it's indeed impossible for me to complete my task.
   > So we need to first straighten out only hard part of the proof,
   > which is
   > 
   > Bill-compositionality = Will-compositionality 
   > 
   > proved by the uniqueness of Schmidt's 3.13.  I'll go ahead on the
   > small subset, but I can not complete the proof until we settle
   > this.
   
   For the sake of our argument, I am perfectly willing to pretend that
   Bill-compositionality = Will-compositionality.  Please get on with
   the proof.
   
   That takes care of what you said was the only hard part of the
   proof, so your next post should contain your Bill-semantics for
   some toy programming language, and a complete proof that this
   Bill-semantics is Bill-compositional.
   
That's right!  I feel great.  2 years ago, we did not get this far.
0
richter1974 (312)
7/8/2004 12:55:39 AM
Ray Blaak <rAYblaaK@STRIPCAPStelus.net> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> > Ray Blaak <rAYblaaK@STRIPCAPStelus.net> writes:
> > > Matthias Blume <find@my.address.elsewhere> writes:
> > > >     "halts"         = reaches stop state after finite number of steps
> > > >     "does not halt" =  everything else
> > > > 
> > > > There is no third possibility.
> > > 
> > > "do not know if it halts or not" is a quite practical possibility, since the
> > > difference between "finite" and "infinite" cannot actually be measured in
> > > general.
> > 
> > But it is not a possibility that exists in addition to the other two.
> > 
> > > In general, we can only ultimately know if an algorithm is in the "halts"
> > > category if we actually execute it and witness it halting.
> > 
> > I am not talking about whether or not we /know/ if the program halts.
> > I am talking about whether it /does/.
> 
> Even if a program is /really/ always in one of the "halts" or "not halts"
> category and no other, that fact does not really help us.

No conditional necessary.  It *really* always is, simply because
"does" and "does not" are defined to refer to complementary events.
(Notice that I am arguing semantically here.  Your favorite logic
might not be able to establish the truth of either.  Lacking the
excluded middle it might also not be able to establish the disjunction
of the two.  But that's a problem with -- or a property of -- the
logic, not of its intepretation.)

> Until we have the knowledge, we cannot make practical use of the fact.

People make use of such facts all the time.  Ever heard of indirect
proofs?

Matthias
0
find19 (1244)
7/8/2004 2:01:30 AM
Joe Marshall <jrm@ccs.neu.edu> responded to me:
   
   > That's a serious misunderstanding, and we won't be able to talk
   > about State until you fix this.
   
   That is a typo.  The correct equations are these:
   
   V [[ numeral ]](env) = number
   
   V [[ id ]](env) = lookup (id, env)
   
   V [[ (<exp0> <exp1>) ]](env) = Phi0 (V [[<exp0>]], V [[<exp1>]])(env)
   
Great, Joe!  Sorry to be abusive, but I carped earlier, and when you
didn't fix the typo, I thought we had serious problems.  We're writing
so much, it's easy for stuff to get lost. :^0 That's the same as

   V [[ (<exp0> <exp1>) ]]= Phi0 (V [[<exp0>]], V [[<exp1>]])

and let's recall that 

V: Expr ---> (Env -> Value)

Phi0: (Env -> Value) x (Env -> Value) ---> (Env -> Value)

   V [[ (if <exp0> <exp1> <exp2>) ]](env) 
       = Phi1 (V [[<exp0>]], V [[<exp1>]], V [[<exp2>]])(env)
   
   and if LAMBDA is to be compositional, it ought to be in this form:
   
   V [[ (lambda <var> <exp0>) ]](env) = Phi2 (<var>, V [[<exp0>]])(env)
   
Why do you say that?  Why don't you say, as I thought Will did, that
compositionality demands

V[ (lambda (x) M) ] = Phi2( V[x], V[M] )

I explained to Will, and you earlier, that Schmidt's 3.13 umbrella is
wide enough for your formulation & mine too:

   V [[ (lambda <var> <exp0>) ]]= Phi2 (<var>, <exp0>)

     if it is to be a primitive, then it is this:
   
   V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)
   
where's the lingo coming from, primitive vs compositional?  Schmidt
doesn't talk this way, does he?

   >   > Compositionality doesn't depend on whether the BB_i are
   >   > defined before the f_ij are defined, or afterward, by
   >   > structural induction.  The proof of this is immediate from
   >   > the uniqueness of Schmidt's Thm 3.13.
   >   
   >   It depends on the BB_ijl(S_ijl) being independent of f_ij, and
   >   f_ij being a function.  This is where the problem lies.
   >
   > Only if you're defining the BB_i by structural induction, Joe.   
   
   That is the point of Theorem 3.13

You're welcome to believe that, Joe.  You posted from Schmidt that
this is indeed Schmidt's purpose.  But it's not what 3.13 says.

   The point of V is to assign meaning to expressions.  Your tuple has
   an expression within it, but no meaning is attached to that
   expression.  If you assign a meaning, you have to prove that the
   assignment is well-defined.

As I keep saying, this ain't math.  Meaning is just some mathematical
object.  We get to argue culturally whether my DS (or BS!) valuation
function correctly capture the Scheme-like language.  But the math
can't tell!  We've got a mathematical definition of a compositional DS
valuation function.  And I think the triples adequately capture the
meaning!  When we get to the culture discussion, I'll hold up my end.
      
   > There's no structural induction in the statement of Schmidt's 3.13.
   
   No.  Schmidt calls upon structural induction for the proof that the
   function so defined is well-defined.

But not in the statement, right?  Why not respond to the restatement
of 3.13 which I posted to Will, after your timely Schmidt quote:

   Given functions f_ij: D_ij1 x D_ij2 x ... D_ijk -> D_i, 

   for i = 1,..,n, j = 1,...,m, there is a unique family of functions
   BB_i: B_i -> D_i for 1<=i<=n satisfying the equations 

        BB_i(Option_ij)
            = f_ij (BB_ij1 (S_ij1), BB_ij2 (S_ij2), ... BB_ijk(S_ijk))


It doesn't say that the BB_i are defined by structural induction!  It
just says there's a unique BB_i solution!  So we have definitions:

The BB_i are Bill-compositional if there exist f_ij s.t. the B_i &
f_ij satisfy Schmidt's equations above.

The BB_i are Will-compositional if there exist f_ij s.t. the BB_i are
defined by structural induction by Schmidt's equations above.

Since Schmidt says there's a unique family of functions BB_i,
Bill-compositionality = Will-compositionality.
0
richter1974 (312)
7/8/2004 2:10:36 AM
Matthias Blume <find@my.address.elsewhere> writes:

> I think philosophically it is rather interesting.  My belief is that
> both Dan and Joe -- deep at their hearts -- do believe in the excluded
> middle just like you and I (or Hilbert, to give a more authoritative
> example).  

In some ways, you are correct and I'm being more than a bit tongue in
cheek, but belief in the law of the excluded middle is *demonstrably*
false in the realm of modern physics.  In the `double slit'
experiment, the electron does *not* go through one slit or the other.
Prior to looking in the box, Shroedinger's cat is *not* either alive
or dead.

These facts are enough to shake my faith in the law of the excluded
middle.

Now I'm not asserting that the laws of quantum physics apply to
classical turing machines, but I think it is not absurd to allow a
little doubt here.

> For example, notice that Dan wrote "if I can't write it
> down ...".  Now, how does he know that he cannot write "it" down if he
> can't write it down?  How can he assert any properties such as "does
> not exist" to it -- if he can't write it down?  In real life, the
> constructivist's pov (or any other intuitionistic one, for that
> matter) leads to all kinds of awkwardness which, when pursued with
> conviction, would make normal discourse almost impossible.  (As the
> example demonstrates, a hard-core constructivist would have a heard
> time talking about constructivism without violating his own
> principles.)

It's hard to order a pizza with confidence if you are a pure
constructivist.

I'm more than willing to agree that given any finite time span, all
turing machines either halt or do not halt during that span.  And that
some turning machines can be proven to never halt.

-- 
~jrm
0
7/8/2004 6:23:37 AM
cesuraSPAM@verizon.net (William D Clinger) responded to me:

   > I need clarification: I think this means a version of F-F's ISWIM
   > on p 39 <http://www.ccs.neu.edu/course/com3357/mono.ps>.
   
   That would be fine.  Joe Marshall's language would be fine.  

Will, here's a 350 line proof for an integer version of F-F's ISWIM.
First I set up the BNF, then discuss the compositionality equations,
then define E in terms of F-F standard reduction, then finally prove
the compositionality equation.

******* ISWIM in the BNF of Schmidt's Theorem 3.13 *******

M in Expr = B_1 -E--> D_1 = Value_bottom

V in Value

x in Var = B_2 -identity--> D_2 = Var

n in Z = B_3 -identity--> D_3 = Z

alpha in Op = {+, -, *} = B_4  -O--> D_4 = (Z x Z -> Z_bottom)

M ::= x | n | (lambda (x) M) | (M M) | (alpha M M) 

V ::= x | n | (lambda (x) M)

We need Value_bottom for E to be a total function, as Schmidt insists,
but I'll often skip the "_bottom" for clarity.

Let's note that in F-F's LC_v, alpha-equivalent expressions are
identified (i.e. renaming of bound variables), and we're forced to do
that, because evaluation proceeds mainly through beta_v reduction:

((lambda (x) M) V) |--->_v M[x <- V]

(alpha m n) |--->_v alpha(m, n), for m, n in Z

That is, we're forced to rename bound variables in M in order to avoid
conflicts with free variables in V.  So let's resolve this problem by
not modding out by alpha-equivalence in Exp, but only in Value.

Schmidt's 3.13 then demands some f_ij functions.  The f_2j & f_3j
functions are subsumed by the identity maps indicated above.  The f_4j
functions are indicated by the O function above, which sends each +, -
& * to their obvious image as functions Z x Z -> Z_bottom.  Thus, the
f_2j, f_3j & f_4j functions are denotations of the terminal symbols,
and given here more straightforwardly as functions.

Now we have the f_1j functions that come from the 4 options of Expr.  
First we have 2 inclusion maps:

f_11 = iota: Var >---> Value

f_12 = iota: Z >---> Value

f_13:  Var x Value ---> Value   

is not given, as lambda is syntax.  As this technically violates
Schmidt's setup, I'll address this below.

f_14 = Phi: Value x Value ---> Value 

is the hard one, coming from (M N), and not defined yet, but I want to
give it a name.

f_15:  (Z x Z -> Z) x Value x Value ---> Value_bottom  

is the partial function given by

(alpha, V, W) |---> alpha(m, n),                if V = m, W = n
                    bottom,                     otherwise

I want to handle the syntax lambda by insisting instead that 

E[ (lambda (x) M) ] = (lambda (x) M)

As I explained in earlier post, we can fit the syntax lambda under
Schmidt's umbrella, by bringing in 2 syntactic=semantic domains 

y in Special-Var = B_5 -identity--> D_5 = Special-Var

P in Special-Exp = B_6 -identity--> D_6 = Special-Exp

where Special-Var = Var, Special-Exp = Exp.  Then we change the BNF
for Exp to 

M ::= x | n | (lambda (y) P) | (M M) | (alpha M M) 

and we now need a function (which doesn't do much)

f_13: D_5 x D_6 ---> D_1

f_13: Special-Var x Special-Exp ---> Value

(y, P) |---> (lambda (y) P).

We've now defined all the syntactic & semantic domains, and defined
all of the f_ij that Schmidt demands, except for the interesting one
f_14 = Phi for combinations (M N).


******* the compositionality equations *******

Apart from the terminals symbols handled above, the compositionality
equations are only for M in Exp:

E: Expr ---> Value_bottom

E[x] = x

E[n] = n

E[(lambda (x) M)] = (lambda (x) M)

E[(M N)] = Phi( E[M], E[N]) 

E[ (alpha M M) ] = alpha(m, n),         if E[M] = m, E[N] = n
                        bottom,         otherwise


Any solution of these equations in E and Phi will show that E is
Bill-compositional, which have agreed to assume is equivalent to
Will-compositionality.  I'll first define Phi in terms of E, and then
define E, and then show these equations are actually satisfied.

So Phi: Value x  Value --->  Value_bottom 

will be defined (after defining E) by 

Phi(U, V) = E[ P[y <- V] ], if U = (lambda (y) P), 
            bottom,         otherwise

Ph is perfectly well-defined now in terms of E.  The 3rd
compositionality equation now becomes 

E[(M N)] =  E[ P[y <- E[N]] ], if E[M] = (lambda (y) P), 
            bottom,            otherwise

and Phi has no been expunged from the discussion.  I think LC_v
experts would say this and the other 4 equations are obvious, but I'll
give a proof, to repeat some induction/ZFC points, and because I need
practice at evaluation contexts, for Advanced Scheme.


******* E and F-F's eval_v^s *******

Now we'll define a map 

E: Expr ---> Value_bottom

by the totalization of F-F's |-->>_v, not quite eval_v^s.  F-F explain
on p 53 that |-->>_v is the transitive closure of the standard
reduction arrow, which reduces the leftmost beta_v redex. We define

  E[M] = V,         if M |-->>_v  V
         bottom,    otherwise.

The difference between eval_v^s and |-->>_v is that eval_v^s sends all
lambda expressions to the symbol "function".  So eval_v^s is obviously
not compositional, in spite of my repeated claims (which I regard as
useful typos, not substantive errors).  Here's another typo I just
noticed: it's not eval_s, it's eval_v^s.  

As F-F say on p 53, the relation |-->>_v defines a well-defined
partial function.  F-F do not AFAIK claim that my totalized E is
compositional, i.e. satisfying the above 5 equations, but surely they
knew this, because obviously the standard reduction algorithm applied
to (M N) first standard reduces M to a value U, and then standard
reduce N to a value V , and then standard reduces (U V).

The first three compositionality equations are obvious, because x, n,
and (lambda (x) M) are values.  So we're left with the interesting 2
for E[(M N)] & E[ (alpha M M) ].  

E is defined by repeated application of the standard reduction arrow
|-->_v, which I want to write as a 1-step non-recursive function R,

R: LC_v Exp ---> LC_v Exp

Let's use the notation that LC_v Exp in our Exp above modded out by
alpha-equivalence.   So the first step of E is to project 

Exp --->> LC_v Exp

Then we'll perform R iteratively until we reach a value, or bottom
out.  Now as I've many times posted, E is defined by the following.

We have a set X = LC_v Exp_bottom with a subset Value_bottom and a set
Y=Value_bottom and a function

f=identity: A ---> Y

and a function R: X ---> X given by the standard reduction arrow
(which bottoms out if we can't perform a standard reduction).  Note
that R(A) = bottom.  We define the iterative composites R^n by
induction, and then define a partial function

E: X ---> Y   given by 

E(x) = f(R^n(x)), where n >= 0 is the unique integer such
                        R^i(x) != bottom for all i = 0,...,n, and 
                        R^n(x) in A.

       bottom, otherwise.

E is a well-defined function (clearly total) by the ZFC subset or
comprehension axiom: we can define sets (and therefore functions) by
formulas that can involve quantifiers.  Perhaps there's a better way
of phrasing it.  In the Math literature, this is standard, and no one
would invoke any ZFC axioms to justify it.  F-F follow the standard
Math practice by not justifying this step in showing their equivalent
result that eval_v^s is a partial function.  Now let's prove:

Lemma: For all M, N in Exp, 

E[(M N)] = E[ P[y <- E[N]] ],   if E[M] = (lambda (y) P),
           bottom,              otherwise

E[ (alpha M M) ] = alpha(m, n),         if E[M] = m, E[N] = n
                        bottom,         otherwise


******* Proof of compositionality *******


Proof: 

I'll just prove the 1st equation, as the 2nd is similar & easier.  I'm
going to show that the standard reduction sequence (henceforth SR
sequence) for a value-terminating (M N) is of the form

(M N) = (M_0 N) ... (M_m N) = (U N_0) ... (U N_n) = (U V) ... W

for values U, V & W.  This proves the result, because W = E[ (U V) ]

As F-F prove in Thm 6.1, there's a unique standard reduction
evaluation context.  Any M in LC_v Exp is either a value or uniquely
of the form 
C[ (V_1 V_2) ] or C[ (alpha V_1 V_2) ], 

where C is a context defined inductively by

C = [] | (V C) | (C P) | (alpha V C) | (alpha C P)  

for V in Value, P in LC_v Exp.  Then F-F's standard reduction arrow
|-->_v, is defined by beta_v reducing (V_1 V_2) or (alpha V_1 V_2).
For us, our R function is defined by

R(X) = C[Q],    if X = C[P] and P  beta_v reduces to Q 
       bottom,  otherwise

Now a simple point about eval contexts (probably noted by F-F) is that
if C[X] is a value, for an expression X and a eval context C, then 
C = [] and X is a value.  This follows by immediate inspection of the
inductive definition of the eval contexts C.

Given X, let X_0 = X, X_1,... be the SR sequence for X.  That is, X_i
= R^i(X) as long as they're non-bottom.  This sequence either
terminates in a value X_r = E[X], or it it's infinite, or it
terminates on a non-value X_r.  By the above, X_i = C_i[ P_i ], for
i=0,...,r, and for i < r, P_i beta_v reduces to say P_i', and then
X_{i+1} = C_i[ P_i' ].  Call r the "length+1" of the sequence even if
the SR sequence does not terminate in a value.  So r = 0, if X is a
value, and r = infty, if the sequence is infinite.

Now suppose X = (M N), and M_0 = M, M_1,...  is the SR sequence of M.
Let m be the "length+1" of the sequence.  We will prove that m <= r,
and that if m is finite, then X_m = (M_m N).

If m = 0, then M = U is a value, and we're done. 

If m > 0, then for 0 <= i < m, there exist Q_i of the form either
 (V_1 V_2) or (alpha V_1 V_2), and again for 0 <= i < m, we have

M_i = C_i[ Q_i ], and Q_i beta_v reduces to Q_i', and 

M_{i+1} = C_i[ Q_i' ]

If m is finite, then either M_m = Q_{m-1}' is a value, or else
Q_{m-1}' does not beta_v reduce.

But C'_i = (C_i N) for 0 <= i < m is an eval context by the inductive
definition above.  Then for i < m,

C'_i[ Q_i ] = (C_i[ Q_i ] N) 

is a sequence of standard reductions. Thus by the Thm 6.1 uniqueness
of eval contexts, X_i = (M_i N) for 0 <= i <= m.  Thus m <= r.

If m = infty, then r = infty, and E[(M N)] = bottom.  If m is finite
but terminates on a nonvalue, i.e. Q_{m-1}' does not beta_v reduce,
then X also does not standard reduce to a value.  Thus we have shown

If (M N) standard reduce to a value, then so does M.

And if E[M] = U, so M_m = U, we've shown that X_m = (U N).

Now consider the SR sequence N_0,... for N of "length+1" n.  We will
show that m+n <= r, and that if n is finite, then X_{m+n} = (U N_n).

If n = 0, so N = V, and we are done.

If n > 0, we're done by the argument above, making the modification
that given an eval context D, D' = (U D) is also an eval context.

Thus: If (M N) standard reduces to a value, then so do M and N.

Furthermore, we showed that initial subsequence of the SR sequence for
X is the sequence

X = (M N) = (M_0 N) ... (M_m N) = (U N_0) ... (U N_n) = (U V) 

ending with the eval context [].  Now (U V) beta_v reduces iff U is a
lambda expression U = (lambda (y) P), and then then next element of
the SR sequence for X is

P[y <- V]

Thus, the remainder of the SR sequence for X (which we are assuming
terminates in a value) is the SR sequence for P[y <- V].

Thus we have shown:  if X = (M N) standard reduces to a value, then 

E[(M N)] = E[ P[y <- E[N]] ],   for E[M] = (lambda (y) P),

and so we've shown part of our equation 

E[(M N)] = E[ P[y <- E[N]] ],   if E[M] = (lambda (y) P),
           bottom,              otherwise

But if (M N) does not standard reduces to a value, then E[ (M N)] =
bottom, and so we're done.  
\qed


Will, I was very precise and rigorous since you've been tasking me
about theorems/proofs.  So let me make a comment about our discussion:

1) To me, the issue isn't whether I can do proofs like this.  If you
   want to evaluate whether I can do real proofs, you should look at
   my proof of the LC standard reduction theorem, on my web page.

2) To me, the issue is whether my claim is true: that we can define
   compositional valuation functions without Scott models.  We'll work
   faster if you use your considerable skills here!  So I think I gave
   too many details to settle this simple case.

3) I've wondered if your interest here is not so much investigating
   this possibility of simpler compositional valuation functions, but
   protecting the newsgroup against my "false teachings".  If that's
   so, why bother?  I'm clearly not being very influential, and it
   can't be worth much of your time to scare off crackpots.
0
richter1974 (312)
7/8/2004 7:41:20 AM
Bill Richter still hasn't posted a full Bill-semantics, let alone
a proof that his Bill-semantics is Bill-compositional, but he still
claims he'll be able to do it Real Soon Now:

> No, Will, but thanks for the coherent mathematical objection!  I'll be
> able to slip out of this, as I explained earlier to Joe:

No, you won't, but watching you wriggle is cheap entertainment.

> First, lambda is a definite problem for compositionality, because it's
> syntax, as Scheme says, or a Lisp special form.  A quick reading of
> Schmidt's 3.13 would say anyone was hosed, because we need
> 
> V [ (lambda <var> <exp0>) ] = Phi2 (V [ <var> ], V [ <exp0> ])

No, Bill, you're forgetting that <var> is not in the same syntactic
category as <exp0>.  What we need is

V_{exp} [[ (lambda <var> <exp0>) ]]

    = Phi2 (V_{var} [ <var> ], V_{exp} [ <exp0> ])

V_{var} can be the identity function, so a <var> is its own denotation.
If V_{exp} were the identity function, we would end up with a trivial
semantics, as I explained in the post to which you were responding.
V_{exp} should be something like the usual denotational semantics of an
expression.

That's how the R5RS semantics works.  The identity function V_{var}
isn't written explicitly.  I'm sorry that confuses you.

Bill then wriggles for quite a while, transporting himself to an even
less comfortable spot:

> And I won't give any options for either P or y.  I'll give a list of
> terminal symbols.  
> 
> P  ::= exp_1 | exp_2 | ... 
> 
> y ::= v_1 | v_2 | v_3 | ...

Notice, folks, that Bill is changing the grammar of his language, and
that his modified grammar has an infinite number of productions.

> For y, that's easy, because it's the same list of terminals we used
> for Var.   Now 3.13 says I can do this: 
> 
> M ::= Option_11 | ... | (lambda (y)  P) | ...
> 
> and now I need a function f_67: D_6 x D_7 ---> D_1, and that's my Phi2
> above, and it's really simple:
> 
> E[ (lambda (y) P) ] = f_67( y, P ) in (U x S -> Value x S)
> 
> which is obviously defined as 
> 
> f_67( y, P )(rho, sigma) = (<y, P, rho>, sigma) 
> 
> So I indeed fit the syntax lambda under the 3.13 umbrella.

Indeed you do, Bill.  Here's something else you have done:  You have
given distinct denotations to

    (lambda (x) (+ 1 2))
and (lambda (x) (+ 2 1))

Does that coincide with your operational semantics?  It's hard for
us to tell, since you haven't yet defined your operational semantics.

>    In Bill's operational semantics, however, expressions like 1+2 and
>    2+1 probably have the same semantics.  
> 
> I guess that's right:
> 
> E[(+ 1 2)](rho, sigma) = (3, sigma) in Value x S
> 
> E[(+ 2 1)](rho, sigma) = (3, sigma) in Value x S

Ah, but does this imply that

    E[(lambda (x) (+ 1 2))] = E[(lambda (x) (+ 2 1))]   ?

Interesting question, that.  Bill hasn't answered it yet, but he'll
have to answer it Real Soon Now.

> No, therefore Bill read 3.13 more carefully than Will did :)
> 
> This a reasonably good time for a slogan I've been waiting to deliver:
> 
> Being good at Math is much the same as being good at programming.
> It's much more about debugging code than writing flawless code!
> 
> So you and others have been wrong for a long time about fundamental
> induction issues.  I don't draw any negative conclusions from that.
> It's how you deal with these issues that makes an impression on me.
> There's certainly plenty of Math I don't know myself.  I think you
> (Will) know a lot of Math that I don't know!  As long as you & the
> others fight through the Math, you make a good impression on me.  I
> salute you, Will, for a fighting well in this post.

Ah, yes, Bill is the fellow who came in here two years ago and began
to lecture us on induction.  He didn't know what structural induction
was.  He didn't even know about the connection between induction and
well-foundedness.  But Bill had a PhD in mathematics, and he knew
that induction had something to do with mathematics, so he thought
he knew more about induction than everybody on this newsgroup.

Apparently he still does.

And Bill knows that semantics has something to do with mathematics,
so...

Will
0
cesuraSPAM (401)
7/8/2004 1:51:38 PM
Joe Marshall <prunesquallor@comcast.net> writes:

> Matthias Blume <find@my.address.elsewhere> writes:
> 
> > I think philosophically it is rather interesting.  My belief is that
> > both Dan and Joe -- deep at their hearts -- do believe in the excluded
> > middle just like you and I (or Hilbert, to give a more authoritative
> > example).  
> 
> In some ways, you are correct and I'm being more than a bit tongue in
> cheek, but belief in the law of the excluded middle is *demonstrably*
> false in the realm of modern physics.  In the `double slit'
> experiment, the electron does *not* go through one slit or the other.

That's because the two are not complementary events, as it turns out.
"Going through slit 2" is not the logical opposite of "going through
slit 1".

> Prior to looking in the box, Shroedinger's cat is *not* either alive
> or dead.

That is highly debatable (and being debated).

> These facts are enough to shake my faith in the law of the excluded
> middle.

"Facts"?
0
find19 (1244)
7/8/2004 3:00:21 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Joe Marshall <jrm@ccs.neu.edu> responded to me:
>    
>    > That's a serious misunderstanding, and we won't be able to talk
>    > about State until you fix this.
>    
>    That is a typo.  The correct equations are these:
>    
>    V [[ numeral ]](env) = number
>    
>    V [[ id ]](env) = lookup (id, env)
>    
>    V [[ (<exp0> <exp1>) ]](env) = Phi0 (V [[<exp0>]], V [[<exp1>]])(env)
>    
> Great, Joe!  Sorry to be abusive, but I carped earlier, and when you
> didn't fix the typo, I thought we had serious problems.  We're writing
> so much, it's easy for stuff to get lost. :^0 

That's why I prefer to formalize in an actual programming language.
If you make a typo there, you'll find it as soon as you try to run the
program.

> That's the same as
>
>    V [[ (<exp0> <exp1>) ]]= Phi0 (V [[<exp0>]], V [[<exp1>]])

Yes.

>
> and let's recall that 
>
> V: Expr ---> (Env -> Value)
>
> Phi0: (Env -> Value) x (Env -> Value) ---> (Env -> Value)
>
>    V [[ (if <exp0> <exp1> <exp2>) ]](env) 
>        = Phi1 (V [[<exp0>]], V [[<exp1>]], V [[<exp2>]])(env)
>    
>    and if LAMBDA is to be compositional, it ought to be in this form:
>    
>    V [[ (lambda <var> <exp0>) ]](env) = Phi2 (<var>, V [[<exp0>]])(env)
>    
> Why do you say that?  Why don't you say, as I thought Will did, that
> compositionality demands
>
> V[ (lambda (x) M) ] = Phi2( V[x], V[M] )

The bound variable X is not an expression and it does not denote a
value.  Therefore you want 

  V [[ (lambda <var> <exp0>) ]] = Phi2 (<var>, V [[<exp0>]])

for the same reason, you wouldn't write 

  V [[ (lambda <var> <exp0>) ]] = Phi2 (V [[lambda]], <var>, V [[<exp0>]])

because LAMBDA is a syntactic tag, not an expression.

> I explained to Will, and you earlier, that Schmidt's 3.13 umbrella is
> wide enough for your formulation & mine too:
>
>    V [[ (lambda <var> <exp0>) ]]= Phi2 (<var>, <exp0>)
>
>      if it is to be a primitive, then it is this:
>    
>    V [[ (lambda <var> <exp0>) ]](env) = tuple (<var>, <exp0>, <env>)
>    
> where's the lingo coming from, primitive vs compositional?  Schmidt
> doesn't talk this way, does he?

I don't know, but `primitive' and `compositional' are common terms in
comp. sci. and I expect that won't cause confusion.  A `primitive'
is not composed of anything, so there is primitive syntax (like
integers) and compositional syntax (like conditionals).

>    >   > Compositionality doesn't depend on whether the BB_i are
>    >   > defined before the f_ij are defined, or afterward, by
>    >   > structural induction.  The proof of this is immediate from
>    >   > the uniqueness of Schmidt's Thm 3.13.
>    >   
>    >   It depends on the BB_ijl(S_ijl) being independent of f_ij, and
>    >   f_ij being a function.  This is where the problem lies.
>    >
>    > Only if you're defining the BB_i by structural induction, Joe.   
>    
>    That is the point of Theorem 3.13
>
> You're welcome to believe that, Joe.  You posted from Schmidt that
> this is indeed Schmidt's purpose.  But it's not what 3.13 says.
>
>    The point of V is to assign meaning to expressions.  Your tuple has
>    an expression within it, but no meaning is attached to that
>    expression.  If you assign a meaning, you have to prove that the
>    assignment is well-defined.
>
> As I keep saying, this ain't math.  Meaning is just some mathematical
> object.  We get to argue culturally whether my DS (or BS!) valuation
> function correctly capture the Scheme-like language.  But the math
> can't tell!  We've got a mathematical definition of a compositional DS
> valuation function.  And I think the triples adequately capture the
> meaning!  When we get to the culture discussion, I'll hold up my end.

Even without defining what meaning is, I can say that although your
semantics are assigning a meaning to LAMBDA expressions (they mean
triples), but they do not assign a meaning to the *subexpression*
within the lambda expression.  That subexpression is simply a bunch of
text stuffed into the triple.

> Since Schmidt says there's a unique family of functions BB_i,
> Bill-compositionality = Will-compositionality.

We can look at your equations in that way, but there is a cost.  Let
me formally write down where we are as a Scheme procedure.

Our first change is to remove the recursive call to curly-e in the
lambda case. 

(define (curly-e expression)
  (cond ((numeral? expression)     (curly-e-numeral expression))
        ((variable? expression)    (curly-e-variable expression))
        ((succ? expression)        (curly-e-succ
                                     (curly-e (succ-arg expression))))

        ((ifequal? expression) (curly-e-ifequal
                                     (curly-e (ifequal-left   expression))
                                     (curly-e (ifequal-right   expression))
                                     (curly-e (ifequal-consequent  expression))
                                     (curly-e (ifequal-alternative expression))))

;; We do not consider the lambda-body to be an expression
;; at this point.
;;
;        ((lambda? expression) (curly-e-lambda
;                               (lambda-variable expression)
;                               (curly-e (lambda-body  expression))))

        ((lambda? expression) (curly-e-lambda
                                (lambda-variable expression)
                                (lambda-body  expression)))

        ((call? expression) (curly-e-call
                             (curly-e (call-operator expression))
                             (curly-e (call-argument expression))))

        (else (error "Illegal expression." expression))))

This necessitates a change in the semantics of lambda expressions:

;; (define (curly-e-lambda var body)
;;   `(lambda (env)
;;      (lambda (arg)
;;        (,body (lambda (v)
;;                 (if (eq? v ',var)
;;                     arg
;;                     (env v)))))))

;; Note how body is now quoted text:

(define (curly-e-lambda var body)
  `(lambda (env)
     (make-triple ',var ',body env)))

We can now generate denotations in terms of Bill Semantics.  Recalling
the test expression in my tiny language:

  (call
    (call
     (call
      (call
       (lambda f (call (lambda d (call d d))
                       (lambda x (call f (lambda i (call x x))))))
       (lambda dec
         (lambda x
           (lambda y
             (lambda z
               (if-equal? x y
                          z
                          (call
                           (call
                            (call
                             (call dec 0)
                             (succ x))
                            y)
                           (succ z)))))))) 1) 40) 0)

Under Bill Semantics becomes this:
(lambda (env)
  (((lambda (env)
      (((lambda (env)
          (((lambda (env)
              (((lambda (env)
                  (make-triple
                    'f
                    '(call
                      (lambda d (call d d))
                      (lambda x (call f (lambda i (call x x)))))
                    env))
                env)
               ((lambda (env)
                  (make-triple
                    'dec
                    '(lambda x
                       (lambda y
                         (lambda z
                           (if-equal?
                             x
                             y
                             z
                             (call
                              (call (call (call dec 0) (succ x)) y)
                              (succ z))))))
                    env))
                env)))
            env)
           ((lambda (env) 1) env)))
        env)
       ((lambda (env) 40) env)))
    env)
   ((lambda (env) 0) env)))

But we have a problem.  Let's simplify the above expression by
reducing the pointless redexs that bind env:  
   ((lambda (env) 0) env) => 0

(lambda env
  (((((make-triple
        'f
        '(call (lambda d (call d d)) (lambda x (call f (lambda i (call x x)))))
        env)
      (make-triple
        'dec
        '(lambda x
           (lambda y
             (lambda z
               (if-equal?
                 x
                 y
                 z
                 (call (call (call (call dec 0) (succ x)) y) (succ z))))))
        env))
     1)
    40)
   0))

The problem is that we are attempting to apply a triple to an argument
(which is not a legal operation in our target semantic model).  If the
result of LAMBDA is to be applied to arguments, it must semantically
be a function (or isomorphic to one).

(define (curly-e-lambda var body)
  `(lambda (env)
     ,(triple->function (make-triple var body 'env))))

You need to supply a definition for triple->function.

This mapping is *not* easy to define.  In fact, it is quite a
challenge.  
0
jrm (1310)
7/8/2004 3:55:33 PM
Matthias Blume <find@my.address.elsewhere> writes:
> Ray Blaak <rAYblaaK@STRIPCAPStelus.net> writes:
> > Even if a program is /really/ always in one of the "halts" or "not halts"
> > category and no other, that fact does not really help us.
> 
> No conditional necessary.  It *really* always is, simply because
> "does" and "does not" are defined to refer to complementary events.

My point is that it does not matter.

I suppose the philosophical dispute is not about complimentary logic, but
rather the definition of terms like "really" and "halts", the *meaning* of the
categorization.

Does "really" even make sense in this context? What does an algorithm do?
Nothing -- it is just a static description. 

"Halts" is misleading. Do we mean "if we ever bothered to execute the
algorithm it would eventually halt"? This is a prediction, unknowable in
general.

Or do we mean "we did execute the algorithm and it halted"? This is a
measurement, and we cannot measure "forever".

Certainly we can reason about a tiny subset of algorithms and deem them to be
"not halting". But that actually means "by our rules of execution we have
proved to our satisfaction that the algorithm would not ever stop".

In general, however, we cannot tell the difference between "running for a very
long time" and "not halting".

Sure, we can think about "halts" and its compliment as a logical exercise. The
problem, though, is that it is not at all clear that "not halting" is
physically realizable. It might be, it might not. We can't know, and thus are
not affected by that "truth".

> > Until we have the knowledge, we cannot make practical use of the fact.
> 
> People make use of such facts all the time.  Ever heard of indirect
> proofs?

Give me an example where it is not known that an algorithm halts, but that
fact that either it halts or it doesn't lets you establish something you did
not already know.

-- 
Cheers,                                        The Rhythm is around me,
                                               The Rhythm has control.
Ray Blaak                                      The Rhythm is inside me,
rAYblaaK@STRIPCAPStelus.net                    The Rhythm has my soul.
0
rAYblaaK (362)
7/8/2004 4:53:26 PM
Ray Blaak <rAYblaaK@STRIPCAPStelus.net> writes:

> Give me an example where it is not known that an algorithm halts, but that
> fact that either it halts or it doesn't lets you establish something you did
> not already know.

The usual proof for the undecidability of the Halting Problem is such
an example.
0
find19 (1244)
7/8/2004 5:04:08 PM
richter@math.northwestern.edu (Bill Richter) writes:

> Will, here's a 350 line proof for an integer version of F-F's ISWIM.
> First I set up the BNF, then discuss the compositionality equations,
> then define E in terms of F-F standard reduction, then finally prove
> the compositionality equation.

FINALLY!

I'll annotate this and show you where the problem is.

> ******* ISWIM in the BNF of Schmidt's Theorem 3.13 *******
>
> M in Expr = B_1 -E--> D_1 = Value_bottom
>
> V in Value
>
> x in Var = B_2 -identity--> D_2 = Var
>
> n in Z = B_3 -identity--> D_3 = Z
>
> alpha in Op = {+, -, *} = B_4  -O--> D_4 = (Z x Z -> Z_bottom)
>
> M ::= x | n | (lambda (x) M) | (M M) | (alpha M M) 
>
> V ::= x | n | (lambda (x) M)
>
> We need Value_bottom for E to be a total function, as Schmidt insists,
> but I'll often skip the "_bottom" for clarity.

You'll need a conditional, but that's a minor change.

 M ::= x | n | (lambda (x) M) | (M M) | (alpha M M) | (ifequal? M M M M)

> Let's note that in F-F's LC_v, alpha-equivalent expressions are
> identified (i.e. renaming of bound variables), and we're forced to do
> that, because evaluation proceeds mainly through beta_v reduction:
>
> ((lambda (x) M) V) |--->_v M[x <- V]
>
> (alpha m n) |--->_v alpha(m, n), for m, n in Z
>
> That is, we're forced to rename bound variables in M in order to avoid
> conflicts with free variables in V.  So let's resolve this problem by
> not modding out by alpha-equivalence in Exp, but only in Value.
>
> Schmidt's 3.13 then demands some f_ij functions.  The f_2j & f_3j
> functions are subsumed by the identity maps indicated above.  The f_4j
> functions are indicated by the O function above, which sends each +, -
> & * to their obvious image as functions Z x Z -> Z_bottom.  Thus, the
> f_2j, f_3j & f_4j functions are denotations of the terminal symbols,
> and given here more straightforwardly as functions.
>
> Now we have the f_1j functions that come from the 4 options of Expr.  
> First we have 2 inclusion maps:
>
> f_11 = iota: Var >---> Value
>
> f_12 = iota: Z >---> Value
>
> f_13:  Var x Value ---> Value   
>
> is not given, as lambda is syntax.  As this technically violates
> Schmidt's setup, I'll address this below.
>
> f_14 = Phi: Value x Value ---> Value 
>
> is the hard one, coming from (M N), and not defined yet, but I want to
> give it a name.
>
> f_15:  (Z x Z -> Z) x Value x Value ---> Value_bottom  
>
> is the partial function given by
>
> (alpha, V, W) |---> alpha(m, n),                if V = m, W = n
>                     bottom,                     otherwise
>
> I want to handle the syntax lambda by insisting instead that 
>
> E[ (lambda (x) M) ] = (lambda (x) M)
>
> As I explained in earlier post, we can fit the syntax lambda under
> Schmidt's umbrella, by bringing in 2 syntactic=semantic domains 
>
> y in Special-Var = B_5 -identity--> D_5 = Special-Var
>
> P in Special-Exp = B_6 -identity--> D_6 = Special-Exp
>
> where Special-Var = Var, Special-Exp = Exp.  Then we change the BNF
> for Exp to 
>
> M ::= x | n | (lambda (y) P) | (M M) | (alpha M M) 
>
> and we now need a function (which doesn't do much)
>
> f_13: D_5 x D_6 ---> D_1
>
> f_13: Special-Var x Special-Exp ---> Value
>
> (y, P) |---> (lambda (y) P).
>
> We've now defined all the syntactic & semantic domains, and defined
> all of the f_ij that Schmidt demands, except for the interesting one
> f_14 = Phi for combinations (M N).

Yes.  There are 3 relevant changes to the semantics of my tiny
language to model this.  The first is in the compositionality equation
which you address below.

The second is in the semantics for lambda expressions.  The change is
easy:

;(define (curly-e-lambda var body)
;  `(lambda (env)
;     (lambda (arg)
;       (,body (lambda (v)
;                (if (eq? v ',var)
;                    arg
;                    (env v)))))))

;;; This is your f_13

(define (curly-e-lambda var body)
  (list 'lambda var body))

The hard part is f_14 (variously known as Phi, and curly-e-call)

The prior definition is this:

(define (curly-e-call operator argument)
  `(lambda (env)
     ((,operator env) (,argument env))))

But since lambda expressions no longer denote functions, we will need
to rewrite this function to expect a tuple to appear as the function.
This is *very* hard as we shall soon see.

> ******* the compositionality equations *******
>
> Apart from the terminals symbols handled above, the compositionality
> equations are only for M in Exp:
>
> E: Expr ---> Value_bottom
>
> E[x] = x
>
> E[n] = n
>
> E[(lambda (x) M)] = (lambda (x) M)
>
> E[(M N)] = Phi( E[M], E[N]) 
>
> E[ (alpha M M) ] = alpha(m, n),         if E[M] = m, E[N] = n
>                         bottom,         otherwise

(define (curly-e expression)
  (cond ((numeral? expression)     (curly-e-numeral expression))
        ((variable? expression)    (curly-e-variable expression))
        ((succ? expression)        (curly-e-succ
                                     (curly-e (succ-arg expression))))

        ((ifequal? expression) (curly-e-ifequal
                                     (curly-e (ifequal-left   expression))
                                     (curly-e (ifequal-right   expression))
                                     (curly-e (ifequal-consequent  expression))
                                     (curly-e (ifequal-alternative expression))))

        ((lambda? expression) (curly-e-lambda
                               (lambda-variable expression)
                               (lambda-body  expression)))

        ((call? expression) (curly-e-call
                             (curly-e (call-operator expression))
                             (curly-e (call-argument expression))))

        (else (error "Illegal expression." expression))))

> Any solution of these equations in E and Phi will show that E is
> Bill-compositional, which have agreed to assume is equivalent to
> Will-compositionality.  I'll first define Phi in terms of E, and then
> define E, and then show these equations are actually satisfied.
>
> So Phi: Value x  Value --->  Value_bottom 
>
> will be defined (after defining E) by 
>
> Phi(U, V) = E[ P[y <- V] ], if U = (lambda (y) P), 
>             bottom,         otherwise
>
> Phi is perfectly well-defined now in terms of E.  The 3rd
> compositionality equation now becomes 
>
> E[(M N)] =  E[ P[y <- E[N]] ], if E[M] = (lambda (y) P), 
>             bottom,            otherwise
>
> and Phi has now been expunged from the discussion.  

You have an error here.  

E[N] is an element in V, but P is a syntactic element.  When you
substitute P[y <- E[N]] you end up with something that is no longer
purely syntactic, thus it is not an acceptable argument to E.

> I think LC_v experts would say this and the other 4 equations are
> obvious, but I'll give a proof, to repeat some induction/ZFC points,
> and because I need practice at evaluation contexts, for Advanced
> Scheme.

It's not quite as obvious as you might think.

Rest of proof snipped --- We'll get to it when you have a fix for the
above issue.

> 1) To me, the issue isn't whether I can do proofs like this.  If you
>    want to evaluate whether I can do real proofs, you should look at
>    my proof of the LC standard reduction theorem, on my web page.

We don't want to evaluate if you can do proof like this.  We *need*
proofs like this because this is how we think.  Yes, it is insanely
detailed, but that's what we're comfortable with.

> 2) To me, the issue is whether my claim is true: that we can define
>    compositional valuation functions without Scott models.  We'll work
>    faster if you use your considerable skills here!  So I think I gave
>    too many details to settle this simple case.

No, absolutely not.  This is the level of detail necessary.

0
jrm (1310)
7/8/2004 5:17:23 PM
Matthias Blume <find@my.address.elsewhere> writes:
> Ray Blaak <rAYblaaK@STRIPCAPStelus.net> writes:
> 
> > Give me an example where it is not known that an algorithm halts, but that
> > fact that either it halts or it doesn't lets you establish something you did
> > not already know.
> 
> The usual proof for the undecidability of the Halting Problem is such
> an example.

I don't think so. The basic proof is to assume a machine M can solve the
halting problem for any input machine T. Then construct a particular T (that
uses M itself) that breaks M, that is, M cannot give an answer for that
input. Contradiction, and so our assumption does not hold.

Where are we making use of the fact that "halts or not halts" is true?

What I do see is that we try out either case and show they both cause a
failure. E.g. Consider this machine T:

T: if M(T) loop forever
   else halt

If T halts, then M(T) returns true, and so T loops forever. But that means
M(T) is false. Contradiction.

If T does not halt, then M(T) is false, and so T halts. But that means M(T) is
true. Contradiction.

But we have not actually used the fact that T (or any other machine) halts or
does not halt. M is certainly implying it because that is what it supposed to
do: categorize its inputs into either category. But M cannot exist, so that is
not relevant.

-- 
Cheers,                                        The Rhythm is around me,
                                               The Rhythm has control.
Ray Blaak                                      The Rhythm is inside me,
rAYblaaK@STRIPCAPStelus.net                    The Rhythm has my soul.
0
rAYblaaK (362)
7/8/2004 7:05:03 PM
In article <uvfgyjp2i.fsf@STRIPCAPStelus.net>,
 Ray Blaak <rAYblaaK@STRIPCAPStelus.net> wrote:

> Matthias Blume <find@my.address.elsewhere> writes:
> > Ray Blaak <rAYblaaK@STRIPCAPStelus.net> writes:
> > 
> > > Give me an example where it is not known that an algorithm halts, but 
> > > that
> > > fact that either it halts or it doesn't lets you establish something you 
> > > did
> > > not already know.
> > 
> > The usual proof for the undecidability of the Halting Problem is such