f



Is it always possible to write a COBOL program using only 1 sentence per paragraph?

    I'm trying to write a program that reformats the structure of a given 
COBOL program. One of the transformations I'd like to apply is to try to 
eliminate as many periods as possibles, using the END-whatever (END-IF, 
END-CALL, END-COMPUTE, etc.) constructs.

    My question is, is it always possible to do this? I don't want to 
"cheat" by adding new paragraph names at the beginning of every sentence 
(which may affect PERFORM statements anyway). My theory is that yes, it is 
always possible, but I haven't been programming in COBOL for that long, so I 
wasn't sure.

    - Oliver Wong
 


0
owong (6177)
7/11/2005 3:41:16 PM
comp.lang.cobol 4278 articles. 1 followers. Post Follow

185 Replies
1182 Views

Similar Articles

[PageSpeed] 9

On Mon, 11 Jul 2005 15:41:16 GMT, "Oliver Wong" <owong@castortech.com>
wrote:

>    I'm trying to write a program that reformats the structure of a given 
>COBOL program. One of the transformations I'd like to apply is to try to 
>eliminate as many periods as possibles, using the END-whatever (END-IF, 
>END-CALL, END-COMPUTE, etc.) constructs.
>
>    My question is, is it always possible to do this? I don't want to 
>"cheat" by adding new paragraph names at the beginning of every sentence 
>(which may affect PERFORM statements anyway). My theory is that yes, it is 
>always possible, but I haven't been programming in COBOL for that long, so I 
>wasn't sure.
>
It is, but you need to have AT LEAST ONE period at the end of each
paragraph.

Some people will use a period on it's own, others will use a
"CONTINUE."

If you are going to do this (I don't agree with it myself) then I
advise you to use the "CONTINUE." way as it makes it very clear that
you have a "." there.





Frederico Fonseca
ema il: frederico_fonseca at syssoft-int.com
0
7/11/2005 5:45:48 PM
On 11-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

>     I'm trying to write a program that reformats the structure of a given
> COBOL program. One of the transformations I'd like to apply is to try to
> eliminate as many periods as possibles, using the END-whatever (END-IF,
> END-CALL, END-COMPUTE, etc.) constructs.

Why?

Is every statement that can use an END- clause going to have one?

>     My question is, is it always possible to do this? I don't want to
> "cheat" by adding new paragraph names at the beginning of every sentence
> (which may affect PERFORM statements anyway). My theory is that yes, it is
> always possible, but I haven't been programming in COBOL for that long, so I
> wasn't sure.

It's not possible to do this when not all of the code is visible.   If some of
the code comes from inaccessible copy members or from precompilers, then it
won't work.
0
howard (6283)
7/11/2005 6:05:52 PM
"Frederico Fonseca" <real-email-in-msg-spam@email.com> wrote in message 
news:7ub5d1pej9vs5msg7v6mr9s6el06ah3kl5@4ax.com...
> It is, but you need to have AT LEAST ONE period at the end of each
> paragraph.
>
> Some people will use a period on it's own, others will use a
> "CONTINUE."
>
> If you are going to do this (I don't agree with it myself) then I
> advise you to use the "CONTINUE." way as it makes it very clear that
> you have a "." there.

Yes, sorry, I meant I wanted to remove all the periods except for the one 
terminating every paragraph.

    - Oliver 


0
owong (6177)
7/11/2005 6:25:47 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:dauceb$qs1$1@peabody.colorado.edu...
>
> On 11-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:
>
>>     I'm trying to write a program that reformats the structure of a
>> given COBOL program. One of the transformations I'd like to apply is to
>> try to eliminate as many periods as possibles, using the END-whatever
>> (END-IF, END-CALL, END-COMPUTE, etc.) constructs.
>
> Why?

Mainly to facilitate GOTO elimination. For example, given the code:

PROCEDURE DIVISION.
  IF A THEN
    DISPLAY "A"
    GO TO TERM-PROC
  END-IF
  DISPLAY "B".
  DISPLAY "C".
TERM-PROC.
  EXIT PROGRAM.

I'll first remove all the periods (that don't terminate a paragraph) to get

PROCEDURE DIVISION.
  IF A THEN
    DISPLAY "A"
    GO TO TERM-PROC
  END-IF
  DISPLAY "B"
  DISPLAY "C"
  .
TERM-PROC.
  EXIT PROGRAM
  .

and then I can transform it to:

PROCEDURE DIVISION.
  IF A THEN
    DISPLAY "A"
    SET GOTOFLAG TO TRUE
  END-IF
  IF GOTOFLAG IS FALSE THEN
    DISPLAY "B"
    DISPLAY "C"
  ENDIF
TERM-PROC.
  EXIT PROGRAM.

Whereas if 'DISPLAY "B"' and 'DISPLAY "C"' had periods after them, it would 
be more difficult for my code to group them together under the new IF THEN 
clause.

> Is every statement that can use an END- clause going to have one?

For simplicity, yes. The output isn't meant to be read by humans, but rather 
it's to serve as the input for yet more code manipulation algorithms.

>>     My question is, is it always possible to do this? I don't want to
>> "cheat" by adding new paragraph names at the beginning of every sentence
>> (which may affect PERFORM statements anyway). My theory is that yes, it
>> is always possible, but I haven't been programming in COBOL for that
>> long, so I wasn't sure.
>
> It's not possible to do this when not all of the code is visible.   If
> some of the code comes from inaccessible copy members or from
> precompilers, then it won't work.

    Hmm, good point. I think my code can handle the COPY statements, but 
it'll probably break with in the presence of conditional compilation 
markers. Okay, thanks.

    - Oliver 


0
owong (6177)
7/11/2005 6:35:57 PM
Oliver Wong wrote:

>"Howard Brazee" <howard@brazee.net> wrote in message 
>news:dauceb$qs1$1@peabody.colorado.edu...
>  
>
>>On 11-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:
>>
>>    
>>
>>>    I'm trying to write a program that reformats the structure of a
>>>given COBOL program. One of the transformations I'd like to apply is to
>>>try to eliminate as many periods as possibles, using the END-whatever
>>>(END-IF, END-CALL, END-COMPUTE, etc.) constructs.
>>>      
>>>
>>Why?
>>    
>>
>
>Mainly to facilitate GOTO elimination. For example, given the code:
>
>PROCEDURE DIVISION.
>  IF A THEN
>    DISPLAY "A"
>    GO TO TERM-PROC
>  END-IF
>  DISPLAY "B".
>  DISPLAY "C".
>TERM-PROC.
>  EXIT PROGRAM.
>
>I'll first remove all the periods (that don't terminate a paragraph) to get
>
>PROCEDURE DIVISION.
>  IF A THEN
>    DISPLAY "A"
>    GO TO TERM-PROC
>  END-IF
>  DISPLAY "B"
>  DISPLAY "C"
>  .
>TERM-PROC.
>  EXIT PROGRAM
>  .
>
>and then I can transform it to:
>
>PROCEDURE DIVISION.
>  IF A THEN
>    DISPLAY "A"
>    SET GOTOFLAG TO TRUE
>  END-IF
>  IF GOTOFLAG IS FALSE THEN
>    DISPLAY "B"
>    DISPLAY "C"
>  ENDIF
>TERM-PROC.
>  EXIT PROGRAM.
>
>Whereas if 'DISPLAY "B"' and 'DISPLAY "C"' had periods after them, it would 
>be more difficult for my code to group them together under the new IF THEN 
>clause.
>
>  
>
>>Is every statement that can use an END- clause going to have one?
>>    
>>
>
>For simplicity, yes. The output isn't meant to be read by humans, but rather 
>it's to serve as the input for yet more code manipulation algorithms.
>
>  
>
>>>    My question is, is it always possible to do this? I don't want to
>>>"cheat" by adding new paragraph names at the beginning of every sentence
>>>(which may affect PERFORM statements anyway). My theory is that yes, it
>>>is always possible, but I haven't been programming in COBOL for that
>>>long, so I wasn't sure.
>>>      
>>>
>>It's not possible to do this when not all of the code is visible.   If
>>some of the code comes from inaccessible copy members or from
>>precompilers, then it won't work.
>>    
>>
>
>    Hmm, good point. I think my code can handle the COPY statements, but 
>it'll probably break with in the presence of conditional compilation 
>markers. Okay, thanks.
>
>    - Oliver 
>
>
>  
>
Wouldn't you want to simply substitute something like:
IF A
    DISPLAY "A"
ELSE
    DISPLAY "B"
    DISPLAY "C"
END-IF
..
There is no need to introduce a variable, set it, and test it.  In my 
opinion, it doesn't "improve" the code; rather, it makes it even harder 
to understand.

After nearly 40 years of programming, and working on the programs of 
others, I can easily believe that you could find an example such as the 
one you cited.  They seem to be everywhere.  However, I would be equally 
unimpressed with the proposed "solution", because it takes what is 
"simple, bad" code and replaces it with "not as simple, still bad" code.
0
cmcampb (110)
7/11/2005 7:52:54 PM
On 11-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

> Mainly to facilitate GOTO elimination. For example, given the code:
>
> PROCEDURE DIVISION.
>   IF A THEN
>     DISPLAY "A"
>     GO TO TERM-PROC
>   END-IF
>   DISPLAY "B".
>   DISPLAY "C".
> TERM-PROC.
>   EXIT PROGRAM.

....


> and then I can transform it to:
>
> PROCEDURE DIVISION.
>   IF A THEN
>     DISPLAY "A"
>     SET GOTOFLAG TO TRUE
>   END-IF
>   IF GOTOFLAG IS FALSE THEN
>     DISPLAY "B"
>     DISPLAY "C"
>   ENDIF
> TERM-PROC.
>   EXIT PROGRAM.


I'm a big proponent of structured code.    But I would rather leave the GO TO in
unstructured code than replacing them with switches in that same code.    My
distaste for the second piece of code above is real strong.    Structured is a
lot more than GO TO - less code.

Does your restructure handle *real* spaghetti code?   With GO TOs pointing all
over the place?
0
howard (6283)
7/11/2005 7:59:44 PM
> >
> > PROCEDURE DIVISION.
> >   IF A THEN
> >     DISPLAY "A"
> >     SET GOTOFLAG TO TRUE
> >   END-IF
> >   IF GOTOFLAG IS FALSE THEN
> >     DISPLAY "B"
> >     DISPLAY "C"
> >   ENDIF
> > TERM-PROC.
> >   EXIT PROGRAM.
>
In my opinion this coding style is a lot more unstructured than the original
code !

I have also been coding Cobol for 20 years - and the problems with
unstructured programs are note GOTO's .. it's the lousy algorithms
underneath, it's the use of hopeless variable names, its the use to swicthes
or flags ... and like here the introduction of error possibilities .... if
yo just add another statement after the second end_if (saying the section is
a lot longer and more complex than here) ... then we have to have all that
inside the second IF ..... just because that we for some reason dont like a
GOTO to an exit :-)

The way that we always code is ... always have sections and paragraphs,
always perform, never ever perform though, always just GOTO's to exits or
top of a section.  Never sections bigger than 3 pages in your favorite
editor. + a couple of more items (no alter etc),

regards
Mette


0
Mette
7/11/2005 8:21:51 PM
"Colin Campbell" <cmcampb@adelphia.net> wrote in message 
news:TO2dnVxaoruLU0_fRVn-vw@adelphia.com...
> Oliver Wong wrote:
>>PROCEDURE DIVISION.
>>  IF A THEN
>>    DISPLAY "A"
>>    SET GOTOFLAG TO TRUE
>>  END-IF
>>  IF GOTOFLAG IS FALSE THEN
>>    DISPLAY "B"
>>    DISPLAY "C"
>>  ENDIF
>>TERM-PROC.
>>  EXIT PROGRAM.
>
> Wouldn't you want to simply substitute something like:
> IF A
>    DISPLAY "A"
> ELSE
>    DISPLAY "B"
>    DISPLAY "C"
> END-IF
> .

    Yes, if I, as a human, were refactoring the code, I'd write it the way 
you put it. However the transformations are done by machine, and I felt it 
was better to keep the algorithms simple (and thus "obviously correct"), and 
then to perhaps later do an optimization pass to transform multiple IFs in 
the format above to IF ELSEs and other such transformations.

    - Oliver 


0
owong (6177)
7/11/2005 8:29:20 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:dauj3r$1m2$1@peabody.colorado.edu...
> I'm a big proponent of structured code.    But I would rather leave the GO 
> TO in
> unstructured code than replacing them with switches in that same code. 
> My
> distaste for the second piece of code above is real strong.    Structured 
> is a
> lot more than GO TO - less code.

    I understand your distate (I share it as well), so let's just say that 
"GO TO elimination" is a requirement for the project I'm working on. If the 
code is "easy to understand", that's a plus, but it a non-negotiable 
requirement for me that there be no GO TO statements.

> Does your restructure handle *real* spaghetti code?   With GO TOs pointing 
> all
> over the place?

    That's the plan. In theory, the algorithms should be able to handle 
everything (PERFORM THRUs, ALTERed GOTOs, fall through logic, the works) 
except for SECTIONS with independent overlayed segments (which is low 
priority because it doesn't seem to appear in any of my sample COBOL files, 
but still laying in the back of my mind as a future problem I'll have to 
eventually address).

    In practice, I've just started trying to implement the algorithms, and 
the first obstacle I've encountered is the desire to have just 1 sentence 
per paragraph to make the future transformations more easy. So I started to 
wonder if I could always transform a given COBOL program to an equivalent 
one with only 1 sentence per paragraph, and I couldn't think of a reason why 
not, but I thought I'd post on the newsgroup just in case.

    - Oliver 


0
owong (6177)
7/11/2005 8:38:07 PM
"Mette Stephansen" <mm> wrote in message 
news:42d2d4e0$0$4723$edfadb0f@dread15.news.tele.dk...
> In my opinion this coding style is a lot more unstructured than the 
> original
> code !
>
> I have also been coding Cobol for 20 years - and the problems with
> unstructured programs are note GOTO's .. it's the lousy algorithms
> underneath, it's the use of hopeless variable names, its the use to 
> swicthes
> or flags ... and like here the introduction of error possibilities .... if
> yo just add another statement after the second end_if (saying the section 
> is
> a lot longer and more complex than here) ... then we have to have all that
> inside the second IF ..... just because that we for some reason dont like 
> a
> GOTO to an exit :-)

    Yes, unfortunately because these transformations I want to do are going 
to be done by a computer, there'll be even more hopelessly meaningless 
variable names, and introductions of lots of falgs and switches. As I 
mentioned in another post though, the results of this transformation is not 
intended to be read by human programmers, but it's just there to make code 
analysis easier. Right now, all the algorithms I've seen for logic flow 
analysis and the such work a lot better if there are zero GO TO statements 
in the code.

    As for the possibility for introducing a new error, I won't have to 
worry about that too much, because all the modifications are going to be 
done by automatic tools. So assuming that I can prove all my transformations 
won't modify the semantics of the COBOL program, I know the tools will do 
the right thing. Actually, that's the reason I asked this question about 1 
sentence per paragraph in the first place; to make sure that my 
transformation won't change the behaviour of the program!

    - Oliver 


0
owong (6177)
7/11/2005 8:43:20 PM
> So I started to wonder if I could always transform a given COBOL program to an equivalent
> one with only 1 sentence per paragraph, and I couldn't think of a reason why not,

I can. It has been recently argu^H^H^H discussed here that a NEXT
SENTENCE embedded in a nested imperitive IF is allowed (some even think
it was deliberately allowed) and this will transfer control to after
the next full stop which may be at some arbitrary point.

In my view any such usage should cause the converter to put the output
and input to /dev/nul, and preverably the coder as well, but it does
mean that such a program cannot have that full stop removed.

0
riplin (4127)
7/11/2005 9:37:01 PM
In article <AEAAe.105441$HI.1740@edtnps84>,
Oliver Wong <owong@castortech.com> wrote:
>
>"Colin Campbell" <cmcampb@adelphia.net> wrote in message 
>news:TO2dnVxaoruLU0_fRVn-vw@adelphia.com...

[snip]

>> Wouldn't you want to simply substitute something like:
>> IF A
>>    DISPLAY "A"
>> ELSE
>>    DISPLAY "B"
>>    DISPLAY "C"
>> END-IF
>> .
>
>    Yes, if I, as a human, were refactoring the code, I'd write it the way 
>you put it. However the transformations are done by machine, and I felt it 
>was better to keep the algorithms simple (and thus "obviously correct"), and 
>then to perhaps later do an optimization pass to transform multiple IFs in 
>the format above to IF ELSEs and other such transformations.

Mr Wong, permit me to ask: have you ever read any code, before and after 
versions, that's been run through a restructuring program?

DD

0
docdwarf (6044)
7/11/2005 10:05:33 PM
Just out of curiosity:

Assuming your code restructuring engine works: how does one go about
maintaining the code afterwards?  This isn't meant to be a hostile
question.  But it occurs to me that if, as you've pointed out, all
GOTO's >>must<< be removed, then the program may be totally unlike the
original even if it's functionally identical.  Odds are good that
whatever specs and documentation exist (ha! as if) will now not be
applicable in large part without some sort of similar restructuring. 
That's to say, the documentation to do with input/output/interfaces of
the program will still apply, but the black-box aspect of the
documentation will no longer describe what's under the hood of the
program.

Seems to me that TPB (The Poor Beggar who does maintenance) is going to
have to study the program line-by-line to make changes possible. 
Whatever knowledge of the program existed prior to the conversion will
no longer apply.  Given the supposition that there are very few new
people coming into COBOL, one wonders how maintenance will even get
done.

Respectfully,
Peter Lacey
0
lacey (134)
7/12/2005 1:18:00 AM
docdwarf@panix.com wrote:

>In article <AEAAe.105441$HI.1740@edtnps84>,
>Oliver Wong <owong@castortech.com> wrote:
>  
>
>>"Colin Campbell" <cmcampb@adelphia.net> wrote in message 
>>news:TO2dnVxaoruLU0_fRVn-vw@adelphia.com...
>>    
>>
>
>[snip]
>
>  
>
>>>Wouldn't you want to simply substitute something like:
>>>IF A
>>>   DISPLAY "A"
>>>ELSE
>>>   DISPLAY "B"
>>>   DISPLAY "C"
>>>END-IF
>>>.
>>>      
>>>
>>   Yes, if I, as a human, were refactoring the code, I'd write it the way 
>>you put it. However the transformations are done by machine, and I felt it 
>>was better to keep the algorithms simple (and thus "obviously correct"), and 
>>then to perhaps later do an optimization pass to transform multiple IFs in 
>>the format above to IF ELSEs and other such transformations.
>>    
>>
>
>Mr Wong, permit me to ask: have you ever read any code, before and after 
>versions, that's been run through a restructuring program?
>
>DD
>
>  
>
DD,
That is a good point.  My experience with "restructured" code is that in 
most cases, a large, poorly structured program gets "structured" (though 
often not in an easily understandable way - for human readers, I mean), 
and the program tends to get about 20% larger, give or take.

Seems like there is very little "profit" in that....
0
cmcampb (110)
7/12/2005 2:01:47 AM
Oliver Wong wrote:
>    I'm trying to write a program that reformats the structure of a
> given COBOL program. One of the transformations I'd like to apply is
> to try to eliminate as many periods as possibles, using the
> END-whatever (END-IF, END-CALL, END-COMPUTE, etc.) constructs.
>
>    My question is, is it always possible to do this? I don't want to
> "cheat" by adding new paragraph names at the beginning of every
> sentence (which may affect PERFORM statements anyway). My theory is
> that yes, it is always possible, but I haven't been programming in
> COBOL for that long, so I wasn't sure.
>
>    - Oliver Wong

Yes. Hunt around for the ETK Toolkit. It's a collection of about thirty 
compiler-independent routines - some quite clever. The author was positively 
anal about one period per paragraph. 


0
heybubNOSPAM (643)
7/12/2005 2:28:39 AM
HeyBub wrote:

>Oliver Wong wrote:
>  
>
>>   I'm trying to write a program that reformats the structure of a
>>given COBOL program. One of the transformations I'd like to apply is
>>to try to eliminate as many periods as possibles, using the
>>END-whatever (END-IF, END-CALL, END-COMPUTE, etc.) constructs.
>>
>>   My question is, is it always possible to do this? I don't want to
>>"cheat" by adding new paragraph names at the beginning of every
>>sentence (which may affect PERFORM statements anyway). My theory is
>>that yes, it is always possible, but I haven't been programming in
>>COBOL for that long, so I wasn't sure.
>>
>>   - Oliver Wong
>>    
>>
>
>Yes. Hunt around for the ETK Toolkit. It's a collection of about thirty 
>compiler-independent routines - some quite clever. The author was positively 
>anal about one period per paragraph. 
>
>
>  
>
All I find when I google on "ETK Toolkit" is a response you made at 
another place, suggesting the use of the ETK Toolkit.  Where do I find it?
0
cmcampb (110)
7/12/2005 4:09:24 AM

Colin Campbell wrote:

> HeyBub wrote:
> 
>> Oliver Wong wrote:
>>  (snip)
>> Yes. Hunt around for the ETK Toolkit. It's a collection of about 
>> thirty compiler-independent routines - some quite clever. The author 
>> was positively anal about one period per paragraph.
>>
>>  
>>
> All I find when I google on "ETK Toolkit" is a response you made at 
> another place, suggesting the use of the ETK Toolkit.  Where do I find it?

Albert Richheimer posted a link here in comp.lang.cobol on June 28th:

===begin quote===

I have it on my website: http://consulting.richheimer.org
Select "Downloads", then "Dokumente", and click on ETKPAK.ZIP

===end quote===

I downloaded it from there, and I think I might just post it on my 
website.


-- 
http://arnold.trembley.home.att.net/

0
7/12/2005 6:05:28 AM

Colin Campbell wrote:

> docdwarf@panix.com wrote:
> (snip)
>> Mr Wong, permit me to ask: have you ever read any code, before and 
>> after versions, that's been run through a restructuring program?
>>
>> DD
>>
>>  
>>
> DD,
> That is a good point.  My experience with "restructured" code is that in 
> most cases, a large, poorly structured program gets "structured" (though 
> often not in an easily understandable way - for human readers, I mean), 
> and the program tends to get about 20% larger, give or take.
> 
> Seems like there is very little "profit" in that....

Several years ago I used IBM's COBOL Structuring Facility to 
restructure several programs and move them into production.  I've told 
this story a few times before.  My company dropped the license for 
IBM-CSF because hardly anyone ever used it.

I found it to be a pretty impressive product, simply amazing that it 
could do what it did.  For really ugly spaghetti code it had to 
generate switches and branches to eliminate the worst cases of PERFROM 
THRU and ALTER.

On the plus side, it was 100% successful at eliminating GO TO, ALTER, 
and PERFROM THRU.  If you wanted it to generate SECTIONS with every 
SECTION having an EXIT, you could set it up to do that (and still get 
rid of every GO TO and ALTER).  I chose the option to generated 
paragraphs only, with no sections.  I never found any program bugs 
generated by the restructuring facility itself.  It would also 
reformat the code somewhat (consistent indentation of IF's, etc.) and 
optionally generate "called by" comments at the top of each output 
paragraph.  One of the surprises I found using it was the amount of 
dead code that could not be generated into the restructured program. 
I think it was carried over as commented-out code.  Getting rid of 
dead code could be a benefit, or an indication of just how poorly 
written the original program was.

On the down side, the generated program was a bit larger (all those 
"called by" comments), and had fewer paragraph names.  And if the 
original program had uninformative datanames and procedure names, they 
would still be meaningless in the generated program.  In my opinion, 
if you wanted a well-written program, then a programmer still had to 
modify the automatically restructured code.  But the tool saved you 
quite a bit of work getting rid of the spaghetti.

I still have one or two old programs in production that I would like 
to restructure if we still had a license for CSF.


-- 
http://arnold.trembley.home.att.net/

0
7/12/2005 6:20:18 AM
In article <OoydnRb_uNAWuU7fRVn-vw@adelphia.com>,
Colin Campbell  <cmcampb_at_adelphia.net> wrote:
>docdwarf@panix.com wrote:
>
>>In article <AEAAe.105441$HI.1740@edtnps84>,
>>Oliver Wong <owong@castortech.com> wrote:

[snip]

>>>   Yes, if I, as a human, were refactoring the code, I'd write it the way 
>>>you put it. However the transformations are done by machine, and I felt it 
>>>was better to keep the algorithms simple (and thus "obviously correct"), and 
>>>then to perhaps later do an optimization pass to transform multiple IFs in 
>>>the format above to IF ELSEs and other such transformations.
>>>    
>>>
>>
>>Mr Wong, permit me to ask: have you ever read any code, before and after 
>>versions, that's been run through a restructuring program?
>>
>DD,
>That is a good point.  My experience with "restructured" code is that in 
>most cases, a large, poorly structured program gets "structured" (though 
>often not in an easily understandable way - for human readers, I mean), 
>and the program tends to get about 20% larger, give or take.
>
>Seems like there is very little "profit" in that....

But you get such... *intuitive* paragraph-names... like 
A71342-D0841-PROCESS-RTN.

DD

0
docdwarf (6044)
7/12/2005 9:29:09 AM
<docdwarf@panix.com> wrote in message news:dauqfd$bh$1@panix5.panix.com...
> In article <AEAAe.105441$HI.1740@edtnps84>,
>
> Mr Wong, permit me to ask: have you ever read any code, before and after
> versions, that's been run through a restructuring program?

    Back in highschool, my English teacher would always call me "Mr. Wong" 
whenever I was in trouble, so now I've associated that unpleasant feeling 
with that name. Feel free to call me "Oliver".

    The closest I've come is using the Eclipse IDE to refactor my Java code, 
with which I was quite pleased with the result. But really, I think that's 
aside from the point. I'm not saying that the code produced from my 
transformation will be anywhere as nice and clean as the transformations 
Eclipse does. I'm not even saying that the transformations will look nice at 
all on an absolute scale.

    All I'm saying is that I've been given the task of writing code that 
eliminates GO TO, ALTER, PERFORM THRU and other similar statements, 
regardless of whether I think the resulting code will look pretty or not. 
Now given these requirements, I'm seeking a bit of help, because I'm not 
familiar with COBOL enough to just wade through and be confident that my 
transformations won't change the behaviour of the program.

    - Oliver 


0
owong (6177)
7/12/2005 12:50:25 PM
"Richard" <riplin@Azonic.co.nz> wrote in message 
news:1121117821.011458.183950@g14g2000cwa.googlegroups.com...
> I can. It has been recently argu^H^H^H discussed here that a NEXT
> SENTENCE embedded in a nested imperitive IF is allowed (some even think
> it was deliberately allowed) and this will transfer control to after
> the next full stop which may be at some arbitrary point.
>
> In my view any such usage should cause the converter to put the output
> and input to /dev/nul, and preverably the coder as well, but it does
> mean that such a program cannot have that full stop removed.

    Hmm, interesting. I did not find any mention of "NEXT SENTENCE" in any 
of the documentation I initially used as references for the COBOL language, 
but based on the discussions I've found on the web, it looks like NEXT 
SENTENCE is an unconditional jump to the statement immediately following the 
period of the sentence containing the NEXT SENTENCE statement, as you've 
said.

    I think this would be one one case where I would have to add a new 
paragraph name (to act as a label), then change the NEXT SENTENCE to a GO TO 
which jumps to that label. All the while, I have to be check for any PERFORM 
or PERFORM THRU statements that I might have to update to reflect the 
addition of this new paragraph.

    Thanks for the tip, I might have to factor in this new issue into the 
algorithm design.

    - Oliver. 


0
owong (6177)
7/12/2005 1:03:23 PM
"Peter Lacey" <lacey@mb.sympatico.ca> wrote in message 
news:42D31A48.E1CBDEB9@mb.sympatico.ca...
> Just out of curiosity:
>
> Assuming your code restructuring engine works: how does one go about
> maintaining the code afterwards?  This isn't meant to be a hostile
> question.  But it occurs to me that if, as you've pointed out, all
> GOTO's >>must<< be removed, then the program may be totally unlike the
> original even if it's functionally identical.  Odds are good that
> whatever specs and documentation exist (ha! as if) will now not be
> applicable in large part without some sort of similar restructuring.
> That's to say, the documentation to do with input/output/interfaces of
> the program will still apply, but the black-box aspect of the
> documentation will no longer describe what's under the hood of the
> program.
>
> Seems to me that TPB (The Poor Beggar who does maintenance) is going to
> have to study the program line-by-line to make changes possible.
> Whatever knowledge of the program existed prior to the conversion will
> no longer apply.  Given the supposition that there are very few new
> people coming into COBOL, one wonders how maintenance will even get
> done.
>
> Respectfully,
> Peter Lacey

    Excellent question. Michael Kasten has written an answer to this on "The 
Kasten COBOL Page" (http://home.swbell.net/mck9/cobol/style/rewrite.html) 
which probably phrases the answer much better than I ever could. He points 
out that while the code is still in an unstructured (in the formal sense of 
the word, i.e. containing 1 or more GO TO statements or similar construct), 
a lot of changes you'd like to make (e.g. inserting a new paragraph) aren't 
"safe" because you don't know what PERFORM or PERFORM THRU might be present. 
You can't touch any GO TO statements, because there might be ALTER 
statements elsewhere that modify them, and so on. By structuring the code 
using this tool, you can now make a lot of assumptions about how control 
flow progresses through the program. Sure, the resulting program is a mess, 
but it's a mess that can be easily modified and fixed up, as compared to the 
original program, which might have *looked* perfectly fine, but was 
completely resistant to any modifications whatsoever. Anyway, the relevant 
quote from Kasten's page follows:

[BEGIN QUOTE]
     Some re-engineering tools can transform unstructured spaghetti code 
into logically equivalent COBOL with no GO TOs or PERFORM THRUs. Such a tool 
can begin a rewrite, and automate some of the drudgery, but the results 
should not be regarded as a finished product. The tool is no more a 
substitute for a programmer than an anvil is a substitute for a blacksmith.

    [...]

    Spaghetti code is almost always buggy code. The obvious bugs will have 
already been eliminated by the time you inherit the program. The remaining 
ones are subtle; they lie concealed in the convoluted logic until a human 
being flushes them out.

    If the restructuring tool does its job properly, it will carefully 
preserve every bug in the original code. On the other hand, once the tool 
has formally restructured the program, the bugs may be easier to recognize.
[END QUOTE]

    - Oliver 


0
owong (6177)
7/12/2005 1:13:21 PM
In article <l0PAe.149449$on1.34681@clgrps13>,
Oliver Wong <owong@castortech.com> wrote:
><docdwarf@panix.com> wrote in message news:dauqfd$bh$1@panix5.panix.com...
>> In article <AEAAe.105441$HI.1740@edtnps84>,
>>
>> Mr Wong, permit me to ask: have you ever read any code, before and after
>> versions, that's been run through a restructuring program?
>
>    Back in highschool, my English teacher would always call me "Mr. Wong" 
>whenever I was in trouble, so now I've associated that unpleasant feeling 
>with that name. Feel free to call me "Oliver".

Back in grade school it was common knowledge among the boys that it was 
best not to touch girls *at all* because they were infested with an 
invisible horror called 'cooties' (not to be confused with the Southern 
USA colloquialism 'cooties', or 'lice')... fortunately a few years have 
passed since then and this has been proven wrong.  During my Kollidj Daze 
I attended an institution where all classes had three rules:

1) One person talks at a time.

2) Any opinion, unless substantiated by reasoned argument, can be 
dismissed as 'mere opinion'.

3) Everyone, at all times, is to be referred to as Mr or Ms.

Since, of course, what one learns in college is supposed to be 'higher 
education' (in the sense of 'more advanced') than what learns in high 
school... well, I'll try not to offend o'ermuch.

>
>    The closest I've come is using the Eclipse IDE to refactor my Java code, 
>with which I was quite pleased with the result. But really, I think that's 
>aside from the point. I'm not saying that the code produced from my 
>transformation will be anywhere as nice and clean as the transformations 
>Eclipse does. I'm not even saying that the transformations will look nice at 
>all on an absolute scale.

Wow... I thought 'look nice' was an aesthetic judgement... are there 
absolute scales for those things now?  I'll take that as a 'no, I've never 
read any code, before and after versions, that's been run through a 
restructuring program'.

>
>    All I'm saying is that I've been given the task of writing code that 
>eliminates GO TO, ALTER, PERFORM THRU and other similar statements, 
>regardless of whether I think the resulting code will look pretty or not. 

All I'm saying is that this kind of task might require a bit of knowledge 
and experience that you've yet to garner and whoever assigned it to you 
might do well to reconsider the criteria which are taken into account when 
it comes to doling out work... granted that I do not believe I can lower 
the quality of my thinking far enough to be Management but I'd say that a 
good foundation for this kind of work would be a few years spent doing 
maintenance/enhancement programming.

>Now given these requirements, I'm seeking a bit of help, because I'm not 
>familiar with COBOL enough to just wade through and be confident that my 
>transformations won't change the behaviour of the program.

Given the various ways that the execution of code can be directed it might 
be wise to become more familiar with COBOL.  GO TO, ALTER, PERFORM THRU 
all concern themselves with code contained in sections/paragraphs in that 
they address labels... NEXT SENTENCE, however, is a slightly different 
kettle o' fish.

Bona fortuna!

DD

0
docdwarf (6044)
7/12/2005 1:21:07 PM
In article <vcPAe.149478$on1.126481@clgrps13>,
Oliver Wong <owong@castortech.com> wrote:

[snip]

>    Hmm, interesting. I did not find any mention of "NEXT SENTENCE" in any 
>of the documentation I initially used as references for the COBOL language, 
>but based on the discussions I've found on the web, it looks like NEXT 
>SENTENCE is an unconditional jump to the statement immediately following the 
>period of the sentence containing the NEXT SENTENCE statement, as you've 
>said.

Leaving aside the persnickety labelling for statement, sentence, 
imperative, instruction, what happens at the end of a paragraph (etc)... 
would someone pass this along to The Standards Folks and ask them for what 
reason - given this level of interpretation shown by a rank neophyte - 
they think the function of NEXT SENTENCE is so difficult to grasp?

Well done, Mr Wong!

DD
0
docdwarf (6044)
7/12/2005 1:26:00 PM
"HeyBub" <heybubNOSPAM@gmail.com> wrote in message 
news:11d6am6pe4knu76@news.supernews.com...
> Oliver Wong wrote:
>>    I'm trying to write a program that reformats the structure of a
>> given COBOL program. One of the transformations I'd like to apply is
>> to try to eliminate as many periods as possibles, using the
>> END-whatever (END-IF, END-CALL, END-COMPUTE, etc.) constructs.
>>
>>    My question is, is it always possible to do this? I don't want to
>> "cheat" by adding new paragraph names at the beginning of every
>> sentence (which may affect PERFORM statements anyway). My theory is
>> that yes, it is always possible, but I haven't been programming in
>> COBOL for that long, so I wasn't sure.
>>
>>    - Oliver Wong
>
> Yes. Hunt around for the ETK Toolkit. It's a collection of about thirty 
> compiler-independent routines - some quite clever. The author was 
> positively anal about one period per paragraph.

    Formally, to prove that it's always possible to write a COBOL program 
with one period per paragraph, it isn't enough to just give examples of 
COBOL programs which are written with one period per paragraph. Rather, you 
need to present some sort of logical proof; However, you can disprove the 
statement by giving an example of a COBOL program which cannot be rewritten 
with just one period per paragraph. (Remember all those science test 
questions which start with "Prove or give a counterexample"?)

    Still, the ETK is a neat library. Thanks.

    - Oliver 


0
owong (6177)
7/12/2005 1:27:24 PM
On 11-Jul-2005, docdwarf@panix.com wrote:

> Mr Wong, permit me to ask: have you ever read any code, before and after
> versions, that's been run through a restructuring program?
>
> DD

Those of us who have aren't in any hurry to spend the money on another one.
0
howard (6283)
7/12/2005 1:29:35 PM
> Oliver Wong wrote:
> >    I'm trying to write a program that reformats the structure of a
> > given COBOL program. One of the transformations I'd like to apply is
> > to try to eliminate as many periods as possibles, using the
> > END-whatever (END-IF, END-CALL, END-COMPUTE, etc.) constructs.
> >
> >    My question is, is it always possible to do this? I don't want to
> > "cheat" by adding new paragraph names at the beginning of every
> > sentence (which may affect PERFORM statements anyway). My theory is
> > that yes, it is always possible, but I haven't been programming in
> > COBOL for that long, so I wasn't sure.

Well, if this program is for internal use only, you may not have to deal
with "always." If your program logic encounters something you can't handle
(e.g., nesting level too deep, END-whatevers already mixed in, NEXT
SENTENCE, etc), you could always write your output with some kind of token
you could search for and leave those paragraphs/blocks for manual
reformatting after your program has done the bulk of the work.

FWIW, if you really haven't been programming in COBOL all that long and you
intend this to be a 'generic' tool, be forewarned that you are going to
encounter some code construction you would have never imagined.

MCM






0
7/12/2005 1:32:55 PM
On 12-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

>     All I'm saying is that I've been given the task of writing code that
> eliminates GO TO, ALTER, PERFORM THRU and other similar statements,
> regardless of whether I think the resulting code will look pretty or not.
> Now given these requirements, I'm seeking a bit of help, because I'm not
> familiar with COBOL enough to just wade through and be confident that my
> transformations won't change the behaviour of the program.

I've been given stupid tasks to perform before also.

Do you have code with ALTER in it?
0
howard (6283)
7/12/2005 1:34:08 PM
On 11-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

>     In practice, I've just started trying to implement the algorithms, and
> the first obstacle I've encountered is the desire to have just 1 sentence
> per paragraph to make the future transformations more easy.

I'm not convinced getting rid of periods will make future transformations more
easy.

> So I started to
> wonder if I could always transform a given COBOL program to an equivalent
> one with only 1 sentence per paragraph, and I couldn't think of a reason why
> not, but I thought I'd post on the newsgroup just in case.

I mentioned invisible code (pre-compiler, copies, etc).   The other issue that
you should be careful is NEXT SENTENCE.    The standard does not allow NEXT
SENTENCE with END-IF, but some compilers (IBM mainframes) do allow it.   
Certainly if you're concerned about ALTER, you should be concerned with NEXT
SENTENCE.

Some people who believe in the one period per paragraph, use NEXT SENTENCE as a
way to terminate a paragraph.    I dislike this intensely, but I like periods
within my paragraph.     It appears that it is within your mandate to eliminate
NEXT SENTENCE altogether, I'd consider doing so if I were you.   Trouble is,
converting NEXT SENTENCE to CONTINUE isn't a trivial task, and will require
parsing.

Another hard to parse code is with "D" in column 6.
0
howard (6283)
7/12/2005 1:51:35 PM
Here's some ugly code that compiles with my compiler.    After converting it,
what would your results look like?

 PERFORM SECTION-NAME THROUGH PARAGRAPH-EXIT UNTIL M=N.

 SECTION-NAME SECTION.
 PARAGRAPH-NAME.
     IF A=B
            PERFORM READ-NEXT
D          IF C=D
D                NEXT SENTENCE;
     ELSE
            GO TO PARAGRAPH-EXIT
     END-IF
     PERFORM WRITE-OUTPUT
     PERFORM WRITE-ERROR.
     COPY ABCDE.
     GO TO PARAGRAPH-NAME.
 PARAGRAPH-EXIT.
     EXIT.


where
ABCDE contains:
 ABCDE.
     PERFORM CHECK-SUM
     IF Z=Z1 GO TO ABCDE
     PERFORM CHECK-SUM2.
     PERFORM CHECK-SUM3.
 ABCDE-EXIT.
     EXIT.
0
howard (6283)
7/12/2005 2:01:17 PM
In article <db0hti$d12$1@peabody.colorado.edu>,
Howard Brazee <howard@brazee.net> wrote:

[snip]

>Another hard to parse code is with "D" in column 6.

I'm familiar with designating debugging code (SOURCE-COMPUTER. xxxxxxx 
WITH DEBUGGING MODE.) by putting a 'D' in column 7... what does putting 
a 'D' in the last column of the card's sequence number do?

DD

0
docdwarf (6044)
7/12/2005 2:09:30 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db0gss$cdl$1@peabody.colorado.edu...
>
> Do you have code with ALTER in it?

    I'd have to check; I've got about 470 test COBOL source files, but I 
haven't actually read through all of them yet. Regardless of whether an 
ALTER statement actually appears in any of them (though I suspect that it 
probably does appear somewhere at least once), the tool I'm working on is 
expected to eventually be able to handle ALTER statements; so if it none of 
the test files contain an ALTER, I'd probably have to write a few new test 
files to be sure the transformation works.

    - Oliver 


0
owong (6177)
7/12/2005 2:13:26 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db0hti$d12$1@peabody.colorado.edu...
>
> I'm not convinced getting rid of periods will make future transformations 
> more
> easy.

    In one scenario for a GO TO elimination, I have to take a series of 
statements and put them in the body of an IF THEN clause. If these 
statements occur in consecutive, but distinct sentences, then I don't think 
I can actually put them in the THEN clause, because a THEN clause can 
contain a sequence of statements, but not a sequence of sentences. If I can 
convert these sentences into statements (what visually amounts of "removing 
the periods"), then I can put all those statements in the THEN clause 
without much problem.

>
>> So I started to
>> wonder if I could always transform a given COBOL program to an equivalent
>> one with only 1 sentence per paragraph, and I couldn't think of a reason 
>> why
>> not, but I thought I'd post on the newsgroup just in case.
>
> I mentioned invisible code (pre-compiler, copies, etc).   The other issue 
> that
> you should be careful is NEXT SENTENCE.    The standard does not allow 
> NEXT
> SENTENCE with END-IF, but some compilers (IBM mainframes) do allow it.
> Certainly if you're concerned about ALTER, you should be concerned with 
> NEXT
> SENTENCE.
>
> Some people who believe in the one period per paragraph, use NEXT SENTENCE 
> as a
> way to terminate a paragraph. I dislike this intensely, but I like periods
> within my paragraph.

    You mean they make the last statement in a paragraph be "NEXT SENTENCE" 
as sort of an indicator that this is the last statement in the paragraph? 
That does seem strange to me as well.

>     It appears that it is within your mandate to eliminate
> NEXT SENTENCE altogether, I'd consider doing so if I were you.   Trouble 
> is,
> converting NEXT SENTENCE to CONTINUE isn't a trivial task, and will 
> require
> parsing.

    I've already written a COBOL parser to generate a parse tree, so that 
latter requirement isn't a problem for me. As for eliminating NEXT SENTENCE, 
I haven't encountered that in any of my test files, nor in any of the 
reference documentation I've been using (I think they were the references 
for VS COBOL/II, RM COBOL and ANSI 85 COBOL), but it is something that I'll 
will probably eventually look into. I posted a strategy in another thread. I 
was wondering if you thought this was a feasible strategy:

[BEGIN QUOTE, with some typos corrected]
    I think this would be one of the few cases where I would have to add a 
new
paragraph name (to act as a label), then change the NEXT SENTENCE to a GO TO
which jumps to that label. All the while, I have to be check for any PERFORM
or PERFORM THRU statements that I might have to update to reflect the
addition of this new paragraph.
[END QUOTE]

> Another hard to parse code is with "D" in column 6.

    Yeah, I wasn't sure what the semantics were of indicator codes in column 
6. From my understand, a "*" denotes that the line should be ignored (as in 
a comment), and any alphabetic character indicates a conditional compilation 
line, so that all the lines marked with "A" could be toggled on or off 
independently of all the lines with "B" and so on, with "D" being, by 
convention (but with no particularly special support), being reserved for 
debugging code. Is this understanding correct?

    - Oliver 


0
owong (6177)
7/12/2005 2:26:55 PM
"Michael Mattias" <michael.mattias@gte.net> wrote in message 
news:bEPAe.592$8y1.417@newssvr17.news.prodigy.com...
> Well, if this program is for internal use only, you may not have to deal
> with "always." If your program logic encounters something you can't handle
> (e.g., nesting level too deep, END-whatevers already mixed in, NEXT
> SENTENCE, etc), you could always write your output with some kind of token
> you could search for and leave those paragraphs/blocks for manual
> reformatting after your program has done the bulk of the work.

    Hmm, that's an interesting strategy, though I think it might not be 
compatible with the requirements I was given. Still, I might adapt something 
similar (inserting special marker tokens for intermediate results between 
two transformations, for example).

> FWIW, if you really haven't been programming in COBOL all that long and 
> you
> intend this to be a 'generic' tool, be forewarned that you are going to
> encounter some code construction you would have never imagined.

    Yeah, I have about 470 test COBOL files, some of which illustrating some 
pretty degenerate cases which I had not expected while writing my first 
prototype parser. Probably the most painful are line comments appearing 
between line continuations, as in:

 [ Line 1: Some COBOL Statement]
*[ Line 2: A Comment]
-[ Line 3: Continuation of line 1]

    which when combined with COPY REPLACE statements (e.g. something half on 
line 1, half on line 3 needs to get replaced) can become quite painful to 
handle.

    - Oliver 


0
owong (6177)
7/12/2005 2:36:42 PM
<docdwarf@panix.com> wrote in message news:db0gd8$amg$1@panix5.panix.com...

> Leaving aside the persnickety labelling for statement, sentence,
> imperative, instruction, what happens at the end of a paragraph (etc)...
> would someone pass this along to The Standards Folks and ask them for what
> reason - given this level of interpretation shown by a rank neophyte -
> they think the function of NEXT SENTENCE is so difficult to grasp?

Well, I guess I'd qualify as A Standards Folk.

What the standard says NEXT SENTENCE ought to do is very clear, and as
stated above.  The contexts in the standard in which NEXT SENTENCE may
appear are also clear (and limited strictly to undelimited IF and
undelimited SEARCH statements).

The two problems discussed in the '02 standard can be found on Page 833:
"It is a common belief among users that control is transferred to a position
after the scope delimiter rather than to a separator period that follows it
somewhere.  In addition, it is a common source of errors, especially for
maintenance programmers who inadvertently insert a period somewhere before
the actual terminating separator period."

A third unstated issue is that use of NEXT SENTENCE outside the limited
contexts in which the standard allows is a very common implementor extension
(e.g., AT END, INVALID KEY, SIZE ERROR, etc., etc., and their corresponding
NOT phrases).   That complicates its interpretation.

A fourth unstated issue is that at least one implementation has supported
the perpetuation in the user community of the misapprehension as to where
NEXT SENTENCE is supposed to transfer control.   It is my understanding that
this error is in the process of being corrected, but because of the large
body of existing programs that might be dependent on this erroneous
presumption, the vendor's policy and agreements with its customers requires
a long period of stern warnings from the compiler before the change can
actually be made.

A fifth unstated issue is that *before* the introduction of scope delimiters
NEXT SENTENCE could in most cases reasonably be considered an imperative
statement on its own without impacting parsing much if at all.  That
fostered its use in contexts in which the standard did not provide for it
(see issue #3 above).

As the standard says as its closing defense of the archaization of NEXT
SENTENCE, "The CONTINUE statement and scope delimiters can be used to
accomplish the same functionality and such constructs are clearer and less
prone to error."

     -Chuck Stevens


0
7/12/2005 2:48:15 PM
On 12-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

> > Do you have code with ALTER in it?
>
>     I'd have to check; I've got about 470 test COBOL source files, but I
> haven't actually read through all of them yet. Regardless of whether an
> ALTER statement actually appears in any of them (though I suspect that it
> probably does appear somewhere at least once), the tool I'm working on is
> expected to eventually be able to handle ALTER statements; so if it none of
> the test files contain an ALTER, I'd probably have to write a few new test
> files to be sure the transformation works.

Why do you suspect they might occur at least once?    I haven't seen one in
decades and they are no longer supported.

Depending upon your employment status, and how likely you think they might be,
you might want to propose simply doing exception reporting with ALTER
statements.

Hmm.   If you have ALTER statements, you might also have memory segmentation by
SECTION.   I'd also do exception reporting if you find this, as determining such
needs isn't trivial.     Again - I haven't seen this in decades either.
0
howard (6283)
7/12/2005 2:48:31 PM
On 12-Jul-2005, docdwarf@panix.com wrote:

> >Another hard to parse code is with "D" in column 6.
>
> I'm familiar with designating debugging code (SOURCE-COMPUTER. xxxxxxx
> WITH DEBUGGING MODE.) by putting a 'D' in column 7... what does putting
> a 'D' in the last column of the card's sequence number do?

My bad.    Column 7 - or with some compilers optionally column 1.
0
howard (6283)
7/12/2005 2:49:47 PM
"Oliver Wong" <owong@castortech.com> wrote in message
news:_zQAe.106822$HI.2697@edtnps84...
> "Michael Mattias" <michael.mattias@gte.net> wrote in message
> news:bEPAe.592$8y1.417@newssvr17.news.prodigy.com...
> > Well, if this program is for internal use only, you may not have to deal
> > with "always." If your program logic encounters something you can't
handle
> > (e.g., nesting level too deep, END-whatevers already mixed in, NEXT
> > SENTENCE, etc), you could always write your output with some kind of
token
> > you could search for and leave those paragraphs/blocks for manual
> > reformatting after your program has done the bulk of the work.
>
>     Hmm, that's an interesting strategy, though I think it might not be
> compatible with the requirements I was given. Still, I might adapt
something
> similar (inserting special marker tokens for intermediate results between
> two transformations, for example).

"Convert anything and everything with a single automated program" is a poor
spec if the goal is "convert all our programs." I have no doubt that most
constructions can be converted virtually 'on sight' by an experienced COBOL
programmer. If your program can handle 90% of situations, the programmer now
only has to "sight" ten (10) percent of the total workload. So how much does
a 90 percent reduction of "sight" save, versus the time necessary to make
your program handle that additional extra ten percent?

To make your program handle that last ten percent seems waste of time and
money. (Intuitive/visceral, not empirical).

Of course, I have twenty-plus years as an executive accustomed to thinking
'bottom line."

Then again, you may be working for a government agency, where no such
'bottom line' thinking is allowed.

MCM







0
7/12/2005 2:56:57 PM
On 12-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

> > Some people who believe in the one period per paragraph, use NEXT SENTENCE
> > as a
> > way to terminate a paragraph. I dislike this intensely, but I like periods
> > within my paragraph.
>
>     You mean they make the last statement in a paragraph be "NEXT SENTENCE"
> as sort of an indicator that this is the last statement in the paragraph?
> That does seem strange to me as well.

I mean they use it to mean EXIT PARAGRAPH.

 PARAGRAPH-NAME.
       READ MY-FILE AT END NEXT SENTENCE
       PERFORM A
       PERFORM B
       PERFORM C
       CONTINUE.
 NEXT PARAGRAPH.


>     I've already written a COBOL parser to generate a parse tree, so that
> latter requirement isn't a problem for me. As for eliminating NEXT SENTENCE,
> I haven't encountered that in any of my test files, nor in any of the
> reference documentation I've been using (I think they were the references
> for VS COBOL/II, RM COBOL and ANSI 85 COBOL), but it is something that I'll
> will probably eventually look into. I posted a strategy in another thread. I
> was wondering if you thought this was a feasible strategy:

It's a lot, lot more common than ALTER.


> > Another hard to parse code is with "D" in column 6.
>
>     Yeah, I wasn't sure what the semantics were of indicator codes in column
> 6.

I meant "7'.   Sorry.


> From my understand, a "*" denotes that the line should be ignored (as in
> a comment), and any alphabetic character indicates a conditional compilation
> line, so that all the lines marked with "A" could be toggled on or off
> independently of all the lines with "B" and so on, with "D" being, by
> convention (but with no particularly special support), being reserved for
> debugging code. Is this understanding correct?

* is a comment.   I'm not familiar with A and B.

People often comment out code with expectations that they can uncomment it and
have it work.

     IF    A=B
**  AND C=D
           PERFORM DEF
**   DEF is my new routine asked for by Joe Smith
     END-IF.

You may have to decide manually whether a comment is such code and what to do
about it.
0
howard (6283)
7/12/2005 2:57:25 PM
On 12-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

>  [ Line 1: Some COBOL Statement]
> *[ Line 2: A Comment]
> -[ Line 3: Continuation of line 1]
>
>     which when combined with COPY REPLACE statements (e.g. something half on
> line 1, half on line 3 needs to get replaced) can become quite painful to
> handle.

Don't forget to make sure your formatting doesn't move code into column 73 or
beyond.
0
howard (6283)
7/12/2005 2:59:13 PM
In article <db0l7v$dec$1@si05.rsvl.unisys.com>,
Chuck Stevens <charles.stevens@unisys.com> wrote:
><docdwarf@panix.com> wrote in message news:db0gd8$amg$1@panix5.panix.com...
>
>> Leaving aside the persnickety labelling for statement, sentence,
>> imperative, instruction, what happens at the end of a paragraph (etc)...
>> would someone pass this along to The Standards Folks and ask them for what
>> reason - given this level of interpretation shown by a rank neophyte -
>> they think the function of NEXT SENTENCE is so difficult to grasp?
>
>Well, I guess I'd qualify as A Standards Folk.

Makes the passing-along rather easy, it seems.

[snip]

>The two problems discussed in the '02 standard can be found on Page 833:
>"It is a common belief among users that control is transferred to a position
>after the scope delimiter rather than to a separator period that follows it
>somewhere.

Given that Mr Wong is a moderate neophyte and concluded from his research 
that 'based on the discussions I've found on the web, it looks like NEXT 
SENTENCE is an unconditional jump to the statement immediately following 
the period of the sentence containing the NEXT SENTENCE statement' it 
seems that he managed sidestep this 'common belief'... but perhaps those 
who post to this group are, by definition, rather uncommon.

DD
0
docdwarf (6044)
7/12/2005 3:01:50 PM
"Howard Brazee" <howard@brazee.net> wrote in message
news:db0hti$d12$1@peabody.colorado.edu...

> Some people who believe in the one period per paragraph, use NEXT SENTENCE
as a
> way to terminate a paragraph.    I dislike this intensely, but I like
periods
> within my paragraph.     It appears that it is within your mandate to
eliminate
> NEXT SENTENCE altogether, I'd consider doing so if I were you.   Trouble
is,
> converting NEXT SENTENCE to CONTINUE isn't a trivial task, and will
require
> parsing.

As alluded to earlier, NEXT SENTENCE in the two contexts in which the '85
standard allows it -- non-delimited IF statements and non-delimited SEARCH
statements -- is an archaic element of '02 COBOL, which carries with it the
implication that its use is deprecated.  Thus, getting rid of NEXT SENTENCE
altogether is indeed a laudable goal.

That being said, in standard COBOL, the only effective contexts in which
NEXT SENTENCE can appear immedately before the period terminating a sentence
(whether the last or only sentence of a paragraph or not) is as part of the
ELSE phrase of a non-delimited IF statement, or as part of the last WHEN
phrase of a non-delimited SEARCH.

While CONTINUE is a STATEMENT and can stand alone, NEXT SENTENCE is a
PHRASE.

Put another way, in standard COBOL the following is legal:
PARAGRAPH-1.
    MOVE A TO B.
    CONTINUE.
PARAGRAPH-2.
    ...

while the following is not:
PARAGRAPH-1.
    MOVE A TO B.
    NEXT SENTENCE.
PARAGRAPH-2.
    ...

Note that I have learned from decades of painful experience supporting COBOL
(and other language) compilers that it is unwise to presume "Nobody in his
right mind would dream of even considering the possibility of writing
something like that." .

    -Chuck Stevens


0
7/12/2005 3:06:33 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db0l8b$f2d$1@peabody.colorado.edu...
> Why do you suspect they might occur at least once?    I haven't seen one 
> in
> decades and they are no longer supported.

    The the majority of the test files aren't "real" COBOL programs that do 
anything "useful", but rather are intended to illustrate as many different 
"things" that a correct ANSI 85 COBOL compiler should be able to handle. At 
the very least, I'd expect each statement to appear at least once.

> Depending upon your employment status, and how likely you think they might 
> be,
> you might want to propose simply doing exception reporting with ALTER
> statements.
>
> Hmm.   If you have ALTER statements, you might also have memory 
> segmentation by
> SECTION.   I'd also do exception reporting if you find this, as 
> determining such
> needs isn't trivial.     Again - I haven't seen this in decades either.

    I've done some research on the semantics of independent segments in 
SECTION declarations, and this is one of the obstacles I cannot currently 
handle.

    The strategy we're using right now is "start with the easy cases, and 
work your way up". So right now I'm trying to eliminate COBOL programs with 
just a single GO TO, no ALTER or PERFORM/PERFORM THRU or segmentation or 
anything like that. Gotta be able to handle GO TO in IF statements, loops, 
EVALUATE statements, and all those fun cases. Luckily, unlike some other 
languages, it seems like in COBOL you can only use GO TO to jump out of 
these constructs, not into the constructs. Once I've got all those cases 
working, I can start worrying about PERFORM THRUs and ALTERs. And then from 
there, maybe it'll become more clear how to handle segmentation. Worst case, 
as you've said, is for the tool to simply detect segmentation and give up, 
reporting an error.

    - Oliver 


0
owong (6177)
7/12/2005 3:07:39 PM
This discussion has encouraged me to pose a question to the group which
I've been contemplating for some time.  One of the goals of tools like
this is to eliminate spaghetti code.  Accepting that as a good thing,
the question arises: exactly what is spaghetti code?  If GOTO =
spaghetti code - that is, if a program that uses GOTO's is automatically
defined as SC, then there's not much to discuss except whether we agree
with the definition or not.  But that strikes me as much too
simple-minded.  Examples in texts that I've seen show program flow
leaping from one end of the program to another, allegedly because the
programmer kept on thinking of things he'd missed or wanting to reuse
code.  It's always struck me that those practices are very much rookie
mistakes.  I can't believe that anyone with some experience under
his/her belt and any level of aptitude for the job would continue doing
so.

I've encountered one example of this - which I called disaster code, not
just spaghetti code - written in BASIC; every statement was followed by
a GOTO to some other part of the program (EVERY statement except the
start & end!); this was a commercial accounting package and the
rationale was to make theft of the code more difficult; by gum, it
succeeded at that!  And there has been just one COBOL program  (by
somebody else, of course) that I have not been able to follow - the only
one I ever gave up in disgust on and rewrote; its problem was that it
had READ's all over the place and I couldn't work out what the heck it
was doing.

Anyway: I'd like to hear some definitions and examples!

Peter
0
lacey (134)
7/12/2005 3:09:47 PM
"Howard Brazee" <howard@brazee.net> wrote in message
news:db0lp1$fd1$1@peabody.colorado.edu...

>  PARAGRAPH-NAME.
>        READ MY-FILE AT END NEXT SENTENCE
>        PERFORM A
>        PERFORM B
>        PERFORM C
>        CONTINUE.
>  NEXT PARAGRAPH.

NEXT SENTENCE in AT END is a vendor extension, not standard COBOL.

Presuming '85 COBOL, how about something like:

PARAGRAPH-NAME.
    READ MY-FILE
        AT END
            CONTINUE
        NOT AT END
            PERFORM A
            PERFORM B
            PERFORM C
    END-READ.
NEXT-PARAGRAPH.

    -Chuck Stevens


0
7/12/2005 3:15:09 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db0ifp$dba$1@peabody.colorado.edu...
> Here's some ugly code that compiles with my compiler.    After converting 
> it,
> what would your results look like?

    Probably not what you'd "want" =P. First of all, the code isn't there 
yet, so if I were to actually post some COBOL code, it would be me as a 
human trying to mechanically follow through the steps of the algorithm, and 
quite probably making mistakes here and there. Second, a lot of simplifying 
assumptions were made during the design of this tool. For example, during 
the parse phase, all conditional compilation lines (e.g. the ones with "D" 
in the indicator field) are treated as comments, and the contents of 
copybooks are inserted right into the code, so that the next phases can 
assume that all preprocessing has been done.

    Please recall that the output produced by this tool isn't meant to be 
read by humans, but rather, will serve as the input for other tools.

    - Oliver 


0
owong (6177)
7/12/2005 3:17:02 PM
"Howard Brazee" <howard@brazee.net> wrote in message
news:db0lsd$ffp$1@peabody.colorado.edu...
>
> On 12-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:
>
> >  [ Line 1: Some COBOL Statement]
> > *[ Line 2: A Comment]
> > -[ Line 3: Continuation of line 1]
> >
> >     which when combined with COPY REPLACE statements (e.g. something
half on
> > line 1, half on line 3 needs to get replaced) can become quite painful
to
> > handle.
>
> Don't forget to make sure your formatting doesn't move code into column 73
or
> beyond.

Implementation-specific.  The location of Margin R has been defined by the
individual implementor since no later than the '74 standard.  I have some
reason to think that ANSI X3.23-1968 may have specified Margin R as being
between columns 72 and 73 of the source image, but in ANSI X3.23-1974 and
subsequent standards (including ISO/IEC 1989:2002 fixed-form reference
format) it's clear that it's up to the compiler writer how long Area B is.

    -Chuck Stevens


0
7/12/2005 3:20:11 PM
"Oliver Wong" <owong@castortech.com> wrote in message 
news:l0PAe.149449$on1.34681@clgrps13...
> <docdwarf@panix.com> wrote in message news:dauqfd$bh$1@panix5.panix.com...
>> In article <AEAAe.105441$HI.1740@edtnps84>,

<snip snip>
>    All I'm saying is that I've been given the task of writing code that 
> eliminates GO TO, ALTER, PERFORM THRU and other similar statements, 
> regardless of whether I think the resulting code will look pretty or not. 
> Now given these requirements, I'm seeking a bit of help, because I'm not 
> familiar with COBOL enough to just wade through and be confident that my 
> transformations won't change the behaviour of the program.
>
>    - Oliver

I wish I worked somewhere that had enough $$$ to spend on this type of 
requirement.

Please, go through the works of Coleridge and remove all instances of 
"dialogue" as a verb..we don't care about the aesthetics or the art of his 
work.  We are looking to satisfy the definition of the word according to 11 
members of the board - and right now there is concern about the confusion 
caused by having "archaic" words populate our world forums.


JCE 


0
defaultuser (532)
7/12/2005 3:28:55 PM
<docdwarf@panix.com> wrote in message news:db0m0u$c4l$1@panix5.panix.com...
> In article <db0l7v$dec$1@si05.rsvl.unisys.com>,
> Chuck Stevens <charles.stevens@unisys.com> wrote:
> Given that Mr Wong is a moderate neophyte and concluded from his research
> that 'based on the discussions I've found on the web, it looks like NEXT
> SENTENCE is an unconditional jump to the statement immediately following
> the period of the sentence containing the NEXT SENTENCE statement' it
> seems that he managed sidestep this 'common belief'... but perhaps those
> who post to this group are, by definition, rather uncommon.

    Writing language tools (e.g. parsers, compilers, etc.) means I had to 
read whatever language reference and documentations much more carefully than 
I might otherwise have done if I were merely interested in learning how to 
program in said language. For example, I felt I was a relatively 
knowlegeable Java programmer, but it was't until I wrote Java compilation 
tools that I had heard about the strict_fp modifier, or that you could 
declare functions in the form "public int foo()[][][] { /*body of 
function*/ }".

    And it is certainly possible that if you use this newsgroup as a sample 
space to perform some surveys on COBOL programmers, your results may 
certainly be skewed.

    - Oliver 


0
owong (6177)
7/12/2005 3:31:58 PM
"Chuck Stevens" <charles.stevens@unisys.com> wrote in message 
news:db0ma9$e55$1@si05.rsvl.unisys.com...
>
> While CONTINUE is a STATEMENT and can stand alone, NEXT SENTENCE is a
> PHRASE.

    Could you clarify this a bit? I have only a vague understanding of the 
hierarchy of statements and sentences and such. For me, the clearest picture 
of this hierarchy is 
http://www.csis.ul.ie/COBOL/Course/Resources/pics/CobolStructure.gif , but 
the documentation I've seen often refers to phrases and clauses and such, 
and I'm not sure what the significance of all these names are.

    - Oliver 


0
owong (6177)
7/12/2005 3:42:17 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db0lp1$fd1$1@peabody.colorado.edu...
>
> * is a comment.   I'm not familiar with A and B.

    I've encountered code where it seems any alphabetic character can appear 
in the indicator column, and the stuff in the "main code area" seemed to be 
exectuable COBOL statements (i.e. not merely comments), so I had assumed 
these characters served as some sort of conditional compilation flags. 
Here's a random snippet from one of my test files (hopefully word wrapping 
won't mangle up this code; just in case, I've added an initial full line of 
hyphens to indicate the intended width the code should display at).

--------------------------------------------------------------------------------
025700 WRITE-LINE. 
IF1114.2
025800     ADD 1 TO RECORD-COUNT. 
IF1114.2
025900Y    IF RECORD-COUNT GREATER 42 
IF1114.2
026000Y        MOVE DUMMY-RECORD TO DUMMY-HOLD 
IF1114.2
026100Y        MOVE SPACE TO DUMMY-RECORD 
IF1114.2
026200Y        WRITE DUMMY-RECORD AFTER ADVANCING PAGE 
IF1114.2
026300Y        MOVE CCVS-H-1  TO DUMMY-RECORD  PERFORM WRT-LN 2 TIMES 
IF1114.2
026400Y        MOVE CCVS-H-2A TO DUMMY-RECORD  PERFORM WRT-LN 2 TIMES 
IF1114.2
026500Y        MOVE CCVS-H-2B TO DUMMY-RECORD  PERFORM WRT-LN 3 TIMES 
IF1114.2
026600Y        MOVE CCVS-H-3  TO DUMMY-RECORD  PERFORM WRT-LN 3 TIMES 
IF1114.2
026700Y        MOVE CCVS-C-1  TO DUMMY-RECORD  PERFORM WRT-LN 
IF1114.2
026800Y        MOVE CCVS-C-2  TO DUMMY-RECORD  PERFORM WRT-LN 
IF1114.2
026900Y        MOVE HYPHEN-LINE TO DUMMY-RECORD PERFORM WRT-LN 
IF1114.2
027000Y        MOVE DUMMY-HOLD TO DUMMY-RECORD 
IF1114.2
027100Y        MOVE ZERO TO RECORD-COUNT. 
IF1114.2

    - Oliver 


0
owong (6177)
7/12/2005 3:47:12 PM
On 12-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

>     The strategy we're using right now is "start with the easy cases, and
> work your way up". So right now I'm trying to eliminate COBOL programs with
> just a single GO TO, no ALTER or PERFORM/PERFORM THRU or segmentation or
> anything like that

Decide whether you want to eliminate SECTIONs and all drop through paragraphs.

Figure out how you are going to change copy members or database includes.   
Obviously you can't change one of these without affecting all programs that use
them.

Even if you don't handle these now, you will need to have a planned strategy so
that your "easy cases" fit into the big picture.

Does your new compiler have some of the very newest features such as EXIT
PARAGRAPH?
0
howard (6283)
7/12/2005 3:56:19 PM
On 12-Jul-2005, Peter Lacey <lacey@mb.sympatico.ca> wrote:

> This discussion has encouraged me to pose a question to the group which
> I've been contemplating for some time.  One of the goals of tools like
> this is to eliminate spaghetti code.

Let me try.

Spaghetti code is designed around flow.   It's a highway where each decision has
us leave the highway, perform a task, and then continue the journey.   There is
no requirement that we return to the highway via the same exit we left.   Maybe
I left at exit 180, drove around looking for a restaurant that interested me,
and returned to exit 178 to finish my journey.    It is possible to save time
taking the shortcut back to the freeway, but it is also possible to get lost.
A GO TO can direct you to exit 178 (or 176), or you can have a series of
switches that tell you where to turn at each street until you get to exit 178
(or 176).     Either way is spaghetti code.

Structured flow is more atomic in nature.   It is a step towards OO, but not
really there.   We are using maps.  When we exit on exit 180, we return to exit
180 to continue our journey.    Every time we leave the main path, we return to
the exact same spot.    Each paragraph performs a function and then returns.
Complicated functions have sub-functions, but the concept is the same.    Since
you don't return via exit 178, you don't need the GOTO.
0
howard (6283)
7/12/2005 4:13:07 PM
On 12-Jul-2005, "Chuck Stevens" <charles.stevens@unisys.com> wrote:

> Put another way, in standard COBOL the following is legal:
> PARAGRAPH-1.
>     MOVE A TO B.
>     CONTINUE.
> PARAGRAPH-2.
>     ...
>
> while the following is not:
> PARAGRAPH-1.
>     MOVE A TO B.
>     NEXT SENTENCE.
> PARAGRAPH-2.

Speaking of CONTINUE...

The following is legal:
 MY-PARAGRAPH-EXIT.
D        DISPLAY "REACHED MY-PARAGRAPH-EXIT".
           CONTINUE.

While the following depends on whether DEBUGING MODE is ON.
 MY-PARAGRAPH-EXIT.
D        DISPLAY "REACHED MY-PARAGRAPH-EXIT".
           EXIT.


  
0
howard (6283)
7/12/2005 4:16:38 PM
Top posting; no more below.

The diagram you cite lacks entirely the concept of "phrase".  Statements
contain *phrases* according to their individual syntax diagrams (e.g., the
WHEN phrase of the SEARCH statement, the AT END phrase of the READ
statement).   There should thus be one more level of "nesting" in the
diagram if it is intended to include all concepts.  The appropriate and
ultimate authority on the subject of the component parts of a COBOL source
program that conforms to the 1985 standard is, of course, ANSI X3.23-1985.
A given "phrase" is meaningful only in the context of the "statement" that
includes it in its syntax diagram.

The only places standard COBOL allows the NEXT SENTENCE "phrase" are the
following:

    1)  IF <some-condition> NEXT SENTENCE
    2)  IF <some-condition> <imperative-statement> ELSE NEXT SENTENCE
    3)  IF <some-condition> NEXT SENTENCE ELSE NEXT SENTENCE [yes, this is
allowed!]
    4)  SEARCH ... WHEN <some-condition> NEXT SENTENCE
    5)  SEARCH ALL ... WHEN <key comparison> NEXT SENTENCE

The standard explicitly disallows
    1)  IF ... NEXT SENTENCE ... END-IF [whether the phrase appears before
or after ELSE]
    2)  SEARCH ... NEXT SENTENCE END-SEARCH [whether SEARCH or SEARCH ALL]

Conditional statements include *NON-delimited* varieties of:  EVALUATE; IF;
SEARCH; RETURN; READ with [NOT] AT END or [NOT] INVALID KEY; WRITE with
[NOT] INVALID KEY or [NOT] AT END-OF-PAGE; DELETE, REWRITE or START with
[NOT] INVALID KEY; ADD, SUBTRACT, MULTIPLY, DIVIDE or COMPUTE with [NOT] ON
SIZE ERROR; RECEIVE ... NO DATA; RECEIVE ... WITH DATA; STRING or UNSTRING
with [NOT] ON OVERFLOW; and CALL with ON OVERFLOW or [NOT] ON EXCEPTION.
The *ONLY* ones of these in which NEXT SENTENCE is allowed as an alternative
for an imperative statement are IF and SEARCH.

Delimited versions of the above are *imperative* statements in and of
themselves.

It is a MISTAKE to assume that NEXT SENTENCE can be substituted for any
imperative statement in standard COBOL.

It is a MISTAKE to assume that NEXT SENTENCE can be used as a part of any
conditional statement om stamdard COBOL.

If an implementor allows either of these assumptions, it does so as an
implementor extension to standard COBOL.

    -Chuck Stevens

"Oliver Wong" <owong@castortech.com> wrote in message
news:txRAe.106836$HI.57415@edtnps84...
>
> "Chuck Stevens" <charles.stevens@unisys.com> wrote in message
> news:db0ma9$e55$1@si05.rsvl.unisys.com...
> >
> > While CONTINUE is a STATEMENT and can stand alone, NEXT SENTENCE is a
> > PHRASE.
>
>     Could you clarify this a bit? I have only a vague understanding of the
> hierarchy of statements and sentences and such. For me, the clearest
picture
> of this hierarchy is
> http://www.csis.ul.ie/COBOL/Course/Resources/pics/CobolStructure.gif , but
> the documentation I've seen often refers to phrases and clauses and such,
> and I'm not sure what the significance of all these names are.
>
>     - Oliver
>
>


0
7/12/2005 4:27:10 PM
"Oliver Wong" <owong@castortech.com> wrote in message
news:4CRAe.106837$HI.38691@edtnps84...
>     I've encountered code where it seems any alphabetic character can
appear
> in the indicator column, and the stuff in the "main code area" seemed to
be
> exectuable COBOL statements (i.e. not merely comments), so I had assumed
> these characters served as some sort of conditional compilation flags.

In '85 standard COBOL, the only things that can appear in Column 7 are " ",
"-", "*", "/", "D" and "d".  If other characters are allowed by an
implementor to appear in the Indicator Area, their purpose is entirely up to
the individual implementor and likely will have nothing whatever to do with
what another COBOL implementation might do upon encountering such a
character.   So long as the compiler lets the user know in some way (which
might include by committing hara-kiri) that such things are not standard
COBOL, as far as the standard is concerned the compiler can turn them into
rutabagas or Novembers.

    -Chuck Stevens


0
7/12/2005 4:52:12 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db0p7f$hej$1@peabody.colorado.edu...
>
> Decide whether you want to eliminate SECTIONs and all drop through 
> paragraphs.

    Though not a requirement, I will probably end up removing all SECTIONs 
and drop through paragraphs, as a lot of the ideas I'm using are based on 
Micheal Kasten's recommendations in "Rewriting Spaghetti Code" 
(http://home.swbell.net/mck9/cobol/style/rewrite.html) in which he 
recommends such eliminations.

> Figure out how you are going to change copy members or database includes.
> Obviously you can't change one of these without affecting all programs 
> that use
> them.

    Hmm, this is a very good point. I realized I haven't fully thought out 
how to handle modification of copy members yet. Thanks.

> Even if you don't handle these now, you will need to have a planned 
> strategy so
> that your "easy cases" fit into the big picture.
>
> Does your new compiler have some of the very newest features such as EXIT
> PARAGRAPH?

    Not sure I understand the question, but I'm tempted to answer yes. Some 
of the code that my tools will have to handle may contain the EXIT PARAGRAPH 
statement, and some of the code that my tools output may contain the EXIT 
PARAGRAPH statement, and as far as I know, there is currently no plan on 
eliminating such statements.

    - Oliver 


0
owong (6177)
7/12/2005 4:57:52 PM
Peter Lacey wrote:
> This discussion has encouraged me to pose a question to the group which
> I've been contemplating for some time.  One of the goals of tools like
> this is to eliminate spaghetti code.  Accepting that as a good thing,
> the question arises: exactly what is spaghetti code?

For some great examples, see recent long program listings in
comp.lang.fortran. :-).

It's not just GOTOs at fault. Calling subroutines peppered around the
program in seemingly random order is also a problem. Here's an
operational definition of spaghetti code in approximate order of
severity:

1. I really can't fathom what the program is doing, even in broad
detail.
2. I print a program listing, then mark it up in pencil, then various
colored markers, then highlighters, etc.
3. I toss the listing into the wastebasket in disgust.
4. I wad the listing into a ball and shoot it into the basket.
5. I light the contents of the wastebasket on fire. :-).
6. I rewrite the program from scratch.
7. I fire the guy who wrote it.
8. I renounce computers and join a monastic order.

0
epc8 (1259)
7/12/2005 5:05:13 PM
"Peter Lacey" <lacey@mb.sympatico.ca> wrote in message 
news:42D3DD3B.38BCDEDC@mb.sympatico.ca...
> the question arises: exactly what is spaghetti code?
[...]
> Anyway: I'd like to hear some definitions and examples!

    I come from a Java background, so this answer might not apply very well 
to (traditional) COBOL, as it relies a bit on using an object oriented 
paradigm, but...

    I believe that in an ideal program, if you wanted to know what a piece 
of code does, you wouldn't need to read anything other than the code in 
question. That is to say, if you were reading through the code listing, and 
saw a method call (I guess the closest COBOL equivalent is a PERFORM 
statement), your first instinct wouldn't be to immediately scroll over to 
read the contents of that method (or paragraph); rather, it would be clear 
what that method does at a high level based on the name of that method.

    That's one reason the presence of GOTOs imply unstructured (or 
"spaghetti") code: Even if you give a really meaningful name to the 
paragraph you're GOTO-ing, the reader still has a strong temptation to 
scroll down and read the code there, to see if you eventually branch back to 
where you came from (maybe this isn't possible in COBOL; I may be thinking 
more of BASIC's GOSUB and RETURN statements).

    This also explains why it's possible (though often unlikely) to write 
structure (i.e. non-spaghetti code) even with the presence of GOTO 
statements. If after reading through the code listing for a bit, you realize 
that the original author was very dilligent and organized and so on, then 
you might be confident that this author has arranged it so that whenever she 
GOTOs somewhere, she always GOTOs back; and so you no longer have the 
temptation, upon encountering a GOTO, to scroll down and read that bit of 
code; you're confident that control will eventually come back, and the name 
of the label of the GOTO target made it clear what that subroutine was 
doing.

    The reason spaghetti code, under the definition I'm offering, is 
difficult for humans to understand is that it essentially requires us to 
maintain a "call stack" like structure in our minds. While reading some 
spaghetti code, when we see a branch, we're tempted to go read whatever code 
migth be at the target of that branch. So we have to remember where we are 
now, and go check it out. While reading THAT code, we then get tempted 
again, to go look at another part of the code. We have to push a new address 
onto our stack, and go read THAT new section of code, and so on.

    With structure code, there is no temptation to go browsing around at 
other locations, and so our stack height will always remain at 1 (or 0, 
depending on whether you count the "current location" as being on the stack 
or not).

    BTW, this is also why comments at the beginning of methods (or 
paragraphs or whatever) that describe what the method does can reduce the 
symptoms of spaghetti code. Now, if you see a method call for which you're 
tempted to branch off into, rather than actually reading the code of that 
method, you can just read the high level description of the method (i.e. the 
comment at the top), and so your stack never exceeds 2 (or 1, again 
depending on how you count).

    - Oliver 


0
owong (6177)
7/12/2005 5:37:35 PM
On 12-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

> > Does your new compiler have some of the very newest features such as EXIT
> > PARAGRAPH?
>
>     Not sure I understand the question, but I'm tempted to answer yes. Some
> of the code that my tools will have to handle may contain the EXIT PARAGRAPH
> statement, and some of the code that my tools output may contain the EXIT
> PARAGRAPH statement, and as far as I know, there is currently no plan on
> eliminating such statements.

My compiler is old enough that it does not support EXIT PERFORM (EXIT PARAGRAPH
and EXIT SECTION).   If yours has these ways to get out of a loop, you will want
to use them.

Hmmm.    Consider the following code and what you plan to do to it.
      PERFORM UNTIL A=B
            PERFORM GET-SALAD
            ADD 1 to A
            IF SALAD = GREEN
                   GO TO EXIT
            END-IF
            PERFORM GET-DRESSING
      END-PERFORM.
0
howard (6283)
7/12/2005 5:39:18 PM
On 12-Jul-2005, "Chuck Stevens" <charles.stevens@unisys.com> wrote:

> The standard explicitly disallows
>     1)  IF ... NEXT SENTENCE ... END-IF [whether the phrase appears before
> or after ELSE]
>     2)  SEARCH ... NEXT SENTENCE END-SEARCH [whether SEARCH or SEARCH ALL]

But in this thread we have programs with ALTER in it, so I wouldn't assume that
the code is from CoBOL compilers which follow this standard either.   (My
compiler allows the above disallowed code).
0
howard (6283)
7/12/2005 5:42:47 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db0v8j$kkc$1@peabody.colorado.edu...
> My compiler is old enough that it does not support EXIT PERFORM (EXIT 
> PARAGRAPH
> and EXIT SECTION).   If yours has these ways to get out of a loop, you 
> will want
> to use them.

    Oh, yes, in that sense, I can indeed output code using the EXIT PERFORM 
statement. Thanks for the tip; it would have probably been very easy for me 
to go overboard with adding flags everywhere and forgetting about the 
existence of the EXIT PERFORM statement (indeed, one of the transformations 
I was planning on using does require a way to break out of loops).

>
> Hmmm.    Consider the following code and what you plan to do to it.
>      PERFORM UNTIL A=B
>            PERFORM GET-SALAD
>            ADD 1 to A
>            IF SALAD = GREEN
>                   GO TO EXIT
>            END-IF
>            PERFORM GET-DRESSING
>      END-PERFORM.

    Okay, again this is just me as a human trying to mechanically apply the 
rules, but I think the transformation step will look like this (assuming no 
malicious ALTERs or anything like that elsewhere in the code):

PERFORM UNTIL A=B
  PERFORM GET-SALAD
  ADD 1 to A
  IF SALAD = GREEN
    SET SHOULD-GOTO-EXIT TO TRUE
    EXIT PERFORM
  END-IF
  PERFORM GET-DRESSING
END-PERFORM
IF SHOULD-GOTO-EXIT THEN
  GO TO EXIT
END-IF
..

Where the transformation of the last GOTO depends on whether EXIT appears 
above or below the GO TO.

    - Oliver 


0
owong (6177)
7/12/2005 5:48:25 PM
"Chuck Stevens" <charles.stevens@unisys.com> wrote in message 
news:db0sgc$hs0$1@si05.rsvl.unisys.com...
> In '85 standard COBOL, the only things that can appear in Column 7 are " 
> ",
> "-", "*", "/", "D" and "d".  If other characters are allowed by an
> implementor to appear in the Indicator Area, their purpose is entirely up 
> to
> the individual implementor and likely will have nothing whatever to do 
> with
> what another COBOL implementation might do upon encountering such a
> character.

    That's very strange, because the code I posted above actually comes from 
a COBOL85 test suite produced by NIST. To quote their page, "The COBOL85 
test suite is a product of the National Computing Centre, UK. It is used to 
determine, insofar as is practical, the degree to which a COBOL processor 
conforms to the COBOL standard (ANSI X3.23-1985, ISO 1989-1985, ANSI 
X3.23a-1989 and ANSI X3.23b-1993.)" So I had assumed that these alphabetic 
characters in the indicator field must have been part of the ANSI 85 COBOL 
standard.

    Anyway, you can download the same test suite I did from 
http://www.itl.nist.gov/div897/ctg/cobol_form.htm , and the file in question 
was "IF111A", and the snippet I quoted begins on line 257.

    - Oliver 


0
owong (6177)
7/12/2005 5:52:50 PM
You do NOT need to add a new paragraph name to handle "next sentence".  The 
following is for a "85 standard conforming environment".

Original:
  If A = "A"
      Next Sentence
  Else
     Perform XYZ
    .
  Move L to M
   .

Converted code:

  Perform
     If A = "A"
         Continue
     Else
         Perform XYZ
     End-IF
   End-Perform
   Move L to M
    .

The following assumes an environment with an EXTENSION (available in Micro 
Focus, IBM, and possibly others) that allows NEXT SENTENCE in nested IF with 
END-IF at same level)

Original Source:
  If A = "A"
     If B = "B"
        Next Sentence
     Else
        Perform XYZ
     End-IF
  Else
     Perform A23
 End-If
    .
Move L to M
  .

Modified code:
Perform
 If A = "A"
     If B = "B"
        Continue
     Else
        Perform XYZ
     End-IF
  Else
     Perform A23
   End-If
End-Perform
Move L to M
  .


-- 
Bill Klein
 wmklein <at> ix.netcom.com
"Oliver Wong" <owong@castortech.com> wrote in message 
news:vcPAe.149478$on1.126481@clgrps13...
>
> "Richard" <riplin@Azonic.co.nz> wrote in message 
> news:1121117821.011458.183950@g14g2000cwa.googlegroups.com...
>> I can. It has been recently argu^H^H^H discussed here that a NEXT
>> SENTENCE embedded in a nested imperitive IF is allowed (some even think
>> it was deliberately allowed) and this will transfer control to after
>> the next full stop which may be at some arbitrary point.
>>
>> In my view any such usage should cause the converter to put the output
>> and input to /dev/nul, and preverably the coder as well, but it does
>> mean that such a program cannot have that full stop removed.
>
>    Hmm, interesting. I did not find any mention of "NEXT SENTENCE" in any of 
> the documentation I initially used as references for the COBOL language, but 
> based on the discussions I've found on the web, it looks like NEXT SENTENCE is 
> an unconditional jump to the statement immediately following the period of the 
> sentence containing the NEXT SENTENCE statement, as you've said.
>
>    I think this would be one one case where I would have to add a new 
> paragraph name (to act as a label), then change the NEXT SENTENCE to a GO TO 
> which jumps to that label. All the while, I have to be check for any PERFORM 
> or PERFORM THRU statements that I might have to update to reflect the addition 
> of this new paragraph.
>
>    Thanks for the tip, I might have to factor in this new issue into the 
> algorithm design.
>
>    - Oliver.
> 


0
wmklein (2605)
7/12/2005 5:54:58 PM
What is your position going to be about GO TO DEPENDING?


One disadvantage to removing periods first is that parsing needs to look at
complete sentences.   You read a sentence, get rid of spaces and lines and such,
handle continuations (I presume you need to check for this) etc.   If you remove
the distinction between sentences and paragraphs, the smallest element to be
parsed will be the paragraph.
0
howard (6283)
7/12/2005 5:56:22 PM
The (old no longer supported) NIST test suite requires one to run a 
"preparation" program that does some source code modification based on "control 
cards" passed to it.  For example, it tests literal continuation based on input 
on where the R-margin is.

I know of NO test case that has "non-Standard) alphabetic characters in margin 7 
*after* the conversion program runs.  If you have an example, please tell me the 
test case name.

FYI,
  If you are using the NIST test suite to "check out" your converter that is 
both good and bad.

A) GOOD - it really DOES test every "standard" COBOL "valid" construction
B) GOOD - it is AWFUL spaghetti code - so it will test your tool
C) BAD - it is NOT typical application code and does NOT use many common 
"constructs" and does use some that I never saw in any real-world application.

-- 
Bill Klein
 wmklein <at> ix.netcom.com
"Oliver Wong" <owong@castortech.com> wrote in message 
news:SrTAe.106860$HI.61039@edtnps84...
>
> "Chuck Stevens" <charles.stevens@unisys.com> wrote in message 
> news:db0sgc$hs0$1@si05.rsvl.unisys.com...
>> In '85 standard COBOL, the only things that can appear in Column 7 are " ",
>> "-", "*", "/", "D" and "d".  If other characters are allowed by an
>> implementor to appear in the Indicator Area, their purpose is entirely up to
>> the individual implementor and likely will have nothing whatever to do with
>> what another COBOL implementation might do upon encountering such a
>> character.
>
>    That's very strange, because the code I posted above actually comes from a 
> COBOL85 test suite produced by NIST. To quote their page, "The COBOL85 test 
> suite is a product of the National Computing Centre, UK. It is used to 
> determine, insofar as is practical, the degree to which a COBOL processor 
> conforms to the COBOL standard (ANSI X3.23-1985, ISO 1989-1985, ANSI 
> X3.23a-1989 and ANSI X3.23b-1993.)" So I had assumed that these alphabetic 
> characters in the indicator field must have been part of the ANSI 85 COBOL 
> standard.
>
>    Anyway, you can download the same test suite I did from 
> http://www.itl.nist.gov/div897/ctg/cobol_form.htm , and the file in question 
> was "IF111A", and the snippet I quoted begins on line 257.
>
>    - Oliver
> 


0
wmklein (2605)
7/12/2005 6:03:54 PM
ALTER is *obsolete in*, but not *gone from*, the '85 standard.  It didn't
get disappeared until '02.

    -Chuck Stevens

"Howard Brazee" <howard@brazee.net> wrote in message
news:db0vf3$ktn$1@peabody.colorado.edu...
>
> On 12-Jul-2005, "Chuck Stevens" <charles.stevens@unisys.com> wrote:
>
> > The standard explicitly disallows
> >     1)  IF ... NEXT SENTENCE ... END-IF [whether the phrase appears
before
> > or after ELSE]
> >     2)  SEARCH ... NEXT SENTENCE END-SEARCH [whether SEARCH or SEARCH
ALL]
>
> But in this thread we have programs with ALTER in it, so I wouldn't assume
that
> the code is from CoBOL compilers which follow this standard either.   (My
> compiler allows the above disallowed code).


0
7/12/2005 6:03:56 PM
"William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
news:StTAe.227354$ub.180201@fe07.news.easynews.com...
> You do NOT need to add a new paragraph name to handle "next sentence". 
> The following is for a "85 standard conforming environment".
>
> Original:
>  If A = "A"
>      Next Sentence
>  Else
>     Perform XYZ
>    .
>  Move L to M
>   .
>
> Converted code:
>
>  Perform
>     If A = "A"
>         Continue
>     Else
>         Perform XYZ
>     End-IF
>   End-Perform
>   Move L to M
>    .

That's a very interesting usage of the inline PERFORM statement! I'll have 
to look more into this one. Naively, it looks like the trick is just to wrap 
code around an inline PERFORM once statement, and using CONTINUE to break 
out of it. But what happens if there's already some nesting of PERFORM 
statements? Assuming the extensions that allow NEXT SENTENCE in a nested IF 
with END-IF:

PERFORM
  IF A = "A" THEN
    NEXT SENTENCE
  ELSE
    PERFORM XYZ
  END-IF
  MOVE L TO M
..
MOVE M TO K
..

Using the naive algorithm I've outlined above, this becomes:

PERFORM
  PERFORM
    IF A = "A" THEN
      CONTINUE
    ELSE
      PERFORM XYZ
    END-IF
    MOVE L TO M
  END-PERFORM
  MOVE M TO K
END-PERFORM
..

which doesn't seem to do the same thing as the original (the original did 
not move L to M, whereas the modified version does).

    - Oliver 


0
owong (6177)
7/12/2005 6:12:40 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db108j$l84$1@peabody.colorado.edu...
> What is your position going to be about GO TO DEPENDING?

    This will need to be eliminated as well, probably via a 2 step process. 
In the first step, it is converted to a series of GO TOs within some IF THEN 
ELSE statements. Then the individual simple GO TOs will be eliminated 
individually.

> One disadvantage to removing periods first is that parsing needs to look 
> at
> complete sentences.   You read a sentence, get rid of spaces and lines and 
> such,
> handle continuations (I presume you need to check for this) etc.   If you 
> remove
> the distinction between sentences and paragraphs, the smallest element to 
> be
> parsed will be the paragraph.

    From my point of view, the transformations would be done on a statement 
by statment basis rather than a paragraph by paragraph (or sentence by 
sentence) basis. And aside from the presence of the NEXT SENTENCE statement, 
if you always use the END-whatever (END-IF, END-CALL, END-COMPUTE, etc.) 
constructs, there's no semantic different between statements and sentences. 
So to simplify the transformation rules, I'd like to largely ignore the 
existence of sentences; or equivalently, I'd like to assume exactly 1 
sentence per paragraph.

    - Oliver 


0
owong (6177)
7/12/2005 6:18:26 PM
You conversion does exactly what the original code does.  It what also be 
equivalent if the
   PERFORM XYZ
were
  Go To XYZ

The "End-Perform is matched ONLY with the nearest "in-line" perform, so I don't 
see any reason that the MOVE would not occur (if it does in the original).

-- 
Bill Klein
 wmklein <at> ix.netcom.com
"Oliver Wong" <owong@castortech.com> wrote in message 
news:sKTAe.106864$HI.103463@edtnps84...
>
> "William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
> news:StTAe.227354$ub.180201@fe07.news.easynews.com...
>> You do NOT need to add a new paragraph name to handle "next sentence". The 
>> following is for a "85 standard conforming environment".
>>
>> Original:
>>  If A = "A"
>>      Next Sentence
>>  Else
>>     Perform XYZ
>>    .
>>  Move L to M
>>   .
>>
>> Converted code:
>>
>>  Perform
>>     If A = "A"
>>         Continue
>>     Else
>>         Perform XYZ
>>     End-IF
>>   End-Perform
>>   Move L to M
>>    .
>
> That's a very interesting usage of the inline PERFORM statement! I'll have to 
> look more into this one. Naively, it looks like the trick is just to wrap code 
> around an inline PERFORM once statement, and using CONTINUE to break out of 
> it. But what happens if there's already some nesting of PERFORM statements? 
> Assuming the extensions that allow NEXT SENTENCE in a nested IF with END-IF:
>
> PERFORM
>  IF A = "A" THEN
>    NEXT SENTENCE
>  ELSE
>    PERFORM XYZ
>  END-IF
>  MOVE L TO M
> .
> MOVE M TO K
> .
>
> Using the naive algorithm I've outlined above, this becomes:
>
> PERFORM
>  PERFORM
>    IF A = "A" THEN
>      CONTINUE
>    ELSE
>      PERFORM XYZ
>    END-IF
>    MOVE L TO M
>  END-PERFORM
>  MOVE M TO K
> END-PERFORM
> .
>
> which doesn't seem to do the same thing as the original (the original did not 
> move L to M, whereas the modified version does).
>
>    - Oliver
> 


0
wmklein (2605)
7/12/2005 6:22:42 PM
Yes, I see that.  I found my vintage-1993 copy of CCVS 85 USER GUIDE VERSION
4.2, and a brief perusal of it leads me to believe the process of building
the test symbol files involves an extraction process (and I'm pretty sure
it's a "self-extraction" process from the monolithic symbol using the first
program included therein and the rest as input data).   The "executive"
appears to be that part of the file that has "EXEC84.2" in the ID field.

Thus, there seems to be a "preprocessor" provided for this file to get its
contents into a state suitable for compilation.

I double-checked ANSI X3.23-1985, and I still don't see anything other than
spaces, forward-slashes, asterisks, hyphens, and upper and lower case D
characters being allowed in the indicator area.

    -Chuck Stevens


"Oliver Wong" <owong@castortech.com> wrote in message
news:SrTAe.106860$HI.61039@edtnps84...

>     That's very strange, because the code I posted above actually comes
from
> a COBOL85 test suite produced by NIST. To quote their page, "The COBOL85
> test suite is a product of the National Computing Centre, UK. It is used
to
> determine, insofar as is practical, the degree to which a COBOL processor
> conforms to the COBOL standard (ANSI X3.23-1985, ISO 1989-1985, ANSI
> X3.23a-1989 and ANSI X3.23b-1993.)" So I had assumed that these alphabetic
> characters in the indicator field must have been part of the ANSI 85 COBOL
> standard.
>
>     Anyway, you can download the same test suite I did from
> http://www.itl.nist.gov/div897/ctg/cobol_form.htm , and the file in
question
> was "IF111A", and the snippet I quoted begins on line 257.


0
7/12/2005 6:30:07 PM
"Oliver Wong" <owong@castortech.com> wrote in message
news:SPTAe.106866$HI.3465@edtnps84...

With respect to GO TO DEPENDING:

>     This will need to be eliminated as well, probably via a 2 step
process.
> In the first step, it is converted to a series of GO TOs within some IF
THEN
> ELSE statements. Then the individual simple GO TOs will be eliminated
> individually.

What about EVALUATE?  It preserves the structure of GO TO DEPENDING better
than IF ... THEN ... ELSE does, I think.

    -Chuck Stevens


0
7/12/2005 6:34:34 PM
"William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
news:eCTAe.235768$iM6.212106@fe01.news.easynews.com...
> The (old no longer supported) NIST test suite requires one to run a 
> "preparation" program that does some source code modification based on 
> "control cards" passed to it.  For example, it tests literal continuation 
> based on input on where the R-margin is.
>
> I know of NO test case that has "non-Standard) alphabetic characters in 
> margin 7 *after* the conversion program runs.  If you have an example, 
> please tell me the test case name.

    I actually had to write my own "preparation program", which does the 
appropriate substitution for certain control cards, but to be quite honest, 
I really had no idea what I was doing and was mainly just fumbling around in 
the dark. When the output "looked like" valid COBOL programs, I had assumed 
that I succeeded.

    Anyway, these characters appear in the original (non-prepared) versions 
of the source file. Is there any documentation I could look at to see how to 
interpret these characters for preparation? Or perhaps a link to a 
conversion program?

    - Oliver 


0
owong (6177)
7/12/2005 6:36:24 PM
"William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
news:STTAe.241992$pI6.149458@fe06.news.easynews.com...
> You conversion does exactly what the original code does.  It what also be 
> equivalent if the
>   PERFORM XYZ
> were
>  Go To XYZ
>
> The "End-Perform is matched ONLY with the nearest "in-line" perform, so I 
> don't see any reason that the MOVE would not occur (if it does in the 
> original).
>
>> PERFORM
>>  IF A = "A" THEN
>>    NEXT SENTENCE
>>  ELSE
>>    PERFORM XYZ
>>  END-IF
>>  MOVE L TO M
>> .
>> MOVE M TO K
>> .

    My interpretation of this code is that if A is equal to "A", then the 
NEXT SENTENCE statement causes execution to jump to after the next period 
that occurs, which means it skips over the "MOVE L TO M" statement, and goes 
straight to the "MOVE M TO K" statement.

>> PERFORM
>>  PERFORM
>>    IF A = "A" THEN
>>      CONTINUE
>>    ELSE
>>      PERFORM XYZ
>>    END-IF
>>    MOVE L TO M
>>  END-PERFORM
>>  MOVE M TO K
>> END-PERFORM
>> .

    In this code, if A is equal to "A", then CONTINUE is executed (which 
doesn't do anything), and then execution conditions after the END-IF, which 
is the "MOVE L TO M" statement. Then after than "MOVE M TO K" gets executed.

    Is this the correct interpretation? If so, then the former performs only 
1 MOVE statement, whereas the latter performs 2.

    - Oliver 


0
owong (6177)
7/12/2005 6:39:48 PM
Reply to first (general note):

As far as I know it is ALWAYS possible to convert an '85 Standard conforming 
COBOL program to a style of:

A) No Sections
B) only one period per procedure division paragraph
C) no PERFORM THRU statements and no ALTER statements and no GO TO statements

As far as I know it is also possible to do this for most (as far as I know ALL) 
vendor extended COBOL's.  However, I don't know this for certain.  I *do* know 
that IBM provided a product (that Arnold) mentioned called "COBOL/SF" or "COBOL 
Structuring Facility".  This product sold for $100,000 - which may give you an 
idea of its complexity.  This product is no longer sold or supported by IBM. 
Its primary use (frequency) was during the time that many sites were converted 
from OS/VS COBOL (a '68 or '74 Standard compiler) to VS COBOL II (an '85 
Standard compiler).  For more information on this product's development (which 
was interesting in and of itself), see:
   http://www.scisstudyguides.addr.com/papers/cwdiss725paper1.htm
     or
  http://hissa.ncsl.nist.gov/formal_methods/vol2.txt

This brings up a question that I have not seen asked or answered yet within this 
thread:

What compiler is being used for the "unmodified" source code today (and do you 
know what compiler options or directives are being used)?  This can 
SIGNIFICANTLY impact what your source code needs to handle. For example,
  - IBM's OS/VS COBOL supported the
     -- documented ON statement ("weird" control flow)
     -- undocumented syntax such as:
              Sort ...
                  Input Procedure ABC THRU XYZ
  - Micro Focus
     --- has a compiler directive to determine the semantics of NEXT SENTENCE
  - Does your "real" code include preprocessors (mentioned elsewhere in the 
thread)
    -- For example IBM code with either DB2 or CICS can have LOTS of "inserted" 
and flow-control code

What compiler will be the target of your "output" source code?  If it is NOT the 
same as your input, then other conversions may be needed.

  ***

Next, if the "goal" of this who project is REALLY to get code analysis, then 
have you really looked at all the tools available for this?  Although some work 
BETTER with "structured" code, this is NOT a requirement for all.  For example 
(AND THIS IS ONLY AN EXAMPLE), the "Micro Focus REVOLVE" product provides 
EXCELLENT code analysis for "IBM mainframe" as well as "Micro Focus" COBOL code. 
See:
   http://www.microfocus.com/products/revolve/

   ***

Finally,  as stated initially in this note, I *do* think it is possible to do 
what you are asking about. However, I am with those who questions the wisdom 
(and cost-effectiveness) of starting on the project.  If you can "better" 
explain your actual business need and environment, we (comp.lang.cobol) MAY be 
able to suggest a "better" solution.

-- 
Bill Klein
 wmklein <at> ix.netcom.com
"Oliver Wong" <owong@castortech.com> wrote in message 
news:wqwAe.115746$9A2.13677@edtnps89...
>    I'm trying to write a program that reformats the structure of a given COBOL 
> program. One of the transformations I'd like to apply is to try to eliminate 
> as many periods as possibles, using the END-whatever (END-IF, END-CALL, 
> END-COMPUTE, etc.) constructs.
>
>    My question is, is it always possible to do this? I don't want to "cheat" 
> by adding new paragraph names at the beginning of every sentence (which may 
> affect PERFORM statements anyway). My theory is that yes, it is always 
> possible, but I haven't been programming in COBOL for that long, so I wasn't 
> sure.
>
>    - Oliver Wong
>
>
> 


0
wmklein (2605)
7/12/2005 6:40:25 PM
You are correct and I was wrong.  I was thinking of the EXIT PERFORM staterment 
(which may or may not be available with your compiler.  If it is available, 
changing the CONTINUE to an EXIT PERFORM would do the conversion correctly.

As just asked in another note:
  - What is your "unmodified" compiler?
  - What is your "modified" compiler?

(Release, operating system, etc - would be useful)

-- 
Bill Klein
 wmklein <at> ix.netcom.com
"Oliver Wong" <owong@castortech.com> wrote in message 
news:U7UAe.106871$HI.61171@edtnps84...
>
> "William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
> news:STTAe.241992$pI6.149458@fe06.news.easynews.com...
>> You conversion does exactly what the original code does.  It what also be 
>> equivalent if the
>>   PERFORM XYZ
>> were
>>  Go To XYZ
>>
>> The "End-Perform is matched ONLY with the nearest "in-line" perform, so I 
>> don't see any reason that the MOVE would not occur (if it does in the 
>> original).
>>
>>> PERFORM
>>>  IF A = "A" THEN
>>>    NEXT SENTENCE
>>>  ELSE
>>>    PERFORM XYZ
>>>  END-IF
>>>  MOVE L TO M
>>> .
>>> MOVE M TO K
>>> .
>
>    My interpretation of this code is that if A is equal to "A", then the NEXT 
> SENTENCE statement causes execution to jump to after the next period that 
> occurs, which means it skips over the "MOVE L TO M" statement, and goes 
> straight to the "MOVE M TO K" statement.
>
>>> PERFORM
>>>  PERFORM
>>>    IF A = "A" THEN
>>>      CONTINUE
>>>    ELSE
>>>      PERFORM XYZ
>>>    END-IF
>>>    MOVE L TO M
>>>  END-PERFORM
>>>  MOVE M TO K
>>> END-PERFORM
>>> .
>
>    In this code, if A is equal to "A", then CONTINUE is executed (which 
> doesn't do anything), and then execution conditions after the END-IF, which is 
> the "MOVE L TO M" statement. Then after than "MOVE M TO K" gets executed.
>
>    Is this the correct interpretation? If so, then the former performs only 1 
> MOVE statement, whereas the latter performs 2.
>
>    - Oliver
> 


0
wmklein (2605)
7/12/2005 6:45:08 PM
"Chuck Stevens" <charles.stevens@unisys.com> wrote in message 
news:db12ga$lji$1@si05.rsvl.unisys.com...
>
> "Oliver Wong" <owong@castortech.com> wrote in message
> news:SPTAe.106866$HI.3465@edtnps84...
>
> With respect to GO TO DEPENDING:
>
>>     This will need to be eliminated as well, probably via a 2 step
>> process.
>> In the first step, it is converted to a series of GO TOs within some IF
>> THEN
>> ELSE statements. Then the individual simple GO TOs will be eliminated
>> individually.
>
> What about EVALUATE?  It preserves the structure of GO TO DEPENDING better
> than IF ... THEN ... ELSE does, I think.

    I had considered that, and actually I mispoke: there would be no "ELSE" 
clause. So code like:

GO TO P1, P2, P3, P4 DEPENDING ON FLAG

would translate to

IF FLAG = 1 THEN
  GO TO P1
END-IF
IF FLAG = 2 THEN
  GO TO P2
END-IF
IF FLAG = 3 THEN
  GO TO P3
END-IF
IF FLAG = 4 THEN
  GO TO P4
END-IF

    and the rational for this is that this actually decouples, from an 
abstract syntax tree perspective, the GO TO statements from each other, so 
that they could be moved around independently for easier manipulation. On 
the other hand, if the GO TOs were inside an "EVALUTE FLAG WHEN 1 etc." type 
construct, the GO TO statements need to be handle together in one big chunk.

    And anyway, it turns out that the algorithm for eliminating GO TOs 
within EVALUATE statement eventually degenerates to the above code (of a 
sequence of IF statements) anyway, except it doesn't look as pretty, being 
machine generated, and also will cause the addition of more variables and 
flags and such.

    - Oliver 


0
owong (6177)
7/12/2005 6:45:39 PM
"Chuck Stevens" <charles.stevens@unisys.com> wrote in message 
news:db127v$la8$1@si05.rsvl.unisys.com...
> Yes, I see that.  I found my vintage-1993 copy of CCVS 85 USER GUIDE 
> VERSION
> 4.2, and a brief perusal of it leads me to believe the process of building
> the test symbol files involves an extraction process (and I'm pretty sure
> it's a "self-extraction" process from the monolithic symbol using the 
> first
> program included therein and the rest as input data).   The "executive"
> appears to be that part of the file that has "EXEC84.2" in the ID field.
>
> Thus, there seems to be a "preprocessor" provided for this file to get its
> contents into a state suitable for compilation.
>
> I double-checked ANSI X3.23-1985, and I still don't see anything other 
> than
> spaces, forward-slashes, asterisks, hyphens, and upper and lower case D
> characters being allowed in the indicator area.

    Okay, thanks. I had originally written my own "extraction program" (base 
merely on guesswork by inspecting the large monolithic file, and my limited 
knowledge of what a valid COBOL program looks like), but it did seem like 
the EXEC84 program had some sort of special status.

    I'll have to investigate what this means with respect to whether or not 
I'll have to redo the extraction process.

    - Oliver 


0
owong (6177)
7/12/2005 6:55:07 PM
"William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
news:eCTAe.235768$iM6.212106@fe01.news.easynews.com...
> FYI,
>  If you are using the NIST test suite to "check out" your converter that 
> is both good and bad.
>
> A) GOOD - it really DOES test every "standard" COBOL "valid" construction
> B) GOOD - it is AWFUL spaghetti code - so it will test your tool
> C) BAD - it is NOT typical application code and does NOT use many common 
> "constructs" and does use some that I never saw in any real-world 
> application.

    I haven't read all the NIST test programs, but I did read some of them 
(in particular, the ones that made my parser choke), and yeah, it was pretty 
obvious that there was some code that could only be described as "no sane 
person would ever write code like that, except as a mischievious attempt to 
try and screw up the compiler".

    As for point C, in addition to the NIST test suite, I've also been 
provided with random COBOL programs download from tutorial sites that aim to 
teach COBOL to a novice, as well as a handful of actual code that was once 
(perhaps still is) used in a production environment.

    Generally, for each phase, the test consisted of making sure it worked 
for the novice programs, then making sure it works for the serious programs, 
then making sure it works for the NIST programs. I can parse almost all of 
the NIST files (I still have some issues with line continuation and the such 
to work out), and right now my restructure tool is still only working on 
novice-style files.

    - Oliver 


0
owong (6177)
7/12/2005 7:01:21 PM
"William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
news:UcUAe.295454$3V6.164634@fe04.news.easynews.com...
> I was thinking of the EXIT PERFORM staterment (which may or may not be 
> available with your compiler.  If it is available, changing the CONTINUE 
> to an EXIT PERFORM would do the conversion correctly.

Aha, but now if the original code was:

IF B = "B" THEN
  PERFORM
    IF A = "A" THEN
      NEXT SENTENCE
    ELSE
      PERFORM XYZ
    END-IF
    MOVE L TO M
  END-PERFORM
  MOVE K TO M
ELSE
  CONTINUE
END-IF
..

Then the transformation, with an EXIT PERFORM would result in

PERFORM
  IF B = "B" THEN
    PERFORM
      IF A = "A" THEN
        EXIT PERFORM
      ELSE
        PERFORM XYZ
      END-IF
      MOVE L TO M
    END-PERFORM
    MOVE K TO M
  ELSE
    CONTINUE
  END-IF
END-PERFORM

which again would be slightly different than the original.

    I think the basic idea of using inline "PERFORM once" statements is a 
good idea, but I'll have to think of something clever to work in general, as 
the semantics of EXIT PERFORM and not exactly the same as that of NEXT 
SENTENCE. Worst case, I could just revert to create new paragraphs in the 
general case, but using your cleaner solution in simpler cases where I can 
detect that it won't change the semantics.

> As just asked in another note:
>  - What is your "unmodified" compiler?
>  - What is your "modified" compiler?
>
> (Release, operating system, etc - would be useful)

    The test COBOL source files come from a variety of sources (as mentioned 
in another thread, they are the NIST test suite complemented with example 
COBOL programs from tutorial sites as well as a few "real", "production" 
code that was once used somewhere), and thus probably would normally be 
compiled from various compilers. As a minimum, I'm supposed to accept COBOL 
programs which comply with ANSI COBOL 85. Beyond that, the tool should try 
to be lenient in accepting as many vendor extensions as feasible. Most of my 
documentation is based on IBM's VS COBOL II and Liant's RM/COBOL language 
reference, so those are the vendor extensions whose support is strongest.

    Given appropriate documentation, I'm open to supporting other 
extensions, as long as it doesn't break existing support elsewhere (so I 
guess it's a first come first serve kind of deal).

    - Oliver 


0
owong (6177)
7/12/2005 7:16:16 PM
>> Some people who believe in the one period per paragraph, use NEXT SENTENCE
>> as a way to terminate a paragraph.

> while the following is not:
> PARAGRAPH-1.
>     MOVE A TO B.
>     NEXT SENTENCE.
> PARAGRAPH-2.

I thought that it was clear he meant 'terminating' the logic flow not
the source text.

0
riplin (4127)
7/12/2005 7:19:11 PM
"Oliver Wong" <owong@castortech.com> wrote in message
news:4GUAe.106878$HI.20772@edtnps84...

>     I think the basic idea of using inline "PERFORM once" statements is a
> good idea, ...

Note that "inline PERFORM" is by no means limited to "PERFORM once"
circumstances.  In the syntax diagrams for PERFORM in ANSI X3.23-1985, note
that [procedure-name-1 [THROUGH procedure-name-2]] shows that the whole
thing is optional.

Note also that the terminating test for a more-than-once-through PERFORM
need have nothing whatever to do with anything you might be varying as part
of the PERFORM.  Presuming the appropriate declarations have been made,
these are all at least theoretically correct:

    PERFORM VARYING I FROM J BY K
                 UNTIL I IS GREATER THAN 500
           CONTINUE
    END-PERFORM.

    PERFORM VARYING I FROM J BY K
                UNTIL FUNCTION CURRENT-DATE (9:2) IS EQUAL TO ZERO
            CONTINUE
    END-PERFORM.

    PERFORM VARYING I FROM J BY K
                UNTIL ITS-A-RUTABAGA
        CONTINUE
    END-PERFORM.

    PERFORM VARYING I FROM J BY K UNTIL
                    I IS GREATER THAN 500  OR
                    FUNCTION CURRENT-DATE (9:2) IS EQUAL TO ZERO OR
                    ITS-A-RUTABAGA
            CONTINUE
    END-PERFORM.

Note also that EXIT PERFORM (new in the '02 standard) is valid only from
within inline PERFORMs.

    -Chuck Stevens


0
7/12/2005 7:44:25 PM
> You do NOT need to add a new paragraph name to handle "next sentence".

> The following assumes an environment with an EXTENSION (available in Micro
> Focus, IBM, and possibly others) that allows NEXT SENTENCE in nested IF with
> END-IF at same level)

Yes, but this is also possible by poor design or by accident:

Original Source:
  If A = "A"
     If B = "B"
        Next Sentence
     Else
        Perform XYZ
     End-IF
  Else
     Perform A23
 End-If
 Move L to M
  .

0
riplin (4127)
7/12/2005 7:44:43 PM
"William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
news:s8UAe.278728$JR4.23333@fe02.news.easynews.com...
> As far as I know it is ALWAYS possible to convert an '85 Standard 
> conforming COBOL program to a style of:
>
> A) No Sections
> B) only one period per procedure division paragraph
> C) no PERFORM THRU statements and no ALTER statements and no GO TO 
> statements

    Yes, I supposed from a formal, purely logical perspective, this is very 
likely. One would simply need to prove that the COBOL language (minus the 
above mentioned statements) is Turing Complete, and assuming that the COBOL 
language (including the above mentioned statement) is not more powerful than 
a Turing Machine (unlikely), then the two languages would be equal in power.

    I guess what I really mean is, will it be feasible to do this conversion 
by machine? =)

    I can sidestep issues of solving the halting problem and such by 
assuming that my input COBOL source code should always be a valid program 
(which should, given the appropriate inputs, eventually terminate).

> What compiler is being used for the "unmodified" source code today (and do 
> you know what compiler options or directives are being used)?  This can 
> SIGNIFICANTLY impact what your source code needs to handle.

    You posted this question again elsewhere (and I answered it there), but 
just for continuity, I've copied and pasted my responce here.

    The test COBOL source files come from a variety of sources (as mentioned
in another thread, they are the NIST test suite complemented with example
COBOL programs from tutorial sites as well as a few "real", "production"
code that was once used somewhere), and thus probably would normally be
compiled from various compilers. As a minimum, I'm supposed to accept COBOL
programs which comply with ANSI COBOL 85. Beyond that, the tool should try
to be lenient in accepting as many vendor extensions as feasible. Most of my
documentation is based on IBM's VS COBOL II and Liant's RM/COBOL language
reference, so those are the vendor extensions whose support is strongest.

    Given appropriate documentation, I'm open to supporting other
extensions, as long as it doesn't break existing support elsewhere (so I
guess it's a first come first serve kind of deal).


>  - Micro Focus
>     --- has a compiler directive to determine the semantics of NEXT 
> SENTENCE

    I was not aware of this. I'll probably have to investigate this further.

>  - Does your "real" code include preprocessors (mentioned elsewhere in the 
> thread)

    For debugging lines, the parser currently assumes debugging is set to 
OFF (that is, all lines with D in the indicator field are treated as 
comments), and the only preprocessor-like statement I'm aware of is the COPY 
and REPLACE pair of statements, which I am expected to eventually handle, so 
yes.

>    -- For example IBM code with either DB2 or CICS can have LOTS of 
> "inserted" and flow-control code

    Embedded languages are currently not supported, though there are plans 
to support them in the future.

> What compiler will be the target of your "output" source code?  If it is 
> NOT the same as your input, then other conversions may be needed.

    See next responce.

> Next, if the "goal" of this who project is REALLY to get code analysis, 
> then have you really looked at all the tools available for this?  Although 
> some work BETTER with "structured" code, this is NOT a requirement for 
> all.  For example (AND THIS IS ONLY AN EXAMPLE), the "Micro Focus REVOLVE" 
> product provides EXCELLENT code analysis for "IBM mainframe" as well as 
> "Micro Focus" COBOL code.

    So yes, the goal of this project really is code analysis, so the output 
target compiler is mostly irrelevant. As for looking into other tools, no I 
haven't really done that. I am interested in hearing about other tools, if 
only to get inspiration for my own, but I don't think I can actually just 
"drop" this project and purchase a license for another tool. The analysis 
module that I'll eventually have to work on actually has to expose a very 
specific API so that it can be controlled by yet another tool, which pretty 
much locks me into having to implement all this stuff myself.

    I'll mention the idea of exploring existing tools to my boss (and will 
probably bring up the example of Revolve in particular), but I'm doubt that 
that will be the direction we will head in.


> See:
>   http://www.microfocus.com/products/revolve/

    As an aside, I watched the flash movie containing a demo of using 
revolve, and not really knowing anything JCLs or CICS or anything like that, 
the demo looked very impressive. =) While the example task they go through 
(expanding a field by 2 bytes, did they meant 2 characters?) probably won't 
apply to me (I'm mainly a Java programmer, so I have abitrarily long strings 
and array-like structures available to me), I might still bookmark and send 
a link to that demo to all my non-programmer friends who don't understand 
why simple changes in the requirement can add so much extra development 
time.

> Finally,  as stated initially in this note, I *do* think it is possible to 
> do what you are asking about. However, I am with those who questions the 
> wisdom (and cost-effectiveness) of starting on the project.  If you can 
> "better" explain your actual business need and environment, we 
> (comp.lang.cobol) MAY be able to suggest a "better" solution.

    Well, I'm hesitant to go into too much detail, as I'm actually a recent 
university graduate (bachelors in computer science) and this is my first 
(computer-related) job, and I signed an NDA, and I don't have much business 
experience, nor easy access to a lawyer, etc. But the basic gist is given 
above. This GO TO elimination tool is just a small tool in a long chain of 
them. I'm doing period elimination because it makes GO TO elimination easier 
later on. I'm doing GO TO elimination because I it makes it easier for the 
analysis I plan on doing later on. I have to write my own analysis tool 
because I have to expose a specific API.

    - Oliver 


0
owong (6177)
7/12/2005 8:09:41 PM
"Oliver Wong" <owong@castortech.com> wrote in message
news:9sVAe.106882$HI.76318@edtnps84...

>     I can sidestep issues of solving the halting problem and such by
> assuming that my input COBOL source code should always be a valid program
> (which should, given the appropriate inputs, eventually terminate).

Hmmm.  If I've read you aright:

There is no requirement that a syntactically-correct COBOL program *must*
eventually terminate, or even *should* eventually terminate.  There is no
requirement that a program execute a STOP RUN statement, or EXIT PROGRAM to
another program that does a STOP RUN.   I've run into many on-line programs
that are intended to run 24/7, for which *any* sort of termination is
considered catastrophic, and in which there is no provision whatever for
graceful *termination* (though such a program would likely include
provisions for a graceful *recovery*).

There is no correspondence between "valid program" and "eventually
terminate"; there are valid application-design reasons to have "outer loops"
that have no terminal condition -- "PERFORM MAIN-LINE UNTIL
HADES-FREEZES-OVER", the ALGOL version "WHILE TRUE DO ...", or even the
Dijkstra-inspired Burroughs B1000 SDL/UPL "DO FOREVER".

STOP RUN (and EXIT PROGRAM) are statements like any other; the programmer
can choose to use them, or not, as the application design requires.

    -Chuck Stevens


0
7/12/2005 8:34:10 PM
> will have to handle may contain the EXIT PARAGRAPH

EXIT PARAGRAPH and SECTION are works of the devil ;-)   In many ways
they have some of the problems of GO TO and NEXT SENTENCE.  As with
those they are 100% clear and definite abouut what happens when you
look at the words themselves, but cloak what happens when you are
examining the target point.

Or, specifically, if there are any of these anywhere in the program
then it is always necessary to examine every possible context on any
change.  If these are proven to not exist then the amount of context is
much reduced and productivity is much greater.

One of the ways that I simplify code is by breaking it down into
smaller paragraphs that are grouped.  For example if I find that nested
IFs are getting too deeply indented it is easy to rip out imperitive
statements into a paragraph and perform it.  EXIT ~ get in the way of
doing this, just as GO TO does.

0
riplin (4127)
7/12/2005 9:05:21 PM
Oliver Wong wrote:
> 
> "Peter Lacey" <lacey@mb.sympatico.ca> wrote in message
> news:42D3DD3B.38BCDEDC@mb.sympatico.ca...
> > the question arises: exactly what is spaghetti code?
> [...]
> > Anyway: I'd like to hear some definitions and examples!
> 
>     I come from a Java background, so this answer might not apply very well
> to (traditional) COBOL, as it relies a bit on using an object oriented
> paradigm, but...
> 
>     I believe that in an ideal program, if you wanted to know what a piece
> of code does, you wouldn't need to read anything other than the code in
> question. That is to say, if you were reading through the code listing, and
> saw a method call (I guess the closest COBOL equivalent is a PERFORM
> statement), your first instinct wouldn't be to immediately scroll over to
> read the contents of that method (or paragraph); rather, it would be clear
> what that method does at a high level based on the name of that method.
> 
>     That's one reason the presence of GOTOs imply unstructured (or
> "spaghetti") code: Even if you give a really meaningful name to the
> paragraph you're GOTO-ing, the reader still has a strong temptation to
> scroll down and read the code there, to see if you eventually branch back to
> where you came from (maybe this isn't possible in COBOL; I may be thinking
> more of BASIC's GOSUB and RETURN statements).
> 

As you suspect: that's a GOSUB.  A GOTO is an unconditional transfer of
control with no instrinsic implication that you're going to return, as
opposed to a PERFORM which does imply that.  The program flow may well
require that you will go through this part of the code again - such as
in a loop of any kind - but that's an artifact of the flow, not the
statement.  I'd agree that if somebody did write in such a fashion as to
require that the program to return to the next statement in line after
the GOTO would be spaghetti-coding.  But I haven't seen that done ever
except for my example mentioned before, and about two rookie FORTRAN
programs, a LONG time ago.  Therefore there is no temptation to check
the label to see if it comes back; likely though you'd have to follow it
up anyway to find out what happens there.

Regrettably, all your analysis really applies to PERFORM's (or CALL's).  

>    The reason spaghetti code, under the definition I'm offering, is 
>difficult for humans to understand is that it essentially requires us to 
>maintain a "call stack" like structure in our minds. While reading some 
>spaghetti code, when we see a branch, we're tempted to go read whatever code 
>migth be at the target of that branch. So we have to remember where we are 
>now, and go check it out. While reading THAT code, we then get tempted 
>again, to go look at another part of the code. We have to push a new address 
>onto our stack, and go read THAT new section of code, and so on.

It seems to me that since "structured" programming requires nested
PERFORM's (or at least cascaded PERFORM's) then by your definition it's
spaghetti code.  (Correct me if I'm wrong, people, but I don't think
it's possible to write any non-trivial program with PERFORM's only one
level deep).  If you see a PERFORM, you must eventually find out what's
going on there to understand the program logic.


> 
>     BTW, this is also why comments at the beginning of methods (or
> paragraphs or whatever) that describe what the method does can reduce the
> symptoms of spaghetti code. Now, if you see a method call for which you're
> tempted to branch off into, rather than actually reading the code of that
> method, you can just read the high level description of the method (i.e. the
> comment at the top), and so your stack never exceeds 2 (or 1, again
> depending on how you count).
> 
>     - Oliver

Funny, that; about the time that Dijkstra (sp? I never am sure) invented
"structured" programming, another study came out that found that
comments were just as effective in creating better programs; but, the
author said, comments aren't as intellectually exciting as GOTO-less
methods, so probably won't get the same sort of attention.  How right he
was!  But so far as top-level comments go: if they describe exactly what
happens, with input and output variables defined and listed, then you're
right.  But if there is any doubt as to what's happening within the
method - such as a payroll calculation going slightly wrong - somebody
is going to have to examine the thing.  I've always thought that that is
one of the two problems with objects: one is determining exactly what
they do, the other is finding out which ones exist to solve the problem
you're working on.

PL
0
lacey (134)
7/12/2005 9:20:48 PM

William M. Klein wrote:
> Reply to first (general note):
> 
> As far as I know it is ALWAYS possible to convert an '85 Standard conforming 
> COBOL program to a style of:
> 
> A) No Sections
> B) only one period per procedure division paragraph
> C) no PERFORM THRU statements and no ALTER statements and no GO TO statements

I can't prove it, but I believe this is correct.  I used to maintain a 
  macro-level CICS program with 42 ALTER statements in it.  I don't 
know how I survived that.

> 
> As far as I know it is also possible to do this for most (as far as I know ALL) 
> vendor extended COBOL's.  However, I don't know this for certain.  I *do* know 
> that IBM provided a product (that Arnold) mentioned called "COBOL/SF" or "COBOL 
> Structuring Facility".  This product sold for $100,000 - which may give you an 
> idea of its complexity.  This product is no longer sold or supported by IBM. 
> Its primary use (frequency) was during the time that many sites were converted 
> from OS/VS COBOL (a '68 or '74 Standard compiler) to VS COBOL II (an '85 
> Standard compiler).  For more information on this product's development (which 
> was interesting in and of itself), see:
>    http://www.scisstudyguides.addr.com/papers/cwdiss725paper1.htm
>      or
>   http://hissa.ncsl.nist.gov/formal_methods/vol2.txt

Thanks for the links, Bill.  Interesting stuff, but a little over my 
head.  I guess I'm not the "formal methods" type.  I found it very 
interesting that some of the development team for COBOL/SF were 
worried that project couldn't possibly be done, due to the 
complexities of "legal" COBOL code and the difficulty of modeling it. 
  They even had a PHD in some specialized branch of mathematics to 
help them solve the problem.

It's a shame that COBOL/SF is no longer available.  I could still use 
it, although it would be difficult to justify the cost for the few 
times I could take advantage of it.  Cheaper to have one good 
programmer rewrite the stinker that keeps abending at O Dark thirty.

It almost sounds like Oliver is being tasked with writing a new 
COBOL/SF.  Whatever happened to "ReCoder", which was a competing 
product in the late 1980's, early 1990's?

> 
> This brings up a question that I have not seen asked or answered yet within this 
> thread:
> 
> What compiler is being used for the "unmodified" source code today (and do you 
> know what compiler options or directives are being used)?  This can 
> SIGNIFICANTLY impact what your source code needs to handle. For example,
>   - IBM's OS/VS COBOL supported the
>      -- documented ON statement ("weird" control flow)

I once wrote a COBOL program to convert OS/VS COBOL to COBOL II.
http://home.att.net/~arnold.trembley/cb2align.zip

It also did some minor "beautification" of the output COBOL program. 
It did about 90% of the conversion automatically, and a diagnostic 
compile would find the rest (EXAMINE, TRANSFORM, BEFORE/AFTER 
POSITIONING).  Its best feature was converting CICS SERVICE RELOADS 
and BLL CELLS to SET and POINTER.

I don't think it even tried to convert the "ON number (imperative 
statement)" construct.  As far as I ever knew, IBM intended it to be 
used for testing, i.e. "ON 500 GO TO END-OF-JOB", so you could put it 
into your program and stop after the first 500 iterations against a 
production input file.  But in practice, everybody used it as an easy 
first-time switch "ON 1 PERFORM PRINT-REPORT-HEADERS", or something 
like that.

The oddest thing in converting OS/VS COBOL was the "NOTE" verb, which 
I assume was an IBM extension.  NOTE was a comment.  Anything up to 
the next period was just comments UNLESS the "NOTE" verb was the first 
statement in a paragraph - then the entire paragraph was a comment no 
matter how many periods/full stops were present.

I don't miss OS/VS COBOL, but the sysprogs in my shop can't seem to 
get rid of it.  It seems that when IBM sends you a new OS release, 
z/OS 1.5 for example, if your shop had ever ordered OS/VS COBOL in the 
past it is still included in your new installation media.

That should give DocDwarf a chuckle.


> (snip the rest)   
-- 
http://arnold.trembley.home.att.net/

0
7/13/2005 4:46:31 AM

Chuck Stevens wrote:

> "Oliver Wong" <owong@castortech.com> wrote in message
> news:SPTAe.106866$HI.3465@edtnps84...
> 
> With respect to GO TO DEPENDING:
> 
> 
>>    This will need to be eliminated as well, probably via a 2 step
> 
> process.
> 
>>In the first step, it is converted to a series of GO TOs within some IF
> 
> THEN
> 
>>ELSE statements. Then the individual simple GO TOs will be eliminated
>>individually.
> 
> 
> What about EVALUATE?  It preserves the structure of GO TO DEPENDING better
> than IF ... THEN ... ELSE does, I think.
> 
>     -Chuck Stevens

Without a formal proof, I think it is possible to convert GO TO 
DEPENDING into an EVALUATE statement, but there are some subtle 
differences.  From a performance viewpoint EVALUATE is like a nested 
IF statement.  You can't execute the statements under the fifth WHEN 
clause without failing the first four WHEN statements.  But a GO TO 
DEPENDING is much more efficient at the machine language level, by 
effectively branching to a computed address without executing any 
compare instructions.

There is one problem converting GO TO DEPENDING, and that is that if 
nio matching GO TO paragraph is found you just fall through to the 
next COBOL sentence/statement.  And with GO TO DEPENDING, there's no 
way to come back to the next statement unless the programmer coded 
more GOTO's to get back.  I suspect that analyzing subsequent control 
flow could be a fairly complex problem in some cases.

But if I were converting it, EVALUATE would be what I would want to 
convert to.

-- 
http://arnold.trembley.home.att.net/

0
7/13/2005 4:57:20 AM

Peter Lacey wrote:

> This discussion has encouraged me to pose a question to the group which
> I've been contemplating for some time.  One of the goals of tools like
> this is to eliminate spaghetti code.  Accepting that as a good thing,
> the question arises: exactly what is spaghetti code?  If GOTO =
> spaghetti code - that is, if a program that uses GOTO's is automatically
> defined as SC, then there's not much to discuss except whether we agree
> with the definition or not.  But that strikes me as much too
> simple-minded.  Examples in texts that I've seen show program flow
> leaping from one end of the program to another, allegedly because the
> programmer kept on thinking of things he'd missed or wanting to reuse
> code.  It's always struck me that those practices are very much rookie
> mistakes.  I can't believe that anyone with some experience under
> his/her belt and any level of aptitude for the job would continue doing
> so.
> 
> I've encountered one example of this - which I called disaster code, not
> just spaghetti code - written in BASIC; every statement was followed by
> a GOTO to some other part of the program (EVERY statement except the
> start & end!); this was a commercial accounting package and the
> rationale was to make theft of the code more difficult; by gum, it
> succeeded at that!  And there has been just one COBOL program  (by
> somebody else, of course) that I have not been able to follow - the only
> one I ever gave up in disgust on and rewrote; its problem was that it
> had READ's all over the place and I couldn't work out what the heck it
> was doing.
> 
> Anyway: I'd like to hear some definitions and examples!
> 
> Peter

Prior to the big Y2K rollover my shop acquired a COBOL analysis tool 
from VIASOFT.  I think it might have been called VIA-VALIDATE or 
something like that.  One of the metrics it produced when analyzing a 
COBOL program was a count of "Live EXITS".  I was unfamiliar with the 
term, but they described it as having a whole series of paragraphs 
executed by something like PERFORM PARAGRAPH-AA THRU ZZ-EXIT, and 
somethere in the middle a GO TO was executed so that the EXIT 
paragraph was never executed.  If the perform range was re-entered, 
from the top or, worse yet, in the middle, you couldn't be sure where 
you would end up.

That's not a definition of spaghetti code, just an example of one 
possibly very ugly side-effect of that style.


-- 
http://arnold.trembley.home.att.net/

0
7/13/2005 5:09:00 AM
In article <H01Be.1131154$w62.130911@bgtnsc05-news.ops.worldnet.att.net>,
Arnold Trembley  <arnold.trembley@worldnet.att.net> wrote:

[snip]

>I don't miss OS/VS COBOL, but the sysprogs in my shop can't seem to 
>get rid of it.  It seems that when IBM sends you a new OS release, 
>z/OS 1.5 for example, if your shop had ever ordered OS/VS COBOL in the 
>past it is still included in your new installation media.
>
>That should give DocDwarf a chuckle.

zzzzzzzzzz... zzzzaaaAAWWWWKKKkkkkhhhhh... zzzzznnnnoooooopppphhhhh... eh? 
huh? whuh?  Oh, sorry, I was just... resting my eyes, did I miss 
something?

Now that you mention it... let me take a look-see at my current client's 
Prod loadlib... hmmmm... got a bunch of 'em, looks like a couple of 
systems have been merged over the decades... let's see... this'un's got a 
mere 1300 modules, that'll be a start...

.... nope, no ILB0 routines in there.

DD

0
docdwarf (6044)
7/13/2005 12:36:41 PM
Oliver Wong wrote:
>    Formally, to prove that it's always possible to write a COBOL
> program with one period per paragraph, it isn't enough to just give
> examples of COBOL programs which are written with one period per
> paragraph. Rather, you need to present some sort of logical proof;
> However, you can disprove the statement by giving an example of a
> COBOL program which cannot be rewritten with just one period per
> paragraph. (Remember all those science test questions which start
> with "Prove or give a counterexample"?)

And I believe Mark Twain said he never took health advice from books. "I 
might die of a misprint," he concluded. Still, the chances of both 
single-period paragraphs and proper living being correct are great.

>
>    Still, the ETK is a neat library. Thanks.
>
>    - Oliver 


0
heybubNOSPAM (643)
7/13/2005 1:08:54 PM
"Peter Lacey" <lacey@mb.sympatico.ca> wrote in message 
news:42D43430.D9407925@mb.sympatico.ca...
> As you suspect: that's a GOSUB.  A GOTO is an unconditional transfer of
> control with no instrinsic implication that you're going to return, as
> opposed to a PERFORM which does imply that.  The program flow may well
> require that you will go through this part of the code again - such as
> in a loop of any kind - but that's an artifact of the flow, not the
> statement.  I'd agree that if somebody did write in such a fashion as to
> require that the program to return to the next statement in line after
> the GOTO would be spaghetti-coding.  But I haven't seen that done ever
> except for my example mentioned before, and about two rookie FORTRAN
> programs, a LONG time ago.  Therefore there is no temptation to check
> the label to see if it comes back; likely though you'd have to follow it
> up anyway to find out what happens there.
>
> Regrettably, all your analysis really applies to PERFORM's (or CALL's).

    I was actually trying to argue for the point that PERFORMs or CALLs are 
*better* than GOTOs, because with PERFORMs, you "know" that control is going 
to come back, whereas with GOTOs, you don't know (or I guess you know that 
it won't) come back.

    The basic heuristic I was getting at for determining whether a program 
was written in the spaghetti style of coding was to ask: Do I have to scroll 
up and down the program listing (i.e. "follow the spaghetti trail") to 
understand it? Or can I just read the one paragraph I'm interested in, and 
whenever I see a "PERFORM", I can infer what the performed paragraph does 
based on the name of the paragraph.

    For example, if the code for a online webstore checkout was something 
like (I'm not familiar with the limits of lengths of names of paragraph and 
such, so maybe this isn't valid COBOL code, but it illustrates the idea):

Checkout.
  PERFORM CheckItemsAreStillInStock
  IF error-flag IS TRUE THEN
    GOTO ErrorHandling
  END-IF
  PERFORM ApplyAnyApplicableRebates
  IF error-flag IS TRUE THEN
    GOTO ErrorHandling
  END-IF
  PERFORM CalculateShippingAndTaxSurchage
  IF error-flag IS TRUE THEN
    GOTO ErrorHandling
  END-IF
  GOTO ConfirmSale
  .

    Then I'd say that code isn't really spaghetti, even though it may 
contain GOTOs. I can read that code, and without actually reading any of the 
paragraphs it refers to ("CheckItemsAreStillInStock", "ErrorHandling", 
etc.), I can pretty much tell what the program is doing. I didn't have to 
scroll to another part of the program to figure out what was going on.


>
>>    The reason spaghetti code, under the definition I'm offering, is
>>difficult for humans to understand is that it essentially requires us to
>>maintain a "call stack" like structure in our minds. While reading some
>>spaghetti code, when we see a branch, we're tempted to go read whatever 
>>code
>>migth be at the target of that branch. So we have to remember where we are
>>now, and go check it out. While reading THAT code, we then get tempted
>>again, to go look at another part of the code. We have to push a new 
>>address
>>onto our stack, and go read THAT new section of code, and so on.
>
> It seems to me that since "structured" programming requires nested
> PERFORM's (or at least cascaded PERFORM's) then by your definition it's
> spaghetti code.  (Correct me if I'm wrong, people, but I don't think
> it's possible to write any non-trivial program with PERFORM's only one
> level deep).  If you see a PERFORM, you must eventually find out what's
> going on there to understand the program logic.

    I was making the assumption that there may exists times when you see a 
PERFORM, and yet you *DON'T* have to actually read the paragraph the PERFORM 
refers to to find out what's going on there, for example, if the name of the 
paragraph is meaningful.

    In the case of GOTO, you pretty much HAVE to go to the paragraph being 
referred to, because the rest of the program logic occurs there, and no 
longer at where you're currently reading. With a PERFORM, the control 
temporarily goes there, but it should eventually come back, and so if you 
can guess at what the PERFORMed paragraph does, you don't need to leave your 
spot to understand what the overall program is doing. You just keep reading 
from where you're at (i.e. the statement following the PERFORM statement).

> But so far as top-level comments go: if they describe exactly what
> happens, with input and output variables defined and listed, then you're
> right.  But if there is any doubt as to what's happening within the
> method - such as a payroll calculation going slightly wrong - somebody
> is going to have to examine the thing.

    The problem I often see with top-level comments is that it's very easy 
to accidentally describe how you implemented the function there, rather than 
what the method actually does. As a perhaps non-realistic trivial example, 
if you were writing a library of math functions, then a not-so-good top 
level comment might be (using "C++/Java style comments", to avoid issues 
with word wrapping):

/* This subroutine uses Newton's method to calculate the square root of a 
number by recursively calling itself. */

    Whereas a better comment might be:

/* Given a non-negative floating point number N, returns a floating point 
number that is within epsilon of the true positive square root (i.e. the 
result of this function is always positive unless an error occured, see 
below).

If the integer is negative, -1 is returned.

@param N  the number whose square root is desired.
@param e  epsilon, the accpetable error of the answer.
@returns  an approximation of the square root of N with an error no greater 
than epsilon, or -1 if an error occured.
*/

    The second comment says nothing about how the method is implemented, 
only what it does. Typically, what a method does never changes for a given 
program, so it never needs to be updated. The first comment instead says HOW 
it does something, and that may very well change (perhaps the recursion will 
be unrolled into an iteration for efficientcy), and then the comment will 
not match with the code.

> I've always thought that that is
> one of the two problems with objects: one is determining exactly what
> they do, the other is finding out which ones exist to solve the problem
> you're working on.

    The first (determining what they do) is a documentation issue more than 
an "OO versus non-OO" issue, and I usually never had any problems with this 
regard for the Java or PHP standard API library (though I often have these 
type of problems when using 3rd party open source libraries).

    The second is a good point, but I think it's better to have the 
solutions out there, requiring you to put effort into finding those 
solutions, than to not have the solutions out there, and having to implement 
them yourself.

    - Oliver 


0
owong (6177)
7/13/2005 1:20:51 PM
"Chuck Stevens" <charles.stevens@unisys.com> wrote in message 
news:db19gi$pom$1@si05.rsvl.unisys.com...
>
> "Oliver Wong" <owong@castortech.com> wrote in message
> news:9sVAe.106882$HI.76318@edtnps84...
>
>>     I can sidestep issues of solving the halting problem and such by
>> assuming that my input COBOL source code should always be a valid program
>> (which should, given the appropriate inputs, eventually terminate).
>
> Hmmm.  If I've read you aright:
>
> There is no requirement that a syntactically-correct COBOL program *must*
> eventually terminate, or even *should* eventually terminate.  There is no
> requirement that a program execute a STOP RUN statement, or EXIT PROGRAM 
> to
> another program that does a STOP RUN.   I've run into many on-line 
> programs
> that are intended to run 24/7, for which *any* sort of termination is
> considered catastrophic, and in which there is no provision whatever for
> graceful *termination* (though such a program would likely include
> provisions for a graceful *recovery*).
>
> There is no correspondence between "valid program" and "eventually
> terminate"; there are valid application-design reasons to have "outer 
> loops"
> that have no terminal condition -- "PERFORM MAIN-LINE UNTIL
> HADES-FREEZES-OVER", the ALGOL version "WHILE TRUE DO ...", or even the
> Dijkstra-inspired Burroughs B1000 SDL/UPL "DO FOREVER".
>
> STOP RUN (and EXIT PROGRAM) are statements like any other; the programmer
> can choose to use them, or not, as the application design requires.

    You are, of course, completely right. I think I made the above comment 
mainly out of habit from an inside joke amongst my computer science student 
peers. The Halting Problem, as you may already know, was proven by Alan 
Turing to be undecideable by Turing Machines. Furthermore, all known 
computers today have equal or less power than a Turing Machine. Therefore, 
if you could ever show that solving the problem you're working on was 
equivalent to solving the Halting Problem, then you know that the problem is 
impossible to solve in COBOL (or any other programming language running on 
any known computer). So of course we would frequently compare solving the 
assignments our teachers gave us to solving the Halting Problem (i.e. they 
were impossibly difficult).

    The Halting Problem is: given a program written in COBOL (or any other 
programming language, but let's assume COBOL for now, since we're in 
comp.lang.cobol), and given the set of inputs that will be provided to that 
program (if any), determine whether that program will eventually halt or 
not.

    Whenever you do significant analysis on a program source code (as I'm 
trying to do), there's always the lingering fear that you might run into the 
Halting Problem. For example, if you're given the task of determining 
whether a program always returns the number "2" as a result, that's 
equivalent to solving the Halting Problem (to be sure it'll eventually 
return 2, you'd have to first know that it eventually halts, or terminates).

    I guess what I was initially trying to say is: "I don't think the 
Halting Problem will be an issue for my analysis tool, so don't worry about 
that."

    - Oliver 


0
owong (6177)
7/13/2005 1:36:09 PM
On 13-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

>     The basic heuristic I was getting at for determining whether a program
> was written in the spaghetti style of coding was to ask: Do I have to scroll
> up and down the program listing (i.e. "follow the spaghetti trail") to
> understand it? Or can I just read the one paragraph I'm interested in, and
> whenever I see a "PERFORM", I can infer what the performed paragraph does
> based on the name of the paragraph.

Unfortunately, there is no way to guarantee that what you think is a
well-structured program really is, unless you (or your program) actually looks.


> Checkout.
>   PERFORM CheckItemsAreStillInStock
>   IF error-flag IS TRUE THEN
>     GOTO ErrorHandling
>   END-IF
>   PERFORM ApplyAnyApplicableRebates
>   IF error-flag IS TRUE THEN
>     GOTO ErrorHandling
>   END-IF
>   PERFORM CalculateShippingAndTaxSurchage
>   IF error-flag IS TRUE THEN
>     GOTO ErrorHandling
>   END-IF
>   GOTO ConfirmSale
>   .
>
>     Then I'd say that code isn't really spaghetti, even though it may
> contain GOTOs. I can read that code, and without actually reading any of the
> paragraphs it refers to ("CheckItemsAreStillInStock", "ErrorHandling",
> etc.), I can pretty much tell what the program is doing. I didn't have to
> scroll to another part of the program to figure out what was going on.

Possibly.   It depends on what ErrorHandling does.    If it is an abort routine,
the GOTO is fine.   If it prints an error report then does a GOTO ConfirmSale,
then it is not.    I also have to look at ConfirmSale.
0
howard (6283)
7/13/2005 2:20:50 PM
Oliver Wong wrote:
> 
> 
>     I was actually trying to argue for the point that PERFORMs or CALLs are
> *better* than GOTOs, because with PERFORMs, you "know" that control is going
> to come back, whereas with GOTOs, you don't know (or I guess you know that
> it won't) come back.

But why SHOULD it come back??  I'm guessing that you're thoroughly
familiar with the "structured" style and haven't written much in the
GOTO-style.  While of course "structured" thinking works well, it does
impose a certain mind-set: in this case it is that flow must be from the
start of the paragraph to the end (with flow transfers happening but
disguised as PERFORMS etc. - which are of course equivalent to "go
there, do something, come back here").  It's a bit difficult to explain
clearly - as you can see - but that isn't the only way to do it.  Bear
in mind that using  GOTO's in no way precludes PERFORM's - they are used
as is convenient, of course - contrariwise, GOTO's are never (or at
least should not ever, on pain of death) used as a substitute for
PERFORM's.  That way lies madness.

PL
0
lacey (134)
7/13/2005 3:59:02 PM
"Peter Lacey" <lacey@mb.sympatico.ca> wrote in message 
news:42D53A46.DC51EE0D@mb.sympatico.ca...
> Oliver Wong wrote:
>>
>>
>>     I was actually trying to argue for the point that PERFORMs or CALLs 
>> are
>> *better* than GOTOs, because with PERFORMs, you "know" that control is 
>> going
>> to come back, whereas with GOTOs, you don't know (or I guess you know 
>> that
>> it won't) come back.
>
> But why SHOULD it come back??  I'm guessing that you're thoroughly
> familiar with the "structured" style and haven't written much in the
> GOTO-style.  While of course "structured" thinking works well, it does
> impose a certain mind-set: in this case it is that flow must be from the
> start of the paragraph to the end (with flow transfers happening but
> disguised as PERFORMS etc. - which are of course equivalent to "go
> there, do something, come back here").  It's a bit difficult to explain
> clearly - as you can see - but that isn't the only way to do it.

    The only languages I've written "significant amounts of code" in were 
BASIC, Java and PHP; the latter two don't have goto statements (or maybe PHP 
does, but I've never used it), and I had use Microsoft's QuickBasic 4.5 
variant of BASIC, where GOTOs were already being discouraged in favor of 
procedure calls and function calls, so yes, I don't have much experience in 
the GOTO-style of programming.

    So maybe my lack of experience makes me biased, but if I had to think 
about which is "better", "always coming back after a branch" (e.g. 
structured style, OO style) or "not nescessarily coming back after a branch" 
(e.g. GOTO-style), I'd vote for "always coming back after a branch".

    The rational for this is again the idea that I don't want to have to 
scroll the code around in my editor to understand the small section of code 
I'm looking at. If the section of code I'm interested in understanding is 3 
lines long, and the 2nd line is a control branch that I know will eventually 
come back, I can read the 1st line, the 2nd line, and then the 3rd line, and 
assuming the 2nd line's label/paragraph/function name was meaningful, I know 
what these 3 lines do.

    If the code is 3 lines long, where the 2nd line is a branch that doesn't 
come back, and the "3rd" line is located 50 lines down, then I have to read 
the 1st line, the 2nd line, and then scroll my editor, then read the 3rd 
line. It's that "scroll the editor" step that I'm trying to avoid.

    I guess if you want to throw around buzzwords, I'm trying to promote 
spacial locality of source code.

    - Oliver 


0
owong (6177)
7/13/2005 4:27:24 PM
On 13-Jul-2005, Peter Lacey <lacey@mb.sympatico.ca> wrote:

> >     I was actually trying to argue for the point that PERFORMs or CALLs are
> > *better* than GOTOs, because with PERFORMs, you "know" that control is going
> > to come back, whereas with GOTOs, you don't know (or I guess you know that
> > it won't) come back.
>
> But why SHOULD it come back??  I'm guessing that you're thoroughly
> familiar with the "structured" style and haven't written much in the
> GOTO-style.  While of course "structured" thinking works well, it does
> impose a certain mind-set: in this case it is that flow must be from the
> start of the paragraph to the end (with flow transfers happening but
> disguised as PERFORMS etc. - which are of course equivalent to "go
> there, do something, come back here").  It's a bit difficult to explain
> clearly - as you can see - but that isn't the only way to do it.  Bear
> in mind that using  GOTO's in no way precludes PERFORM's - they are used
> as is convenient, of course - contrariwise, GOTO's are never (or at
> least should not ever, on pain of death) used as a substitute for
> PERFORM's.  That way lies madness.

Mainly because if you can count on it coming back, you can follow the flow of of
the code more easily.
0
howard (6283)
7/13/2005 4:29:00 PM
<much/all snippage"

My definition of "spaghetti code" is any (COBOL) code in which transfer of 
control "crosses" at least once (and in such programs it is USUALLY multiple 
times) in a one way direction.

For example

*NOT* Spaghetti Code (*not* my style - but neither is it - IMHO spaghetti code)

Perform ABC
Stop Run
  .
ABC.
    Perform XYZ
    If L = "L"
        go to ABC-exit
    Else
        Perform A23
   End-If
        .
 ABC-Exit.
     Exit.
 A23.
    Display "Here"
         .

Spaghetti Code (also *not* my style - but IMHO spaghetti code)

Go to ABC
Stop Run
  .
ABC.
    Perform XYZ
    If L = "L"
        go to A23
    Else
        Go To ABC-Exit
   End-If
     .
 ABC-Exit.
     Exit.
 A23.
    Move X to L
    Go To ABC
         .

-- 
Bill Klein
 wmklein <at> ix.netcom.com 


0
wmklein (2605)
7/13/2005 7:38:49 PM
On 13-Jul-2005, "William M. Klein" <wmklein@nospam.netcom.com> wrote:

> My definition of "spaghetti code" is any (COBOL) code in which transfer of
> control "crosses" at least once (and in such programs it is USUALLY multiple
> times) in a one way direction.

I've seen a lot of spaghetti code with transfers of control in both directions.

Aren't all directions "one way"?
0
howard (6283)
7/13/2005 8:03:14 PM
I'd consider any COBOL code as being spaghetti-suspect if it did a GO TO a
paragraph that appeared earlier in the progran source.  "Forward" GO TO's
aren't usually problemmatical; it's the "backward" ones that are most likely
to prove to be gotchas, I think.  COBOL's got enough loop-control mechanisms
to handle just about anything you might want to throw at it in terms of a
"backward" GO.

    -Chuck Stevens

"William M. Klein" <wmklein@nospam.netcom.com> wrote in message
news:c5eBe.440936$581.258709@fe05.news.easynews.com...
> <much/all snippage"
>
> My definition of "spaghetti code" is any (COBOL) code in which transfer of
> control "crosses" at least once (and in such programs it is USUALLY
multiple
> times) in a one way direction.
>
> For example
>
> *NOT* Spaghetti Code (*not* my style - but neither is it - IMHO spaghetti
code)
>
> Perform ABC
> Stop Run
>   .
> ABC.
>     Perform XYZ
>     If L = "L"
>         go to ABC-exit
>     Else
>         Perform A23
>    End-If
>         .
>  ABC-Exit.
>      Exit.
>  A23.
>     Display "Here"
>          .
>
> Spaghetti Code (also *not* my style - but IMHO spaghetti code)
>
> Go to ABC
> Stop Run
>   .
> ABC.
>     Perform XYZ
>     If L = "L"
>         go to A23
>     Else
>         Go To ABC-Exit
>    End-If
>      .
>  ABC-Exit.
>      Exit.
>  A23.
>     Move X to L
>     Go To ABC
>          .
>
> -- 
> Bill Klein
>  wmklein <at> ix.netcom.com
>
>


0
7/13/2005 8:03:59 PM
The "one way" rule that I was referring to was that I would NOT consider the 
following "spaghetti code"


Para1.
    If A = "A"
       Go to Para2
    Else
       Display "Not equal"
    End-If
    Display "after IF"
      .
Para2.
   Move x to A
   Go To Para1
       .

"functionally" this is the same as a PERFORM PARA2 (because you "go to" a single 
paragraph that, itself, - unconditionally - "comes back" from where you went 
to).

This also shows why I would allow "go to" a previous paragraph (procedure-name).

Needless to say, my PERSONAL PREFERRED way to code the above logic would be:


Para1.
    If A = "A"
       Perform Para2
    Else
        Display "Not equal"
    End-If
    Display "after IF"
      .
Para2.
   Move x to A
       .


-- 
Bill Klein
 wmklein <at> ix.netcom.com
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db3s2g$d1h$1@peabody.colorado.edu...
>
> On 13-Jul-2005, "William M. Klein" <wmklein@nospam.netcom.com> wrote:
>
>> My definition of "spaghetti code" is any (COBOL) code in which transfer of
>> control "crosses" at least once (and in such programs it is USUALLY multiple
>> times) in a one way direction.
>
> I've seen a lot of spaghetti code with transfers of control in both 
> directions.
>
> Aren't all directions "one way"? 


0
wmklein (2605)
7/13/2005 8:49:19 PM
Oliver,
   Just thought I would mention one other area that you will need to consider in 
the analysis phase - if not the "structuring" phase.

The question of what happens with non-Standard PERFORM ranges.  For a 
(semi-)discussion of how a number of vendors handle these, see the PERFORM-TYPE 
compiler directive available from Micro Focus:

FYI,
   If you haven't already, you might want to look at the entire section

"Run-time Behavior"

in the manual
  "Compiler Directives"

available online at:
  http://supportline.microfocus.com//documentation/books/sx40sp1/stpubb.htm

NOTE:
   Micro Focus is KNOWN for its "vast" quantity of directives AS WELL AS its 
ability to (more or less) emulate the behavior of many, MANY, other vendor's 
compilers.  Therefore, either when it comes to "extensions" or "implementor 
defined" behavior, Micro Focus may (probably?) has a directive (or two) to 
provide compatibility with the different semantics for the SAME syntax.

-- 
Bill Klein
 wmklein <at> ix.netcom.com 


0
wmklein (2605)
7/13/2005 9:00:32 PM
> "functionally" this is the same as a PERFORM PARA2

It is only 'functionally the same' from Para1. The whole point of a
'perform para2' is that it can be done from anywhere. Your code has a
'string' that only can only be attached in one place. It is these fixed
paths that characterise 'spagetti' code, thus yours is.

0
riplin (4127)
7/13/2005 9:53:53 PM
Arnold Trembley wrote:

>
> I still have one or two old programs in production that I would like 
> to restructure if we still had a license for CSF.
>
>
Arnold,
I have a working copy of VisualAge for COBOL Enterprise V2 for OS/2, 
which included the COBOL Structuring Facility (that one feature wasn't 
implemented in the Windoze version for V2).  Want to send me some source 
code?

This is probably best discussed offline.
Colin
0
cmcampb (110)
7/14/2005 1:11:08 AM
Oliver Wong wrote:
> 
> "Peter Lacey" <lacey@mb.sympatico.ca> wrote in message
> news:42D53A46.DC51EE0D@mb.sympatico.ca...
> > Oliver Wong wrote:
> >>
> >>
> >>     I was actually trying to argue for the point that PERFORMs or CALLs
> >> are
> >> *better* than GOTOs, because with PERFORMs, you "know" that control is
> >> going
> >> to come back, whereas with GOTOs, you don't know (or I guess you know
> >> that
> >> it won't) come back.
> >
> > But why SHOULD it come back??  I'm guessing that you're thoroughly
> > familiar with the "structured" style and haven't written much in the
> > GOTO-style.  While of course "structured" thinking works well, it does
> > impose a certain mind-set: in this case it is that flow must be from the
> > start of the paragraph to the end (with flow transfers happening but
> > disguised as PERFORMS etc. - which are of course equivalent to "go
> > there, do something, come back here").  It's a bit difficult to explain
> > clearly - as you can see - but that isn't the only way to do it.
> 
>     The only languages I've written "significant amounts of code" in were
> BASIC, Java and PHP; the latter two don't have goto statements (or maybe PHP
> does, but I've never used it), and I had use Microsoft's QuickBasic 4.5
> variant of BASIC, where GOTOs were already being discouraged in favor of
> procedure calls and function calls, so yes, I don't have much experience in
> the GOTO-style of programming.
> 
>     So maybe my lack of experience makes me biased, but if I had to think
> about which is "better", "always coming back after a branch" (e.g.
> structured style, OO style) or "not nescessarily coming back after a branch"
> (e.g. GOTO-style), I'd vote for "always coming back after a branch".
> 
>     The rational for this is again the idea that I don't want to have to
> scroll the code around in my editor to understand the small section of code
> I'm looking at. If the section of code I'm interested in understanding is 3
> lines long, and the 2nd line is a control branch that I know will eventually
> come back, I can read the 1st line, the 2nd line, and then the 3rd line, and
> assuming the 2nd line's label/paragraph/function name was meaningful, I know
> what these 3 lines do.
> 
>  (!!!)   If the code is 3 lines long, where the 2nd line is a branch that doesn't
> come back, and the "3rd" line is located 50 lines down, then I have to read
> the 1st line, the 2nd line, and then scroll my editor, then read the 3rd
> line. It's that "scroll the editor" step that I'm trying to avoid.
> 
>     I guess if you want to throw around buzzwords, I'm trying to promote
> spacial locality of source code.
> 
>     - Oliver

We're failing to communicate here.  I understand what you're getting
at.  I'm saying is that (a) "coming back always" is not the only way to
do it (b) I'm not accusing you of bias, just of ingrained practice (c)
"spatial locality" is not particular to structured methods.

re your second example (!!!) above: if it's an unconditional GOTO, then
almost  certainly you wouldn't have to come back to the third line to
understand the flow.  If it isn't unconditional, then neither would the
equivalent PERFORM statement be.

Really it doesn't matter.  You will write the program one way, I'll do
it another; both will work and any competent programmer will understand
both.

PL
0
lacey (134)
7/14/2005 1:17:52 AM
> I was actually trying to argue for the point that PERFORMs or CALLs are
> *better* than GOTOs, because with PERFORMs, you "know" that control is going
> to come back, whereas with GOTOs, you don't know (or I guess you know that
> it won't) come back.

One of the 'features' of Cobol is that labels may be used for different
tasks, in fact one label may be the target of a PERFORM, a GO TO, or a
drop through or may be the terminating point for a perform, or all of
these.

Other languages have different label types for the various mechanisms,
one cannot goto a C procedure or drop into it, or 'call' a label. In
these other languages the context of what may happen at any particular
label is usually obvious.

In Cobol, if all control statements are allowed, it is only possible to
determine what happens at any particular label by examining the whole
program. The logic path at any particular time may terminate above the
label, it may drop through the label, or it may start after the label
from some other point, or all of these at different times.

In order to restrict the almost infinite complexity that may occur
certain arbitrary constraints may be placed on the code that is
allowed. For example: "Only perform sections and only GO TO ~exit
paragraph".  This should result in section labels being only ever used
in performs and paragraph labels only ever dropped through or gotoed.
However, different sites have different ideas about what makes the best
set of constraints.

> *better* than GOTOs,

My contention is that it isn't the GOTO that is the problem, it is the
labels. If GOTO is allowed then when inspecting the code at any
particular label it is necessary to inspect the whole program to
determine how the logic paths change at the label, there is no
mechanism, as there may be in other languages, to specify how this
label may or may not be used.  Similarly if THRU is allowed, or
sections.

The more restrictive the coding, the more easily the code can be
understood in small pieces.  For example I don't use section, thru,
goto.  The only mechanism allowed is 'perform paragraph'.  This means
that I can look at any part of the program and know exactly how the
logic flows: it comes in at the label from a perform and exits back to
the perform at the end of the paragraph.  Because I also do not use
'next sentence' or 'exit paragraph/section/perform' I can also
transform one paragraph into two or more (one performing the other(s))
without having to be concerned that this will change the way it works.

It isn't that the GOTO is a bad statement, but if it exists it destroys
the ability to examine the code as localised modules.

0
riplin (4127)
7/14/2005 3:23:54 AM
"Oliver Wong" <owong@castortech.com> wrote in message 
news:OnRAe.106833$HI.24039@edtnps84...
> <docdwarf@panix.com> wrote in message 
> news:db0m0u$c4l$1@panix5.panix.com...
>> In article <db0l7v$dec$1@si05.rsvl.unisys.com>,
>> Chuck Stevens <charles.stevens@unisys.com> wrote:
>> Given that Mr Wong is a moderate neophyte and concluded from his research
>> that 'based on the discussions I've found on the web, it looks like NEXT
>> SENTENCE is an unconditional jump to the statement immediately following
>> the period of the sentence containing the NEXT SENTENCE statement' it
>> seems that he managed sidestep this 'common belief'... but perhaps those
>> who post to this group are, by definition, rather uncommon.
>
>    Writing language tools (e.g. parsers, compilers, etc.) means I had to 
> read whatever language reference and documentations much more carefully 
> than I might otherwise have done if I were merely interested in learning 
> how to program in said language. For example, I felt I was a relatively 
> knowlegeable Java programmer, but it was't until I wrote Java compilation 
> tools that I had heard about the strict_fp modifier, or that you could 
> declare functions in the form "public int foo()[][][] { /*body of 
> function*/ }".
>
Oliver, I have followed this thread with some sadness at the reactions your 
perfectly reasonable request has provoked.

There has been denigration of your task by people who really don't 
understand what you are doing or why, there has been complete 
misunderstanding of what you wrote and attempts to throw spanners in the 
works, or divert the the conversation.

There have been paranoid knee jerks from people who loftily believe that 
ONLY a human could write 'proper' COBOL...

The world of COBOL programming is, at best, insecure at the moment, so I 
guess much of this is to be expected.

Through all of this you have maintained a dignified and polite composure, 
which is certainly rare here.

I believe you have grasped much more about COBOL in a very short time than 
some posters here have in many years.

The task you are undertaking is a very interesting and worthwhile one. 
Re-factoring code from existing systems, whether it be COBOL or any other 
language is a valuable exercise. The people requiring you to do this are not 
idiots; it makes sense to derive as much value as possible from an exisitng 
investment before discarding or upgrading it. Providing a tool to do this is 
a challenging (but nonetheless, rewarding) task if you can make it happen. I 
believe you are well on the right track to doing so and will look forward to 
seeing the finished result.

Understand that some of the views expressed in  this group are necessarily 
from people who, at the very least may not like what you are doing, and at 
the worst, would positively resist it. Either way there is no excuse for 
some of the hostility and rudeness you have encountered.

On the other hand, there are also people who are genuinely interested in 
your problem and have offered sound unbiased advice.

I hope you have had enough feedback to sort out your initial hesitations, 
and I'm completely confident you are able to sort the wheat from the chaff.

If you need specific advice on any specific points don't hesitate to post 
privately.

All the best with your enterprise.

>    And it is certainly possible that if you use this newsgroup as a sample 
> space to perform some surveys on COBOL programmers, your results may 
> certainly be skewed.

....or even screwed...

Pete.




0
dashwood1 (2140)
7/14/2005 3:34:18 AM
Pete Dashwood wrote:
<snip>

> Oliver, I have followed this thread with some sadness at the reactions your 
> perfectly reasonable request has provoked.
> 
> There has been denigration of your task by people who really don't 
> understand what you are doing or why, there has been complete 
> misunderstanding of what you wrote and attempts to throw spanners in the 
> works, or divert the the conversation.

<snip>

> The world of COBOL programming is, at best, insecure at the moment, so I 
> guess much of this is to be expected.

<snip>

> The task you are undertaking is a very interesting and worthwhile one. 
> Re-factoring code from existing systems, whether it be COBOL or any other 
> language is a valuable exercise. The people requiring you to do this are not 
> idiots; it makes sense to derive as much value as possible from an exisitng 
> investment before discarding or upgrading it. Providing a tool to do this is 
> a challenging (but nonetheless, rewarding) task if you can make it happen. I 
> believe you are well on the right track to doing so and will look forward to 
> seeing the finished result.
> 
> Understand that some of the views expressed in  this group are necessarily 
> from people who, at the very least may not like what you are doing, and at 
> the worst, would positively resist it. Either way there is no excuse for 
> some of the hostility and rudeness you have encountered.

<snip>

The ugly truth is that Oliver, a new hire at his first job, is doing 
something that sounds a whole lot more interesting and challenging than 
some of the work the rest of us have been doing for the last 10 or 20 or 
30 years, and we're secretly afraid he'll succeed.

;)

Louis
0
lkrupp847 (386)
7/14/2005 6:56:18 AM
Louis Krupp <lkrupp@pssw.nospam.com.invalid> wrote:

>Pete Dashwood wrote:
>
><snip>
>
>> The task you are undertaking is a very interesting and worthwhile one. 
>> Re-factoring code from existing systems, whether it be COBOL or any other 
>> language is a valuable exercise. The people requiring you to do this are not 
>> idiots; it makes sense to derive as much value as possible from an exisitng 
>> investment before discarding or upgrading it. Providing a tool to do this is 
>> a challenging (but nonetheless, rewarding) task if you can make it happen. I 
>> believe you are well on the right track to doing so and will look forward to 
>> seeing the finished result.
>> 
>> Understand that some of the views expressed in  this group are necessarily 
>> from people who, at the very least may not like what you are doing, and at 
>> the worst, would positively resist it. Either way there is no excuse for 
>> some of the hostility and rudeness you have encountered.
>
><snip>
>
>The ugly truth is that Oliver, a new hire at his first job, is doing 
>something that sounds a whole lot more interesting and challenging than 
>some of the work the rest of us have been doing for the last 10 or 20 or 
>30 years, and we're secretly afraid he'll succeed.
>
>;)

If I understand correctly what Oliver is attempting to do, the
"knock-on" effects if he actually manages to succeed could be massive.
As I see it, in order to re-factor "spaghetti" (or any other variation
of programming "method") it's going to be necessary for the tool to
"understand" the functions that comprise the program to be re-factored
- effectively to understand what the program does and how it does it.
Once you can do that, you have the precurser of a tool that can
*really* convert one language to another - not just as a quick syntax
re-hash, but as a genuine re-creation in another language.

I wish him luck, it sounds a fascinating task..  :-)

-- 
Jeff.         Ironbridge,  Shrops,  U.K.
jeff@xjackfieldx.org (remove the x..x round jackfield for return address)
and don't bother with ralf4, it's a spamtrap and I never go there.. :)

.... "There are few hours in life more agreeable
      than the hour dedicated to the ceremony
      known as afternoon tea.."

         Henry James,  (1843 - 1916).

 
0
ralf4 (132)
7/14/2005 9:25:40 AM
In article <3gbcd19unj1vmj7q1fr928cqejvofe2h48@4ax.com>,
Jeff York  <ralf4@btinternet.com> wrote:

[snip]

>As I see it, in order to re-factor "spaghetti" (or any other variation
>of programming "method") it's going to be necessary for the tool to
>"understand" the functions that comprise the program to be re-factored
>- effectively to understand what the program does and how it does it.

I agree... and disagree.  As I see it, what a program does is a result of 
how it applies the functions of the language and it is those functions 
which the restructuring tool needs to 'understand' and manipulate.

The ability of the restructuring program to manipulate those functions 
would seem to be predicated on the ability of the programmer who writes 
the restructurer to 'understand' the functions of the language and I am... 
cautious about a neophyte's abilities in such things, based on my own 
small experiences with both COBOL and programming neophytes; this group 
has seen a few postings by even Olde Pros where confusion has resulted due 
to difficulties in such matters.

Mr Wong, as noted earlier, seemed to pick up on the function of NEXT 
SENTENCE in fashion so ready that it appeared to contradict one of The 
Standard's reasons for getting rid of the function; if the rest of his 
work shows similar quality the result might be pleasant to see, indeed.

(note - I put quotation marks around 'understand' because I am uncertain 
what this phenomenon entails... what are the symptoms of 'understanding'?  
How do these differ from the criteria of 'understanding?' What goes on 
when a person says 'Ahhhhhh... I understand!'?  Ol' Davey Hume wrote an 
entire Critique of Human Understanding which I've heard folks say is... 
beyond understanding; anyhow, I try to avoid using the word, just my 
prejudice.)

>Once you can do that, you have the precurser of a tool that can
>*really* convert one language to another - not just as a quick syntax
>re-hash, but as a genuine re-creation in another language.

It may say more about my powers of association than anything else but when 
I read about converting one language to another the first thing that comes 
to my mind is Larry Wall's 'A Real Programmer can write assembley in *any* 
language.'

>
>I wish him luck, it sounds a fascinating task..  :-)

That it does, Mr York.

DD

0
docdwarf (6044)
7/14/2005 12:24:50 PM
"William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
news:QhfBe.108603$Cl1.51057@fe03.news.easynews.com...
> Oliver,
>   Just thought I would mention one other area that you will need to 
> consider in the analysis phase - if not the "structuring" phase.
>
> The question of what happens with non-Standard PERFORM ranges.  For a 
> (semi-)discussion of how a number of vendors handle these, see the 
> PERFORM-TYPE compiler directive available from Micro Focus:
>
> FYI,
>   If you haven't already, you might want to look at the entire section
>
> "Run-time Behavior"
>
> in the manual
>  "Compiler Directives"
>
> available online at:
>  http://supportline.microfocus.com//documentation/books/sx40sp1/stpubb.htm
>
> NOTE:
>   Micro Focus is KNOWN for its "vast" quantity of directives AS WELL AS 
> its ability to (more or less) emulate the behavior of many, MANY, other 
> vendor's compilers.  Therefore, either when it comes to "extensions" or 
> "implementor defined" behavior, Micro Focus may (probably?) has a 
> directive (or two) to provide compatibility with the different semantics 
> for the SAME syntax.

    Thank you very much. I had some nagging concerns that perhaps different 
dialects of COBOL might assign the statements and keywords with different 
semantics, but without an actual reference document to list all these 
differences, I couldn't really make a serious effort to address this issue.

    It looks like my original ideas for restructuring assumed that the 
semantics for PERFORM are as described with the "MF" setting (which I assume 
implies this is the semantics MicroFocus uses), because it treats the 
PERFORM statements the most like method calls in Java, and so was the most 
intuitive interpretation for me.

    I'll have to investigate what effects on the restructuring "rules" the 
other semantics may have. I'll be sure to add this site to my bookmarks (too 
bad it uses JavaScript code which seems to break the functionality of the 
"back" and "forward" buttons on my webbrowser).

    - Oliver 


0
owong (6177)
7/14/2005 12:40:13 PM
On 13-Jul-2005, "Richard" <riplin@Azonic.co.nz> wrote:

> The more restrictive the coding, the more easily the code can be
> understood in small pieces.  For example I don't use section, thru,
> goto.  The only mechanism allowed is 'perform paragraph'.  This means
> that I can look at any part of the program and know exactly how the
> logic flows: it comes in at the label from a perform and exits back to
> the perform at the end of the paragraph.  Because I also do not use
> 'next sentence' or 'exit paragraph/section/perform' I can also
> transform one paragraph into two or more (one performing the other(s))
> without having to be concerned that this will change the way it works.
>
> It isn't that the GOTO is a bad statement, but if it exists it destroys
> the ability to examine the code as localised modules.

Agreed.

It is possible to write a standards checker that reads code before it gets
approved for production - that would limit how a label can be reached.   The
easiest way to do this is to check the following:

1.  No GO TO.
2.  No procedure division SECTIONs.
3.  No THRU clause in performs.
4.  The first paragraph ends with your standard for exiting the program
(GOBACK).

As you say, it isn't the stuff that is proscribed that's the problem - it's that
we don't know how we got to that label.   If the above is done, we know that we
got there via PERFORM processing.
0
howard (6283)
7/14/2005 1:26:18 PM
On 13-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:

> There have been paranoid knee jerks from people who loftily believe that
> ONLY a human could write 'proper' COBOL...

I'm close to this.   The reason for structured code is to make it easier for the
human debugger.    Putting in a bunch of switches to make a program "structured"
does not do this.

> The task you are undertaking is a very interesting and worthwhile one.

Interesting and fun yes.   Worthwhile to his employer?   I'm not sold on this.

> Re-factoring code from existing systems, whether it be COBOL or any other
> language is a valuable exercise.

Exercises like this can make him a better programmer, which in the long term can
be useful to his employer.    Exercising makes us stronger.
0
howard (6283)
7/14/2005 1:52:43 PM
Colin,

While it might be very satisfying to fix the last clunker in my 
inventory, I couldn't take you up on your very thoughtful and generous 
offer, for the obvious reasons.

And if it's not obvious to some of the readers here in 
comp.lang.cobol, companies frown on revealing source code to third 
parties, because the business rules in the source code are the 
intellectual property of the company.  They are trade secrets.  During 
the Y2K frenzy it was not too difficult to discuss problems with date 
routines, because those don't really reveal any business rules.

This is also why students rarely can find interesting COBOL source 
code on the net.  If it's a piece of code that has real value to a 
company, then it is not likely to be available for public viewing at 
no charge.

There are only two ways I can imagine that restructuring tools can be 
marketed for COBOL.  The first is as a tool that can be leased or 
purchased, so the user can fix their code and keep it private.  The 
second is probably as a fee for service arrangement, with contracts 
and confidentiality agreements.

In my 25 years of COBOL programming I can only remember one time when 
I was directed to send source code off-site to a third party (not 
counting certain programs we sell or give to customers so they can 
interface with our systems), and that was Y2K related.  We hired a 
third party to analyze a ton of COBOL source code to see if we had 
missed anything in our remediation.  The reports were impressive, and 
they confirmed we had done a very thorough job.  I wished we had such 
sophisticated tools.  But I believe that company went out of business 
shortly after the rollover.

If Oliver Wong can write a COBOL restructuring tool all by himself, I 
would say he must be one of the world's best programmers.  Even if he 
doesn't quite succeed, it's still an incredibly challenging learning 
experience.

While I would love to see a freeware or open-source COBOL 
restructuring tool, or even a low-priced one, I seriously doubt it 
will ever happen.  There needs to be a reasonable return for the 
investment of that much programming effort.

With kindest regards,

(Top post only)


Colin Campbell wrote:

> Arnold Trembley wrote:
> 
>>
>> I still have one or two old programs in production that I would like 
>> to restructure if we still had a license for CSF.
>>
>>
> Arnold,
> I have a working copy of VisualAge for COBOL Enterprise V2 for OS/2, 
> which included the COBOL Structuring Facility (that one feature wasn't 
> implemented in the Windoze version for V2).  Want to send me some source 
> code?
> 
> This is probably best discussed offline.
> Colin

-- 
http://arnold.trembley.home.att.net/

0
7/16/2005 7:03:48 AM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:db5qnq$rev$1@peabody.colorado.edu...
>
> On 13-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:
>
>> There have been paranoid knee jerks from people who loftily believe that
>> ONLY a human could write 'proper' COBOL...
>
> I'm close to this.   The reason for structured code is to make it easier 
> for the
> human debugger.    Putting in a bunch of switches to make a program 
> "structured"
> does not do this.

I have to disagree with you Howard. ( I don't enjoy it, because you are 
usually right and certainly one of the reasonable people here. :-))

I contest your premise "The reason for structured code is to make it easier 
for the
human debugger."

That is certainly _A_ benefit of structured code, but there are other 
reasons for structuring, that may be more important than this...benefits 
from avoiding duplication and encapsulating functionality into code blocks, 
for example.

Systems that use generators to pull blocks of code or OO components together 
are still taking advantage of structure, without any human programmers being 
involved.

"Putting in a bunch of switches to make a program "structured"
 does not do this."

No it doesn't, but this is an example of how the true purpose of Oliver's 
exercise gets lost. He is NOT putting in the switches to make the code 
structured.  (The removal of branches, in and of itself, does NOT make the 
code "Structured", but it DOES make it easier to process in a subsequent 
pass...) He is putting in the switches as a step towards converting the 
code.

Dijkstra was one of the first to scientifically analyse computer programming 
and show the fundamental functions it is comprised of: Sequence, Selection, 
and Iteration. He showed that no other functions were required for flow 
control. (At least, not in the Von Neumann model we are familiar with; 
certain other architectures may require their own equivalents and have some 
added ones. Multiple parallell asynchronous processors, quantum computers 
(if we ever see them) and content addressable stores, all have different 
rules..) Out of this has come a huge amount of information and it is of 
particular importance in what Oliver is trying to do. He is implementing 
algorithms that are well understood and guarantee success. His problem is to 
ensure that he matches the language facilities (whether it is COBOL, Java, 
C, Algol, PL/1 or anything else) correctly to his algorithms. (In my opinion 
he has demonstrated excellent judgement in doing this...) He accepts that he 
is not a COBOL expert (but he doesn't have to be; he needs to understand how 
the facilities of COBOL match to his computer science model, and all the 
facilities of COBOL are well documented.) He has checked in here as a belt 
and braces approach that is systematic and thorough. Some good stuff has 
come out of his doing so.

As COBOL programmers it is easy to start believing that we are implementing 
smart and complex code sequences that could  not possibly be grasped  or 
formulated by a machine. Wrong. The most beautiful, elegant, clever, 
imaginative code that you and I have ever written, can be decomposed 
according to well established (since the 80s) rules and 'simplified' or 
'refactored' into something that will do the exact equivalent.

The rules are so well established that the process can be automated, and 
that is what Oliver is doing.

>
>> The task you are undertaking is a very interesting and worthwhile one.
>
> Interesting and fun yes.   Worthwhile to his employer?   I'm not sold on 
> this.

I talk to managers all the time who are concerned about replacing their 
legacy systems. But they know they have to. It isn't about fashion or "if it 
ain't broke don't fix it"; it is about a dynamic market place where 
competitive edge is hard won and where it simply isn't viable to have 
millions of lines of computer code that has to be manually maintained, when 
there are other options.

So, say the company has its business model embedded in several million lines 
of COBOL and you are responsible for IT. What do you do?  COBOL programmers 
are getting harder to come by and the whole idea of maintaining bespoke code 
in-house is becoming more and more expensive and ponderous. Outsource the 
lot to, say, EDS? That is one option, but you have simply shifted the 
problem. The systems will be no more responsive to change in this scenario 
than if they stayed in-house. Maybe the members of the board have just been 
entertained by Siebel or SAP and are pretty sold on this solution.
Either of them will require tailoring, so the implementation of "New 
Technology" is going to be painful.

Anything you can do to ease the move to the new platforms will be good. If 
you could reuse all your existing COBOL functionality in a way that allowed 
it to work in the new environment, so you had the benefits of the new 
environment (too many to go into here), but bought time to consider 
rebuilding your core processes, wouldn't that be good?

Re-factoring existing code does that. If the COBOL becomes Java (or any 
other OO language) it can become encapsulated beans (functions) that can be 
wrapped appropriately and plugged into the New Technology, just like any 
other component. (Java or non-Java. Could be PHP or ASP or whatever; the New 
Technology has interfaces for all of it.) The source code is really 
irrelevant; it is the functionality that matters. If a given function is 
found not to be doing the job, it is much easier to replace that failing 
component than to try and find the required lines amongst a million lines of 
COBOL code. And then, (as you have pointed out yourself here frequently), do 
all the regression testing...

This is one reason why Oliver is not so concerned with source code (either 
COBOL or generated) beyond making sure that his algorithms can accommodate 
it. Whether he is working on a tool that will allow companies to refactor 
their own code (for their own reasons) or whether it is for a specific 
company who have recognised that they need such a tool and are prepaed to 
develop it, what he is doing is valuable, and that was the point we 
disagreed on.

The four paragraphs above are insufficient to cover all the reasons why 
automated re-factoring of code is a major cost saver, but hopefully, they 
may help to persuade you that it can be...

>> Re-factoring code from existing systems, whether it be COBOL or any other
>> language is a valuable exercise.
>
I stand by it; see above...

> Exercises like this can make him a better programmer, which in the long 
> term can
> be useful to his employer.    Exercising makes us stronger.
>
He is already a very capable programmer. However, I don't disagree with you 
on this one :-)

Pete. 



0
dashwood1 (2140)
7/16/2005 12:48:06 PM
"Jeff York" <ralf4@btinternet.com> wrote in message 
news:3gbcd19unj1vmj7q1fr928cqejvofe2h48@4ax.com...
> Louis Krupp <lkrupp@pssw.nospam.com.invalid> wrote:
>
<snip>
> If I understand correctly what Oliver is attempting to do, the
> "knock-on" effects if he actually manages to succeed could be massive.

He will succeed. He is smart and knows what he is doing. Also, it has been 
done by others.


> As I see it, in order to re-factor "spaghetti" (or any other variation
> of programming "method") it's going to be necessary for the tool to
> "understand" the functions that comprise the program to be re-factored
> - effectively to understand what the program does and how it does it.

That would be the understandable reaction of most COBOL programmers, who are 
not necessarily computer scientists.

In actuality, there is no such requirement. The tool sees a tiny window of 
source code and matches it to an algorithm derived from computer science, 
which recognises all the elements of flow in any program (there are only 
three, identified by Dijkstra and proven to be all that is required...) When 
Oliver completes his task, ANY COBOL source code will be able to be matched. 
"understanding" of what the code is doing is not a requirement.

> Once you can do that, you have the precurser of a tool that can
> *really* convert one language to another - not just as a quick syntax
> re-hash, but as a genuine re-creation in another language.

Precisely. Not only that, but you can add intelligence to it that will 
recognise discrete functions and allow them to be 'converted' as components 
and embedded into a new solution.

Pete.

>
> I wish him luck, it sounds a fascinating task..  :-)
>
> -- 
> Jeff.         Ironbridge,  Shrops,  U.K.
> jeff@xjackfieldx.org (remove the x..x round jackfield for return address)
> and don't bother with ralf4, it's a spamtrap and I never go there.. :)
>
> ... "There are few hours in life more agreeable
>      than the hour dedicated to the ceremony
>      known as afternoon tea.."
>
>         Henry James,  (1843 - 1916).
>
>
> 



0
dashwood1 (2140)
7/16/2005 1:00:19 PM
"Pete Dashwood" <dashwood@enternet.co.nz> wrote in message
news:3jse0aFrgk9nU1@individual.net...
>
> "Howard Brazee" <howard@brazee.net> wrote in message
> news:db5qnq$rev$1@peabody.colorado.edu...
> >
> > I'm close to this.   The reason for structured code is to make it easier
> > for the  human debugger...
>
> I have to disagree with you Howard. ( I don't enjoy it, because you are
> usually right and certainly one of the reasonable people here. :-))
>
> I contest your premise "The reason for structured code is to make it
easier
> for the human debugger."
>
> That is certainly _A_ benefit of structured code, but there are other
> reasons for structuring, that may be more important than this...benefits
> from avoiding duplication and encapsulating functionality into code
blocks,
> for example.

Call me a semantic nitpicker, Pete, but eliminating duplication and putting
functionality into compact code blocks sure sounds like a direct benefit to
us mere humans.

I have five bucks (US or NZ, your choice) says more than once in your long
and distinguished career you have commented with disparaging adjectives on
the work of others who put the same <expletive> code in seven different
places and you only found and fixed six on your first run; or took a single
business rule and devised a procedure which managed to separate the code
actually used by that procedure with 4000 or more lines of totally unrelated
source code plus two telephone area codes and a partridge in a pear tree.

Sure, there are resource issues resulting from duplication, and potentially
some efficiency issues from 'non-compactness' , but the bottom-line cost of
these is far, far less than the cost of programming - programming done by
humans.

MCM



0
7/16/2005 1:03:08 PM
<docdwarf@panix.com> wrote in message news:db5lii$hkk$1@panix5.panix.com...
> In article <3gbcd19unj1vmj7q1fr928cqejvofe2h48@4ax.com>,
> Jeff York  <ralf4@btinternet.com> wrote:
>
> [snip]
>
>>As I see it, in order to re-factor "spaghetti" (or any other variation
>>of programming "method") it's going to be necessary for the tool to
>>"understand" the functions that comprise the program to be re-factored
>>- effectively to understand what the program does and how it does it.
>
> I agree... and disagree.  As I see it, what a program does is a result of
> how it applies the functions of the language and it is those functions
> which the restructuring tool needs to 'understand' and manipulate.

Depends what you mean by 'function's, Doc. If you consider PERFORM (for 
example) to be a 'function' of COBOL, then I would agree with what you 
stated.

>
> The ability of the restructuring program to manipulate those functions
> would seem to be predicated on the ability of the programmer who writes
> the restructurer to 'understand' the functions of the language and I am...
> cautious about a neophyte's abilities in such things, based on my own
> small experiences with both COBOL and programming neophytes; this group
> has seen a few postings by even Olde Pros where confusion has resulted due
> to difficulties in such matters.

This is where it gets hairy :-)

The 'restructuring program' is not specifically interested in restructuring 
as a goal in itself. It is a means to an end and, really just a fringe 
benefit (which will never be realised by humans because they will never get 
to see it.)

I share your concern about neophytes. But Oliver is not a neophyte 
programmer, just new to COBOL.

In this case, 'understanding' is required within a precise mathematically 
defined domain, and the 'understanding' is for one statement at a time. 
(That's what prompted this thread in the first place; Oliver needs a single 
statement (or sentence) to be able to parse into his conversion algorithms.) 
Every 'function' (used in the sense above) of COBOL is precisely documented, 
but there are vendor specific extensions and departures from standards, not 
to mention optional constructs, that must be handled. These muddy the water, 
so Oliver wisely posted here to get some clarification.

Personally, I have no doubt he will succeed.

>
> Mr Wong, as noted earlier, seemed to pick up on the function of NEXT
> SENTENCE in fashion so ready that it appeared to contradict one of The
> Standard's reasons for getting rid of the function; if the rest of his
> work shows similar quality the result might be pleasant to see, indeed.
>
> (note - I put quotation marks around 'understand' because I am uncertain
> what this phenomenon entails... what are the symptoms of 'understanding'?
> How do these differ from the criteria of 'understanding?' What goes on
> when a person says 'Ahhhhhh... I understand!'?  Ol' Davey Hume wrote an
> entire Critique of Human Understanding which I've heard folks say is...
> beyond understanding; anyhow, I try to avoid using the word, just my
> prejudice.)
>
>>Once you can do that, you have the precurser of a tool that can
>>*really* convert one language to another - not just as a quick syntax
>>re-hash, but as a genuine re-creation in another language.
>
> It may say more about my powers of association than anything else but when
> I read about converting one language to another the first thing that comes
> to my mind is Larry Wall's 'A Real Programmer can write assembley in *any*
> language.'
>
I've also seen it quoted as: "A real programmer can write Fortran in any 
language."   :-)

Either way, it doesn't matter because this exercise is not about source code 
in the final analysis, and even the converted code is unlikely to be 
maintained by humans (certainly not for any period of time). Converting to 
an OO language gives the advantage of pluggability into various containers 
and platforms, collectively referred to as New Technology. (Not to be 
confused with the Windows NT product...). Even if this is not picked up 
immediately, at least the option is available, which it isn't if the source 
remains in standard COBOL.

>>
>>I wish him luck, it sounds a fascinating task..  :-)
>
> That it does, Mr York.
>
He is not the first to attempt it. However, I also wish him success.

Pete.



0
dashwood1 (2140)
7/16/2005 1:20:30 PM
"Michael Mattias" <michael.mattias@gte.net> wrote in message 
news:gA7Ce.1028$4F4.165@newssvr31.news.prodigy.com...
> "Pete Dashwood" <dashwood@enternet.co.nz> wrote in message
> news:3jse0aFrgk9nU1@individual.net...
>>
>> "Howard Brazee" <howard@brazee.net> wrote in message
>> news:db5qnq$rev$1@peabody.colorado.edu...
>> >
>> > I'm close to this.   The reason for structured code is to make it 
>> > easier
>> > for the  human debugger...
>>
>> I have to disagree with you Howard. ( I don't enjoy it, because you are
>> usually right and certainly one of the reasonable people here. :-))
>>
>> I contest your premise "The reason for structured code is to make it
> easier
>> for the human debugger."
>>
>> That is certainly _A_ benefit of structured code, but there are other
>> reasons for structuring, that may be more important than this...benefits
>> from avoiding duplication and encapsulating functionality into code
> blocks,
>> for example.
>
> Call me a semantic nitpicker, Pete, but eliminating duplication and 
> putting
> functionality into compact code blocks sure sounds like a direct benefit 
> to
> us mere humans.

Of course it is. But only if a human gets to see it... :-)

My point is that whether a human gets to see it or not, there is benefit in 
structured code. So, in an automated system, you might well prepare 
structured code, not "to make it easier for the human debugger" but because 
if the other uses it has.

>
> I have five bucks (US or NZ, your choice) says more than once in your long
> and distinguished career you have commented with disparaging adjectives on
> the work of others who put the same <expletive> code in seven different
> places and you only found and fixed six on your first run; or took a 
> single
> business rule and devised a procedure which managed to separate the code
> actually used by that procedure with 4000 or more lines of totally 
> unrelated
> source code plus two telephone area codes and a partridge in a pear tree.
>

Ansolutely :-). I wouldn't take the bet. Removal of duplicated code is a 
very good benefit of structuring (whether a human gets to see/maintain it or 
not :-))

> Sure, there are resource issues resulting from duplication, and 
> potentially
> some efficiency issues from 'non-compactness' , but the bottom-line cost 
> of
> these is far, far less than the cost of programming - programming done by
> humans.
>

Sure. No disagreement from me. I think I failed to explain clearly just 
exactly what my disagreement with Howard was. Hope this helps.

Pete.



> MCM
>
>
>
> 



0
dashwood1 (2140)
7/16/2005 1:47:10 PM
I realize I'm late in the discussion, my apologies if I duplicate 
anything others have said...

Oliver Wong wrote:
> "Howard Brazee" <howard@brazee.net> wrote in message 
> news:dauj3r$1m2$1@peabody.colorado.edu...
> 
>>I'm a big proponent of structured code.    But I would rather leave the GO 
>>TO in
>>unstructured code than replacing them with switches in that same code. 
>>My
>>distaste for the second piece of code above is real strong.    Structured 
>>is a
>>lot more than GO TO - less code.
> 
> 
>     I understand your distate (I share it as well), so let's just say that 
> "GO TO elimination" is a requirement for the project I'm working on. If the 
> code is "easy to understand", that's a plus, but it a non-negotiable 
> requirement for me that there be no GO TO statements.

I'm not sure what the output product of this project is supposed to be 
(nor what say you may or may not have in the requirements).  However, 
unless this is an academic exercise, this strikes me as a terribly bad 
idea.  Now, I'm just as opposed to GO TO loops and PERFORM THRU logic as 
the next guy - but, there are cases where GO TO is the best tool for the 
job.

Take a 15K-line program, that can reject in over 100 different ways. 
That's a *lot* of extra conditional checking (or duplicated code) if the 
ability to say "die now" is taken away (and a centralized place to wrap 
things up before doing so).

If This-Variable = That-Variable
     Move 1537 to Reject-Code
     Go to Program-Dying-Now
End-If

Nothing wrong with that at all.  It's the best tool for the job. 
Imagine a Java project where you're told you can't catch any specific 
exceptions, just the generic "Exception" - and you can't try but one 
statement per try...catch block.  Sure, it can be done - but with a lot 
of bloated code and possibly less efficient run times.  It may be "gee 
whiz cool" to programmers, but users rarely want bigger, slower software.

>>Does your restructure handle *real* spaghetti code?   With GO TOs pointing 
>>all
>>over the place?
> 
> 
>     That's the plan. In theory, the algorithms should be able to handle 
> everything (PERFORM THRUs, ALTERed GOTOs, fall through logic, the works) 
> except for SECTIONS with independent overlayed segments (which is low 
> priority because it doesn't seem to appear in any of my sample COBOL files, 
> but still laying in the back of my mind as a future problem I'll have to 
> eventually address).
> 
>     In practice, I've just started trying to implement the algorithms, and 
> the first obstacle I've encountered is the desire to have just 1 sentence 
> per paragraph to make the future transformations more easy. So I started to 
> wonder if I could always transform a given COBOL program to an equivalent 
> one with only 1 sentence per paragraph, and I couldn't think of a reason why 
> not, but I thought I'd post on the newsgroup just in case.

So, this is a "first step" processor, an unstringer of spaghetti?  I can 
see where that might be useful - however, there's often little 
substitute for a skilled analyst.  :)  I still don't think removing a 
valid structure that can serve a very understandable purpose is a good 
idea, though.

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~   /   \  /         ~        Live from Montgomery, AL!       ~
~  /     \/       o  ~                                        ~
~ /      /\   -   |  ~          daniel@thebelowdomain         ~
~ _____ /  \      |  ~      http://www.djs-consulting.com     ~
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ GEEKCODE 3.12 GCS/IT d s-:+ a C++ L++ E--- W++ N++ o? K- w$ ~
~ !O M-- V PS+ PE++ Y? !PGP t+ 5? X+ R* tv b+ DI++ D+ G- e    ~
~ h---- r+++ z++++                                            ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0
lxi0007 (1830)
7/17/2005 1:58:33 AM
In article <3jsft1Frbmg0U1@individual.net>,
Pete Dashwood <dashwood@enternet.co.nz> wrote:
>
><docdwarf@panix.com> wrote in message news:db5lii$hkk$1@panix5.panix.com...
>> In article <3gbcd19unj1vmj7q1fr928cqejvofe2h48@4ax.com>,
>> Jeff York  <ralf4@btinternet.com> wrote:
>>
>> [snip]
>>
>>>As I see it, in order to re-factor "spaghetti" (or any other variation
>>>of programming "method") it's going to be necessary for the tool to
>>>"understand" the functions that comprise the program to be re-factored
>>>- effectively to understand what the program does and how it does it.
>>
>> I agree... and disagree.  As I see it, what a program does is a result of
>> how it applies the functions of the language and it is those functions
>> which the restructuring tool needs to 'understand' and manipulate.
>
>Depends what you mean by 'function's, Doc. If you consider PERFORM (for 
>example) to be a 'function' of COBOL, then I would agree with what you 
>stated.

My language is not as precise as that of others, Mr Dashwood; I used 
'function' in the sense of 'any group of actions contributing to a larger 
action', the 'things that it does'.  PERFORM is/has a function, as do 
INSPECT, IF, MOVE, etc.

>
>>
>> The ability of the restructuring program to manipulate those functions
>> would seem to be predicated on the ability of the programmer who writes
>> the restructurer to 'understand' the functions of the language and I am...
>> cautious about a neophyte's abilities in such things, based on my own
>> small experiences with both COBOL and programming neophytes; this group
>> has seen a few postings by even Olde Pros where confusion has resulted due
>> to difficulties in such matters.
>
>This is where it gets hairy :-)

Were it simple folks might have already gone out for beer, aye.

>
>The 'restructuring program' is not specifically interested in restructuring 
>as a goal in itself. It is a means to an end and, really just a fringe 
>benefit (which will never be realised by humans because they will never get 
>to see it.)

Ow... it might wind up that way, Mr Dashwood, just as the structural steel 
of a skyscraper is something humans (under most normal conditions) 'never 
get to see'... but humans most certainly *did* see it, when it was being 
put together, and a good thing that was, too.  What we have here might be 
likened to a group of folks with varying degrees of interest and skill 
looking at the building of such a skeleton and responding to the 
constructor's request for comments.

>
>I share your concern about neophytes. But Oliver is not a neophyte 
>programmer, just new to COBOL.

Even the most experienced of drivers might approach handling a new vehicle 
with caution, Mr Dashwood... and what Mr Wong proposes seems to be a 
pushing-of-the-limits that's usually done by the aforementioned Olde Pros.

>
>In this case, 'understanding' is required within a precise mathematically 
>defined domain, and the 'understanding' is for one statement at a time. 
>(That's what prompted this thread in the first place; Oliver needs a single 
>statement (or sentence) to be able to parse into his conversion algorithms.) 
>Every 'function' (used in the sense above) of COBOL is precisely documented, 
>but there are vendor specific extensions and departures from standards, not 
>to mention optional constructs, that must be handled. These muddy the water, 
>so Oliver wisely posted here to get some clarification.
>
>Personally, I have no doubt he will succeed.

As noted below (which pointed out how it was noticed earlier) Mr Wong 
appeared to pick up on at least one thing with a modicum of facility... 
but being the cautious type that I am I still maintain doubts, having been 
trained in the Mac Truck school.

>
>>
>> Mr Wong, as noted earlier, seemed to pick up on the function of NEXT
>> SENTENCE in fashion so ready that it appeared to contradict one of The
>> Standard's reasons for getting rid of the function; if the rest of his
>> work shows similar quality the result might be pleasant to see, indeed.

[snip]

>> It may say more about my powers of association than anything else but when
>> I read about converting one language to another the first thing that comes
>> to my mind is Larry Wall's 'A Real Programmer can write assembley in *any*
>> language.'
>>
>I've also seen it quoted as: "A real programmer can write Fortran in any 
>language."   :-)

How interesting... I am not sure of the date of Wall's statement but the 
earliest reference I can find is 1994 
<http://groups-beta.google.com/group/rec.games.video.atari/msg/12afae38f6cd9147?dmode=source&hl=en>

.... while the earliest reference for the FORTRAN version is 1983 
<http://groups-beta.google.com/group/net.micro/msg/4146c2dd3ecab2d1?dmode=source&hl=en>

DD

0
docdwarf (6044)
7/17/2005 2:30:04 AM
<docdwarf@panix.com> wrote in message news:dbcfrc$1hp$1@panix5.panix.com...
> In article <3jsft1Frbmg0U1@individual.net>,
> Pete Dashwood <dashwood@enternet.co.nz> wrote:
>>
>><docdwarf@panix.com> wrote in message 
>>news:db5lii$hkk$1@panix5.panix.com...
>>> In article <3gbcd19unj1vmj7q1fr928cqejvofe2h48@4ax.com>,
>>> Jeff York  <ralf4@btinternet.com> wrote:
>>>
>>> [snip]
>>>
<snipped>>>
>>Personally, I have no doubt he will succeed.
>
> As noted below (which pointed out how it was noticed earlier) Mr Wong
> appeared to pick up on at least one thing with a modicum of facility...
> but being the cautious type that I am I still maintain doubts, having been
> trained in the Mac Truck school.
>
OK, Doc, do tell... What is the Mac Truck school, and what did you 
experience there :-)?

Pete.
<snip> 



0
dashwood1 (2140)
7/17/2005 11:05:56 AM
In article <3juscnFrr1jlU1@individual.net>,
Pete Dashwood <dashwood@enternet.co.nz> wrote:
>
><docdwarf@panix.com> wrote in message news:dbcfrc$1hp$1@panix5.panix.com...
>> In article <3jsft1Frbmg0U1@individual.net>,
>> Pete Dashwood <dashwood@enternet.co.nz> wrote:
>>>
>>><docdwarf@panix.com> wrote in message 
>>>news:db5lii$hkk$1@panix5.panix.com...
>>>> In article <3gbcd19unj1vmj7q1fr928cqejvofe2h48@4ax.com>,
>>>> Jeff York  <ralf4@btinternet.com> wrote:
>>>>
>>>> [snip]
>>>>
><snipped>>>
>>>Personally, I have no doubt he will succeed.
>>
>> As noted below (which pointed out how it was noticed earlier) Mr Wong
>> appeared to pick up on at least one thing with a modicum of facility...
>> but being the cautious type that I am I still maintain doubts, having been
>> trained in the Mac Truck school.
>>
>OK, Doc, do tell... What is the Mac Truck school, and what did you 
>experience there :-)?

Whoopsie... my error and apologies, that should have been Mac*k*, not 
Mac... as in http://www.macktrucks.com/ ... as described, e'er-so-long 
ago, in

<http://groups-beta.google.com/group/comp.lang.cobol/msg/e7026b62826dbf51?dmode=source&hl=en>

--begin quoted text:

I say that unless it is necessary to the process then code should be 
written to comply with both the two-year-programmer standard and to the 
Mack Truck standard ('Anyone can be hit by a Mack Truck so code should be 
written to allow for this'); code should be written to the lowest common 
denominator because that is who is going to be maintaining it.

--end quoted text

.... so the Mack Truck is the one which teaches the Mack Truck standard.

(Interestingly enough... 
<http://groups-beta.google.com/group/comp.lang.cobol/msg/25058e66ea9136ee?dmode=source&hl=en>
 shows the phrase/concept being used and the earliest posting I can find I 
made here was on 3 May of that same year.)

DD

0
docdwarf (6044)
7/17/2005 1:54:07 PM
Ah, clear.

Thanks.

Pete.

Top post no more below.
<docdwarf@panix.com> wrote in message news:dbdntv$gjm$1@panix5.panix.com...
> In article <3juscnFrr1jlU1@individual.net>,
> Pete Dashwood <dashwood@enternet.co.nz> wrote:
>>
>><docdwarf@panix.com> wrote in message 
>>news:dbcfrc$1hp$1@panix5.panix.com...
>>> In article <3jsft1Frbmg0U1@individual.net>,
>>> Pete Dashwood <dashwood@enternet.co.nz> wrote:
>>>>
>>>><docdwarf@panix.com> wrote in message
>>>>news:db5lii$hkk$1@panix5.panix.com...
>>>>> In article <3gbcd19unj1vmj7q1fr928cqejvofe2h48@4ax.com>,
>>>>> Jeff York  <ralf4@btinternet.com> wrote:
>>>>>
>>>>> [snip]
>>>>>
>><snipped>>>
>>>>Personally, I have no doubt he will succeed.
>>>
>>> As noted below (which pointed out how it was noticed earlier) Mr Wong
>>> appeared to pick up on at least one thing with a modicum of facility...
>>> but being the cautious type that I am I still maintain doubts, having 
>>> been
>>> trained in the Mac Truck school.
>>>
>>OK, Doc, do tell... What is the Mac Truck school, and what did you
>>experience there :-)?
>
> Whoopsie... my error and apologies, that should have been Mac*k*, not
> Mac... as in http://www.macktrucks.com/ ... as described, e'er-so-long
> ago, in
>
> <http://groups-beta.google.com/group/comp.lang.cobol/msg/e7026b62826dbf51?dmode=source&hl=en>
>
> --begin quoted text:
>
> I say that unless it is necessary to the process then code should be
> written to comply with both the two-year-programmer standard and to the
> Mack Truck standard ('Anyone can be hit by a Mack Truck so code should be
> written to allow for this'); code should be written to the lowest common
> denominator because that is who is going to be maintaining it.
>
> --end quoted text
>
> ... so the Mack Truck is the one which teaches the Mack Truck standard.
>
> (Interestingly enough...
> <http://groups-beta.google.com/group/comp.lang.cobol/msg/25058e66ea9136ee?dmode=source&hl=en>
> shows the phrase/concept being used and the earliest posting I can find I
> made here was on 3 May of that same year.)
>
> DD
>
> 



0
dashwood1 (2140)
7/17/2005 10:50:30 PM
On 16-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:

> I contest your premise "The reason for structured code is to make it easier
> for the
> human debugger."
>
> That is certainly _A_ benefit of structured code, but there are other
> reasons for structuring, that may be more important than this...benefits
> from avoiding duplication and encapsulating functionality into code blocks,
> for example.

OK.   For *me* those other benefits are of little concern.   I want structured
code for the human debugger.   As you indicated later in your post one can also
structure as a step towards encapsulation into objects that can be used in an OO
environment.    I haven't worked anywhere that was interested in this benefit -
but we have very much appreciated the benefit of making it easier to code.

My work experience is more limited than yours, and those other reasons haven't
come up in the companies I have worked at since structured programming was
invented.
0
howard (6283)
7/18/2005 2:16:31 PM
On 16-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:

> "Putting in a bunch of switches to make a program "structured"
>  does not do this."
>
> No it doesn't, but this is an example of how the true purpose of Oliver's
> exercise gets lost. He is NOT putting in the switches to make the code
> structured.  (The removal of branches, in and of itself, does NOT make the
> code "Structured", but it DOES make it easier to process in a subsequent
> pass...) He is putting in the switches as a step towards converting the
> code.

That will work - if it is easier to finish the task by getting rid of switches
than it would have been by getting rid of periods.    It is not obvious to me
that that would be the case.

When the code is converted to good structured code, whether paragraphs have
multiple sentences in them won't make a difference, so if he wants to eliminate
them as a step towards this goal, it won't hurt.    He won't need to add them
back in after.    But if he does this by sticking in switches, he will want to
find a step to take them out again.     I'm not seeing the advantage of this
approach.
0
howard (6283)
7/18/2005 2:21:02 PM
On 16-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:

> Ansolutely :-). I wouldn't take the bet. Removal of duplicated code is a
> very good benefit of structuring (whether a human gets to see/maintain it or
> not :-))

I've worked in environments where duplicated code was desired.   So I'll play
Devil's advocate here and ask why an automated system should significantly
benefit by eliminating duplicated code.
0
howard (6283)
7/18/2005 2:25:19 PM
"Howard Brazee" <howard@brazee.net> wrote in message
news:dbge55$gm2$1@peabody.colorado.edu...
>
> On 16-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:
>
> > Ansolutely :-). I wouldn't take the bet. Removal of duplicated code is a
> > very good benefit of structuring (whether a human gets to see/maintain
it or
> > not :-))
>
> I've worked in environments where duplicated code was desired.   So I'll
play
> Devil's advocate here and ask why an automated system should significantly
> benefit by eliminating duplicated code.

Eliminating duplicated code makes maintenance easier. So in any company
which places a high "point value" on the maintainability of software
relative to execution speed and resource usage, reducing or eliminating
duplicated code is absolutely a benefit.

If maintainability is not so important (e.g. one-time or 'on demand file
fixup' programs), then reducing duplicated code is no big thing.

Me, I place a HUGE value on maintainability, since users are always thinking
of new ways to enhance their software - always. Not to mention,
maintainability means one does not get locked in to a 'single vendor' for
maintenance services. When I was a PHB,  I always had a bad feeling about
being in that situation. Couldn't help but feel that single vendor was
holding something important and was just waiting for the right moment to
inflict massive pain by squeezing.

MCM






0
7/18/2005 2:48:57 PM
On 18-Jul-2005, "Michael Mattias" <michael.mattias@gte.net> wrote:

> > I've worked in environments where duplicated code was desired.   So I'll
> play
> > Devil's advocate here and ask why an automated system should significantly
> > benefit by eliminating duplicated code.
>
> Eliminating duplicated code makes maintenance easier. So in any company
> which places a high "point value" on the maintainability of software
> relative to execution speed and resource usage, reducing or eliminating
> duplicated code is absolutely a benefit.

Right.   That goes with the whole advantage of structured in making maintenance
easier.   Once learned, structured coding also is easy to design and code.    
But Pete Dashwood was listing advantages of structured that go beyond
maintainability.
0
howard (6283)
7/18/2005 3:22:36 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:dbgdt4$gli$1@peabody.colorado.edu...
>
> On 16-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:
>
>> "Putting in a bunch of switches to make a program "structured"
>>  does not do this."
>>
>> No it doesn't, but this is an example of how the true purpose of Oliver's
>> exercise gets lost. He is NOT putting in the switches to make the code
>> structured.  (The removal of branches, in and of itself, does NOT make 
>> the
>> code "Structured", but it DOES make it easier to process in a subsequent
>> pass...) He is putting in the switches as a step towards converting the
>> code.
>
> That will work - if it is easier to finish the task by getting rid of 
> switches
> than it would have been by getting rid of periods.    It is not obvious to 
> me
> that that would be the case.
>
> When the code is converted to good structured code, whether paragraphs 
> have
> multiple sentences in them won't make a difference, so if he wants to 
> eliminate
> them as a step towards this goal, it won't hurt.    He won't need to add 
> them
> back in after.    But if he does this by sticking in switches, he will 
> want to
> find a step to take them out again.     I'm not seeing the advantage of 
> this
> approach.

    The adding of switches is mainly because this is pretty much the 
technique used by all GOTO-eliminating algorithms I've seen discussed in the 
literature and papers I've read on this topic. If there's a way to perform 
GOTO removal without adding new switches, I'd very much like to know about 
it because generating meaningful names for switches is rather difficult 
(these names WILL probably eventually be seen by some human, at least in the 
form of the outline of the program generated at the end, if not directly in 
the COBOL source code itself).

    Whether or not there will be a "switch eliminating phase" is yet to be 
seen; it's much too early at this point to tell whether it will be 
nescessary, feasible, or even desirable. Right now, when confronted with 
"lots of switches" versus "lots of GOTOs", my requirements state that "lots 
of switches" is the lesser of two evils.

    - Oliver 


0
owong (6177)
7/18/2005 3:34:43 PM
"Howard Brazee" <howard@brazee.net> wrote in message
news:dbghgi$ii9$1@peabody.colorado.edu...
> But Pete Dashwood was listing advantages of structured that go beyond
> maintainability.

Give Pete a break.

He's obviously been forced to spend WAY too much time in the Geek wing of
comp.lang.cobol.

MCM



0
7/18/2005 4:24:02 PM
On 18-Jul-2005, "Oliver Wong" <owong@castortech.com> wrote:

>     Whether or not there will be a "switch eliminating phase" is yet to be
> seen; it's much too early at this point to tell whether it will be
> nescessary, feasible, or even desirable. Right now, when confronted with
> "lots of switches" versus "lots of GOTOs", my requirements state that "lots
> of switches" is the lesser of two evils.

Requirements are requirements.    But I wouldn't pay a dime to have someone
replace spaghetti code with lots of GOTOs with spaghetti code with lots of
switches.

I'd play a lot to have code that looked like native structured code - without
GOTOs and without switches.
0
howard (6283)
7/18/2005 4:27:45 PM
On 18-Jul-2005, "Michael Mattias" <michael.mattias@gte.net> wrote:

> > But Pete Dashwood was listing advantages of structured that go beyond
> > maintainability.
>
> Give Pete a break.
>
> He's obviously been forced to spend WAY too much time in the Geek wing of
> comp.lang.cobol.

It's a fun wing to spend time in.   I'd rather learn stuff that isn't needed
than to fail to learn stuff that is useful.
0
howard (6283)
7/18/2005 4:29:13 PM
<all snippage>

I think that a LOT (most?) of the comments in this thread seem to have missed 
Oliver's main point that this is NOT a tool intended to create "improved for 
maintenance by a programmer" code BUT RATHER that this (his project) is intended 
to create code that will be seen (primarily) by ANOTHER automated "analysis" 
tool.

Therefore (to me) the use of switches and other techniques (mentioned or 
discussed) should be judged by

A) can it be automated?
B) is  the "resultant" code "guaranteed" to have the same semantics (run-time 
logic)?
C) is the "initial" code structure (paragraph, section, whatever - names) 
retained and distinguished from "inserted" code so that the analysis tool will 
be able to report on the ORIGINAL code structure?

Given these requirement (NOT will the resultant code be "easier to maintain" or 
even be "maintainable by the original programmer"), then using switches/flags 
seems totally appropriate.   I also don't think that "inserted" new paragraphs 
(for NEXT SENTENCE resolution) is necessarily a problem.

The only "caveat" that I would place is that my experience points to cases where 
clear-cut initial "requirements" often turn into "while you are at it" or "why 
don't we enhance this so that ..." situations and I question whether some 
"future" requirement will make the resultant code "programmer visible" and/or 
that users of the initial tool *WILL* (for whatever reason) start "using" that 
code - rather than the original code.  If/when this happens, then many of the 
"objections" mentioned in the thread will come back as "problems".

I certainly have a reasonable amount of experience with "code generator" 
applications where the product's vendor EXPLICITLY said that users should NEVER 
modify (play with - or even look at) the "generated COBOL code" - but where real 
world programmers actually DID end up using/working with this VERY UGLY code. 
To some extent (from a business reason - of the tool vendor) it may actually be 
beneficial if Oliver's "resultant code" is UGLY and difficult to maintain - as 
it will "encourage" people to use the tool for its intended purpose.

-- 
Bill Klein
 wmklein <at> ix.netcom.com 


0
wmklein (2605)
7/18/2005 6:47:03 PM
On 18-Jul-2005, "William M. Klein" <wmklein@nospam.netcom.com> wrote:

> I think that a LOT (most?) of the comments in this thread seem to have missed
> Oliver's main point that this is NOT a tool intended to create "improved for
> maintenance by a programmer" code BUT RATHER that this (his project) is
> intended
> to create code that will be seen (primarily) by ANOTHER automated "analysis"
> tool.

Somewhere down the road will be a finished project.   Whether it goes through 2
tools or 100 tools, there is a final goal.   It appears that a partial goal is
"to make the program continue working the way it does now".   Another partial
goal is to "eliminate spaghetti code".

I guess that he wants the other benefits of structured code - of which benefits,
I am most familiar with making it easy to write and maintain.

It is important to keep the final destination in mind as we go through the
preliminary steps of such a journey.


> I certainly have a reasonable amount of experience with "code generator"
> applications where the product's vendor EXPLICITLY said that users should
> NEVER
> modify (play with - or even look at) the "generated COBOL code" - but where
> real
> world programmers actually DID end up using/working with this VERY UGLY code.
> To some extent (from a business reason - of the tool vendor) it may actually
> be
> beneficial if Oliver's "resultant code" is UGLY and difficult to maintain - as
>
> it will "encourage" people to use the tool for its intended purpose.

What's its intended purpose?    To turn working spaghetti code into working
difficult-to-maintain code?    Already have that.
0
howard (6283)
7/18/2005 8:04:42 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:dbge55$gm2$1@peabody.colorado.edu...
>
> On 16-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:
>
>> Ansolutely :-). I wouldn't take the bet. Removal of duplicated code is a
>> very good benefit of structuring (whether a human gets to see/maintain it 
>> or
>> not :-))
>
> I've worked in environments where duplicated code was desired.

I'd be interested to know what those environments were, Howard. Never 
encountered it myself, and am having great difficulty imagining it.


 >So I'll play
> Devil's advocate here and ask why an automated system should significantly
> benefit by eliminating duplicated code.

A very fair question. If no human is going to maintain it, you could well 
argue that duplication is irrelevant. However, it isn't . Here are some 
reasons why even a  code generator might avoid duplicated code:

1. Isolation and reuse of functionality. Refactoring code into component 
functions means that the code can be shared and reused. Even if you are not 
maintaining it, you might want to export the functionality to other systems 
or include it in new features of existing systems. (This is the main 
advantage of component based systems.)

2. The less code that is generated, the faster it will run.

3. It is easier to maintain an inventory of discrete functions if their 
functionality is not duplicated all over the place.

At a line-by-line level a code generator may well duplicate code because it 
doesn't 'understand' the overall picture. However, subsequent passes of the 
conversion process can analyse this generated code (because it is now in a 
form where it is suitable for automated analysis), and recognise repeated 
code. Optimization can create functions and replace the duplication with 
function (or component) activation. (I didn't say CALLs, because it depends 
on the environment...).

The bottom line is this:

Computer programming is well enough understood now that existing code can be 
'rationalised' and refactored. (We haven't yet got AI generators that are 
capable of creating code, but there are techniques that support this and it 
is a 'next step' in system evolution. (Iterative interaction with a user is 
one possible process whereby a computer could 'create' new functionality. - 
I mentioned this years ago, right here in this forum. I even considered 
actually doing it, but got sidetracked by the need to make a living... :-)) 
There are already tools that can do this in some limited spheres. As 
processor and storage resources increase and become cheaper/faster, 
techniques that were previously non-viable become feasible. As these 
techniques are investigated, they get more refined and better.) Even without 
going to Science Fiction, Computer Science has enabled refactoring tools, 
such as the one Oliver is building, to become a reality.

It is no longer accurate to believe that only a human programmer, who has 
full understanding of a process, can be capable of refactoring that process. 
Indications are that smart software can do it faster, cheaper, and with 
fewer errors.

Pete.



0
dashwood1 (2140)
7/18/2005 11:02:50 PM
Thanks guys, I appreciate you cutting me some slack :-)

On this occasion it isn't necessary. I have responded to Howard's post and 
what I'm saying isn't geeky, it is accepted practice in environments where 
they use modern techniques as a matter of course.

People who are not COBOL programmers are used to interacting with tools that 
allow them to assemble system functionality, in a way that someone whose 
whole life is COBOL could not imagine. There are 'drag and drop' tools that 
generate system functionality in minutes, that would take days to duplicate 
in COBOL. My point is that there is another world outside of procedural 
programming. And it is not the domain of geeks; people in the work place are 
encountering it in one form or another. And newcomers take it for granted.

Pete.

Top Post, no more below.

"Howard Brazee" <howard@brazee.net> wrote in message 
news:dbgldf$kf2$1@peabody.colorado.edu...
>
> On 18-Jul-2005, "Michael Mattias" <michael.mattias@gte.net> wrote:
>
>> > But Pete Dashwood was listing advantages of structured that go beyond
>> > maintainability.
>>
>> Give Pete a break.
>>
>> He's obviously been forced to spend WAY too much time in the Geek wing of
>> comp.lang.cobol.
>
> It's a fun wing to spend time in.   I'd rather learn stuff that isn't 
> needed
> than to fail to learn stuff that is useful.
> 



0
dashwood1 (2140)
7/18/2005 11:11:37 PM
Good for you, Bill.

My original post here was making precisely this point.

Pete.

Top post, no more below.

"William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
news:HOSCe.389872$iM6.48195@fe01.news.easynews.com...
> <all snippage>
>
> I think that a LOT (most?) of the comments in this thread seem to have 
> missed Oliver's main point that this is NOT a tool intended to create 
> "improved for maintenance by a programmer" code BUT RATHER that this (his 
> project) is intended to create code that will be seen (primarily) by 
> ANOTHER automated "analysis" tool.
>
> Therefore (to me) the use of switches and other techniques (mentioned or 
> discussed) should be judged by
>
> A) can it be automated?
> B) is  the "resultant" code "guaranteed" to have the same semantics 
> (run-time logic)?
> C) is the "initial" code structure (paragraph, section, whatever - names) 
> retained and distinguished from "inserted" code so that the analysis tool 
> will be able to report on the ORIGINAL code structure?
>
> Given these requirement (NOT will the resultant code be "easier to 
> maintain" or even be "maintainable by the original programmer"), then 
> using switches/flags seems totally appropriate.   I also don't think that 
> "inserted" new paragraphs (for NEXT SENTENCE resolution) is necessarily a 
> problem.
>
> The only "caveat" that I would place is that my experience points to cases 
> where clear-cut initial "requirements" often turn into "while you are at 
> it" or "why don't we enhance this so that ..." situations and I question 
> whether some "future" requirement will make the resultant code "programmer 
> visible" and/or that users of the initial tool *WILL* (for whatever 
> reason) start "using" that code - rather than the original code.  If/when 
> this happens, then many of the "objections" mentioned in the thread will 
> come back as "problems".
>
> I certainly have a reasonable amount of experience with "code generator" 
> applications where the product's vendor EXPLICITLY said that users should 
> NEVER modify (play with - or even look at) the "generated COBOL code" - 
> but where real world programmers actually DID end up using/working with 
> this VERY UGLY code. To some extent (from a business reason - of the tool 
> vendor) it may actually be beneficial if Oliver's "resultant code" is UGLY 
> and difficult to maintain - as it will "encourage" people to use the tool 
> for its intended purpose.
>
> -- 
> Bill Klein
> wmklein <at> ix.netcom.com
>
> 



0
dashwood1 (2140)
7/18/2005 11:14:09 PM

Pete Dashwood wrote:
> (snip) 
> It is no longer accurate to believe that only a human programmer, who has 
> full understanding of a process, can be capable of refactoring that process. 
> Indications are that smart software can do it faster, cheaper, and with 
> fewer errors.
> 
> Pete.

If you only have one program to restructure, then it might be cheaper 
for a human programmer to do it, but that's only because the tools can 
be pretty expensive.

I would much rather use a refactoring tool because I can absolutely 
guarantee the computer can do it much faster than me and with much 
more accurate results.  Without the tool, I would think long and hard 
about doing it myself.  Who has the time?

-- 
http://arnold.trembley.home.att.net/

0
7/19/2005 2:37:09 AM

Pete Dashwood wrote:

> Thanks guys, I appreciate you cutting me some slack :-)
> 
> On this occasion it isn't necessary. I have responded to Howard's post and 
> what I'm saying isn't geeky, it is accepted practice in environments where 
> they use modern techniques as a matter of course.
> 
> People who are not COBOL programmers are used to interacting with tools that 
> allow them to assemble system functionality, in a way that someone whose 
> whole life is COBOL could not imagine. There are 'drag and drop' tools that 
> generate system functionality in minutes, that would take days to duplicate 
> in COBOL. My point is that there is another world outside of procedural 
> programming. And it is not the domain of geeks; people in the work place are 
> encountering it in one form or another. And newcomers take it for granted.
> 
> Pete.

Way back in 1996, my employer paid for a class in Visual Basic 
(version 4, I believe).  While I have never had occasion to use it at 
work, it was very interesting since I had no previous (or subsequent) 
exposure to GUI programming.  I wasn't all that keen on writing a 
program by "dragging and dropping", but I was very impressed when 
every student in the class built a working inquiry application against 
an Access table in mere minutes, without writing a line of code.

It would have taken me several days to do the same task in traditional 
  COBOL.

A few months ago I had an introductory class on Java programming.  I 
managed to get through all the in-class assignments, but they never 
got as far as reading or writing a file, or even accepting user 
entered data from the screen.  I guess that must be covered in the 
intermediate or advanced class.


-- 
http://arnold.trembley.home.att.net/

0
7/19/2005 2:47:19 AM
"Arnold Trembley" <arnold.trembley@worldnet.att.net> wrote in message 
news:XQZCe.11276$5N3.47@bgtnsc05-news.ops.worldnet.att.net...
>
>
> Pete Dashwood wrote:
>
>> Thanks guys, I appreciate you cutting me some slack :-)
>>
>> On this occasion it isn't necessary. I have responded to Howard's post 
>> and what I'm saying isn't geeky, it is accepted practice in environments 
>> where they use modern techniques as a matter of course.
>>
>> People who are not COBOL programmers are used to interacting with tools 
>> that allow them to assemble system functionality, in a way that someone 
>> whose whole life is COBOL could not imagine. There are 'drag and drop' 
>> tools that generate system functionality in minutes, that would take days 
>> to duplicate in COBOL. My point is that there is another world outside of 
>> procedural programming. And it is not the domain of geeks; people in the 
>> work place are encountering it in one form or another. And newcomers take 
>> it for granted.
>>
>> Pete.
>
> Way back in 1996, my employer paid for a class in Visual Basic (version 4, 
> I believe).  While I have never had occasion to use it at work, it was 
> very interesting since I had no previous (or subsequent) exposure to GUI 
> programming.  I wasn't all that keen on writing a program by "dragging and 
> dropping", but I was very impressed when every student in the class built 
> a working inquiry application against an Access table in mere minutes, 
> without writing a line of code.
>
> It would have taken me several days to do the same task in traditional 
> COBOL.
>

Thanks for offering that, Arnold.  I think sometimes people think I make 
this stuff up... :-)

> A few months ago I had an introductory class on Java programming.  I 
> managed to get through all the in-class assignments, but they never got as 
> far as reading or writing a file, or even accepting user entered data from 
> the screen.  I guess that must be covered in the intermediate or advanced 
> class.
>
Java sees 'file's as streams, and has appropriate classes to manage them. I 
strongly recommend 'SAMs Teach Yourself Java 2 in 24 Hours', by Rogers 
Cadenhead. He has a delightful dry wit that makes the book entertaining as 
well as informative.

Pete.



0
dashwood1 (2140)
7/19/2005 3:49:06 AM
"Arnold Trembley" <arnold.trembley@worldnet.att.net> wrote in message 
news:pHZCe.11247$5N3.8632@bgtnsc05-news.ops.worldnet.att.net...
>
>
> Pete Dashwood wrote:
>> (snip) It is no longer accurate to believe that only a human programmer, 
>> who has full understanding of a process, can be capable of refactoring 
>> that process. Indications are that smart software can do it faster, 
>> cheaper, and with fewer errors.
>>
>> Pete.
>
> If you only have one program to restructure, then it might be cheaper for 
> a human programmer to do it, but that's only because the tools can be 
> pretty expensive.

That's a fair point. My experience has been that when refactoring is 
required, it is on a scale that affects more than one program. But I agree 
that wouldn't always be the case, and it could make sense to do it manually, 
rather than commit to expensive tools which have a limited life (once you've 
done it, it is done...). I think this is why some companies outsource the 
refactoring proces to service providers who have experience in this area, 
and have the tools on hand. Of course, sensitivity surrounding source code 
being what it is, there will definitely be cases where this is not an 
option.

Management need to consider much more than just the technical aspects when 
it comes to refactoring code.
>
> I would much rather use a refactoring tool because I can absolutely 
> guarantee the computer can do it much faster than me and with much more 
> accurate results.  Without the tool, I would think long and hard about 
> doing it myself.  Who has the time?

I'm not sure, Arnold, but I believe you are working in a mainframe 
environment?

If it wouldn't be spilling trade secrets, could you give some insight into 
the kind of experience you have had with this? I'd be interested to hear 
about refactoring tools that have been used successfully in Mainframe 
environments, and the circumstances surrounding it.

Pete.



0
dashwood1 (2140)
7/19/2005 3:57:46 AM

Pete Dashwood wrote:

> "Arnold Trembley" <arnold.trembley@worldnet.att.net> wrote in message 
> news:pHZCe.11247$5N3.8632@bgtnsc05-news.ops.worldnet.att.net...
> 
>>
>>Pete Dashwood wrote:
>>
>>>(snip) It is no longer accurate to believe that only a human programmer, 
>>>who has full understanding of a process, can be capable of refactoring 
>>>that process. Indications are that smart software can do it faster, 
>>>cheaper, and with fewer errors.

And I basically agree with this.  If you have several programs to 
restructure, it will definitely be cheaper to use a good tool.  The 
tool already wins on faster and with fewer errors.

>>>
>>>Pete.
>>
>>If you only have one program to restructure, then it might be cheaper for 
>>a human programmer to do it, but that's only because the tools can be 
>>pretty expensive.
> 
> 
> That's a fair point. My experience has been that when refactoring is 
> required, it is on a scale that affects more than one program. But I agree 
> that wouldn't always be the case, and it could make sense to do it manually, 
> rather than commit to expensive tools which have a limited life (once you've 
> done it, it is done...). 

Yes, if you've done a thorough restructuring of a program, it will 
take years of accumulated maintenance before you need to rewrite again 
just to clean up the code.  And if it's more than ten years, you may 
end up rewriting from a clean slate just because the business rules or 
requirements have changed.


> I think this is why some companies outsource the 
> refactoring proces to service providers who have experience in this area, 
> and have the tools on hand. Of course, sensitivity surrounding source code 
> being what it is, there will definitely be cases where this is not an 
> option.
> 
> Management need to consider much more than just the technical aspects when 
> it comes to refactoring code.
> 
>>I would much rather use a refactoring tool because I can absolutely 
>>guarantee the computer can do it much faster than me and with much more 
>>accurate results.  Without the tool, I would think long and hard about 
>>doing it myself.  Who has the time?
> 
> 
> I'm not sure, Arnold, but I believe you are working in a mainframe 
> environment?

That is correct.  Three z/990's spread across two sites, CICS, DB2, 
and Enterprise COBOL for z/OS.

> 
> If it wouldn't be spilling trade secrets, could you give some insight into 
> the kind of experience you have had with this? I'd be interested to hear 
> about refactoring tools that have been used successfully in Mainframe 
> environments, and the circumstances surrounding it.
> 
> Pete.

As long as I don't talk about what the programs do in specific 
business terms, there's no risk of spilling trade secrets.

I think my company acquired IBM's COBOL Restructuring Facility by 
accident.  They probably got it at a reduced price because they were 
at the cusp of converting from OS/VS COBOL to VS COBOL II, and that 
was another feature available with IBM/CSF.  There were some huge 
changes in CICS between OS/VS COBOL and VS COBOL II in the handling of 
Linkage section items, and I believe CSF could fix those too, although 
all the stinkers I actually put into production were plain old batch 
COBOL.

Anyway, I may have been the only one who actually used the product. 
The sysprogs announced they had it, if anyone wanted to use it, and I 
had some spare time.  This was around 1994, I think.

One problem I had learning to use the tool was that it ran as a TSO 
application.  At that time only the sysprogs normally had access to 
TSO/ISPF.  The application programmers used an ISPF lookalike called 
SYSD which was a CICS application that used considerably fewer 
resources.  That was a legacy of the shop having been a VSE shop in 
the early 1980's, before I got there.  I imagined that CSF should be 
just a batch job, with an input COBOL source file and and output 
restructured COBOL source file.  But CSF was actually written to use 
TSO panels to specify a rather large number of runtime or conversion 
options, and the tool itself required the PL/I runtime library, which 
tells me that COBOL/SF was written in PL/I.

The tool also produced voluminous reports with metrics (McCabe 
complexity, etc.) in addition to a COBOL source code file that was 
ready to compile.  I forget how it handled copybooks, because I think 
it would need to see them in order to do a thorough job of refactoring.

There were so many options you could specify that I doubt that I can 
remember them all.  Some of them seemed a bit trivial to me, like 
whether you wanted IF statements indented three columns or four 
columns.  Whatever you chose, the output would be "beautified" in a 
consistent way that helped readability.  You could have it generate 
sections only or paragraphs only.  It could generate all PERFORM's 
with THRU if you wanted that.  My choice was paragraphs only, and 
PERFORM with no THRU.  You could have it convert nested IF's to 
EVALUATE's where it deemed them appropriate.  You could have it 
generate comments at the head of each paragraph/section that reported 
the names of all calling procedures.  Since the restructured program 
normally had fewer paragraph names than the input source code, you 
could chose to have it propogate the deleted paragraph names as 
comments in the output program.  It would list all the dead code that 
it could not generate to the output program.  Sometimes that was a 
real revelation to learn that some critical piece of code could never 
even be reached.  That could be an indication of a bug in the input 
source code, so you would want to know what the "dead code" was.

You also had the option of having the tool split up the input program 
into multiple output programs, if the analysis reports indicated that 
there might be some benefit in doing so.  But I never chose to do that.

Most of my learning curve time was spent in learning enough TSO so I 
could work through the panels (screens) to set my options and start a 
conversion run.  I don't recall a way to set it up to convert multiple 
programs in a single pass, but that may have been an option.  Once I 
chose the options that were most compatible with local shop standards 
(or my personal preferences), I didn't really change them again.  In 
any event, I only restructured about three or four programs that I 
actually put into production.  There was another one I wish I had 
converted but that I just never got around to it.  I think every one 
of them was also a conversion from OS/VS COBOL to VS COBOL II.

My company was pretty conservative in its approach.  Since we had no 
prior experience with the tool, I was pretty much required to test 
each converted program as if it it were a new program I had just 
written.  IBM also would not guarantee perfectly identical execution, 
and recommended testing every converted program.

In the end, my company eventually dropped the license, because we 
didn't use the product enough.  Part of that was that I did not have a 
project to do this, I was free-lancing.  And part of that was one 
department that was convinced that COBOL II produced much less 
efficient code than OS/VS COBOL.  They didn't convert until Y2K, and 
then they had to do that manually.

In my opinion, the good points about IBM COBOL/SF were:

1.  Very fast.  Once you had you options selected, you could convert a 
program in a couple of minutes.  You could spend an hour reading the 
reports to see just how bad the original program was.
2.  Very accurate.  I never found any bugs in the converted programs. 
  As far as I could tell, they generated programs provided exactly the 
same results as the original, unstructured programs.  Obviously, I 
never had a generated program that would not compile cleanly.
3.  The tool did an excellent job of getting rid of GO TO, ALTER, and 
PERFORM THRU.  If you had particularly ugly PERFORM THRU logic, it 
would generate working-storage switches with procedure division logic 
to set them and then choose what to execute.  This handled cases where 
you might perform A thru E-exit or C thru E-Exit.
4.  The output program was "beautified" so it was easier to read, and 
it would fix alignment errors.  All the output conditional logic of IF 
and EVALUATE would be consistently indented.  That alone would prevent 
some future maintenance problems.
5.  The analysis reports produced by the restructuring tool could be 
very helpful as well.

There was one thing I found a bit disappointing about the results.  If 
the input program had poor datanames and procedure names, the output 
program still had poor names, plus there would be fewer procedure 
names.  That's just a limitation of using an automated tool.  I don't 
see any way that the tool could rename the data and procedures to make 
the program easier for a person to understand in business terms.  But 
the control flow would definitely be much easier to understand and 
manipulate.

 From my point of view, if the restructured program was going to be 
maintained by human programmers, then you could get even better 
results by having a human programmer look over the generated program 
and rename the data items and procedures to improve the 
"self-documenting" nature of the program.  That said, if the original 
program was a real dog, anything would be better.  Certainly the 
generated program would be no worse then the original in terms of 
readability, and fixing the control flow made other kinds of 
maintenance much less risky.

I was very favorably impressed that it could do what as much as it 
did.  If the decision had been up to me, I would have recommended 
keeping IBM COBOL/SF, and using it as much as possible.  But as you 
point out, once you've refactored a program, it's done.  Eventually 
the tool would have less and less work to do, as you ran out of 
programs to fix.  Although even then, it might be worthwhile to 
restructure a frequently changed program every year, just to keep bad 
practices from creeping back in.

I don't know if that fully answers your question.  If you think of 
something I haven't mentioned, just let me know.

With kindest regards,

-- 
http://arnold.trembley.home.att.net/

0
7/19/2005 5:49:55 AM
Unfortunately, I choose the interspersed method of response.  It's more 
questions than comments I fear.

JCE

"Pete Dashwood" <dashwood@enternet.co.nz> wrote in message
news:3k2qouFs9h1bU1@individual.net...
>
> "Howard Brazee" <howard@brazee.net> wrote in message
> news:dbge55$gm2$1@peabody.colorado.edu...
>>
>> On 16-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:
>>
>>> Ansolutely :-). I wouldn't take the bet. Removal of duplicated code is a
>>> very good benefit of structuring (whether a human gets to see/maintain
>>> it or
>>> not :-))
>>
>> I've worked in environments where duplicated code was desired.
>
> I'd be interested to know what those environments were, Howard. Never
> encountered it myself, and am having great difficulty imagining it.

I thought of a couple of examples that I've come across and after writing 
them down it becomes clear that there are other solutions and it wasn't a 
"desire" but just the way it ended up.

The most reasonable example was a program I wrote to generate report 
programs from the DB2 catalogue (well actually the DDL as we didn't have 
access to the catalogue).  As each program was generated it was simply the 
same code spat out each time. Each one was modified and tailored for 
specific enhancements in "user areas".   At the time, it made much more 
sense to have each report be a simple report (300 lines or so of code of 
which about 280 was generated and common)....and self documenting and 
encapsulating.  This made it easier to get it into our audit documents and 
into our product asset plan in short term.  It wholly simplified the build 
process.

I told you it wasn't even a good example.  I have a _really_ hard time 
seeing this need too outside of places using the LOC metric.

> >So I'll play
>> Devil's advocate here and ask why an automated system should
>> significantly
>> benefit by eliminating duplicated code.
>
> A very fair question. If no human is going to maintain it, you could well
> argue that duplication is irrelevant. However, it isn't . Here are some
> reasons why even a  code generator might avoid duplicated code:
>
> 1. Isolation and reuse of functionality. Refactoring code into component
> functions means that the code can be shared and reused. Even if you are
> not maintaining it, you might want to export the functionality to other
> systems or include it in new features of existing systems. (This is the
> main advantage of component based systems.)

Imagine the following situation.  I create a component that is well defined 
to do some data access functions.  You like it but want to add some 
functionality.  I ask primarily because you have experience in the area, but 
I've recently (for whatever reason) been thinking about .NET 
versioning...which seems a form of code duplication - the only difference 
being that you can drop support for versions (in essence).

Which is the most common scenario in your experience:

Do you "extend" my component and create a new one based solely on mine ?  If 
you want to do something close but not quite, you may have to duplicating 
the function in order to make your slight change.

Do you update my source code (assuming it's available) ?  Perhaps 
duplicating components code if I create a new version of my component that 
is slightly incompatible with your enhancements (even Java gave up on 
backward compatible).

I just see an explosion of components and code that become very unmanageable 
unless the code is very clearly owned and that ownership just implies more 
components of similar thing to gain some ownership over the code use.

If I take the concept down to these levels there seems to be no worry about 
the concept of code duplication across development boundaries.  It 
fundamentally appears to be at its root a "desk policy".  Some might find 
mine messy or otherwise but it doesn't matter because you're not looking at 
my stuff anyway (I actually have an empty desk policy - I don't keep 
anything paper which makes moving simple).  The component hides it from you 
making code duplication a non issue.  Even duplicated function isn't a 
problem if it exists multiple times in many flavours.  It's a failure to 
profit from sound repeatable function that's the problem.

> 2. The less code that is generated, the faster it will run.
> 3. It is easier to maintain an inventory of discrete functions if their
> functionality is not duplicated all over the place.
I would imagine that for a code generator this is a non issue.  I would 
actually think duplicate code is a lot simpler for autotools to manage.

> At a line-by-line level a code generator may well duplicate code because
> it doesn't 'understand' the overall picture. However, subsequent passes of
> the conversion process can analyse this generated code (because it is now
> in a form where it is suitable for automated analysis), and recognise
> repeated code. Optimization can create functions and replace the
> duplication with function (or component) activation. (I didn't say CALLs,
> because it depends on the environment...).
But now who would maintain that code? The generator, the end user, or the 
refactoring tool?
If the process is always A to B to C...who's responsible for the build and 
QA test ? Three tools touching code is almost as dangerous as one person 
touching it :-)  To date the best advice I"ve received is, "if you cannot 
understand the generated code, then you shouldn't use the generator".  Good 
code is readable regardless of who wrote it.

> The bottom line is this:
> Computer programming is well enough understood now that existing code can
> be 'rationalised' and refactored. (We haven't yet got AI generators that
> are capable of creating code, but there are techniques that support this
> and it is a 'next step' in system evolution. (Iterative interaction with a
> user is one possible process whereby a computer could 'create' new
> functionality. - I mentioned this years ago, right here in this forum. I
> even considered actually doing it, but got sidetracked by the need to make
> a living... :-))

I recently read an article somewhere that was in Software Developer I 
think...I thought of you immediately as they have created a new IDE that 
allows generic modelling of "objects" without dropping you straight into 
code.  They took the reference architecture for J2EE and .NET (the Java 
Petstore I believe) and showed that it took significantly less time and less 
lines of code to produce (I believe underlying it used J2EE) than even .NET 
which took less than J2EE.  It made me laugh though because immediately 
after the whole "code generating smart ide" there was a great article on the 
power and need for Subversion - and source code management in general.  When 
popular press is caught between models you know the future is bright!

> There are already tools that can do this in some limited
> spheres. As processor and storage resources increase and become
> cheaper/faster, techniques that were previously non-viable become
> feasible. As these techniques are investigated, they get more refined and
> better.) Even without going to Science Fiction, Computer Science has
> enabled refactoring tools, such as the one Oliver is building, to become a
> reality.
Given the need that "reusable" componentry be bullet proof, it is typical to 
see test frameworks built into the component build.  This requires tight 
communication and understanding of the component or you will limit only to 
variable and data type checking of limits and so forth).  Unless a 
refactoring tool can extract the dead code, clean up the mess and provide 
reasoned documentation of the program it seems somewhat fruitless. 
Spaghetti code represents spaghetti logic.  You can straighten out the code, 
but it's still messed up logic.

My immediate guess is that COBOL code in general are based on a large 
program methodology where functions are not particularly useful - the common 
code will be getting dates, writing headers, etc etc The real business logic 
is all wrapped up and embedded in there...not a lot of value in extracting. 
I think refactoring in a more pattern based technology (OO) will be more 
useful.  If they got "real" smart, the system should isolate function and 
replace it with standard solutions....how many people wrote their own 
"logging" routines, or "web frameworks" or servlets and applets before log4j 
and struts and jsp custom tags ?  I remember once I wrote something that was 
very similar to a point and shoot JMS solution.....and then came JMS.


> It is no longer accurate to believe that only a human programmer, who has
> full understanding of a process, can be capable of refactoring that
> process. Indications are that smart software can do it faster, cheaper,
> and with fewer errors.
>
> Pete.
But when will smart software do it faster, cheaper, and with autocorrect ? 
=)

JCE



0
defaultuser (532)
7/19/2005 6:22:32 AM
"Arnold Trembley" <arnold.trembley@worldnet.att.net> wrote in message 
news:XQZCe.11276$5N3.47@bgtnsc05-news.ops.worldnet.att.net...
> Way back in 1996, my employer paid for a class in Visual Basic (version 4, 
> I believe).  While I have never had occasion to use it at work, it was 
> very interesting since I had no previous (or subsequent) exposure to GUI 
> programming.  I wasn't all that keen on writing a program by "dragging and 
> dropping", but I was very impressed when every student in the class built 
> a working inquiry application against an Access table in mere minutes, 
> without writing a line of code.
>
> It would have taken me several days to do the same task in traditional 
> COBOL.

While looking for a tutorial on Visual Basic over the Internet (I'm guessing 
this was around when VB6 was out, maybe around 1998-1999), I found one which 
in which the first lesson said that implementing "Hello World" was simply 
too trivial in VB, so the first lesson was writing a text editor. While this 
text editor was more simplistic than some of the "professional" ones out 
there, it had more features than the "Notepad" application that comes with 
Windows.

It's vague in my memory now, but I think it mainly consisted of dragging a 
"textfield" GUI element, and a "menu bar" GUI element, onto the windows 
form; editing a resource file to specify what options would appear in the 
menu bar (and was flexible enough that you could just change the resource 
file to provide your application in different languages); and then writing 
the code for all the menu options, which typically were 3 to 5 lines of code 
each ("save", for example, consisted of displaying a save prompt [which 
could be done in 1 statement], reading the text data from the textfield 
[another statement], and then writing it to a file [a 3rd statement]).

> A few months ago I had an introductory class on Java programming.  I 
> managed to get through all the in-class assignments, but they never got as 
> far as reading or writing a file, or even accepting user entered data from 
> the screen.  I guess that must be covered in the intermediate or advanced 
> class.

    Yeah, VB is what many people call a RAD language, for Rapid Application 
Development. I've never used it in a corporate environment, but I heard that 
some companies use it as a prototyping tool, to quickly develop something to 
show to the client, either within a day, or even right in front of the 
client, to make sure both parties understand exactly what it is the client 
wants. Once they've established an agreement, they then write the 
application in a "real" language (which I assume to them means C++).

    Java is very much NOT a RAD language; simple programs can probably be 
written more easily and more quickly in other languages such as VB. However, 
I think Java provides a lot of support for producing highly maintainable 
code, partly by being designed so as to discourage "hacks". So for larger 
projects (those that might take several months or years), those would 
probably be better written in Java than in VB.

    - Oliver 


0
owong (6177)
7/19/2005 1:05:58 PM
On 18-Jul-2005, Arnold Trembley <arnold.trembley@worldnet.att.net> wrote:

> Way back in 1996, my employer paid for a class in Visual Basic
> (version 4, I believe).  While I have never had occasion to use it at
> work, it was very interesting since I had no previous (or subsequent)
> exposure to GUI programming.  I wasn't all that keen on writing a
> program by "dragging and dropping", but I was very impressed when
> every student in the class built a working inquiry application against
> an Access table in mere minutes, without writing a line of code.
>
> It would have taken me several days to do the same task in traditional
>   COBOL.

At least if you were a beginner like they were.

> A few months ago I had an introductory class on Java programming.  I
> managed to get through all the in-class assignments, but they never
> got as far as reading or writing a file, or even accepting user
> entered data from the screen.  I guess that must be covered in the
> intermediate or advanced class.

That's my experience as well.   Reading and writing files is not a basic Java
function.
0
howard (6283)
7/19/2005 1:32:05 PM
On 18-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:

> > I've worked in environments where duplicated code was desired.
>
> I'd be interested to know what those environments were, Howard. Never
> encountered it myself, and am having great difficulty imagining it.

An easy example of such an environment is with interpreted Basic.    Sometimes
it is more efficient to duplicate the code than to gosub it.

My specific case though was with the way we handled on-line CoBOL with a Univac
9030.
0
howard (6283)
7/19/2005 1:36:22 PM
On 18-Jul-2005, Arnold Trembley <arnold.trembley@worldnet.att.net> wrote:

> If you only have one program to restructure, then it might be cheaper
> for a human programmer to do it, but that's only because the tools can
> be pretty expensive.
>
> I would much rather use a refactoring tool because I can absolutely
> guarantee the computer can do it much faster than me and with much
> more accurate results.  Without the tool, I would think long and hard
> about doing it myself.  Who has the time?

The big variable here is how you are going to use the refactored code.    If it
is going to be maintained by humans, then some guidance to the refactoring might
be very useful.    Ideally we would have some type of interactive environment
where the refactoring program accepted input from the person and accepted
intelligent decision making.

The end goals should always be part of the overall design in such a project.
0
howard (6283)
7/19/2005 1:41:52 PM
On 18-Jul-2005, Arnold Trembley <arnold.trembley@worldnet.att.net> wrote:

> One problem I had learning to use the tool was that it ran as a TSO
> application.  At that time only the sysprogs normally had access to
> TSO/ISPF.  The application programmers used an ISPF lookalike called
> SYSD which was a CICS application that used considerably fewer
> resources.

Our shop dropped SYSD because CICS was a pretty expensive resource which we
didn't use elsewhere.

I've used a tool that converted COBOL II to COBOL 360.   It was easier to do
these minor changes by hand and get them right the first time.
0
howard (6283)
7/19/2005 1:44:49 PM
On 19-Jul-2005, "jce" <defaultuser@hotmail.com> wrote:

> > 1. Isolation and reuse of functionality. Refactoring code into component
> > functions means that the code can be shared and reused. Even if you are
> > not maintaining it, you might want to export the functionality to other
> > systems or include it in new features of existing systems. (This is the
> > main advantage of component based systems.)
>
> Imagine the following situation.  I create a component that is well defined
> to do some data access functions.  You like it but want to add some
> functionality.  I ask primarily because you have experience in the area, but
> I've recently (for whatever reason) been thinking about .NET
> versioning...which seems a form of code duplication - the only difference
> being that you can drop support for versions (in essence).
>
> Which is the most common scenario in your experience:
>
> Do you "extend" my component and create a new one based solely on mine ?  If
> you want to do something close but not quite, you may have to duplicating
> the function in order to make your slight change.

In lots of OO environments, the goal of reusability is overshadowed by the goal
of "not breaking anything that already exists", and the difficulty of
coordinating changes.    I keep seeing duplicated code because it is easier than
modifying existing objects to be more re-used.    What I have seen, the practice
falls well short of the promise.    But my experience is limited here.
0
howard (6283)
7/19/2005 1:50:07 PM
Thanks very much for an interesting and insightful response, Arnold.

I'm keeping a copy of this for personal reference.

Appreciate your time.

Pete.

Top Post no more below.
(Not snipped because it is well worth reading :-))

"Arnold Trembley" <arnold.trembley@worldnet.att.net> wrote in message 
news:7w0De.11719$5N3.376@bgtnsc05-news.ops.worldnet.att.net...
>
>
> Pete Dashwood wrote:
>
>> "Arnold Trembley" <arnold.trembley@worldnet.att.net> wrote in message 
>> news:pHZCe.11247$5N3.8632@bgtnsc05-news.ops.worldnet.att.net...
>>
>>>
>>>Pete Dashwood wrote:
>>>
>>>>(snip) It is no longer accurate to believe that only a human programmer, 
>>>>who has full understanding of a process, can be capable of refactoring 
>>>>that process. Indications are that smart software can do it faster, 
>>>>cheaper, and with fewer errors.
>
> And I basically agree with this.  If you have several programs to 
> restructure, it will definitely be cheaper to use a good tool.  The tool 
> already wins on faster and with fewer errors.
>
>>>>
>>>>Pete.
>>>
>>>If you only have one program to restructure, then it might be cheaper for 
>>>a human programmer to do it, but that's only because the tools can be 
>>>pretty expensive.
>>
>>
>> That's a fair point. My experience has been that when refactoring is 
>> required, it is on a scale that affects more than one program. But I 
>> agree that wouldn't always be the case, and it could make sense to do it 
>> manually, rather than commit to expensive tools which have a limited life 
>> (once you've done it, it is done...).
>
> Yes, if you've done a thorough restructuring of a program, it will take 
> years of accumulated maintenance before you need to rewrite again just to 
> clean up the code.  And if it's more than ten years, you may end up 
> rewriting from a clean slate just because the business rules or 
> requirements have changed.
>
>
>> I think this is why some companies outsource the refactoring proces to 
>> service providers who have experience in this area, and have the tools on 
>> hand. Of course, sensitivity surrounding source code being what it is, 
>> there will definitely be cases where this is not an option.
>>
>> Management need to consider much more than just the technical aspects 
>> when it comes to refactoring code.
>>
>>>I would much rather use a refactoring tool because I can absolutely 
>>>guarantee the computer can do it much faster than me and with much more 
>>>accurate results.  Without the tool, I would think long and hard about 
>>>doing it myself.  Who has the time?
>>
>>
>> I'm not sure, Arnold, but I believe you are working in a mainframe 
>> environment?
>
> That is correct.  Three z/990's spread across two sites, CICS, DB2, and 
> Enterprise COBOL for z/OS.
>
>>
>> If it wouldn't be spilling trade secrets, could you give some insight 
>> into the kind of experience you have had with this? I'd be interested to 
>> hear about refactoring tools that have been used successfully in 
>> Mainframe environments, and the circumstances surrounding it.
>>
>> Pete.
>
> As long as I don't talk about what the programs do in specific business 
> terms, there's no risk of spilling trade secrets.
>
> I think my company acquired IBM's COBOL Restructuring Facility by 
> accident.  They probably got it at a reduced price because they were at 
> the cusp of converting from OS/VS COBOL to VS COBOL II, and that was 
> another feature available with IBM/CSF.  There were some huge changes in 
> CICS between OS/VS COBOL and VS COBOL II in the handling of Linkage 
> section items, and I believe CSF could fix those too, although all the 
> stinkers I actually put into production were plain old batch COBOL.
>
> Anyway, I may have been the only one who actually used the product. The 
> sysprogs announced they had it, if anyone wanted to use it, and I had some 
> spare time.  This was around 1994, I think.
>
> One problem I had learning to use the tool was that it ran as a TSO 
> application.  At that time only the sysprogs normally had access to 
> TSO/ISPF.  The application programmers used an ISPF lookalike called SYSD 
> which was a CICS application that used considerably fewer resources.  That 
> was a legacy of the shop having been a VSE shop in the early 1980's, 
> before I got there.  I imagined that CSF should be just a batch job, with 
> an input COBOL source file and and output restructured COBOL source file. 
> But CSF was actually written to use TSO panels to specify a rather large 
> number of runtime or conversion options, and the tool itself required the 
> PL/I runtime library, which tells me that COBOL/SF was written in PL/I.
>
> The tool also produced voluminous reports with metrics (McCabe complexity, 
> etc.) in addition to a COBOL source code file that was ready to compile. 
> I forget how it handled copybooks, because I think it would need to see 
> them in order to do a thorough job of refactoring.
>
> There were so many options you could specify that I doubt that I can 
> remember them all.  Some of them seemed a bit trivial to me, like whether 
> you wanted IF statements indented three columns or four columns.  Whatever 
> you chose, the output would be "beautified" in a consistent way that 
> helped readability.  You could have it generate sections only or 
> paragraphs only.  It could generate all PERFORM's with THRU if you wanted 
> that.  My choice was paragraphs only, and PERFORM with no THRU.  You could 
> have it convert nested IF's to EVALUATE's where it deemed them 
> appropriate.  You could have it generate comments at the head of each 
> paragraph/section that reported the names of all calling procedures. 
> Since the restructured program normally had fewer paragraph names than the 
> input source code, you could chose to have it propogate the deleted 
> paragraph names as comments in the output program.  It would list all the 
> dead code that it could not generate to the output program.  Sometimes 
> that was a real revelation to learn that some critical piece of code could 
> never even be reached.  That could be an indication of a bug in the input 
> source code, so you would want to know what the "dead code" was.
>
> You also had the option of having the tool split up the input program into 
> multiple output programs, if the analysis reports indicated that there 
> might be some benefit in doing so.  But I never chose to do that.
>
> Most of my learning curve time was spent in learning enough TSO so I could 
> work through the panels (screens) to set my options and start a conversion 
> run.  I don't recall a way to set it up to convert multiple programs in a 
> single pass, but that may have been an option.  Once I chose the options 
> that were most compatible with local shop standards (or my personal 
> preferences), I didn't really change them again.  In any event, I only 
> restructured about three or four programs that I actually put into 
> production.  There was another one I wish I had converted but that I just 
> never got around to it.  I think every one of them was also a conversion 
> from OS/VS COBOL to VS COBOL II.
>
> My company was pretty conservative in its approach.  Since we had no prior 
> experience with the tool, I was pretty much required to test each 
> converted program as if it it were a new program I had just written.  IBM 
> also would not guarantee perfectly identical execution, and recommended 
> testing every converted program.
>
> In the end, my company eventually dropped the license, because we didn't 
> use the product enough.  Part of that was that I did not have a project to 
> do this, I was free-lancing.  And part of that was one department that was 
> convinced that COBOL II produced much less efficient code than OS/VS 
> COBOL.  They didn't convert until Y2K, and then they had to do that 
> manually.
>
> In my opinion, the good points about IBM COBOL/SF were:
>
> 1.  Very fast.  Once you had you options selected, you could convert a 
> program in a couple of minutes.  You could spend an hour reading the 
> reports to see just how bad the original program was.
> 2.  Very accurate.  I never found any bugs in the converted programs. As 
> far as I could tell, they generated programs provided exactly the same 
> results as the original, unstructured programs.  Obviously, I never had a 
> generated program that would not compile cleanly.
> 3.  The tool did an excellent job of getting rid of GO TO, ALTER, and 
> PERFORM THRU.  If you had particularly ugly PERFORM THRU logic, it would 
> generate working-storage switches with procedure division logic to set 
> them and then choose what to execute.  This handled cases where you might 
> perform A thru E-exit or C thru E-Exit.
> 4.  The output program was "beautified" so it was easier to read, and it 
> would fix alignment errors.  All the output conditional logic of IF and 
> EVALUATE would be consistently indented.  That alone would prevent some 
> future maintenance problems.
> 5.  The analysis reports produced by the restructuring tool could be very 
> helpful as well.
>
> There was one thing I found a bit disappointing about the results.  If the 
> input program had poor datanames and procedure names, the output program 
> still had poor names, plus there would be fewer procedure names.  That's 
> just a limitation of using an automated tool.  I don't see any way that 
> the tool could rename the data and procedures to make the program easier 
> for a person to understand in business terms.  But the control flow would 
> definitely be much easier to understand and manipulate.
>
> From my point of view, if the restructured program was going to be 
> maintained by human programmers, then you could get even better results by 
> having a human programmer look over the generated program and rename the 
> data items and procedures to improve the "self-documenting" nature of the 
> program.  That said, if the original program was a real dog, anything 
> would be better.  Certainly the generated program would be no worse then 
> the original in terms of readability, and fixing the control flow made 
> other kinds of maintenance much less risky.
>
> I was very favorably impressed that it could do what as much as it did. 
> If the decision had been up to me, I would have recommended keeping IBM 
> COBOL/SF, and using it as much as possible.  But as you point out, once 
> you've refactored a program, it's done.  Eventually the tool would have 
> less and less work to do, as you ran out of programs to fix.  Although 
> even then, it might be worthwhile to restructure a frequently changed 
> program every year, just to keep bad practices from creeping back in.
>
> I don't know if that fully answers your question.  If you think of 
> something I haven't mentioned, just let me know.
>
> With kindest regards,
>
> -- 
> http://arnold.trembley.home.att.net/
> 
> 



0
dashwood1 (2140)
7/19/2005 2:25:36 PM
"jce" <defaultuser@hotmail.com> wrote in message 
news:I_0De.10777$iG6.9945@tornado.tampabay.rr.com...
> Unfortunately, I choose the interspersed method of response.  It's more 
> questions than comments I fear.
>
Yes, it is a bit of a ramble... :-)

Nevertheless, I've tried to address at least some of the points you 
raised...see below...and done some heavy snippage to try and make it 
manageable...



<snip>>
> I thought of a couple of examples that I've come across and after writing 
> them down it becomes clear that there are other solutions and it wasn't a 
> "desire" but just the way it ended up.
>
> The most reasonable example was a program I wrote to generate report 
> programs from the DB2 catalogue (well actually the DDL as we didn't have 
> access to the catalogue).  As each program was generated it was simply the 
> same code spat out each time. Each one was modified and tailored for 
> specific enhancements in "user areas".   At the time, it made much more 
> sense to have each report be a simple report (300 lines or so of code of 
> which about 280 was generated and common)....and self documenting and 
> encapsulating.  This made it easier to get it into our audit documents and 
> into our product asset plan in short term.  It wholly simplified the build 
> process.
>
I have done something very similar to provide 'universal VSAM access' for a 
large VSAM based system. It was in 1979.

However, there is nothing wrong with the 'template' based approach. I still 
use this for certain web based functions. But, at the back of my head, I 
KNOW I could make the templates 'smarter' and combine many of them... I 
think the point is that duplicated code is never a REQUIREMENT; certainly 
not in modern systems where we can utilise reusable functions.

> I told you it wasn't even a good example.  I have a _really_ hard time 
> seeing this need too outside of places using the LOC metric.
>
:-)  Yep, me too...


<snip>>>
>> 1. Isolation and reuse of functionality. Refactoring code into component
>> functions means that the code can be shared and reused. Even if you are
>> not maintaining it, you might want to export the functionality to other
>> systems or include it in new features of existing systems. (This is the
>> main advantage of component based systems.)
>
> Imagine the following situation.  I create a component that is well 
> defined to do some data access functions.  You like it but want to add 
> some functionality.  I ask primarily because you have experience in the 
> area, but I've recently (for whatever reason) been thinking about .NET 
> versioning...which seems a form of code duplication - the only difference 
> being that you can drop support for versions (in essence).
>
> Which is the most common scenario in your experience:
>
> Do you "extend" my component and create a new one based solely on mine ?

Yes. see below.

>If you want to do something close but not quite, you may have to 
>duplicating the function in order to make your slight change.

No, I would incorporate your functionality into a completely new function by 
adding methods and properties that do not HAVE to be used. The new function 
then includes the old one as a subset. I would NOT mess with the 
functionality that already existed in your component. (That's why source 
code is not such a big deal any more...)
>
> Do you update my source code (assuming it's available) ?  Perhaps 
> duplicating components code if I create a new version of my component that 
> is slightly incompatible with your enhancements (even Java gave up on 
> backward compatible).

No. The whole point about components is that they have known, documented 
methods and properties. You shoulod not mess with them, even if you can.

>
> I just see an explosion of components and code that become very 
> unmanageable unless the code is very clearly owned and that ownership just 
> implies more components of similar thing to gain some ownership over the 
> code use.

That could certainly happen if you don't employ the constraints I have 
outlined above. The components are the property of the company and owned by 
everybody. You can see immediately, you don't want people amending the 
functionality. This fact will decide the 'granularity' of the components. 
Personally, I like fairly small building blocks and don't build massive 
functionality as a single component, unless I am really sure that it is a 
closed system which meets every possible requirement of that particular 
arena. There will be immediate cries: "But you can't know every possible 
requirement!!"  The fact is it depends on whether you are dealing with a 
closed universe or not. For example, consider file access. The functions are 
Add, Change, Delete,  and Read. (You can have subfunctions that decide 
whether you read backwards or forwards, but the primary functions depend on 
what you can do with data... it is closed.)

It is also important to recognise elemental components and 'assemblies' or 
'sub-assemblies' that incorporate them. Sometimes what are really assemblies 
are registered as components, and then confusion arises. The test is in the 
nature of the interface, but it's too much to go into here.

>
> If I take the concept down to these levels there seems to be no worry 
> about the concept of code duplication across development boundaries.  It 
> fundamentally appears to be at its root a "desk policy".  Some might find 
> mine messy or otherwise but it doesn't matter because you're not looking 
> at my stuff anyway (I actually have an empty desk policy - I don't keep 
> anything paper which makes moving simple).  The component hides it from 
> you making code duplication a non issue.  Even duplicated function isn't a 
> problem if it exists multiple times in many flavours.  It's a failure to 
> profit from sound repeatable function that's the problem.
>
No comment.

>> 2. The less code that is generated, the faster it will run.
>> 3. It is easier to maintain an inventory of discrete functions if their
>> functionality is not duplicated all over the place.
> I would imagine that for a code generator this is a non issue.  I would 
> actually think duplicate code is a lot simpler for autotools to manage.
>
Yes, but you are still looking at your tool as producing code for 
maintenance. I don't do that. I covered the fact that at some point the 
generator will include duplicated code; the difference is that it can be 
further processed to remove the duplication, if desired.

>> At a line-by-line level a code generator may well duplicate code because
>> it doesn't 'understand' the overall picture. However, subsequent passes 
>> of
>> the conversion process can analyse this generated code (because it is now
>> in a form where it is suitable for automated analysis), and recognise
>> repeated code. Optimization can create functions and replace the
>> duplication with function (or component) activation. (I didn't say CALLs,
>> because it depends on the environment...).
> But now who would maintain that code? The generator, the end user, or the 
> refactoring tool?

Nobody. That is my whole point.

> If the process is always A to B to C...who's responsible for the build and 
> QA test ?

The person building the system. Traditionally, a programmer, but it could be 
(and this is happening more and more) a knowledgeable business person, 
interacting with smart software. End users are becoming the architects and 
the software interfaces are being simplified to enable them to become so. 
Besides, there is no need for the builders to understand technicalities; 
computers can do that much more effectively.

Three tools touching code is almost as dangerous as one person
> touching it :-)  To date the best advice I"ve received is, "if you cannot 
> understand the generated code, then you shouldn't use the generator". 
> Good code is readable regardless of who wrote it.

At this point we diverge strongly. 'Good code' is simply a series of 
iterations, sequences, and selections, produced by standard algorithms, from 
something written by a programmer. The 'Good Code' can then be processed 
further. (Or not)

In my scenario the whole purpose of refactoring is to acquire components for 
re-use. Source code maintenance is neither necessary nor desirable. In your 
scenario the refactoring is to 'straighten out' spaghetti and structure it, 
so it can be maintained. Source code is everything.

 Both scenarios have their place. It's just that I, personally, have no 
interest in yours... :-).  (I don't think it is 'wrong'; I just don't think 
it is a good reason to refactor code. To me, if you are hell bent on 
maintaining source code, rewrite the stuff from scratch....  It is certainly 
arguable... :-)).
>
>> The bottom line is this:
>> Computer programming is well enough understood now that existing code can
>> be 'rationalised' and refactored. (We haven't yet got AI generators that
>> are capable of creating code, but there are techniques that support this
>> and it is a 'next step' in system evolution. (Iterative interaction with 
>> a
>> user is one possible process whereby a computer could 'create' new
>> functionality. - I mentioned this years ago, right here in this forum. I
>> even considered actually doing it, but got sidetracked by the need to 
>> make
>> a living... :-))
>
> I recently read an article somewhere that was in Software Developer I 
> think...I thought of you immediately as they have created a new IDE that 
> allows generic modelling of "objects" without dropping you straight into 
> code.  They took the reference architecture for J2EE and .NET (the Java 
> Petstore I believe) and showed that it took significantly less time and 
> less lines of code to produce (I believe underlying it used J2EE) than 
> even .NET which took less than J2EE.  It made me laugh though because 
> immediately after the whole "code generating smart ide" there was a great 
> article on the power and need for Subversion - and source code management 
> in general.  When popular press is caught between models you know the 
> future is bright!
>
LOL! I'd like to have seen that :-)

>> There are already tools that can do this in some limited
>> spheres. As processor and storage resources increase and become
>> cheaper/faster, techniques that were previously non-viable become
>> feasible. As these techniques are investigated, they get more refined and
>> better.) Even without going to Science Fiction, Computer Science has
>> enabled refactoring tools, such as the one Oliver is building, to become 
>> a
>> reality.
> Given the need that "reusable" componentry be bullet proof, it is typical 
> to see test frameworks built into the component build.

Hmmmm..... not sure about that. I'd need persuading and probably a demo.

>This requires tight communication and understanding of the component or you 
>will limit only to variable and data type checking of limits and so forth). 
>Unless a refactoring tool can extract the dead code, clean up the mess and 
>provide reasoned documentation of the program it seems somewhat fruitless.

That's like saying that supermarkets are useless because they don't actually 
put the stuff in your larder for you. :-)

> Spaghetti code represents spaghetti logic.  You can straighten out the 
> code, but it's still messed up logic.

If that particular code works, exactly how is it's logic messed up? You are 
seeing only source code maintenance again. Assume for a minute it works, and 
you are never going to maintain it. Does it still matter if it is 'messed 
up'...?

Procedural programming is obsessed with code maintenance, but it is actually 
the functionality that is important. The prime evaluation criterion is that 
the program should work. Elegance of code is necessarily a lower priority 
than achieving correct functionality.

>
> My immediate guess is that COBOL code in general are based on a large 
> program methodology where functions are not particularly useful - the 
> common code will be getting dates, writing headers, etc etc The real 
> business logic is all wrapped up and embedded in there...not a lot of 
> value in extracting. I think refactoring in a more pattern based 
> technology (OO) will be more useful.  If they got "real" smart, the system 
> should isolate function and replace it with standard solutions....how many 
> people wrote their own "logging" routines, or "web frameworks" or servlets 
> and applets before log4j and struts and jsp custom tags ?  I remember once 
> I wrote something that was very similar to a point and shoot JMS 
> solution.....and then came JMS.
>
Exactly. The need for such functionality was recognised. But you didn't not 
use it because you didn't have the source code, did you :-)?
>
>> It is no longer accurate to believe that only a human programmer, who has
>> full understanding of a process, can be capable of refactoring that
>> process. Indications are that smart software can do it faster, cheaper,
>> and with fewer errors.
>>
>> Pete.
> But when will smart software do it faster, cheaper, and with autocorrect ?

Correction is required if something is wrong. As the standard algorithms 
ensure that the generated code ISN'T wrong, autocorrect is a redundant 
feature. :-)

Prediction:

As existing systems (particularly procedural coded ones, where source 
maintenance is a heavy requirement) are phased out, we are likely to see 
more code refactoring being employed. I don't believe it will be done with 
the object of straightening out the existing code, so it can be more easily 
maintained; rather, it will be done so that functionality can be 
encapsulated and embedded into new 'quick build' systems, and corporate 
packages like SAP and Siebel.

The Age of Source Code is almost over; the Age of Functionality is dawning.

Pete.



0
dashwood1 (2140)
7/19/2005 3:34:18 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:dbj0f6$3i0$1@peabody.colorado.edu...
>
> On 19-Jul-2005, "jce" <defaultuser@hotmail.com> wrote:
>
>> > 1. Isolation and reuse of functionality. Refactoring code into 
>> > component
>> > functions means that the code can be shared and reused. Even if you are
>> > not maintaining it, you might want to export the functionality to other
>> > systems or include it in new features of existing systems. (This is the
>> > main advantage of component based systems.)
>>
>> Imagine the following situation.  I create a component that is well 
>> defined
>> to do some data access functions.  You like it but want to add some
>> functionality.  I ask primarily because you have experience in the area, 
>> but
>> I've recently (for whatever reason) been thinking about .NET
>> versioning...which seems a form of code duplication - the only difference
>> being that you can drop support for versions (in essence).
>>
>> Which is the most common scenario in your experience:
>>
>> Do you "extend" my component and create a new one based solely on mine ? 
>> If
>> you want to do something close but not quite, you may have to duplicating
>> the function in order to make your slight change.
>
> In lots of OO environments, the goal of reusability is overshadowed by the 
> goal
> of "not breaking anything that already exists", and the difficulty of
> coordinating changes.    I keep seeing duplicated code because it is 
> easier than
> modifying existing objects to be more re-used.    What I have seen, the 
> practice
> falls well short of the promise.    But my experience is limited here.
>
I believe you. Things are not always what they should be and, under stress 
in the workplace, people have been known to take the line of least 
resistance... :-)

What I will say is this: If you extend an existing component into a 
'superset', there is absolutely no need to duplicate code. The new component 
simply uses the old one just like any application would.

The key to it is the interface. This is one reason why I, personally, do not 
pass parameters in component invocations, as a general rule (I have been 
known ot break it under extreme circumstances...). I set the properties, 
invoke the method, then get the properties I want. Doing this, the 
reusability is never a problem and there is no need for duplicated code when 
a component is extended.

Pete.




0
dashwood1 (2140)
7/19/2005 3:42:18 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:dbivle$31p$1@peabody.colorado.edu...
>
> On 18-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:
>
>> > I've worked in environments where duplicated code was desired.
>>
>> I'd be interested to know what those environments were, Howard. Never
>> encountered it myself, and am having great difficulty imagining it.
>
> An easy example of such an environment is with interpreted Basic. 
> Sometimes
> it is more efficient to duplicate the code than to gosub it.

Only where the branch and return overhead is more than the actual code. How 
often does that happen? I think this one is flimsy at best.

>
> My specific case though was with the way we handled on-line CoBOL with a 
> Univac
> 9030.
>
Not my experience base, so I cannot comment. I'd still like to see it...:-)

Pete. 



0
dashwood1 (2140)
7/19/2005 3:45:57 PM
"jce" <defaultuser@hotmail.com> wrote in message 
news:I_0De.10777$iG6.9945@tornado.tampabay.rr.com...
> Do you "extend" my component and create a new one based solely on mine ? 
> If you want to do something close but not quite, you may have to 
> duplicating the function in order to make your slight change.
>
> Do you update my source code (assuming it's available) ?  Perhaps 
> duplicating components code if I create a new version of my component that 
> is slightly incompatible with your enhancements (even Java gave up on 
> backward compatible).

    The code that results from these two actions have two different 
semantics (in Java at least); In fact, Id' say there are 3 ways you can 
"adapt existing code".

The first is, as you say, inheritance - in Java, you'd use the keyword 
"extends" to mean that you're essentially importing all the code from your 
old component into the new one, but you might add new features, or override 
existing features. In this case, while there is "duplicated functionality", 
there is no duplicated code. The code resides in one place, and the compiler 
takes care of routing the calls to the correct methods.

The second is composition, in which your new component "secretly" has an 
instance of the old component, which it uses for its computation. On the 
outside, people will invoke features on your new component, and internally, 
those method calls get forwarded to your old component, and then the results 
returned back to the user, perhaps with some filtering or translation along 
the way.  In this case as well, there is no duplicated code. The code 
resides only in the old component, and the new code you wrote handles 
forwarding the calls to the appropriate methods.

The third is starting from scratch, or modifying your original source code; 
in both cases, there is no relationship, from the compiler's point of view, 
between the new code and the old code. This is where duplicate code can come 
up, especially when I want to use the old version and the new version of the 
code in different locations. This case does occasionally happen, but I find 
it's pretty rare.

The issue is that the object oriented languages give inheritance some very 
special semantics, and if you want these semantics, composition is pretty 
much out of the question. However, if you specifically DON'T want these 
semantics, then inheritance is out of the question.

>> 2. The less code that is generated, the faster it will run.
>> 3. It is easier to maintain an inventory of discrete functions if their
>> functionality is not duplicated all over the place.
>
> I would imagine that for a code generator this is a non issue.  I would 
> actually think duplicate code is a lot simpler for autotools to manage.

Well, this depends on what you're trying to accomplish. If the generator 
"knows" ahead of time that two of the things it's going to generate will 
contain identical code, then it probably has some sort of "planning" phase. 
During this phase, it can make a list of all the functionality it wants to 
generate, checking every time it tries to add a feature that it isn't 
already there, to ensure that no duplicate code gets generated. Even without 
an explicit planning phase, typically when generating code, you'll want to 
refer to it (via method calls or even GOTOs) later on, so you'll need to 
keep some sort of list of what you've generated so far anyway.

But yes, most tools I've seen don't do an explicit "duplicate code check" 
phase, examining the code it just generated. I think most tools make a 
"decent" effort towards not generating duplicate code, but if it does, it 
doesn't worry about it too much.

> If the process is always A to B to C...who's responsible for the build and 
> QA test ? Three tools touching code is almost as dangerous as one person 
> touching it :-)

    I think if you can "prove" that a tool doesn't change the semantics of a 
given program, then it doesn't matter how many times you run that tool, each 
time the semantics won't change. Similarly, if you have 3 tools, each of 
which don't change the action, then you can run any of them in any order as 
many times as you like, and the result at the end will be semantically 
identical to the program at the beginning.

    Of course, very rarely do I see program accompanied by formal proof of 
correctness.

    As for QA, if you're confident that the tool is correct, then you might 
have more important things to spend time testing (e.g. code you've manually 
written); the less confident you are about the tool, the higher the priority 
should be given to testing the generated code.

> To date the best advice I"ve received is, "if you cannot understand the 
> generated code, then you shouldn't use the generator".  Good code is 
> readable regardless of who wrote it.

I don't agree with this; when you generate a lexer (the front end of a 
compiler) using a grammar file, usually the lexer will contain completely 
unintelligeable code. The lexer I'm working with, for example, has a giant 
array containing what I assume (based on the name of the variable) is the 
next state its internal state machine should transition to encoded as a 
bunch of integers. The code looks like:

jjnextStates = { 485, 486, 468, 487, 488, 497, 499, 501, 502, 490, 492, 494, 
495, 1, 2, 327, 328, 388, 492, 494, 495, 499, 501, 502, 3, 4, 17, 18, 30, 
31, 48, 49, 64, 65, 77, 78, 98, 99, 119, 120, 145, 146, 171, 172, 193, 194, 
223, 224, 253, 254, 288, 289, 323, 324, 5, 6, 7, 9, 10, 12, 13, 14, 15, 19

and so on. I don't understand the code at all. However, I have a basic 
understanding of compiler technology (I know, for example, that lexing can 
be done via regular expressions, and that regular expressions can be 
emulated by DFAs (Deterministic Finite Automatas), and that DFAs can be 
implemented as state machines), and I "trust" that my tool works, so I don't 
bother to "test" the lexer that gets generated. So far I haven't been 
disapointed.

    - Oliver 


0
owong (6177)
7/19/2005 4:04:02 PM
On 19-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:

> > An easy example of such an environment is with interpreted Basic.
> > Sometimes
> > it is more efficient to duplicate the code than to gosub it.
>
> Only where the branch and return overhead is more than the actual code. How
> often does that happen? I think this one is flimsy at best.

I remember getting a huge efficiency gain by putting a highly iterated piece of
code at the front of the program (right after a GOTO which went to the start of
logic).    That was so the label would be found quickly.     Admittedly, not
much would be saved with a routine that is called rarely enough that it could be
duplicated - but searching through the code for labels was slow.    Compilers
get rid of that problem.
0
howard (6283)
7/19/2005 4:16:05 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:dbivdd$2u4$1@peabody.colorado.edu...
>> A few months ago I had an introductory class on Java programming.  I
>> managed to get through all the in-class assignments, but they never
>> got as far as reading or writing a file, or even accepting user
>> entered data from the screen.  I guess that must be covered in the
>> intermediate or advanced class.

    For what it's worth, reading from console can be done via a call to 
System.in.read(), though returns a byte (more on this later).

> That's my experience as well.   Reading and writing files is not a basic 
> Java
> function.

Yeah, simple reading and writing to text files is typically more annoying 
than in other languages due to the API library's design for streams.

The idea is that you'd use the FileInputStream class to actually open a file 
and the data one byte at a time.

Then you'd put the FileInputStream into an InputStreamReader, which will 
handle decoding the bytes into characters (using ASCII, EDBIC, Unicode 
UTF-8, or whatever encoding scheme you specify).

Then you'd put the InputStreamReader into a BufferedReader, which would 
handle buffering the data, and reading a whole line at a time (as opposed to 
just one character at a time).

I think Sun chose this system for its flexibility. You could replace the 
FileInputStream with something which reads from the console instead of a 
file (as in the "System.in" object mentioned above), or with a TCP/IP Input 
Stream reader, or an HTTP Input Stream reader, or an Audio Input Stream 
reader (for reading bytes from the microphone), and the rest of the code 
wouldn't have to change or be aware of where the data is coming from.

For the lazy, Sun also provides a "FileReader" class, which allows you to 
just provide a filename, and it'll let you read from that file one character 
at a time (you'd still need to BufferedReader to get one line at a time). 
The FileReader assumes a default character encoding depending on what 
operating system it's running on (which for all the systems I've 
encountered, is Unicode UTF-8).

With most projects I've worked with, we don't actually really read plain 
text data from files anymore. When I work with files in Java, it's almost 
always XML files, and Java has some very powerful support for reading and 
writing XML.

    - Oliver 


0
owong (6177)
7/19/2005 5:14:58 PM
"Howard Brazee" <howard@brazee.net> wrote in message 
news:dbj90t$7u8$1@peabody.colorado.edu...
>
> On 19-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:
>
>> > An easy example of such an environment is with interpreted Basic.
>> > Sometimes
>> > it is more efficient to duplicate the code than to gosub it.
>>
>> Only where the branch and return overhead is more than the actual code. 
>> How
>> often does that happen? I think this one is flimsy at best.
>
> I remember getting a huge efficiency gain by putting a highly iterated 
> piece of
> code at the front of the program (right after a GOTO which went to the 
> start of
> logic).    That was so the label would be found quickly.     Admittedly, 
> not
> much would be saved with a routine that is called rarely enough that it 
> could be
> duplicated - but searching through the code for labels was slow. 
> Compilers
> get rid of that problem.
>
OK, thanks for that.

Pete. 



0
dashwood1 (2140)
7/20/2005 10:53:33 AM
On Tue, 19 Jul 2005 11:02:50 +1200, "Pete Dashwood"
<dashwood@enternet.co.nz> wrote:

>
>"Howard Brazee" <howard@brazee.net> wrote in message 
>news:dbge55$gm2$1@peabody.colorado.edu...
>>
>> On 16-Jul-2005, "Pete Dashwood" <dashwood@enternet.co.nz> wrote:
>>
>>> Ansolutely :-). I wouldn't take the bet. Removal of duplicated code is a
>>> very good benefit of structuring (whether a human gets to see/maintain it 
>>> or
>>> not :-))
>>
>> I've worked in environments where duplicated code was desired.
>
>I'd be interested to know what those environments were, Howard. Never 
>encountered it myself, and am having great difficulty imagining it.
>
>
> >So I'll play
>> Devil's advocate here and ask why an automated system should significantly
>> benefit by eliminating duplicated code.
>
>A very fair question. If no human is going to maintain it, you could well 
>argue that duplication is irrelevant. However, it isn't . Here are some 
>reasons why even a  code generator might avoid duplicated code:
>
>1. Isolation and reuse of functionality. Refactoring code into component 
>functions means that the code can be shared and reused. Even if you are not 
>maintaining it, you might want to export the functionality to other systems 
>or include it in new features of existing systems. (This is the main 
>advantage of component based systems.)
>
>2. The less code that is generated, the faster it will run.

Not necessarily.  Some of the wild optimizations of IBM's mainframe
COBOL include moving the code for PERFORMed paragraphs inline
replacing the PERFORM depending on how many places the paragraph is
PERFORMED from.  I am certain that the same exists in other compilers
for various languages.
>3. It is easier to maintain an inventory of discrete functions if their 
>functionality is not duplicated all over the place.

This is true.  This gets into a whole host of questions on tradeoffs
on maintainability, control of introduction of new functions, the
efficiencies of compile time versus bind time versus run-time binding,
especially with parameter checking.  NOTE CLEARLY, this is not an OO
versus non_OO question since run-time binding is being done in a large
percentage of z/OS COBOL shops today (all CALLs use the DYNAM option).
>> rest snipped
0
cfmtech (125)
7/20/2005 9:15:32 PM
"Clark Morris" <cfmtech@istar.ca> wrote in message 
news:e1atd11fi2cshuqsgg11l78qp1n840102p@4ax.com...
> On Tue, 19 Jul 2005 11:02:50 +1200, "Pete Dashwood"
> <dashwood@enternet.co.nz> wrote:
>
>>
<snip>>>
>>2. The less code that is generated, the faster it will run.
>
> Not necessarily.

Yes, necessarily. :-)   Three factors affect online performance; two of them 
are load time and capture time. The smaller  and tighter a module is, the 
quicker it loads and it will be likely to require less capture time.

Capture time for a process will be improved if there is less code for the 
process to execute (as long as it provides the same functionality, of 
course.)

Adding code cannot possibly make it execute any faster than it is required 
to. And if everything else is equal, the faster load time of a smaller 
module makes overall execution quicker.  If a small module is being paged in 
and out continually, as opposed to a large resident piece of code, then the 
overall execution of the small module functionality may be longer, but that 
implies that the large piece of code is doing more (otherwise, why is it so 
large?), so the comparison is between apples and oranges

Given identical functionality and identical residence, the smaller the code 
is, the quicker it will execute. Invariably.

> Some of the wild optimizations of IBM's mainframe
> COBOL include moving the code for PERFORMed paragraphs inline
> replacing the PERFORM depending on how many places the paragraph is
> PERFORMED from.  I am certain that the same exists in other compilers
> for various languages.

And how exactly does that invalidate my statement? :-) If the optimizer 
expands code in line then 'less code' is not generated.

If you are thinking that page boundaries could make your statement true, 
then that fails as well; shorter code is paged quicker because it occupies 
fewer pages...

In modern systems 'locality of reference' is almost irrelevant, especially 
when multitasking is proceeding anyway.

Think about it.

<snipped what we agree on... :-)>

Pete 



0
dashwood1 (2140)
7/21/2005 12:01:29 AM
In article <3k86utFl4gssU1@individual.net>,
Pete Dashwood <dashwood@enternet.co.nz> wrote:

[snip]

>Given identical functionality and identical residence, the smaller the code 
>is, the quicker it will execute. Invariably.

'Identical residence'?  I'm not sure what this represents, Mr Dashwood, 
but I'd take exception with your assertion that 'less code = faster 
execution' because it appears to ignore differences in the amount of time 
it takes to execute different instructions.

I recall, years ago, talking with a Geezer who said that the first 
airline reservations system he worked on used only register-to-register 
(RR) instructions because they were so fast... in general they found that 
rewriting other single instructions as multiple RR instructions still gave 
them an overall drop in execution time (commonly known as 'faster code').  
Consider the following example.

Using the IBM mainframe platform and every compiler I'm familiar with 
since IKFCBL00: given:

    05  FLDA PIC X.
    05  FLDB PIC X.

then MOVE FLDA TO FLDB will compile to an MVI, IF FLDA = FLDB will compile 
to a CLI.  Given:

    05  FLDA PIC X(n).
    05  FLDB PIC X(n).

.... where n is an integer > 1 then MOVE FLDA TO FLDB will compile to an 
MVC, IF FLDA = FLDB will compile to a CLC.

As I was taught, lo, those many years ago - and I do not know if this is 
still valid for COBOL under modern conditions - an MVI executes 
approximately three times faster than an MVC and a CLI thrice faster than 
a CLC.  It can be readily postulated that code can be written which is 
greater in volume of instructions ('larger') than other code ('smaller') 
and yet will execute more rapidly due to faster constructs being employed.

DD
0
docdwarf (6044)
7/21/2005 1:26:48 AM
Pete,
  Also see my other note (and this may or may not tie into your - unknown to 
me - "identical residence" - term), on shared code.  For (older) IBM CICS 
applications, a larger load module which kept "in storage" all same-logic-path 
code would often "run faster" than code that required "memory" swaps depending 
upon logic flow.

-- 
Bill Klein
 wmklein <at> ix.netcom.com
<docdwarf@panix.com> wrote in message news:dbmtko$3a7$1@panix5.panix.com...
> In article <3k86utFl4gssU1@individual.net>,
> Pete Dashwood <dashwood@enternet.co.nz> wrote:
>
> [snip]
>
>>Given identical functionality and identical residence, the smaller the code
>>is, the quicker it will execute. Invariably.
>
> 'Identical residence'?  I'm not sure what this represents, Mr Dashwood,
> but I'd take exception with your assertion that 'less code = faster
> execution' because it appears to ignore differences in the amount of time
> it takes to execute different instructions.
>
> I recall, years ago, talking with a Geezer who said that the first
> airline reservations system he worked on used only register-to-register
> (RR) instructions because they were so fast... in general they found that
> rewriting other single instructions as multiple RR instructions still gave
> them an overall drop in execution time (commonly known as 'faster code').
> Consider the following example.
>
> Using the IBM mainframe platform and every compiler I'm familiar with
> since IKFCBL00: given:
>
>    05  FLDA PIC X.
>    05  FLDB PIC X.
>
> then MOVE FLDA TO FLDB will compile to an MVI, IF FLDA = FLDB will compile
> to a CLI.  Given:
>
>    05  FLDA PIC X(n).
>    05  FLDB PIC X(n).
>
> ... where n is an integer > 1 then MOVE FLDA TO FLDB will compile to an
> MVC, IF FLDA = FLDB will compile to a CLC.
>
> As I was taught, lo, those many years ago - and I do not know if this is
> still valid for COBOL under modern conditions - an MVI executes
> approximately three times faster than an MVC and a CLI thrice faster than
> a CLC.  It can be readily postulated that code can be written which is
> greater in volume of instructions ('larger') than other code ('smaller')
> and yet will execute more rapidly due to faster constructs being employed.
>
> DD 


0
wmklein (2605)
7/21/2005 2:05:02 AM
The "old" locality of reference idea was to keep your code and data close 
together (on the same page) to reduce paging.  On IBM mainframe systems (and 
perhaps on others also) that have instruction pipelines/caches,  this "old" 
method can actually cause a performance hit.  This is caused when there is 
data in the cache that is changed causing the cache contents be invalidated 
and causing a disruption of the instruction pipeline..  The "new" locality 
of reference idea is to keep your data at separate from your instructions by 
at least the size of the cache.  One way to do this is to make your programs 
reentrant.

There is a discussion of the at:

http://groups-beta.google.com/group/bit.listserv.ibm-main/browse_thread/thread/3262c21bc944371f/3daee1483bfbc55e?q=cache&


Top post - no more below.

"Pete Dashwood" <dashwood@enternet.co.nz> wrote in message 
news:3k86utFl4gssU1@individual.net...
>
> "Clark Morris" <cfmtech@istar.ca> wrote in message 
> news:e1atd11fi2cshuqsgg11l78qp1n840102p@4ax.com...
>> On Tue, 19 Jul 2005 11:02:50 +1200, "Pete Dashwood"
>> <dashwood@enternet.co.nz> wrote:
>>

<snip>

>
> In modern systems 'locality of reference' is almost irrelevant, especially 
> when multitasking is proceeding anyway.


0
jghottel (73)
7/21/2005 4:13:57 AM
"Pete Dashwood" <dashwood@enternet.co.nz> wrote in message
news:3k86utFl4gssU1@individual.net...
>
> "Clark Morris" <cfmtech@istar.ca> wrote in message
> news:e1atd11fi2cshuqsgg11l78qp1n840102p@4ax.com...
> > On Tue, 19 Jul 2005 11:02:50 +1200, "Pete Dashwood"
> > <dashwood@enternet.co.nz> wrote:
> >
> >>
> <snip>>>
> >>2. The less code that is generated, the faster it will run.
> >
> > Not necessarily.
>
> Yes, necessarily. :-)   Three factors affect online performance; two of
them
> are load time and capture time. The smaller  and tighter a module is, the
> quicker it loads and it will be likely to require less capture time.
>
> Capture time for a process will be improved if there is less code for the
> process to execute (as long as it provides the same functionality, of
> course.)
>
> Adding code cannot possibly make it execute any faster than it is required
> to. And if everything else is equal, the faster load time of a smaller
> module makes overall execution quicker.  If a small module is being paged
in
> and out continually, as opposed to a large resident piece of code, then
the
> overall execution of the small module functionality may be longer, but
that
> implies that the large piece of code is doing more (otherwise, why is it
so
> large?), so the comparison is between apples and oranges
>
> Given identical functionality and identical residence, the smaller the
code
> is, the quicker it will execute. Invariably.

Perhaps, Mr Dashwood, you had some additional restrictions
in mind; but there are cases where larger code, with 'identical
functionality', will execute faster.

The following program has two procedures to determine the
length of text (a string). The procedures have 'identical functionality',
in that they each accept a 32 character item and return its length
(the position of the last non-space character). They also function
identically by examining the 32 character item, in reverse, one
character at a time. Yet the larger procedure outperforms the smaller.

Test program
--------
       program-id. unroll.
       data division.
       working-storage section.
       1 str pic x(32) value space.
       1 strlen-1 comp-5 pic 9(4) value 0.
       1 strlen-2 comp-5 pic 9(4) value 0.
       1 time-1.
        2 time-1-hour pic 99.
        2 time-1-minute pic 99.
        2 time-1-second pic 99v99.
       1 time-2.
        2 time-2-hour pic 99.
        2 time-2-minute pic 99.
        2 time-2-second pic 99v99.
       1 time-3.
        2 time-3-hour pic 99.
        2 time-3-minute pic 99.
        2 time-3-second pic 99v99.
       1 elapsed-time pic 9(7)v99.
       1 elapsed-time-display pic z(6)9.99.
       procedure division.
       mainline section.
           move spaces to str
           perform time-test
           move 'a' to str (16:1)
           perform time-test
           move 'a' to str (32:1)
           perform time-test
           stop run
           .
       time-test section.
           accept time-1 from time
           perform loop 100000000 times
           accept time-2 from time
           perform unrolled-loop 100000000 times
           accept time-3 from time
           display time-1
           if time-1 > time-2
               move 86400 to elapsed-time
           else
               move 0 to elapsed-time
           end-if
           compute elapsed-time = elapsed-time
             + (time-2-hour - time-1-hour) * 3600
             + (time-2-minute - time-1-minute) * 60
             + (time-2-second - time-1-second)
           move elapsed-time to elapsed-time-display
           display time-2 space elapsed-time-display
               space strlen-1
           if time-2 > time-3
               move 86400 to elapsed-time
           else
               move 0 to elapsed-time
           end-if
           compute elapsed-time = elapsed-time
             + (time-3-hour - time-2-hour) * 3600
             + (time-3-minute - time-2-minute) * 60
             + (time-3-second - time-2-second)
           move elapsed-time to elapsed-time-display
           display time-3 space elapsed-time-display
               space strlen-2
           .
       loop section.
           perform varying strlen-1 from 32 by -1
             until strlen-1 = 0
               if space not = str (strlen-1:1)
                   exit perform
               end-if
           end-perform
           .
       unrolled-loop section.
           evaluate space
             when not = str (32:1) move 32 to strlen-2
             when not = str (31:1) move 31 to strlen-2
             when not = str (30:1) move 30 to strlen-2
             when not = str (29:1) move 29 to strlen-2
             when not = str (28:1) move 28 to strlen-2
             when not = str (27:1) move 27 to strlen-2
             when not = str (26:1) move 26 to strlen-2
             when not = str (25:1) move 25 to strlen-2
             when not = str (24:1) move 24 to strlen-2
             when not = str (23:1) move 23 to strlen-2
             when not = str (22:1) move 22 to strlen-2
             when not = str (21:1) move 21 to strlen-2
             when not = str (20:1) move 20 to strlen-2
             when not = str (19:1) move 19 to strlen-2
             when not = str (18:1) move 18 to strlen-2
             when not = str (17:1) move 17 to strlen-2
             when not = str (16:1) move 16 to strlen-2
             when not = str (15:1) move 15 to strlen-2
             when not = str (14:1) move 14 to strlen-2
             when not = str (13:1) move 13 to strlen-2
             when not = str (12:1) move 12 to strlen-2
             when not = str (11:1) move 11 to strlen-2
             when not = str (10:1) move 10 to strlen-2
             when not = str (9:1) move 9 to strlen-2
             when not = str (8:1) move 8 to strlen-2
             when not = str (7:1) move 7 to strlen-2
             when not = str (6:1) move 6 to strlen-2
             when not = str (5:1) move 5 to strlen-2
             when not = str (4:1) move 4 to strlen-2
             when not = str (3:1) move 3 to strlen-2
             when not = str (2:1) move 2 to strlen-2
             when not = str (1:1) move 1 to strlen-2
             when other move 0 to strlen-2
           end-evaluate
           .
       end program unroll.
-------

Test results
--------
00585676
00591950      22.74 00000
00592449       4.99 00000
00592449
00593828      13.79 00016
00594119       2.91 00016
00594119
00594229       1.10 00032
00594328       0.99 00032
--------



0
ricksmith (875)
7/21/2005 5:54:03 AM
<docdwarf@panix.com> wrote in message news:dbmtko$3a7$1@panix5.panix.com...
> In article <3k86utFl4gssU1@individual.net>,
> Pete Dashwood <dashwood@enternet.co.nz> wrote:
>
> [snip]
>
>>Given identical functionality and identical residence, the smaller the 
>>code
>>is, the quicker it will execute. Invariably.
>
> 'Identical residence'?  I'm not sure what this represents, Mr Dashwood,
> but I'd take exception with your assertion that 'less code = faster
> execution' because it appears to ignore differences in the amount of time
> it takes to execute different instructions.
>

Might be best not to argue if you don't understand the terms... :-)

Three things govern efficiency in online environments. ( ...those are the 
environments where efficiency actually matters...)

1. Queue wait time.
2. Load time.
3. Capture time.

Load time is simply dependent on module size. The smaller it is the quicker 
it loads. Small is beautiful in online environments.
Capture time is not directly dependent on module size, like load time is, 
but it is certainly more likely to be shorter for smaller modules than large 
ones. (This is affected by OS and TP monitor architecture; timeslicing may 
mean a larger module runs out of CPU time where a smaller one would have fit 
the envelope; interrupt driven environments tend to encounter an interrupt 
more often in large modues than they do in small ones (statistically, the 
more code there is, the better chance of being interrupted and swapped...) 
In either case, swapping a large module HAS to take more time than swapping 
a small one...)

Bottom line: smaller modules execute faster. It is not just about 
instruction architecture, and you can theorise about instruction sets and 
postulate exception cases, for argument's sake, for as long as you like; 
when you come to run it, it will be quicker if it is smaller... :-)

> I recall, years ago, talking with a Geezer who said that the first
> airline reservations system he worked on used only register-to-register
> (RR) instructions because they were so fast... in general they found that
> rewriting other single instructions as multiple RR instructions still gave
> them an overall drop in execution time (commonly known as 'faster code').
> Consider the following example.
>
> Using the IBM mainframe platform and every compiler I'm familiar with
> since IKFCBL00: given:
>
>    05  FLDA PIC X.
>    05  FLDB PIC X.
>
> then MOVE FLDA TO FLDB will compile to an MVI, IF FLDA = FLDB will compile
> to a CLI.  Given:
>
>    05  FLDA PIC X(n).
>    05  FLDB PIC X(n).
>
> ... where n is an integer > 1 then MOVE FLDA TO FLDB will compile to an
> MVC, IF FLDA = FLDB will compile to a CLC.
>
> As I was taught, lo, those many years ago - and I do not know if this is
> still valid for COBOL under modern conditions - an MVI executes
> approximately three times faster than an MVC and a CLI thrice faster than
> a CLC.  It can be readily postulated that code can be written which is
> greater in volume of instructions ('larger') than other code ('smaller')
> and yet will execute more rapidly due to faster constructs being employed.
>
Well it's a long time since I wrote any BAL and I don't have a green or 
yellow card about the place, but from memory, I'll venture the following:

The MVI contains the data to be moved. In effect it is a move literal (that 
is why it is 'immediate'). The MVC needs more information (registers and 
offsets), so I would expect the instruction lengths to be different. I'm 
sure someone here can confirm or deny this. So, while the MVI might well run 
faster, you would need 256 of them to do the same job as a single MVC 
addressing the max.

To say that an MVI executes three times faster than an MVC is silly, because 
the time the MVC takes is dependent on the length of the operands. (Is it 
three times faster than an MVC moving 1 byte, or an MVC moving 256 bytes? 
Does it matter anyway? There are other more potent factors than instruction 
architecture that will decide how quickly something gets executed (see 
above). With the 'new' MVCL the limit of 256 bytes has been extended but the 
overall time for the instruction  (in both cases) will be dependent on the 
length being moved.

Despite your 'ready postulate' I disagree. The instructions that 'take 
longer' will either be the same length or longer than the ones that are 
quicker, but they will do more.

Show me an application program (on any platform) that does real work in a 
real environment, where shortening it does NOT make it run faster. (Even if 
you remove code that is never executed, it is STILL faster overall because 
it loads quicker...)

My comments apply to real code running in the real world, not to contrived 
exceptions designed for the sake of an argument. :-)

(And arguing it with you has not changed my mind :-))

Pete.



0
dashwood1 (2140)
7/21/2005 11:38:17 AM
"William M. Klein" <wmklein@nospam.netcom.com> wrote in message 
news:hpDDe.503013$3V6.105715@fe04.news.easynews.com...
> Pete,
>  Also see my other note (and this may or may not tie into your - unknown 
> to me - "identical residence" - term), on shared code.  For (older) IBM 
> CICS applications, a larger load module which kept "in storage" all 
> same-logic-path code would often "run faster" than code that required 
> "memory" swaps depending upon logic flow.
>
Yes, I recognised that in my discourse :-)  Identical residence means what 
it says. :-)

Pete.
<snip>> 



0
dashwood1 (2140)
7/21/2005 11:40:58 AM
Good post, Rick. I appreciate the empirical approach.

Comments below...


"Rick Smith" <ricksmith@mfi.net> wrote in message 
news:11due6mb9fhfac3@corp.supernews.com...
>
> "Pete Dashwood" <dashwood@enternet.co.nz> wrote in message
> news:3k86utFl4gssU1@individual.net...
>>
>> "Clark Morris" <cfmtech@istar.ca> wrote in message
>> news:e1atd11fi2cshuqsgg11l78qp1n840102p@4ax.com...
>> > On Tue, 19 Jul 2005 11:02:50 +1200, "Pete Dashwood"
>> > <dashwood@enternet.co.nz> wrote:
>> >
>> >>
>> <snip>>>
>> >>2. The less code that is generated, the faster it will run.
>> >
>> > Not necessarily.
>>
>> Yes, necessarily. :-)   Three factors affect online performance; two of
> them
>> are load time and capture time. The smaller  and tighter a module is, the
>> quicker it loads and it will be likely to require less capture time.
>>
>> Capture time for a process will be improved if there is less code for the
>> process to execute (as long as it provides the same functionality, of
>> course.)
>>
>> Adding code cannot possibly make it execute any faster than it is 
>> required
>> to. And if everything else is equal, the faster load time of a smaller
>> module makes overall execution quicker.  If a small module is being paged
> in
>> and out continually, as opposed to a large resident piece of code, then
> the
>> overall execution of the small module functionality may be longer, but
> that
>> implies that the large piece of code is doing more (otherwise, why is it
> so
>> large?), so the comparison is between apples and oranges
>>
>> Given identical functionality and identical residence, the smaller the
> code
>> is, the quicker it will execute. Invariably.
>
> Perhaps, Mr Dashwood, you had some additional restrictions
> in mind; but there are cases where larger code, with 'identical
> functionality', will execute faster.
>
> The following program has two procedures to determine the
> length of text (a string). The procedures have 'identical functionality',
> in that they each accept a 32 character item and return its length
> (the position of the last non-space character). They also function
> identically by examining the 32 character item, in reverse, one
> character at a time. Yet the larger procedure outperforms the smaller.

This is because the capture time of the smaller module is much greater than 
the capture time of the larger module.

The example has been contrived to show this, and it is valid.

I shouldn't have said 'invariably'. It is not invariably so.

My statement was based on my own experience using real world data, and the 
teaching I received.

The fact is that load module size directly reduces load time, but it is only 
more likely to reduce capture time; it isn't definite.

Your example shows a case where it doesn't help the capture time. I 
overlooked this,  but I should have been more careful in what I stated.

Thanks for posting the example.

However, I still believe that in a real mix of programs, in an online 
environment, small is beautiful.

Pete.
<snipped good example> 



0
dashwood1 (2140)
7/21/2005 12:15:44 PM
Apologies to Clark, see below.

"Pete Dashwood" <dashwood@enternet.co.nz> wrote in message 
news:3k86utFl4gssU1@individual.net...
>
> "Clark Morris" <cfmtech@istar.ca> wrote in message 
> news:e1atd11fi2cshuqsgg11l78qp1n840102p@4ax.com...
>> On Tue, 19 Jul 2005 11:02:50 +1200, "Pete Dashwood"
>> <dashwood@enternet.co.nz> wrote:
>>
>>>
> <snip>>>
>>>2. The less code that is generated, the faster it will run.
>>
>> Not necessarily.
>
> Yes, necessarily. :-)   Three factors affect online performance; two of 
> them are load time and capture time. The smaller  and tighter a module is, 
> the quicker it loads and it will be likely to require less capture time.

Although I said 'likely to' here, I behaved as if it said 'will'.
>
> Capture time for a process will be improved if there is less code for the 
> process to execute (as long as it provides the same functionality, of 
> course.)

Large iterations with indirect addressing are just one case where small code 
may have a large capture time. I was wrong to overlook this, and I'm sorry, 
Clark.

>
> Adding code cannot possibly make it execute any faster than it is required 
> to. And if everything else is equal, the faster load time of a smaller 
> module makes overall execution quicker.  If a small module is being paged 
> in and out continually, as opposed to a large resident piece of code, then 
> the overall execution of the small module functionality may be longer, but 
> that implies that the large piece of code is doing more (otherwise, why is 
> it so large?), so the comparison is between apples and oranges
>
> Given identical functionality and identical residence, the smaller the 
> code is, the quicker it will execute. Invariably.

Nope, not invariably. I was wrong.

Pete.

<snipped> 



0
dashwood1 (2140)
7/21/2005 12:22:19 PM
A very different way of duplicating code is when that duplicated code is
distributed.    Efficiencies in distributed processing are much harder for me to
figure out.
0
howard (6283)
7/21/2005 1:40:46 PM
"Pete Dashwood" <dashwood@enternet.co.nz> wrote in message
news:3k86utFl4gssU1@individual.net...

> >>2. The less code that is generated, the faster it will run.
> >
> > Not necessarily.
>
> Yes, necessarily. :-)   Three factors affect online performance; two of
them
> are load time and capture time. The smaller  and tighter a module is, the
> quicker it loads and it will be likely to require less capture time.

Can you define a "module" of, or containing, COBOL code in an
implementation-independent manner?

I know of COBOL compilers that pay virtually no attention to natural
boundaries within a program when deciding where a block of object code
begins and ends, and I know of others on the same platform that allow the
user tight control over the size and end points of the object code segments.
The debate continues to rage over which approach is superior (I happen to
prefer the latter).

> Capture time for a process will be improved if there is less code for the
> process to execute (as long as it provides the same functionality, of
> course.)

It is not clear to me that making the particular piece of code smaller by
eliminating some in-line code and putting it into another piece of code will
always make it run faster.   It seems to me that a "branch-and-link" or
other PERFORM analog to a sequence of machine instructions will ordinarily
run a little *slower* than the same sequence of machine instructions
executed linearly within the larger sequence.  Maybe I don't understand what
it is you're trying to convey.

> Adding code cannot possibly make it execute any faster than it is required
> to. And if everything else is equal, the faster load time of a smaller
> module makes overall execution quicker.

OK, presuming the "module" needs to be loaded more than once during the
execution of the program ... a matter of a cost/benefit tradeoff on the
available memory on the system, ISTM.

>If a small module is being paged in
> and out continually, as opposed to a large resident piece of code,

Why should object code *ever* need to be paged *out*, since code can never
be modified in memory anyway?

If the system needs the memory space for a code segment,  basically all it
needs to do is mark it absent in the segment dictionary, and take the
memory.  Next time that code segment gets "touched" during execution, the
system will find space for it somewhere and load it back in.  If that
reaches a point commonly called "thrashing", it's time to take action -- 
reduce the workload, add memory, or live with it.  Simple.

> overall execution of the small module functionality may be longer, but
that
> implies that the large piece of code is doing more (otherwise, why is it
so
> large?), so the comparison is between apples and oranges

That also seems to presume that all instructions in both modules are always
executed, I think.

> Given identical functionality and identical residence, the smaller the
code
> is, the quicker it will execute. Invariably.

I would say "the fewer the instructions executed the quicker the whole will
execute", but I'm not yet convinced that a small "module" will necessarily
execute faster than a large one!

    -Chuck Stevens


0
7/21/2005 3:04:31 PM
On 21-Jul-2005, "Chuck Stevens" <charles.stevens@unisys.com> wrote:

> It is not clear to me that making the particular piece of code smaller by
> eliminating some in-line code and putting it into another piece of code will
> always make it run faster.   It seems to me that a "branch-and-link" or
> other PERFORM analog to a sequence of machine instructions will ordinarily
> run a little *slower* than the same sequence of machine instructions
> executed linearly within the larger sequence.  Maybe I don't understand what
> it is you're trying to convey.

Sometimes compilers for supercomputers actually take code and make multiple
copies - which are executed simultaneously.

But those computers are designed to avoid virtual memory swaps as much as
possible.


> Why should object code *ever* need to be paged *out*, since code can never
> be modified in memory anyway?

VM machines have various degrees of smarts and successes.   The easiest thing to
do is to swap a whole program out when memory is needed by another program.  
But if it has control over various modules, it can be smarter about just
swapping out enough modules to free up the required amount of memory.    On the
other hand - if it picks the small modules to swap out - programs with small
modules will be the ones that are slowed down when this happens.
0
howard (6283)
7/21/2005 3:58:34 PM
Based on information that seems to be between the lines in this thread, the
concept of "module" seems to include both instruction information and data
information, treated as a logical unit which as a whole is either in memory
or it is not.   That's a particular implementation approach, but it is not
an approach that applies to all environments.  In our environment, "virtual
memory swaps" take place *only* for data; there is never a need to write
object code back out to disk; and there's a whole suite of mechanisms that
fine-tune how sticky a particular piece of (code or data) information needs
to be and how readily its space can be reused for something else.    Code
gets read in if it's not there when somebody goes to it or performs it; it
gets marked absent if the system decides it needs the space; but it never
gets written back out.  Thus, the concept of binding code to data in some
way may be a really useful and productive concept in designing an
application, but it doesn't necessarily require that the underlying hardware
platform actually act that way.

(no more below)

    -Chuck Stevens

"Howard Brazee" <howard@brazee.net> wrote in message
news:dbogo5$k74$1@peabody.colorado.edu...
>
> On 21-Jul-2005, "Chuck Stevens" <charles.stevens@unisys.com> wrote:
>
> > It is not clear to me that making the particular piece of code smaller
by
> > eliminating some in-line code and putting it into another piece of code
will
> > always make it run faster.   It seems to me that a "branch-and-link" or
> > other PERFORM analog to a sequence of machine instructions will
ordinarily
> > run a little *slower* than the same sequence of machine instructions
> > executed linearly within the larger sequence.  Maybe I don't understand
what
> > it is you're trying to convey.
>
> Sometimes compilers for supercomputers actually take code and make
multiple
> copies - which are executed simultaneously.
>
> But those computers are designed to avoid virtual memory swaps as much as
> possible.
>
>
> > Why should object code *ever* need to be paged *out*, since code can
never
> > be modified in memory anyway?
>
> VM machines have various degrees of smarts and successes.   The easiest
thing to
> do is to swap a whole program out when memory is needed by another
program.
> But if it has control over various modules, it can be smarter about just
> swapping out enough modules to free up the required amount of memory.
On the
> other hand - if it picks the small modules to swap out - programs with
small
> modules will be the ones that are slowed down when this happens.


0
7/21/2005 4:51:19 PM
In article <3k9fpeFtilonU1@individual.net>,
Pete Dashwood <dashwood@enternet.co.nz> wrote:

[snip]

>(And arguing it with you has not changed my mind :-))

It seems, however, that code someone else posted did... sometimes a 
different method is needed, that is all.

DD

0
docdwarf (6044)
7/21/2005 6:53:38 PM
<docdwarf@panix.com> wrote in message news:dboqvi$64$1@panix5.panix.com...
> In article <3k9fpeFtilonU1@individual.net>,
> Pete Dashwood <dashwood@enternet.co.nz> wrote:
>
> [snip]
>
>>(And arguing it with you has not changed my mind :-))
>
> It seems, however, that code someone else posted did... sometimes a
> different method is needed, that is all.

I guess so :-) But it is still fun to argue (in the classical sense) stuff 
with people capable of doing so... :-)

I always enjoy your posts, Doc.

Pete.



0
dashwood1 (2140)
7/22/2005 12:53:36 AM
<snip>>
>> Given identical functionality and identical residence, the smaller the
> code
>> is, the quicker it will execute. Invariably.
>
> I would say "the fewer the instructions executed the quicker the whole 
> will
> execute", but I'm not yet convinced that a small "module" will necessarily
> execute faster than a large one!

Yes. Capture time probably has the largest effect on overall throughput, 
although the other two factors definitely influence it.

My comments were intended for an online environment where multitasking is 
going on.

I was wrong to say 'invariably' and have corrected it elsewhere.

I take no issue with your post and am happy to be in agreement with your 
conclusion. :-)

Pete.




0
dashwood1 (2140)
7/22/2005 1:00:09 AM
Massively snipped - sometimes without notation - to avoid a ramble which 
didn't work as this is majorly rambling.  It's not intended for a response. 
It's more that I agree with you in "principle" but am separated by a 
differing view of what I call "realistic probability".  Some people think 
Marx was right and then some people think Wallerstein needed to make some 
corrections to the theories to fit the real world when it become clear that 
Marxism was flawed......I think your theories are right, but don't take into 
account Fortune 500 companies and their goals.  Crap analogy.  Tis just food 
for thought is more elegant.

Again, pleasantly surprised to your clc dedication ;-)

JCE

"Pete Dashwood" <dashwood@enternet.co.nz> wrote in message 
news:3k4krvFsnis7U1@individual.net...
> "jce" <defaultuser@hotmail.com> wrote in message
<snip>

>> Which is the most common scenario in your experience:
>> Do you "extend" my component and create a new one based solely on mine ?
> Yes. see below.
>>If you want to do something close but not quite, you may have to 
>>duplicating the function in order to make your slight change.
> No, I would incorporate your functionality into a completely new function 
> by adding methods and properties that do not HAVE to be used. The new 
> function then includes the old one as a subset. I would NOT mess with the 
> functionality that already existed in your component. (That's why source 
> code is not such a big deal any more...)
You have duplicated the function unless your component stays logically 
defined with the old component, or your component "hides" all the 
uneccessary features of the first.  You component is now out there and 
supported by you whilst the old - supporting the same base function (that 
you rely on) is no doubt supported by someone else.

From a purely aesthetic standpoint I do agree with you  :-)  I guess the 
point I didn't make clear in my note is that I understand that you're not 
tied to the source code, but you are tied to the new "source" code.  For you 
it's an API, for someone else it's an XMI and for others its still the old 
source COBOL, C, Java.

I'm less worried about duplicate code, and more about management of 
"worldwide interenterprise crossed with freelance and free source" functions 
" where there are bucks to be made".  _That_  - no one has sufficiently 
addressed.

>> I just see an explosion of components and code that become very 
>> unmanageable unless the code is very clearly owned and that ownership 
>> just implies more components of similar thing to gain some ownership over 
>> the code use.
> That could certainly happen if you don't employ the constraints I have 
> outlined above. The components are the property of the company and owned 
> by everybody. You can see immediately, you don't want people amending the 
> functionality. This fact will decide the 'granularity' of the components. 
> Personally, I like fairly small building blocks and don't build massive 
> functionality as a single component,

This works well in some instances, but software licencing is never that 
easy.  Someone may not _like_ your extensions of _their_ component.  Someone 
may _change_ their component in a way you don't like.  I'm being kind of a 
naysayer but business is business and when people don't have control their 
are two things that can happen:
I don't want to play with you anymore, I'll make "better" friends.
I don't want to play with you anymore, and by the way it's my ball and I'm 
taking it with me.

What happens is that you find different people with different balls - some 
with a Gilbert some with an Adidas, some all weather, some real 
leather...etc.

> It is also important to recognise elemental components and 'assemblies' or 
> 'sub-assemblies' that incorporate them. Sometimes what are really 
> assemblies are registered as components, and then confusion arises.
This is perhaps my hangup.  Most useful offerings are 'assemblies' however, 
so perhaps an easy "mistake?" to make

>> I would imagine that for a code generator this is a non issue.  I would 
>> actually think duplicate code is a lot simpler for autotools to manage.
> Yes, but you are still looking at your tool as producing code for 
> maintenance. I don't do that. I covered the fact that at some point the 
> generator will include duplicated code; the difference is that it can be 
> further processed to remove the duplication, if desired.
At some point someone will want to either (a) change something, or (b) 
validate the perfect documentation.
In order to change something or validate something you need something in a 
form that you can look at.
The maintenance I speak of is not "source" code, but the "new source".  If 
one built an App using MS Studio GUI, it would be soul destroying to know 
that there is no way of "editing" or "validating" it.

>>> At a line-by-line level a code generator may well duplicate code because
>>> it doesn't 'understand' the overall picture. However, subsequent passes 
>>> of
>>> the conversion process can analyse this generated code (because it is 
>>> now
>>> in a form where it is suitable for automated analysis), and recognise
>>> repeated code. Optimization can create functions and replace the
>>> duplication with function (or component) activation. (I didn't say 
>>> CALLs,
>>> because it depends on the environment...).
>> But now who would maintain that code? The generator, the end user, or the 
>> refactoring tool?
> Nobody. That is my whole point.
I question the interface description - who validates it?  Or more 
importantly can figure out how to change it when validation fails.
I want to make an update, who and what do I use to update it?
I did slip and use the word "code" here which was unintentional - I meant 
source.  If it was as easy as this, then in my old source world, I would 
never keep source, I'd create my object code and delete the program and be 
happy that the compiler did its very best.

>> If the process is always A to B to C...who's responsible for the build 
>> and QA test ?
> The person building the system. Traditionally, a programmer, but it could 
> be (and this is happening more and more) a knowledgeable business person, 
> interacting with smart software.
Good luck finding a knowledgeable business person who has time to do this. 
I'm finding that smart programmers are just moving up the business ladder 
(often out of their depth too).

> Three tools touching code is almost as dangerous as one person
>> touching it :-)  To date the best advice I"ve received is, "if you cannot 
>> understand the generated code, then you shouldn't use the generator". 
>> Good code is readable regardless of who wrote it.
> At this point we diverge strongly. 'Good code' is simply a series of 
> iterations, sequences, and selections, produced by standard algorithms, 
> from something written by a programmer. The 'Good Code' can then be 
> processed further. (Or not)
> In my scenario the whole purpose of refactoring is to acquire components 
> for re-use. Source code maintenance is neither necessary nor desirable. In 
> your scenario the refactoring is to 'straighten out' spaghetti and 
> structure it, so it can be maintained. Source code is everything.
Again - a component cannot ever be a black box.  You need to be able to 
validate the documentation which is the "new source" which can arguably be 
done with mere testing and boundary conditions.  But should it ever fail ? 
Then what.

Closed software is not the "in thing" - I'm not going to use a credit card 
payment component without knowing that it's auditable.

I've used "code" again, but I mean source :-) I have no issue with a code 
generation from a design doc or a nice tool - but I still want to be able to 
understand the source.  Anyone who used the Visualage visual editor can tell 
you that it was great for creating code...was AWFUL for understanding it - 
even the visual framework was not clear.  So I never looked at the code it 
generated, but the "source" the design palette _almost_ forced you to 
abandon it and look at the code.  If I had optimized it, then it would not 
have worked in the visual framework and that's a problem to me.

> Both scenarios have their place. It's just that I, personally, have no 
> interest in yours... :-).  (I don't think it is 'wrong'; I just don't 
> think it is a good reason to refactor code. To me, if you are hell bent on 
> maintaining source code, rewrite the stuff from scratch....  It is 
> certainly arguable... :-)).
Yes, but maintenenance is understanding and auditing.  Someone +has+ to do 
that. It's government mandated in my industry.

>>Unless a refactoring tool can extract the dead code, clean up the mess and 
>>provide reasoned documentation of the program it seems somewhat fruitless.
> That's like saying that supermarkets are useless because they don't 
> actually put the stuff in your larder for you. :-)
Not really.  It's like saying, I'm not paying someone to clean up my garage 
if all they do is move stuff around without telling me where my junk is now.

>> Spaghetti code represents spaghetti logic.  You can straighten out the 
>> code, but it's still messed up logic.
> If that particular code works, exactly how is it's logic messed up? You 
> are seeing only source code maintenance again. Assume for a minute it 
> works, and you are never going to maintain it. Does it still matter if it 
> is 'messed up'...?
If no one is maintaining it then what's the point of refactoring? If it's to 
reuse the functionality in some other arena, then I would argue that you 
+need+ to understand it.  You could have code that has a rounding error 
somewhere...then says "if error < $0.02 then display 'We're in balance and 
skip this 2c'.  Again, I'm +auditable+.  The code does "work" - just not 
very well.  (And yes this is real and existed unknown to me until I fixed 
the real rounding problem and someone noticed that the 2c file was always 
empty )

> Procedural programming is obsessed with code maintenance, but it is 
> actually the functionality that is important.
ALL programming is obsessed with code maintenance.  Less than 10% of the 
code I have ever written is "new" function. It's enhancements and 
unfortunately "fixing".

> The prime evaluation criterion is that the program should work. Elegance 
> of code is necessarily a lower priority than achieving correct 
> functionality.
And then the priority immediately becomes ability to understand, fix AND 
validate. I don't care if the fix is a code fix, or a gui fix, as long as 
the "source" is readable and simple.

>> should isolate function and replace it with standard solutions....how 
>> many people wrote their own "logging" routines, or "web frameworks" or 
>> servlets and applets before log4j and struts and jsp custom tags ?  I 
>> remember once I wrote something that was very similar to a point and 
>> shoot JMS solution.....and then came JMS.
> Exactly. The need for such functionality was recognised. But you didn't 
> not use it because you didn't have the source code, did you :-)?
The above was all availabe open source. Log4j is pretty slick and there is a 
Log4p and a Log4c which will make you sick to your stomach :-)

>>> It is no longer accurate to believe that only a human programmer, who 
>>> has
>>> full understanding of a process, can be capable of refactoring that
>>> process. Indications are that smart software can do it faster, cheaper,
>>> and with fewer errors.
Refactoring I have no questions that automation is faster, and accurately 
reproduces what's there in a GIGO fashion.

> Prediction:
> As existing systems (particularly procedural coded ones, where source 
> maintenance is a heavy requirement) are phased out, we are likely to see 
> more code refactoring being employed. I don't believe it will be done with 
> the object of straightening out the existing code, so it can be more 
> easily maintained; rather, it will be done so that functionality can be 
> encapsulated and embedded into new 'quick build' systems, and corporate 
> packages like SAP and Siebel.
SAP/Siebel which allow Java and even better...their own proprietary 
languages that keep people employed worldwide, whoopee :-)...how many ABAP 
people we got in here?

> The Age of Source Code is almost over; the Age of Functionality is 
> dawning.
I don't see them as separate things.  Different forms of source code is 
STILL source code - taken to an extreme the component you encapsulate is 
still the source, it's just now that to validate I have to understand the 
documentation of that component as by law in any US business where there is 
ANY finances involved you are +mandated+ to understand it and be 
auditable...the CEO's sign off on it.

JCE


0
defaultuser (532)
7/22/2005 5:28:46 AM
Thanks Oliver.  Your perfectly innocent first post has made a rather 
interesting thread.

"Oliver Wong" <owong@castortech.com> wrote in message 
news:Sv9De.95888$wr.56881@clgrps12...
> "jce" <defaultuser@hotmail.com> wrote in message 
> news:I_0De.10777$iG6.9945@tornado.tampabay.rr.com...
<snip - a very nice explanation on some OO concepts>

>> I would imagine that for a code generator this is a non issue.  I would 
>> actually think duplicate code is a lot simpler for autotools to manage.
> Well, this depends on what you're trying to accomplish. If the generator 
> "knows" ahead of time that two of the things it's going to generate will 
> contain identical code, then it probably has some sort of "planning" 
> phase. During this phase, it can make a list of all the functionality it 
> wants to generate, checking every time it tries to add a feature that it 
> isn't already there, to ensure that no duplicate code gets generated. Even 
> without an explicit planning phase, typically when generating code, you'll 
> want to refer to it (via method calls or even GOTOs) later on, so you'll 
> need to keep some sort of list of what you've generated so far anyway.
> But yes, most tools I've seen don't do an explicit "duplicate code check" 
> phase, examining the code it just generated. I think most tools make a 
> "decent" effort towards not generating duplicate code, but if it does, it 
> doesn't worry about it too much.

I'm not sure why the developers would worry much...You are much more versed 
in this area than me, but the code generators I have experience with are 
EXTREMELY verbose.  Frankly I don't care too much because the code was 
fairly static and it was easy enough to figure out what it was doing, even 
if it was cumbersome.  While we licence the tool it's a non issue - but 
recently, that _was_ an issue.

>> If the process is always A to B to C...who's responsible for the build 
>> and QA test ? Three tools touching code is almost as dangerous as one 
>> person touching it :-)
> I think if you can "prove" that a tool doesn't change the semantics of a 
> given program, then it doesn't matter how many times you run that tool, 
> each time the semantics won't change. Similarly, if you have 3 tools, each 
> of which don't change the action, then you can run any of them in any 
> order as many times as you like, and the result at the end will be 
> semantically identical to the program at the beginning.
> Of course, very rarely do I see program accompanied by formal proof of 
> correctness.

I've yet to find a bug free piece of software that provided an value.  The 
sheer number of problems and patches create for compilers of all types I 
think is testament to that. Having spent hours working with the support 
start on a DBMS precompiler abending (protection exception) proved it to me. 
Now, if a compiler written by the supposed owner of a language cannot get it 
right (in the case of Sun) then I'm standing by my statement...though maybe 
it's more accurate to say "3000 tools as dangerous as one "smart and 
conscientious" person" .

> As for QA, if you're confident that the tool is correct, then you might 
> have more important things to spend time testing (e.g. code you've 
> manually written); the less confident you are about the tool, the higher 
> the priority should be given to testing the generated code.

Most certainly agree with this - which is why I adjusted my numbers above.

>> To date the best advice I"ve received is, "if you cannot understand the 
>> generated code, then you shouldn't use the generator".  Good code is 
>> readable regardless of who wrote it.

> I don't agree with this; when you generate a lexer (the front end of a 
> compiler) using a grammar file, usually the lexer will contain completely 
> unintelligeable code. The lexer I'm working with, for example, has a giant 
> array containing what I assume (based on the name of the variable) is the 
> next state its internal state machine should transition to encoded as a 
> bunch of integers. The code looks like:
> jjnextStates = { 485, 486, 468, 487, 488, 497, 499, 501, 502, 490, 492, 
> 494, 495, 1, 2, 327, 328, 388, 492, 494, 495, 499, 501, 502, 3, 4, 17, 18, 
> 30, 31, 48, 49, 64, 65, 77, 78, 98, 99, 119, 120, 145, 146, 171, 172, 193, 
> 194, 223, 224, 253, 254, 288, 289, 323, 324, 5, 6, 7, 9, 10, 12, 13, 14, 
> 15, 19

So perhaps they threw you a bone? Maybe if it was the following you wouldn't 
be able to assume anything.
i = {485,486...}
It's clear you don't care, so it's moot anyway :-)

I'm not versed in this area so excuse the over simplification.
I think with creating a lexer you have an atypical situation and is a tough 
one to think about.  You are producing a system to essentially tokenize a 
"source" for input into a "parser".  IMHO the real source that you are 
interested in for this example is the 'grammar file'. The _output_ is the 
lexer.  To me this is similar to a precompile, or in fact, any compile. 
Technically, the output from the compiler is "generated" source but I trust 
that the provider has done a good job.

I think to put it in perspective it would be as if you wrote the grammar 
file to create a lexer and then asked someone to conform to your grammar. 
If your grammer made [0.9] mean any letter and [a.Z] be any integer people 
would be baffled and say "I don't understand his grammar".

If you are writing the grammar file to create a lexer for an existing 
source, then to me it's a case where you are right - I don't care about the 
source in _this_ instance.

I was talking more about the typical in use code generators - taking UML, 
MetaData, Designs, or GUI whizz bang drag and drop apps to create code in 
which you add functionality.  I know that if you drop a table on a form and 
attach a database query to fill it, you don't have to write ANY code - even 
the SQL has a designer these days - I think you should be able to understand 
what the tool produced.  It _should_ be readable - if it's not the tool 
should be replaced with one that is.  I've yet to work somewhere where a 
generator was bought and the licence was kept for the entire life of the 
application - or worse, that the developer of the generator continued to 
support the same version.  Someone here recently mentioned an IBM structured 
cobol facility that was supported on OS/2 but not windows...that's nice.

I _did_ read the other post regarding those who optimize WITHOUT analyzers 
=)

> and so on. I don't understand the code at all. However, I have a basic 
> understanding of compiler technology (I know, for example, that lexing can 
> be done via regular expressions, and that regular expressions can be 
> emulated by DFAs (Deterministic Finite Automatas), and that DFAs can be 
> implemented as state machines), and I "trust" that my tool works, so I 
> don't bother to "test" the lexer that gets generated. So far I haven't 
> been disapointed.

If it wasn't important to understand any outputs then everything should 
generate just 0's and 1's as an output targetted per CPU, right?

generate allMyBusinessCode.src -target Xeon64 -numprocessors 64

 JCE 


0
defaultuser (532)
7/22/2005 6:17:30 AM
In article <3kauckFtf29sU1@individual.net>,
Pete Dashwood <dashwood@enternet.co.nz> wrote:
>
><docdwarf@panix.com> wrote in message news:dboqvi$64$1@panix5.panix.com...
>> In article <3k9fpeFtilonU1@individual.net>,
>> Pete Dashwood <dashwood@enternet.co.nz> wrote:
>>
>> [snip]
>>
>>>(And arguing it with you has not changed my mind :-))
>>
>> It seems, however, that code someone else posted did... sometimes a
>> different method is needed, that is all.
>
>I guess so :-) But it is still fun to argue (in the classical sense) stuff 
>with people capable of doing so... :-)

I agree... when you find such people would you be so kind as to introduce 
me to them?  I think I'd be able to learn much!

>
>I always enjoy your posts, Doc.

Shucks... you'se jes' easily amused.

DD
0
docdwarf (6044)
7/22/2005 7:54:08 AM
In article <11due6mb9fhfac3@corp.supernews.com>,
 "Rick Smith" <ricksmith@mfi.net> wrote:
>        loop section.
>            perform varying strlen-1 from 32 by -1
>              until strlen-1 = 0
>                if space not = str (strlen-1:1)
>                    exit perform
>                end-if
>            end-perform
>            .

>        unrolled-loop section.
>            evaluate space
>              when not = str (32:1) move 32 to strlen-2
<snip>
>              when not = str (1:1) move 1 to strlen-2
>              when other move 0 to strlen-2
>            end-evaluate
>            .
>        end program unroll.
> -------
> 
> Test results
> --------
> 00585676
> 00591950      22.74 00000
> 00592449       4.99 00000
> 00592449
> 00593828      13.79 00016
> 00594119       2.91 00016
> 00594119
> 00594229       1.10 00032
> 00594328       0.99 00032
> --------


Interesting, but you seem to have intentionally handicapped the loop 
version.

Had you written it like this:

   "perform varying strlen-1 from length of str by -1"

Then you would have had a flexible routine that will work in pretty much 
any case and could be reused frequently.  This benefit would surely 
outweigh the trivial speed gain you would receive from the unrolled-loop 
version.

Consider also that an optimizing compiler is likely to unroll the loop 
version for you if all of the necessary values are known at compile time.

It is likely that the unrolled version of the code will never recover in  
runtime the amount of typing type the programmer spent writing it.
0
7/23/2005 3:54:26 PM
In article <n_PCe.121287$HI.53338@edtnps84>,
 "Oliver Wong" <owong@castortech.com> wrote:
>     The adding of switches is mainly because this is pretty much the 
> technique used by all GOTO-eliminating algorithms I've seen discussed in the 
> literature and papers I've read on this topic. If there's a way to perform 
> GOTO removal without adding new switches, I'd very much like to know about 
> it because generating meaningful names for switches is rather difficult 
> (these names WILL probably eventually be seen by some human, at least in the 
> form of the outline of the program generated at the end, if not directly in 
> the COBOL source code itself).
> 
>     Whether or not there will be a "switch eliminating phase" is yet to be 
> seen; it's much too early at this point to tell whether it will be 
> nescessary, feasible, or even desirable. Right now, when confronted with 
> "lots of switches" versus "lots of GOTOs", my requirements state that "lots 
> of switches" is the lesser of two evils.
> 
>     - Oliver 

In many (not all) cases, you can use knowledge of idiomatic Cobol to 
remove the need for switches.  Consider the common "go to exit":

   1234-Some-Para.

         ... some code ...

         If condition
            go to 1234-some-para-exit.

         ... lots of code ...

   1234-Some-Para-Exit.
         exit. 

This can always be rewritten as:

   1234-Some-Para.

         ... some code ...

         If NOT (condition)
            ... lots of code ...
         END-IF

   1234-Some-Para-Exit.
         exit. 

I have a hunch that you can do this without switches EXCEPT for the case 
where a go to jumps over a paragraph boundary (E.G in a long perform 
thru).
0
7/23/2005 4:05:38 PM
In article <11dc2uvk4d1kd66@corp.supernews.com>,
 Louis Krupp <lkrupp@pssw.nospam.com.invalid> wrote:

> The ugly truth is that Oliver, a new hire at his first job, is doing 
> something that sounds a whole lot more interesting and challenging than 
> some of the work the rest of us have been doing for the last 10 or 20 or 
> 30 years, and we're secretly afraid he'll succeed.
> 
> ;)
> 
> Louis

I'm not afraid he will succeed -- I want to license the result for my 
own restructuring projects...
0
7/23/2005 4:06:47 PM
"Joe Zitzelberger" <joe_zitzelberger@nospam.com> wrote in message
news:joe_zitzelberger-2F5770.11542623072005@ispnews.usenetserver.com...
> In article <11due6mb9fhfac3@corp.supernews.com>,
>  "Rick Smith" <ricksmith@mfi.net> wrote:
> >        loop section.
> >            perform varying strlen-1 from 32 by -1
> >              until strlen-1 = 0
> >                if space not = str (strlen-1:1)
> >                    exit perform
> >                end-if
> >            end-perform
> >            .
>
> >        unrolled-loop section.
> >            evaluate space
> >              when not = str (32:1) move 32 to strlen-2
> <snip>
> >              when not = str (1:1) move 1 to strlen-2
> >              when other move 0 to strlen-2
> >            end-evaluate
> >            .
> >        end program unroll.
> > -------
> >
> > Test results
> > --------
> > 00585676
> > 00591950      22.74 00000
> > 00592449       4.99 00000
> > 00592449
> > 00593828      13.79 00016
> > 00594119       2.91 00016
> > 00594119
> > 00594229       1.10 00032
> > 00594328       0.99 00032
> > --------
>
>
> Interesting, but you seem to have intentionally handicapped the loop
> version.
>
> Had you written it like this:
>
>    "perform varying strlen-1 from length of str by -1"
>
> Then you would have had a flexible routine that will work in pretty much
> any case and could be reused frequently.  This benefit would surely
> outweigh the trivial speed gain you would receive from the unrolled-loop
> version.

It is not clear to me that I "intentionally handicapped" anything.
I do not use LENGTH OF because it is non-standard and, with
the exception of COMP-5, I believe the extensions that I did use
(such as, unnamed paragraphs in sections, exit perform, and
partial expressions in the EVALUATE statement) are all
available in the 2002 standard. For flexibility, I would normally
write something like:

    compute x = function length (str)
    perform varying strlen from x by -1 ...

This is, at least, standard. However, flexibility was not a
consideration, code size was.

> Consider also that an optimizing compiler is likely to unroll the loop
> version for you if all of the necessary values are known at compile time.

The optimizing compiler I use does not unroll loops. In fact,
when I enabled some specific optimizations (OPTSPEED and
NOTRICKLE), the loop version slowed and the unrolled loop
sped up for lengths of 0 and 16, but slowed for a length of 32.
There is something happening that I do not yet, and may never,
understand; but the program was unimportant for all but one
reason.

> It is likely that the unrolled version of the code will never recover in
> runtime the amount of typing type the programmer spent writing it.

Which has nothing to do with the reason for the code. The
code was presented to demonstrate that *sometimes* larger
is faster. I was aware of this because I have unrolled loops
before and achieved speed gains on the order of 50 to 800
percent, in long running programs; that is, those programs
were the speed up was important (savings of hours per
execution, sometimes a hundred or more hours per year).



0
ricksmith (875)
7/23/2005 6:43:35 PM
Re:
      compute x = function length (str)
      perform varying strlen from x by -1 ...

In the '02 Standard, one could code a conforming:

   Perform varying strlen from Function Length (str) by -1 ...

And, in fact, if the correct "repository" paragraph was coded (to allow omission 
of "FUNCTION" keyword), one could code:

   Perform varying strlen from Length (str) by -1 ...

which would (in MOST cases) provide the same results as the "extension" code:

   Perform varying strlen from Length of str by -1 ...

-- 
Bill Klein
 wmklein <at> ix.netcom.com
"Rick Smith" <ricksmith@mfi.net> wrote in message 
news:11e5405rddbchf1@corp.supernews.com...
>
> "Joe Zitzelberger" <joe_zitzelberger@nospam.com> wrote in message
> news:joe_zitzelberger-2F5770.11542623072005@ispnews.usenetserver.com...
>> In article <11due6mb9fhfac3@corp.supernews.com>,
>>  "Rick Smith" <ricksmith@mfi.net> wrote:
>> >        loop section.
>> >            perform varying strlen-1 from 32 by -1
>> >              until strlen-1 = 0
>> >                if space not = str (strlen-1:1)
>> >                    exit perform
>> >                end-if
>> >            end-perform
>> >            .
>>
>> >        unrolled-loop section.
>> >            evaluate space
>> >              when not = str (32:1) move 32 to strlen-2
>> <snip>
>> >              when not = str (1:1) move 1 to strlen-2
>> >              when other move 0 to strlen-2
>> >            end-evaluate
>> >            .
>> >        end program unroll.
>> > -------
>> >
>> > Test results
>> > --------
>> > 00585676
>> > 00591950      22.74 00000
>> > 00592449       4.99 00000
>> > 00592449
>> > 00593828      13.79 00016
>> > 00594119       2.91 00016
>> > 00594119
>> > 00594229       1.10 00032
>> > 00594328       0.99 00032
>> > --------
>>
>>
>> Interesting, but you seem to have intentionally handicapped the loop
>> version.
>>
>> Had you written it like this:
>>
>>    "perform varying strlen-1 from length of str by -1"
>>
>> Then you would have had a flexible routine that will work in pretty much
>> any case and could be reused frequently.  This benefit would surely
>> outweigh the trivial speed gain you would receive from the unrolled-loop
>> version.
>
> It is not clear to me that I "intentionally handicapped" anything.
> I do not use LENGTH OF because it is non-standard and, with
> the exception of COMP-5, I believe the extensions that I did use
> (such as, unnamed paragraphs in sections, exit perform, and
> partial expressions in the EVALUATE statement) are all
> available in the 2002 standard. For flexibility, I would normally
> write something like:
>
>    compute x = function length (str)
>    perform varying strlen from x by -1 ...
>
> This is, at least, standard. However, flexibility was not a
> consideration, code size was.
>
>> Consider also that an optimizing compiler is likely to unroll the loop
>> version for you if all of the necessary values are known at compile time.
>
> The optimizing compiler I use does not unroll loops. In fact,
> when I enabled some specific optimizations (OPTSPEED and
> NOTRICKLE), the loop version slowed and the unrolled loop
> sped up for lengths of 0 and 16, but slowed for a length of 32.
> There is something happening that I do not yet, and may never,
> understand; but the program was unimportant for all but one
> reason.
>
>> It is likely that the unrolled version of the code will never recover in
>> runtime the amount of typing type the programmer spent writing it.
>
> Which has nothing to do with the reason for the code. The
> code was presented to demonstrate that *sometimes* larger
> is faster. I was aware of this because I have unrolled loops
> before and achieved speed gains on the order of 50 to 800
> percent, in long running programs; that is, those programs
> were the speed up was important (savings of hours per
> execution, sometimes a hundred or more hours per year).
>
>
> 


0
wmklein (2605)
7/23/2005 7:00:45 PM
"jce" <defaultuser@hotmail.com> wrote in message 
news:_b0Ee.12377$iG6.10246@tornado.tampabay.rr.com...
>
> I was talking more about the typical in use code generators - taking UML, 
> MetaData, Designs, or GUI whizz bang drag and drop apps to create code in 
> which you add functionality.  I know that if you drop a table on a form 
> and attach a database query to fill it, you don't have to write ANY code - 
> even the SQL has a designer these days - I think you should be able to 
> understand what the tool produced.  It _should_ be readable - if it's not 
> the tool should be replaced with one that is.

It depends a lot of what you want to do with the resulting code. I haven't 
done C# programming with Microsoft Visual Studio in a while, but I remember 
that you could have two views of your application: a GUI editing view, and a 
code editing view. You could perform modifications from either view, and 
they would stay in "sync", to a limited degree. That is, MSVS.Net would put 
a comments around a block of code basically saying "Don't touch this". If 
you did edit that code, then the GUI editing view would become very confused 
and probably very bad things happen to your source code. I never really 
explored that path too deeply, so I don't know in detail what happens.

The point is, because the comment made it very clear to me that I wasn't to 
touch that code, at the time I really didn't care what the code looked like, 
or if it was readable (in fact, the MSVS editor hides that portion of the 
code; you could only see it if you opened up your source code file in a 
plain text editor).

Of course, I haven't been in the programming industry long enough to worry 
about "What if my vendor goes bankrupt and/or stops supporting my tools?" I 
can imagine that that is a very valid concern in the long term, but if 
you're relatively confident that you're not gonna need to touch a certain 
piece of code again, and if it's cheaper (in money, time or staff) to use 
the tool that doesn't generate readable code, then certainly you shouldn't 
disregard that option outright.

> If it wasn't important to understand any outputs then everything should 
> generate just 0's and 1's as an output targetted per CPU, right?

    "Everything" is a dangerous word in the above sentence, but of course 
you knew that. I just wanted to point out a situation you might not be 
familiar with in which it's desirable to output source code, even though you 
don't expect anyone to read it.

    Let's say I want to write a tool that generates programs; perhaps one of 
those "instant game makers" where you just drag and drop players and 
enemies, and you'll have a working game, with no programmign skills 
nescessary. If I knew the C language very well, but I knew almost nothing of 
assembly, it would be a lot easier for me to make my tool generate C code, 
and then call an existing compiler, than to actually study assembly myself, 
as well as compiler theory and optimization theory, and actually generate 
machine code myself.

    - Oliver 


0
owong (6177)
7/25/2005 10:03:02 PM
Reply: