A critique of test-first...

  • Follow



Everyone agrees that testing is important, but what use is it to write 
the unit tests before starting coding (rather than after)?

IMO, its one advantage is that it forces the programmer to focus on the 
problem before starting to code. Of course, this presumes that the 
programmer is one of those who don't spend time thinking about the 
problem, its specification, possible data-structures and algorithms etc.

Other advantages claimed for this rule (I'm quoting from 
http://www.extremeprogramming.org/rules/testfirst.html ) are discused below.

It is claimed: Test-first helps nail down (document?) the 
requirements/specifications. Clearly, this is true when the 
requirements/specifications are "shallow" - i.e. when one module 
implements one (or many) requirements. On the other hand, as problem 
complexity grows, this becomes false. Consider the requirement "the 
database must implement rollback". This requirement will possibly be 
implemented using multiple modules, each of which will have unit tests. 
These tests, however, won't help nail down any specification/requirement.

It is claimed: Test-first helps define the scope of the module. (This 
para assumes that all "we create our unit tests first", which is 
somewhat contradictory to the test-driven development approach 
recommended 2 paras later). _IF_ one can define the functionality a 
particular module early, this may make sense. However, I have found that 
the partitioning of work betwen modules only becomes clearer when a 
certain amount of code has been written.

These two paras make a certain amount of sense in a situation where unit 
tests are a subset of the acceptance tests - i.e. where modules directly 
implement end-user function. They are less applicable in more complex 
problems, where the end-user functionality requires several modules to 
implement.

It is claimed: test-first makes code more testable at a system level. I 
don't see how this follows. Does it matter whether the unit-tests are 
written before or after coding? In fact, does it matter if the 
unit-tests are written by a different team?
0
Reply ctips (287) 11/15/2004 5:07:06 AM

CTips wrote:

> Everyone agrees that testing is important, but what use is it to write
> the unit tests before starting coding (rather than after)?
>
> IMO, its one advantage is that it forces the programmer to focus on the
> problem before starting to code. Of course, this presumes that the
> programmer is one of those who don't spend time thinking about the
> problem, its specification, possible data-structures and algorithms etc.
>
> Other advantages claimed for this rule (I'm quoting from
> http://www.extremeprogramming.org/rules/testfirst.html ) are discused
below.

You may want to read /Code Complete 2nd Edition/ by Steve McConnell on the
topic.

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/15/2004 5:14:57 AM


> Everyone agrees that testing is important, but what use is it to write 
> the unit tests before starting coding (rather than after)?

One key benefit, not listed in the page you refer to, is avoiding 
confirmation bias. You start with the hypothesis that your code doesn't 
work, confirm the hypothesis, then write just enough code to "falsify" 
the hypothesis.

Laurent
0
Reply Laurent 11/15/2004 9:04:57 AM

Laurent Bossavit wrote:
>>Everyone agrees that testing is important, but what use is it to write 
>>the unit tests before starting coding (rather than after)?
> 
> 
> One key benefit, not listed in the page you refer to, is avoiding 
> confirmation bias. You start with the hypothesis that your code doesn't 
> work, confirm the hypothesis, then write just enough code to "falsify" 
> the hypothesis.
> 
> Laurent

There is some truth to that. However, the bias will still exist, partly 
because:
- at the time the test is written, the programmer may also have an 
implementation in mind, so the test and implementation are coupled
- the implementation may pass the test, but have other bugs that will 
show up.

The "write just enough code" approach will make it unlikely that there 
are untested bugs _IF_ the module is simple. For instance, there are 
modules where one has to write 1kloc+ of code before it is possible to 
run the simplest "real" test.

Alternative approaches to avoid confirmation bias (well, actually to 
ensure that the module is adequately tested) include:
- use a coverage tool to ensure that a particular level of coverage has 
been met
- have a separate tester.
0
Reply CTips 11/15/2004 12:38:47 PM

On Mon, 15 Nov 2004 00:07:06 -0500, CTips <ctips@bestweb.net> wrote:

>It is claimed: test-first makes code more testable at a system level. I 
>don't see how this follows. Does it matter whether the unit-tests are 
>written before or after coding? In fact, does it matter if the 
>unit-tests are written by a different team?

Yes, it does matter. It's quite common for programmers to write code
that is hard to test when the tests are done after the fact. Much of
the point and value of Michael Feathers' new book on legacy code is
that he shows how to get such code back in testable shape.

When we write the tests first, at a bare minimum, we ensure that tests
can be written. Even if no other benefit accrued, that would be a
significant advantage compared to a lot of code that we see "out
there".

(I couldn't tell from CTips' entire article whether he understands
that TDD is done one test at a time, not a whole bunch of tests, then
code to match. It's really not well-described as "test-first";
"test-driven" does a better job. One alternates between test, code,
back to test.)

Regards,

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/15/2004 12:52:12 PM

On Mon, 15 Nov 2004 07:38:47 -0500, CTips <ctips@bestweb.net> wrote:

>Alternative approaches to avoid confirmation bias (well, actually to 
>ensure that the module is adequately tested) include:
>- use a coverage tool to ensure that a particular level of coverage has 
>been met
>- have a separate tester.

Yes, both these things are of value. A coverage tool will show up
parts of the code that need testing, but it will also often show up
"blind spots" in the testing that we're actually trying to do well.
Observing what's being missed will often help us adjust our style of
work for better coverage.

Separate testing is also of value. In Extreme Programming, one of the
methods that recommends TDD, this shows up in the "Customer Acceptance
Testing" practice, where independent tests are specified by the
customer team and used to verify overall system behavior.

I am not very favorably inclined to separate testers doing unit
testing, because the feedback comes to me later than I would like,
they often have difficulty understanding what I was doing well enough
to test it, and so on. Even in the case of Customer Acceptance
Testing, I'd like to have the tests automated and available right when
I think I'm done, so that I can run them before passing the code on.

Regards,

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/15/2004 12:56:26 PM

> For instance, there are modules where one has to write 1kloc+ of
> code before it is possible to run the simplest "real" test.

What do you call a "real" test ?

What kind of module would require writing hundreds or a thousand lines 
of code before you could unit test it ?

Laurent
0
Reply Laurent 11/15/2004 1:34:35 PM

"CTips" <ctips@bestweb.net> wrote in message 
news:10pgebfgsf2jld5@corp.supernews.com...
>
>
> Everyone agrees that testing is important, but what use is it to write the 
> unit tests before starting coding (rather than after)?

As several other posters have mentioned, we're not talking about
writing the tests (plural) before writing the code. In fact, we're
not talking about designing a module, writing the tests, and then
writing the module.

What we're talking about is Test Driven Development, which
says to do some design, write one (singular) failing test, then
write exactly the code required to make that test pass, and
not one keystroke more. Repeat until done.

Designing a module, writing all the tests and then writing
the module is a strategy that has been suggested many times
over the last several decades, and it has never gained
enough traction to be fairly evaluated. It's pretty much regarded
as a failure.

Test Driven Development (write one test, write the code
to make it pass, refactor, repeat) is a success.

I think we've fallen into one of the major traps here: Test Driven
Development is a design technique first, and a testing technique
second. It is (or should be) a well known fact that the suite
of unit tests that come out of TDD is not the same as a professional
tester would write after the fact. That test suite is grossly
insufficient according to classical testing techniques. However,
it works.

> IMO, its one advantage is that it forces the programmer to focus on the 
> problem before starting to code. Of course, this presumes that the 
> programmer is one of those who don't spend time thinking about the 
> problem, its specification, possible data-structures and algorithms etc.

And that's simply wrong. Any developer who doesn't think about
design issues is going to be lost sooner rather than later if they
try to use Test Driven Development.

> Other advantages claimed for this rule (I'm quoting from 
> http://www.extremeprogramming.org/rules/testfirst.html ) are discused 
> below.
>
> It is claimed: Test-first helps nail down (document?) the 
> requirements/specifications. Clearly, this is true when the 
> requirements/specifications are "shallow" - i.e. when one module 
> implements one (or many) requirements. On the other hand, as problem 
> complexity grows, this becomes false. Consider the requirement "the 
> database must implement rollback". This requirement will possibly be 
> implemented using multiple modules, each of which will have unit tests. 
> These tests, however, won't help nail down any specification/requirement.

We've got another confusion here. You've shifted from programmer
(unit) tests to customer (acceptance) tests. They are two very different
things.

In any case, I wouldn't write "the database must implement rollback"
except to specify which data base to purchase, and that only if the
program required it.

What I would write is something like: "all transactions must leave
the data in the original state if they have to be abandoned before
successful completion." This is a "motherhood" or global requirement;
it should generate tests for each story that deals with a failure.

Mike Cohen (User Stories Applied) calls these constraints,
and suggests writing the word "constraint" on story cards
that must be obeyed rather than implemented. See p. 77.

> It is claimed: Test-first helps define the scope of the module. (This para 
> assumes that all "we create our unit tests first", which is somewhat 
> contradictory to the test-driven development approach recommended 2 paras 
> later). _IF_ one can define the functionality a particular module early, 
> this may make sense. However, I have found that the partitioning of work 
> betwen modules only becomes clearer when a certain amount of code has been 
> written.
>
> These two paras make a certain amount of sense in a situation where unit 
> tests are a subset of the acceptance tests - i.e. where modules directly 
> implement end-user function. They are less applicable in more complex 
> problems, where the end-user functionality requires several modules to 
> implement.

See the comments I started out with. We don't deal with
module level practices.

> It is claimed: test-first makes code more testable at a system level. I 
> don't see how this follows. Does it matter whether the unit-tests are 
> written before or after coding? In fact, does it matter if the unit-tests 
> are written by a different team?

It's very easy to write difficult to test code if you don't have
the tests in front of you. It's even easier if someone else is
doing the testing some time later.

It's practically impossible to write untestable code if you're
doing Test Driven Development. If you're also implementing
story by story, and doing continuous integration, it's practically
impossible to write a system that can't be tested at all
levels.

John Roth 

0
Reply John 11/15/2004 2:38:15 PM

Has any of the coaches here experienced anyone continue to ask questions
like these after they had used TDD, correctly and sustainably, for a few
weeks?

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/15/2004 3:05:22 PM

Laurent Bossavit wrote:

>>For instance, there are modules where one has to write 1kloc+ of
>>code before it is possible to run the simplest "real" test.
> 
> 
> What do you call a "real" test ?
> 
> What kind of module would require writing hundreds or a thousand lines 
> of code before you could unit test it ?
> 
> Laurent

Many compiler optimizations - in one style of writing optimizations, we
perform an analysis phase (which does not perform any observable change
on the output), followed by an transform phase (which does). The
analysis phases can be quite complex - several klocs long. Even if we
break it down to the minimal amount of analysis per transform kind, it
will still amount to 1kloc+. (BTW: this is based on recent experience,
not hypothetical).

Another case is certain kinds of device drivers - you have to get pretty
much all the basic parameters right or the device wont work at all - no
observable behavior. This also might require 1kloc+ code. (Also based on
  personal experience).

I suspect that a lot of your experience is in "business applications"
and/or GUI-centric apps, which tend to be "shallow", and other
small-to-medium sized, low complexity apps. Otherwise you'd probably be
able to think of lots of examples of stuff for which passing the "next"
unit test requires writing kloc+ of code.



0
Reply CTips 11/15/2004 3:42:55 PM

Laurent Bossavit wrote:
>>For instance, there are modules where one has to write 1kloc+ of
>>code before it is possible to run the simplest "real" test.
> 
> 
> What do you call a "real" test ?
> 
> What kind of module would require writing hundreds or a thousand lines 
> of code before you could unit test it ?
> 
> Laurent
Many compiler optimizations - in one style of writing optimizations, we 
perform an analysis phase (which does not perform any observable change 
on the output), followed by an transform phase (which does). The 
analysis phases can be quite complex - several klocs long. Even if we 
break it down to the minimal amount of analysis per transform kind, it 
will still amount to 1kloc+. (BTW: this is based on recent experience, 
not hypothetical).

Another case is certain kinds of device drivers - you have to get pretty 
much all the basic parameters right or the device wont work at all - no 
observable behavior. This also might require 1kloc+ code. (Also based on 
  personal experience).

I suspect that a lot of your experience is in "business applications" 
and/or GUI-centric apps, which tend to be "shallow", and other 
small-to-medium sized, low complexity apps. Otherwise you'd probably be 
able to think of lots of examples of stuff for which passing the "next" 
unit test requires writing kloc+ of code.
0
Reply CTips 11/15/2004 3:44:46 PM

CTips wrote:

> IMO, its one advantage is that it forces the programmer to focus on
> the problem before starting to code. Of course, this presumes that the
> programmer is one of those who don't spend time thinking about the
> problem, its specification, possible data-structures and algorithms
> etc.

Actually, even those programmers who already do all those things might find
writing the tests a helpfull tool to concentrate on client needs and to let
the thinking become more concrete.

> Other advantages claimed for this rule (I'm quoting from
> http://www.extremeprogramming.org/rules/testfirst.html ) are discused
> below.
>
> It is claimed: Test-first helps nail down (document?) the
> requirements/specifications. Clearly, this is true when the
> requirements/specifications are "shallow" - i.e. when one module
> implements one (or many) requirements. On the other hand, as problem
> complexity grows, this becomes false. Consider the requirement "the
> database must implement rollback". This requirement will possibly be
> implemented using multiple modules, each of which will have unit
> tests. These tests, however, won't help nail down any
> specification/requirement.

The web page isn't very clear in this point, but I'd think that it is more
speaking about the implicite requirement on a module - "what needs this
class to do?" Your are right that otherwise it doesn't make much sense.

> It is claimed: Test-first helps define the scope of the module. (This
> para assumes that all "we create our unit tests first", which is
> somewhat contradictory to the test-driven development approach
> recommended 2 paras later). _IF_ one can define the functionality a
> particular module early, this may make sense. However, I have found
> that the partitioning of work betwen modules only becomes clearer
> when a
> certain amount of code has been written.

Yes. But you certainly also can't start implementing a module before you
have a single idea what it should do. That's why the tests are written in
parallel to the production code, alternating between writing a test and
making it run.

> It is claimed: test-first makes code more testable at a system level.
> I don't see how this follows. Does it matter whether the unit-tests
> are written before or after coding? In fact, does it matter if the
> unit-tests are written by a different team?

If I write the test first, and then the production code to make the test
pass, that production code is testable by definition, isn't it?

Cheers, Ilja


0
Reply Ilja 11/15/2004 3:59:53 PM

Ronald E Jeffries wrote:
> On Mon, 15 Nov 2004 00:07:06 -0500, CTips <ctips@bestweb.net> wrote:
> (I couldn't tell from CTips' entire article whether he understands
> that TDD is done one test at a time, not a whole bunch of tests, then
> code to match. It's really not well-described as "test-first";
> "test-driven" does a better job. One alternates between test, code,
> back to test.)

I did assume that it was talking about TDD. However, lets look at para 3 
from http://www.extremeprogramming.org/rules/testfirst.html

"It is often not clear when a developer has finished all the necessary 
functionality. Scope creep can occur as extensions and error conditions 
are considered. If we create our unit tests first then we know when we 
are done; the unit tests all run."

These appears to say that test-first limits scope-creep by using the 
tests to specify the functionality. If all the unit-tests are written 
ahead of time, that make sense. However, if you're using TDD, then this 
para is nonsense. If we can keep adding tests (=> function, scope), then 
we will never "know when we are done" since we can always add more tests.

Also, if you're using TDD, the benefit described in para 2 becomes less 
clear - "Requirements are nailed down firmly by tests." With TDD, we 
will be adding tests incrementally, so in a sense we are making up the 
requirements as we go along.  In fact, the requirements on the module 
won't be completely "nailed down firmly by tests" until the last test is 
written.

There seems to be a disconenct between TDD and those 2 paras. Was it 
ever the case that extreme programming first advocated all (or most) 
tests first, and then switched to TDD?
0
Reply CTips 11/15/2004 5:05:30 PM

CTips <ctips@bestweb.net> wrote in message news:<10pgebfgsf2jld5@corp.supernews.com>...
> It is claimed: Test-first helps define the scope of the module. (This 
> para assumes that all "we create our unit tests first", which is 
> somewhat contradictory to the test-driven development approach 
> recommended 2 paras later). _IF_ one can define the functionality a 
> particular module early, this may make sense. However, I have found that 
> the partitioning of work betwen modules only becomes clearer when a 
> certain amount of code has been written.

I agree, and this is one of the reasons for doing TDD. As you evolve the code,
you refactor the design. The tests are there to buttress those changes. 
Refactoring code may require tests to be modified, but the modifications 
are relatively trivial, and it's good to know that everything is still
working after you've moved things around, tweaked the api, etc. 

CTips, you seem interested enough in TDD and XP to criticise its practices,
but I have not read anything from you about trying it. I think it would
be pretty cool if you went to a conference to try it out. Maybe someone
like Robert Martin would be willing to do a session with you. If you come
out of it with something positive to say, then it would be good PR for
XP, but if not, I think your criticisms would be concrete instead of 
abstract.
0
Reply vladimir_levin 11/15/2004 5:13:46 PM

CTips wrote:

> I suspect that a lot of your experience is in "business applications"
> and/or GUI-centric apps, which tend to be "shallow", and other
> small-to-medium sized, low complexity apps. Otherwise you'd probably
> be able to think of lots of examples of stuff for which passing the
> "next" unit test requires writing kloc+ of code.

Isn't a basic strategy to handle such complex units to compose them from 
several more simple units?

Curious, Ilja 


0
Reply Ilja 11/15/2004 5:35:09 PM

  This is a very good thread.  Thank you CTips and all the rest of you.
It cleared up a few misconceptions that I've not heard any of you mention
yet.

  I'm not sold on TDD itself, but still interested in the concept.
Since I have had good teachers and experience, almost everything you
have mentioned is already part of my work plan.  I tend to think
and design rather deep though and most of the refactoring is done
in my head before the base concepts are written down.

  I write tests as I go as well.  This has annoyed other programmers
a bit, since it makes my code base larger.  However, everyone 
seems to understand how to use and generally maintain the code I've
produced.  It has also served well that any failure in a system
should leave some trace of why that happened.  I test all code
changes before checking in and never bother with Release vs. Debug
code -- its all Release code and must be diagnosable at a distance.
But that is also the world I live in.

  David
0
Reply David 11/15/2004 5:38:49 PM

Vladimir Levin wrote:
> CTips <ctips@bestweb.net> wrote in message news:<10pgebfgsf2jld5@corp.supernews.com>...
> 
>>It is claimed: Test-first helps define the scope of the module. (This 
>>para assumes that all "we create our unit tests first", which is 
>>somewhat contradictory to the test-driven development approach 
>>recommended 2 paras later). _IF_ one can define the functionality a 
>>particular module early, this may make sense. However, I have found that 
>>the partitioning of work betwen modules only becomes clearer when a 
>>certain amount of code has been written.
> 
> 
> I agree, and this is one of the reasons for doing TDD. As you evolve the code,
> you refactor the design. The tests are there to buttress those changes. 
> Refactoring code may require tests to be modified, but the modifications 
> are relatively trivial, and it's good to know that everything is still
> working after you've moved things around, tweaked the api, etc. 
> 
> CTips, you seem interested enough in TDD and XP to criticise its practices,
> but I have not read anything from you about trying it. I think it would
> be pretty cool if you went to a conference to try it out. Maybe someone
> like Robert Martin would be willing to do a session with you. If you come
> out of it with something positive to say, then it would be good PR for
> XP, but if not, I think your criticisms would be concrete instead of 
> abstract.

Here's a question - remember the psuedo-hangman example you posted early 
this year? It took me about 3 hours to code and debug a _complete_ 
hangman, including dictionary management and I/O, _NOT_ using TDD. ( 
source is at http://users.bestweb.net/~ctips/hangman.c )

How long had you been working on the problem, using TDD? Do you think 
if you hadn't tried doing it with TDD but just gone ahead and written 
it, adding tests as and when it seemed appropriate, you might have been 
able to get it done quicker?

Note that I have *tried* to do things TDD; unfortunately, it became 
apparent that TDD is a completely inadequate way of doing things.

- With TDD you write code to pass test #1, then #1 & #2 and so on. If, 
at some point along the process, you discover that the best way to doing 
#1...#n is to use a different approach (say, to use a table-driven 
approach instead of a switch statement) then you have to rewrite the 
code ("refactor"). If, however, you had written it using a table-driven 
approach in the first place you would have saved yourself a lot of time.

- There are many situations in which it is not possible to write tests. 
As an extreme example, consider implementing synchronization primitives. 
They have to be proved to be correct.

- The problem is quite often complex enough that the first non-trivial 
test requires most of the work.

I suspect that TDD works better for tackling simpler problems in less 
complex domains, and at lower productivities.
0
Reply CTips 11/15/2004 9:21:02 PM

Vladimir Levin wrote:

> CTips <ctips@bestweb.net> wrote in message news:<10pgebfgsf2jld5@corp.supernews.com>...
> 
>>It is claimed: Test-first helps define the scope of the module. (This 
>>para assumes that all "we create our unit tests first", which is 
>>somewhat contradictory to the test-driven development approach 
>>recommended 2 paras later). _IF_ one can define the functionality a 
>>particular module early, this may make sense. However, I have found that 
>>the partitioning of work betwen modules only becomes clearer when a 
>>certain amount of code has been written.
> 
> 
> I agree, and this is one of the reasons for doing TDD. As you evolve the code,
> you refactor the design. The tests are there to buttress those changes. 
> Refactoring code may require tests to be modified, but the modifications 
> are relatively trivial, and it's good to know that everything is still
> working after you've moved things around, tweaked the api, etc. 
> 
> CTips, you seem interested enough in TDD and XP to criticise its practices,
> but I have not read anything from you about trying it. I think it would
> be pretty cool if you went to a conference to try it out. Maybe someone
> like Robert Martin would be willing to do a session with you. If you come
> out of it with something positive to say, then it would be good PR for
> XP, but if not, I think your criticisms would be concrete instead of 
> abstract.
Here's a question - remember the psuedo-hangman example you posted early 
this year? It took me about 3 hours to code and debug a _complete_ 
hangman, including dictionary management and I/O, _NOT_ using TDD. ( 
source is at http://users.bestweb.net/~ctips/hangman.c )

How long had you been working on the problem, using TDD? Do you think 
if you hadn't tried doing it with TDD but just gone ahead and written 
it, adding tests as and when it seemed appropriate, you might have been 
able to get it done quicker?

Note that I have *tried* to do things TDD; unfortunately, it became 
apparent that TDD is a completely inadequate way of doing things.

- With TDD you write code to pass test #1, then #1 & #2 and so on. If, 
at some point along the process, you discover that the best way to doing 
#1...#n is to use a different approach (say, to use a table-driven 
approach instead of a switch statement) then you have to rewrite the 
code ("refactor"). If, however, you had written it using a table-driven 
approach in the first place you would have saved yourself a lot of time.

- There are many situations in which it is not possible to write tests. 
As an extreme example, consider implementing synchronization primitives. 
They have to be proved to be correct.

- The problem is quite often complex enough that the first non-trivial 
test requires most of the work.

I suspect that TDD works better for tackling simpler problems in less 
complex domains, and at lower productivities.
0
Reply CTips 11/15/2004 9:21:43 PM

> It took me about 3 hours to code and debug a _complete_ 
> hangman, including dictionary management and I/O, _NOT_ using TDD.

Having written it that way, do you think it would be very difficult to 
do that one over, this time test-driven ? Since you know where you're 
going to end up, massive rewrites shouldn't be a concern. Also...

> #1...#n is to use a different approach (say, to use a table-driven 
> approach instead of a switch statement) then you have to rewrite the 
> code ("refactor").

Refactor doesn't mean "rewrite". It means a small, reversible change 
which leaves all tests running. The accumulation of such transformations 
tends to stress the code in such a way that interface ("what") and 
implementation ("how") drift away from each other; that happens to be a 
desirable property of code.

Laurent
0
Reply Laurent 11/15/2004 9:51:44 PM

> These appears to say that test-first limits scope-creep by using the 
> tests to specify the functionality. [...] If we can keep adding tests
> (=> function, scope), then  we will never "know when we are done" since
> we can always add more tests.

Note that there are two kinds of scope creep - requirements inflation 
(the customer wants more than she originally said) and gold-plating 
(developers beef up the code more than needed to make customer happy).

Tests are involved in limiting scope creep at more than one level of 
abstraction; system-level tests pin down the "big picture" view of what 
the system does - while smaller (unit) tests pin down the implementation 
details. Think of drawing an outline, then filling it in.

There is one level of tests that enjoy a special status - "acceptance 
tests", which serve as a vehicle for requirements. If you go all the way 
up to that level, then requirements inflation might occur, but gold-
plating is less likely.

Laurent
0
Reply Laurent 11/15/2004 9:55:55 PM

> Even if we break it down to the minimal amount of analysis per transform kind,
> it will still amount to 1kloc+. (BTW: this is based on recent experience,
> not hypothetical).

I suspect we differ on what we call a "unit" test, or a "real" test.

Here is a thought experiment. Suppose you suspect that, somewhere in 
these 1000 lines of optimizer code for a given transformation, there is 
one nasty bug.

I would suppose that you will instrument the code for debugging - put a 
printf in there, or activate one that you've already put there for that 
purpose.

The next step might be running the compiler on a test source file that 
is carefully designed to exercise, among other paths, one that causes 
the printf statement to be activated.

Now we have a string that has been output to stdout. If the debug 
instrumentation is at all helpful, we have a precise idea what that 
string should be. If the string actually output differs from that, we 
have a "smoking gun" that the bug is indeed, in the main, what and where 
we suspected.

We could make this more efficient by writing a small function that 
examines stdout for us, and compares the particular line in the output 
that we're interested in with the value we expect. Couldn't we ?

Maybe we don't need to exercise the whole compiler, either. The analysis 
is, I suppose, running over part of the partially compiled code - mostly 
assembly with unresolved jump locations, or somesuch. Maybe we can store 
the binary representation of that instead of the source file. And do we 
really need more than a dozen or a few dozen bytes of it in order to 
expose the bug ? I may be wrong (I've only written one compiler in my 
life, with fairly simple peephole optimizations), but I'll assume that's 
about right.

Here's the one part I'm not 100% sure of. Is the 1000 lines of analysis 
code one big function, or is it broken down further into functions ? In 
the latter case, presumably we can massage the input data further still, 
until we know exactly what *would* be passed into the function that has 
the debug printf, if we were encountering the bug in real use. Instead 
of a printf, perhaps we can devise other ways to get the output back - 
say in a return value, or a parameter passed by reference.

So we can write a test function that:
- constructs the input data we need
- passes it into the relevant function of the analysis module
- receives an encoded "status" string as one result of the function
- compares the string it received with an "expected" string

Well, surprise - that test function is an XP-style unit test. Ex 
hypothesi (unless I've made a mistake in the above) it tests something 
that is considerably less than 1000 lines of code. (I haven't put a 
number on that size, but it's "whatever the function length you consider 
reasonable"). Is it a "real" test ? Well, again ex hypothesi, I wrote it 
because of an actual bug, so presumably it tests something relevant.

Instead of writing these tests when a bug surfaces, when arguably it is 
too late, XP suggests that they should be written before the code. This 
helps developers focus on testability in their design (they can't get 
away with a 1000-line function), avoids confirmation bias, and helps 
with debugging in much the same way that instrumentation does, when a 
rare bug does surface. Though actually the beginning of this paragraph 
should read "In addition to", not "Instead of" - in XP it's considered a 
best practice to write a new test for any bug found.

Laurent
0
Reply Laurent 11/15/2004 10:23:53 PM

On Mon, 15 Nov 2004 12:05:30 -0500, CTips <ctips@bestweb.net> wrote:

>Ronald E Jeffries wrote:
>> On Mon, 15 Nov 2004 00:07:06 -0500, CTips <ctips@bestweb.net> wrote:
>> (I couldn't tell from CTips' entire article whether he understands
>> that TDD is done one test at a time, not a whole bunch of tests, then
>> code to match. It's really not well-described as "test-first";
>> "test-driven" does a better job. One alternates between test, code,
>> back to test.)
>
>I did assume that it was talking about TDD. However, lets look at para 3 
>from http://www.extremeprogramming.org/rules/testfirst.html
>
>"It is often not clear when a developer has finished all the necessary 
>functionality. Scope creep can occur as extensions and error conditions 
>are considered. If we create our unit tests first then we know when we 
>are done; the unit tests all run."
>
>These appears to say that test-first limits scope-creep by using the 
>tests to specify the functionality. If all the unit-tests are written 
>ahead of time, that make sense. However, if you're using TDD, then this 
>para is nonsense. If we can keep adding tests (=> function, scope), then 
>we will never "know when we are done" since we can always add more tests.

"Wisdom begins when we discover the difference between 'that makes no
sense' and 'I don't understand'" -- Mary Doria Russell

Permit me please, to help increase your understanding:

What we find in doing TDD, is that when we switch from adding one bit
of function to writing the next test, it's easier to notice that we
really don't need the next bit. I'm not sure why that would be the
case, but many people report the same effect, so I'm rather sure it's
real.
>
>Also, if you're using TDD, the benefit described in para 2 becomes less 
>clear - "Requirements are nailed down firmly by tests." With TDD, we 
>will be adding tests incrementally, so in a sense we are making up the 
>requirements as we go along.  In fact, the requirements on the module 
>won't be completely "nailed down firmly by tests" until the last test is 
>written.

We are /translating/ the requirements that we have in our head, or in
what the customer told us or wherever we got them. Since TDD is
commonly done at the /unit/ level, it's like a kind of design. We
might have said to ourselves, "We need to write a linked list class",
written down some requirements or just winged them, and coded it up
and then tested. Or using TDD, we'd write a series of tests like
"emptylist.next() returns null", and so on, and again stop when we're
done.

But when we stop, the tests define the requrements that we used to
write the tests and code. The requirements are now written, whether
they were on paper elsewhere or not, in executable code. Therefore,
"nailed down".
>
>There seems to be a disconenct between TDD and those 2 paras. Was it 
>ever the case that extreme programming first advocated all (or most) 
>tests first, and then switched to TDD?

Not to my recollection. In early days we said you had to test
"everything that could possibly break" but we wrote most of the tests
after the fact. Then the test-first thing came along, where we write
them, basically, one at a time.

Regards,

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/16/2004 2:50:02 AM

On Mon, 15 Nov 2004 16:21:02 -0500, CTips <ctips@bestweb.net> wrote:

>Note that I have *tried* to do things TDD; unfortunately, it became 
>apparent that TDD is a completely inadequate way of doing things.

For you, perhaps. It would be interesting to work with you on
something to see what we'd discover. I'd bet we'd each learn something
new.
>
>- With TDD you write code to pass test #1, then #1 & #2 and so on. If, 
>at some point along the process, you discover that the best way to doing 
>#1...#n is to use a different approach (say, to use a table-driven 
>approach instead of a switch statement) then you have to rewrite the 
>code ("refactor"). If, however, you had written it using a table-driven 
>approach in the first place you would have saved yourself a lot of time.

Yes, if we had thought of it. But by your hypothesis, we didn't think
of it, in which case, we'd have to rewrite it no matter whether we
were doing TDD or not, n'est pas?
>
>- There are many situations in which it is not possible to write tests. 
>As an extreme example, consider implementing synchronization primitives. 
>They have to be proved to be correct.

Yes, there are some situations where tests are not sufficient --
though I am aware of no situation where tests literally cannot be
written. And sync primitives /should/ probably be proven correct. But
I'd wager that many have been written that were not.

However, the number of such cases, where testing is not adequate,
while perhaps large in absolute numbers, is not in my experience, a
large percentage of the code that needs to be written.
>
>- The problem is quite often complex enough that the first non-trivial 
>test requires most of the work.

I've felt that way sometimes, but what I've found is usually that I
just hadn't thought of the simple starting point yet.
>
>I suspect that TDD works better for tackling simpler problems in less 
>complex domains, and at lower productivities.

Well, I don't know. I've worked in a lot of domains that people
consider complex and I'd use TDD most everywhere. As for productivity,
I know it raises mine, but perhaps if I were as good as you report
your teams to be, it wouldn't help me, or wouldn't help as much. 

For me, TDD avoids defects, and saves time because I do far less
debugging. If a team avoided defects some other way -- expecially by
just being incredibly smart, as opposed to some time-consuming way
such as extensive reviews -- then TDD might not help so much.

I can't remember the last time I encountered a team that had an very
very low defect rate, but I'm sure they are out there somewhere. And
maybe they don't need TDD. Someday maybe I'll get to observe such a
team and find out.

regards,

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/16/2004 2:57:12 AM

Laurent Bossavit wrote:
>>It took me about 3 hours to code and debug a _complete_ 
>>hangman, including dictionary management and I/O, _NOT_ using TDD.
> 
> 
> Having written it that way, do you think it would be very difficult to 
> do that one over, this time test-driven ? Since you know where you're 
> going to end up, massive rewrites shouldn't be a concern. Also...
> 
> 
>>#1...#n is to use a different approach (say, to use a table-driven 
>>approach instead of a switch statement) then you have to rewrite the 
>>code ("refactor").
> 
> 
> Refactor doesn't mean "rewrite". It means a small, reversible change 
> which leaves all tests running. The accumulation of such transformations 
> tends to stress the code in such a way that interface ("what") and 
> implementation ("how") drift away from each other; that happens to be a 
> desirable property of code.
> 
> Laurent

Lets take an example. I want to do a peephole optimization that converts
z = x + 0
into
z = x
OK, so thats test#1, and I write a bunch of C/C++/what-have-you to 
implement that.

Now, test #2 is
t = x + K, K is a constant
z = t + L, L is a constant
----------
z = x + (K+L)

and so on and so forth, recognizing more patterns. Each pattern 
generates say on the average of about 30 lines of C. There are 
eventually going to be more than 500 such patterns, resulting in about 
15kloc of code.

At some point, it should become clear that it is much more efficient and 
much less error-prone to write a tool that will automatically convert 
descriptions such as
t = x + y
z = t - y
---------
z = x
into the C code required to implement them. This tool takes less than 
2kloc, and the descriptions total another 2 kloc, resulting in a net 
savings of 75% of the total effort (actually, its much more, because the 
descriptions are much harder to get wrong than the code).

At this point you throw away all the implementation code which you wrote 
by hand.

Now, if you had been doing TDD you would either have written 15kloc or 
you would have written some amount of code, then after realizing the 
right approach, had to throw it all away.

If you had been smart, thought about the problem, and decided to 
implement the solution the right way from the beginning, you'd have to 
write about 700loc or so of code before test#1 passed, and probably 
another 700loc before test#2 passed.
0
Reply CTips 11/16/2004 4:00:36 AM

Laurent Bossavit wrote:
>>These appears to say that test-first limits scope-creep by using the 
>>tests to specify the functionality. [...] If we can keep adding tests
>>(=> function, scope), then  we will never "know when we are done" since
>>we can always add more tests.
> 
<snip>
> Tests are involved in limiting scope creep at more than one level of 
> abstraction; system-level tests pin down the "big picture" view of what 
> the system does - while smaller (unit) tests pin down the implementation 
> details. Think of drawing an outline, then filling it in.
> 

Thats specious. Tests specify the functionality that the module 
implements. If you can add more tests, then you can increase the 
functionality/scope of the module (or vice-versa). TDD does not specify 
when to stop adding tests. Therefore, TDD does not address scope creep 
in any way.
0
Reply CTips 11/16/2004 4:05:35 AM

David wrote:

>   I'm not sold on TDD itself, but still interested in the concept.
> Since I have had good teachers and experience, almost everything you
> have mentioned is already part of my work plan.  I tend to think
> and design rather deep though and most of the refactoring is done
> in my head before the base concepts are written down.

Everyone can take a given program some distance in their head. Call this
distance X. I'm the first to admit my average X is less than most
programmers. But when X runs out, that's where TDD. You have to do all of X
via TDD so that when you run off the end of your ability to design in your
head, you can still keep going.

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/16/2004 4:52:29 AM

CTips wrote:

> Thats specious. Tests specify the functionality that the module
> implements. If you can add more tests, then you can increase the
> functionality/scope of the module (or vice-versa). TDD does not specify
> when to stop adding tests. Therefore, TDD does not address scope creep
> in any way.

I think Ron once said, "Add tests until fear turns into boredom".

Each test must show an unbroken chain: Requirement->feature->test->code. If
you can't think of a new test to fulfill a requirement, you are allowed to
add more test-last, but you are not allowed to

Tests, alone, naturally cannot stop the scope creep. However, they provide a
very powerful system for engineers to stop it. Only with wall-to-wall tests
can you _remove_ lines, and see if they weren't needed.

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces



0
Reply Phlip 11/16/2004 5:10:58 AM

CTips <ctips@bestweb.net> wrote in message news:<10pi7f8baa3sibe@corp.supernews.com>...
> Here's a question - remember the psuedo-hangman example you posted early 
> this year? It took me about 3 hours to code and debug a _complete_ 
> hangman, including dictionary management and I/O, _NOT_ using TDD. ( 
> source is at http://users.bestweb.net/~ctips/hangman.c )
> 
> How long had you been working on the problem, using TDD? Do you think 
> if you hadn't tried doing it with TDD but just gone ahead and written 
> it, adding tests as and when it seemed appropriate, you might have been 
> able to get it done quicker?

I don't remember how long it took me at the time (2, 3 hours?). I have
written several similar types of programs since then, games like
blackjack, pong, breakout, bouncing ball in a box... I teach some
classes in my spare time, so that gives me a chance to play with
simple programs a fair bit... I'd estimate my productivity with these
to be about the same as without TDD. I'd say the time taken up writing
the tests seems to roughly make up for the dumb mistakes I introduce
into the code as I modify it, which the tests help me to notice. In
the case of the simple ball I games, I found the tests to be a very
convenient way to check my collision detection code. This is just my
own experience, but I'd say the code I write with TDD is more "robust"
than without it. I think of more things to test with TDD than I would
if I just sat down, wrote the code, then debugged it. Also, I find
that the code breaks down into smoother separation of implementation
and interface, which is a quality I consider to be desirable...
Anyway, while I am sure you're a very productive programmer, much
better than average, I am not so sure that your productivity would
really suffer from TDD once you got used to it. I think you'd find
benefits such as being able to slide in changes where before you'd
have had to re-write the module entirely to accomodate...
 
> If, however, you had written it using a table-driven 
> approach in the first place you would have saved yourself a lot of time.

This is the kind of situation where I would expect TDD to really
shine.
You re-write the implementation, and the tests still pass. Maybe I am
mistaken... I do think XP allows for just hacking out a bunch of code,
but you have to throw it away and re-write it using TDD before you can
actually use it as part of the production codebase. So if you're not
sure how to do something, you can hack some code until you know you're
going in the right direction, then quickly write the "correct"
implementation using TDD. That way you have tests that back up the
fact stuff still works when you start making some changes/enhancements
months later.
 
> - There are many situations in which it is not possible to write tests. 
> As an extreme example, consider implementing synchronization primitives. 
> They have to be proved to be correct.

Fair enough. The whole area of concurrency is pretty difficult. I am
fairly comfortable with the idea that you can abandon TDD if it's too
hard or just not worthwhile in a specific case. But in the end, no
matter what you're developing, I believe that 85% of the time or more,
TDD should not be a problem.
 
> - The problem is quite often complex enough that the first non-trivial 
> test requires most of the work.
> - I suspect that TDD works better for tackling simpler problems in less 
> complex domains, and at lower productivities.

I am inclined to disagree with the 2 comments above. I really believe
you ought to be able to break things done to smaller units and that
this is a desirable goal (see Laurent's posting...). Also, no matter
what you are developing, it is ultimately made up of layers of
components of relatively simpler complexity... Robert Martin pointed
out at a talk he gave recently that the 1,000,000 line program was
once a humble 1,000 line program. There is probably a degenerate case
somewhere that violate this principle. I imagine a true AI will be
some horrible, fundamentally un-scramblable knot of code, but that's
not what the vast majority of programs are about...
0
Reply vladimir_levin 11/16/2004 6:06:07 AM

Phlip wrote:

> CTips wrote:
>
> > Thats specious. Tests specify the functionality that the module
> > implements. If you can add more tests, then you can increase the
> > functionality/scope of the module (or vice-versa). TDD does not specify
> > when to stop adding tests. Therefore, TDD does not address scope creep
> > in any way.
>
> I think Ron once said, "Add tests until fear turns into boredom".
>
> Each test must show an unbroken chain: Requirement->feature->test->code.
If
> you can't think of a new test to fulfill a requirement, you are allowed to
> add more test-last, but you are not allowed to

add more failing tests and make them pass.

> Tests, alone, naturally cannot stop the scope creep. However, they provide
a
> very powerful system for engineers to stop it. Only with wall-to-wall
tests
> can you _remove_ lines, and see if they weren't needed.
>
> -- 
>   Phlip
>   http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces
>
>
>


0
Reply Phlip 11/16/2004 6:24:23 AM

> Lets take an example. I want to do a peephole optimization that converts
> [...]
> Now, if you had been doing TDD you would either have written 15kloc or 
> you would have written some amount of code, then after realizing the 
> right approach, had to throw it all away.

If I had done that, then I would not have been doing TDD.

The rules of TDD require me to write dead-simple, "naive" code for the 
first peephole optimization. You had just the right instinct for which 
optimization to start with, by the way - the simplest one possible.

The rules of TDD require me to again write dead-simple code for the 
second peephole optimization, *but* they also require me to ensure that 
there is no code duplication at all in the analysis module that results. 
I'll repeat that: no duplication at all.

To get rid of duplication, I might well have to start moving toward a 
process driven by data tables, or some such. What matters is that the 
edict, "no duplication", tends to drive the code to ever greater levels 
of abstraction. By the 100th pattern I might be nowhere near 3000 lines.

If you had already done this before, you might choose to drive the 
implementation toward the approach you mention, a code generator. I'm 
not fond of code generation myself, so I might choose differently.

By this time, a distinction will have started to emerge between the 
coarser-grained tests that match an input pattern of register moves and 
operations to an expected output pattern - and true unit tests, which 
exercise the low-level functionality of the code generator, or pattern 
matcher, or whatever our implementation of choice is. The former 
category describes the functionality of the analysis module, the latter 
describe its design. It's quite likely that by now the test harness is 
capable of using the transformation spec *themselves* as an input 
format.

At any rate, by this time the functionality of the analysis module is 
completely covered by tests, so that any regression introduced, say, in 
modifying the generator or matcher to be able to handle a whole new 
class of optimization is detected right away.

I would expect that someone with experience in designing peephole 
optimizers would have no major difficulty driving the design toward a 
solution known to be serviceable, but they would also end up with code 
much less likely to suffer from regressions (more robust in that sense) 
than if they'd written 700LOC then one test, then 700 further LOC and 
one further test.

Laurent
http://bossavit.com/thoughts/
0
Reply Laurent 11/16/2004 8:12:05 AM

On Mon, 15 Nov 2004 23:05:35 -0500, CTips <ctips@bestweb.net> wrote:

>Laurent Bossavit wrote:
>>>These appears to say that test-first limits scope-creep by using the 
>>>tests to specify the functionality. [...] If we can keep adding tests
>>>(=> function, scope), then  we will never "know when we are done" since
>>>we can always add more tests.
>> 
><snip>
>> Tests are involved in limiting scope creep at more than one level of 
>> abstraction; system-level tests pin down the "big picture" view of what 
>> the system does - while smaller (unit) tests pin down the implementation 
>> details. Think of drawing an outline, then filling it in.
>> 
>
>Thats specious. Tests specify the functionality that the module 
>implements. If you can add more tests, then you can increase the 
>functionality/scope of the module (or vice-versa). TDD does not specify 
>when to stop adding tests. Therefore, TDD does not address scope creep 
>in any way.

One might similarly say that reviewers might miss defects, and
therefore reviews do not address defects in any way.

I find, and many people with whom I talk find, that because a test
/specifies/ a requirement, whereas the code /implements/ a
requirement, TDD helps us stop when it's time to stop.

I suspect it's because as we contemplate writing the next test, "OK,
what if he would like to remove three elements at once and then get
the next element", it's easier to recognize that we're going beyond
our current need. As we write the code, we see that if we do it just
this way, with only a little more effort, we can arrange it so that
you can have an extra parameter that says how many elements to remove
before doing next, so we just do it.

Developing the habit of writing the test is equivalent to writing down
a new requirement, which seems to be enough to get us, frequently, to
realize that we don't really need that thing.

Logically, it might seem that we could write tests forever and
therefore scope creep would not be addressed. In practice, that's not
what happens. I think it's a psychology thing, not a logic thing.

Regards,

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/16/2004 1:36:14 PM

Laurent Bossavit wrote:

> If I had done that, then I would not have been doing TDD.
>
> The rules of TDD require me to write dead-simple, "naive" code for the
> first peephole optimization. You had just the right instinct for which
> optimization to start with, by the way - the simplest one possible.

Okay, I'm seventeen times smarter than Laurent, however I do that too. I'm
so smart that I know a good design for everything, and I'm smart enough to
_refrain_ from implementing it. The code that passes my tests is nothing but

    SCAFFOLDING TO SUPPORT THE TESTS.

Put another way, given a choice between losing the code or the tests, I
would lose the code and keep the tests. They are where all the features and
all the design decisions are stretched out and visible, not crammed together
via refactoring.

> The rules of TDD require me to again write dead-simple code for the
> second peephole optimization, *but* they also require me to ensure that
> there is no code duplication at all in the analysis module that results.
> I'll repeat that: no duplication at all.

Now the fun begins. The tests are like huge magnets forming a confinement
field, and the code is like a droplet of Bose-Einstein condensate in the
middle. The more tests we turn on, the smaller the droplet gets, and the
more superfluid becomes its trapped nuclear matter.

Okay, that's /too/ smart. Try again.

What we mean is that to seek the minimal but most elegant code to pass the
tests, we put the code thru repeated passes of adding features and removing
duplication. This anneals the code. And even if I planned the design it
eventually arives at, TDD will most likely arive at a simpler design in a
shorter amount of time than I could have predicted.

> To get rid of duplication, I might well have to start moving toward a
> process driven by data tables, or some such. What matters is that the
> edict, "no duplication", tends to drive the code to ever greater levels
> of abstraction. By the 100th pattern I might be nowhere near 3000 lines.
>
> If you had already done this before, you might choose to drive the
> implementation toward the approach you mention, a code generator. I'm
> not fond of code generation myself, so I might choose differently.

Okay, here the idea was there's a quantum leap (or possibly a very long
leap) between the design that simple tests lead to, and the best design for
all of them. That's still good.

SOMETIMES YOU THROW THAT ELEGANT DESIGN AWAY.

But you do it by following TDD's other rule: "no more than 10 edits before
passing tests". After you grow a design, and get it reviewed by your peers
(even those with 1/17th of your intellijence), you might decide to replace
it.

You do that by adding a test that forces the beginning of the new system to
exist. You leave the other system online while the new system grows. Then
you start replacing the old system with the new one, at its call sites, one
by one. Then you erase the old system, and you then seek opportunities to
refactor _everywhere_ based on the features of the new system. (This is
Substitute Algorithm Refactor.)

TDD gives these benefits:

 - lots of tests, most of whom don't care what the design is
 - the ability to remove code and see if tests pass
 - the ability to deploy, release, or deliver, _during_ a refactor
 - the ability to review the design based on its testage
 - the ability to escallate testage into customer tests
 - the force to rapidly find a minimal and elegant design
 - the ability to replace that design without a blackout
 - the ability to continously integrate
 - the ability to swap modules with colleagues

Other systems give those benefits too, but not so easily.

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/16/2004 2:58:55 PM

Laurent Bossavit wrote:
>>Even if we break it down to the minimal amount of analysis per transform kind,
>>it will still amount to 1kloc+. (BTW: this is based on recent experience,
>>not hypothetical).
> 
> 
> I suspect we differ on what we call a "unit" test, or a "real" test.
> 

  <snip: description of method to have analysis phase dump output and 
then check that output for correctness>

Every now and then we have someone who suggests doing something like that.

  However, here's the problem: the analysis phase is broken down into 
several routines. A 1kloc phase might have between 10 and 25 such 
routines; lets call it 20 routines averaging 50 loc.

  Each analysis routine takes in a graph and either annotates the graph 
or creates auxiliary data-structures (or both). So, if the input graph 
has 25 nodes + 25 edges, and creates 4 pieces of information for each, 
thats 200 output values.

So, per test, we'd have to figure out and write 200 lotest to test 50 
loc. Since these routines have multiple paths, I'd expect that the total 
testing effort following the suggested strategy would be, oh, lets say 
several 1000 entries - probably a 20x to 50x effort compared to actually 
writing the function.

If you wait until the transformation phase, you get to check the 
analysis phase by seeing if the transforms actually happened correctly. 
  That is relatively straight-forward - compile and run the code to see 
if it gives the same result with and without the optimization.

Specifically, we can usually exercise a phase "completely" with about 
1x-4x the effort of actually writing it. (Completely means at the very 
least branch level coverage, but usually includes correlated paths - 
i.e. if what happens in A0 impacts B0 then some test will exercise a 
path containing A0 and B0). The extra overhead includes time to
- add extra self-check code
- add extra debug trace code
- write phase-specific tests/test-drivers/test-generators etc.
- find bugs and isolate them

There are other techniques you can use to build confidence in program 
correctness and to elicit other than simply testing (much less TDD). 
They are also _much_ higher return for effort. Unfortunately, people who 
have worked on simple problems have never been forced into situations 
where they need to develop skill in those kinds of techniques.
0
Reply CTips 11/16/2004 3:30:35 PM

Phlip wrote:

> David wrote:
> 
> 
>>  I'm not sold on TDD itself, but still interested in the concept.
>>Since I have had good teachers and experience, almost everything you
>>have mentioned is already part of my work plan.  I tend to think
>>and design rather deep though and most of the refactoring is done
>>in my head before the base concepts are written down.
> 
> 
> Everyone can take a given program some distance in their head. Call this
> distance X. I'm the first to admit my average X is less than most
> programmers. But when X runs out, that's where TDD. You have to do all of X
> via TDD so that when you run off the end of your ability to design in your
> head, you can still keep going.
> 

Next time you have to develop a module do the following: think about the 
problem. What routines need to be implemented? What data-structures are 
needed? What alogrithms will be needed? Try and visualize what the final 
code will look like. Don't write a single line of code until you're 
confident that you have a good idea about the final picture. In fact, 
don't even sit at the computer - just think. At most, use a pencil and 
paper.

Now, when you're going to start coding write out all the external 
functions that module is going to contain, and the description for them 
(If it was C, I'd say write the .h file first). When you're going to 
write a function, start off by writing comments of the form

   /* first do this ... */
   /* then do this ... */
   /* finally do this ... */

and then start filling in the code.

After you've done this a few times, you'll get a handle on how to write 
  a module without having to resort to sub-optimal intermediate points.
0
Reply CTips 11/16/2004 4:26:56 PM

What about TDD as regression testing? Isn't that worth something CTips?
Stede

0
Reply stedetro 11/16/2004 4:41:37 PM

stedetro@yahoo.com wrote:

> What about TDD as regression testing? Isn't that worth something CTips?
> Stede
> 

Its better than nothing. But is it better than the alternatives?

For instance, consider testing to achieve a particular level of coverage.
- will TDD give you that level of coverage? It won't give path coverage, 
thats for sure. It should give statement coverage (if you write tests to 
cover error/undefined scenarios).
- will TDD give a minimal set of tests for that level of coverage? In 
many cases, I don't expect it to.

In fact, post-facto white-box testing will probably yeild a "better" (in 
the sense of coverage) and smaller test suite than that developed by TDD.
0
Reply CTips 11/16/2004 5:02:04 PM

Makes sense CTip. What about Design by Contract? Is that TDD. Do you
feel that will help? Maybe a combination of Design by Contract,
Regression Testing and maybe a little TDD for finding the simpliest way
to build a class?

0
Reply stedetro 11/16/2004 5:09:50 PM

Makes sense CTip. What about Design by Contract? Is that TDD. Do you
feel that will help? Maybe a combination of Design by Contract,
Regression Testing and maybe a little TDD for finding the simpliest way
to build a class?

0
Reply stedetro 11/16/2004 5:10:06 PM

stedetro@yahoo.com wrote:

> Makes sense CTip. What about Design by Contract? Is that TDD. Do you
> feel that will help? Maybe a combination of Design by Contract,
> Regression Testing and maybe a little TDD for finding the simpliest way
> to build a class?
> 

Design by contract is an interesting idea. Unfortunately, its also more 
overhead than its worth in most situations.

  Lets look at the whole idea of preconditions, postconditions and 
invariants. They are expressions on the state of the program that are 
(supposted to be) true before, after, and during the execution of a 
piece of code. Quite often, the actual condition ends up being very hard 
to express correctly.

For instance, how do I specify the post-condition for the factorial 
function?
   n = fact(i);
   /* at the end of this n is i! */
To actually check this would require rewriting the factorial function 
some other way.
   n = fact(i)
   assert( n == fact_1(i) );
So, in practice, one would just leave the post-condition as a comment. 
Which kind of defeats the purpose.

When conditions get very complex, then one has several options:
- leave them as documentation only, in a somewhat fuzzy (non-executable) 
form. This allows any number of misunderstandings to creep through (I 
thought you meant *that*! No, I didn't)
- convert them to an executable form, which may require more work than 
other approaches
- use simpler executable conditions and leave the rest as documentation. 
e.g. replace
	assert( i > 0 && n == fact_1(i) );
with
	assert( i > 0 ); /* n == i! */
- change the behavior of the function so that the conditions can be 
simplified.

Another problem with pre-, post- and invariant conditions in complex 
situations is that coming up with the "right" conditions is pretty 
difficult (particularily the invariants). Here "right" means that they 
correctly capture the state, are weak enough to be written succintly, 
and are strong enough to detect many bugs.
0
Reply CTips 11/16/2004 6:49:52 PM

>   <snip: description of method to have analysis phase dump output and 
> then check that output for correctness>
> Every now and then we have someone who suggests doing something like that.

That was a thought experiment, not a suggestion. The point was to show 
that it can't be necessary to write 1000 lines of code before you write 
one "real" test, if by "real" you mean "can expose a bug".

In practice I would test-drive the whole thing from scratch, not start 
from a less than ideally testable design as I did in the thought 
experiment.

> So, per test, we'd have to figure out and write 200 lotest to test 50 
> loc.

I refer you to my passages on refactoring, especially the bit about 
removing duplication. Reread the bit about removing duplication. Your 
unit testing effort does not increase linearly with the number of code 
chunks, neither is it a combinatorial function of the dimensions along 
which input data can vary.

Rather, think of the unit tests produced by TDD as that many lemmas in a 
very long proof. Or think of them as executable examples, chosen for 
pedagogical value in the problem domain.

Laurent
0
Reply Laurent 11/16/2004 8:01:33 PM

Laurent Bossavit wrote:

>>  <snip: description of method to have analysis phase dump output and 
>>then check that output for correctness>
>>Every now and then we have someone who suggests doing something like that.
> 
> 
> That was a thought experiment, not a suggestion. The point was to show 
> that it can't be necessary to write 1000 lines of code before you write 
> one "real" test, if by "real" you mean "can expose a bug".
> 
> In practice I would test-drive the whole thing from scratch, not start 
> from a less than ideally testable design as I did in the thought 
> experiment.

Whats the largest program you wrote yourself? How long did it take?
Whats the most complex algorithm you've implemented? How long was it? 
Did you do either of them TDD?

I tend to write about 100klocs of production/production-quality code 
every year (I'm not counting things like the test code itself). Its 
slipped now that I'm doing customer/sales/management kind of stuff, but 
I'm still trying to get to about that half that. And its not exactly 
simple stuff either - at this point a large part of what I write 
involves development of new algorithms as well as the coding.

So permit me to say this - if you can hit those kinds of productivities 
using TDD and the kinds of testing approaches you're advocating, more 
power to you. If you can't, then maybe you should consider how to change 
your approach to get there.

> 
>>So, per test, we'd have to figure out and write 200 lotest to test 50 
>>loc.
> 
> 
> I refer you to my passages on refactoring, especially the bit about 
> removing duplication. Reread the bit about removing duplication. Your 
> unit testing effort does not increase linearly with the number of code 
> chunks, neither is it a combinatorial function of the dimensions along 
> which input data can vary.


Sorry, I think you're not getting it. If I have to check for the 
correctness of one run through a function, and that function generates 
200 values, then I have to check 200 values to ensure that it is 
correct. If I am very lucky I might be able to check it using less than 
200 lines of code/test/result-specification {say, for instance, that all 
4 of these nodes must have the same annotation}, but in practice it does 
n't work that way.

If I have a typical 50 line function then it probably will need 8 
different tests to get minimal (branch) coverage. That means that to 
test a single function adequately, I need to write 1600 lines for the 
tests - that is a 32-to-1 ratio of test code to actual code.

Now, in a 1000 kloc block, I have 20 such functions. To test each one 
separately, I'd have to write 32kloc.

Note: no combinatorial explosion anywhere. Just simple multiplication:
200 output values/function/test x 8 tests/function x 20 functions = 
32kloc of test values.

> Rather, think of the unit tests produced by TDD as that many lemmas in a 
> very long proof. Or think of them as executable examples, chosen for 
> pedagogical value in the problem domain.
> 
> Laurent


0
Reply CTips 11/16/2004 8:46:58 PM

Ronald E Jeffries wrote:


> I can't remember the last time I encountered a team that had an very
> very low defect rate, but I'm sure they are out there somewhere. And
> maybe they don't need TDD. Someday maybe I'll get to observe such a
> team and find out.
> 

You know you have a standing invitation. Swing by whenever you're in the 
area. I'll make sure you get access to our bug database, and see if I 
can arrange access to our customer help e-mails.
0
Reply CTips 11/16/2004 8:49:33 PM

Your thoughts make a lot of sense CTip. Now for the really big
question. How do I get a low defect like you without doing TDD?

Stede

0
Reply stedetro 11/16/2004 8:53:44 PM

"CTips" <ctips@bestweb.net> wrote in message 
news:10pkckbdlk5n36f@corp.supernews.com...
> stedetro@yahoo.com wrote:
>
>> What about TDD as regression testing? Isn't that worth something CTips?
>> Stede
>>
>
> Its better than nothing. But is it better than the alternatives?
>
> For instance, consider testing to achieve a particular level of coverage.
> - will TDD give you that level of coverage? It won't give path coverage, 
> thats for sure. It should give statement coverage (if you write tests to 
> cover error/undefined scenarios).
> - will TDD give a minimal set of tests for that level of coverage? In many 
> cases, I don't expect it to.
>
> In fact, post-facto white-box testing will probably yeild a "better" (in 
> the sense of coverage) and smaller test suite than that developed by TDD.

It depends on what you're looking for. TDD isn't
a testing methodology first. It's a design methodology
first and a testing methodology second.

TDD normally gives statement coverage in the high
90s, and branch coverage in the middle 90s. As far
as other coverage metrics, I haven't a clue - I haven't
seen either measurements or theoretical arguments.

We've had some comparisons of TDD and classical
testing methodologies in the past. The general conclusion
was that TDD gave a much *smaller* set of tests than
the testing gurus thought was an adequate set,  by a
factor of at least five.

I normally don't worry overmuch about a "minimal"
set of tests. I understand that's a consideration in
classical testing, but it's irrelevant in XP for one very
simple reason: tests *must* be written to run as fast
as possible, because you will be running literally
hundreds of them on every edit and compile cycle.

As a very high level rule of thumb, I'd expect
an average of around 5 loc for each test written.
Since TDD does one test and then the code to
make it pass on each pass through the loop, the
problem must be constructible in that kind of
very small step. There may very well be areas
where this isn't possible, and compiler optimization
algorithms may very well be one of them.

Let's just say that I have my doubts. What
I'd consider as an adequate demonstration is
someone who has a lot of experience with TDD
trying it and deciding that it's not an appropriate
area to apply that technique.

John Roth 

0
Reply John 11/16/2004 9:51:07 PM

> Whats the largest program you wrote yourself? 

I haven't tended to use LOC as a metric. I've preferred counting how 
many people used what I wrote, for instance.

Anyway, I could well be wrong about that, but I suspect the program I 
alluded to, which included a compiler, might have been the largest 
single program I wrote by myself. It ran to 30KLOC of Java code, which 
according to Capers Jones is equivalent to about 72KLOC of C code.

There was a C++ project to which I was a contributor for a year and a 
bit, where I might have produced quite a bit more, but I no longer have 
access to the source.

> How long did it take?

I recall a little under three months. I did a number of other projects 
that same year, for an aggregate size of, well, whatever it was. Again, 
I wasn't counting.

> Whats the most complex algorithm you've implemented? How long was it?

I couldn't say - depends on what you mean by "complex". The most 
challenging at the time might have been that RTP protocol implementation 
for an Internet telephony thingie, but it was tiny.

> Did you do either of them TDD?

Nope. I've tended to shift my focus even further away from "pissing 
contest" statistics since I started doing TDD, for reasons semi-directly 
related to my motivations in doing so.

One of my pleasant accomplishments using a mix of techniques derived 
from a then-fresh understanding of TDD was *removing* 15KLoc from a Java 
program of about 45K initially, *adding* functionality in the process 
(and quite a bit of robustness).

By my own assessment, TDD has been a productivity boost, but I couldn't 
care less about showing it in terms of LOC.

Laurent
0
Reply Laurent 11/16/2004 10:27:36 PM

"John Roth" <newsgroups@jhrothjr.com> wrote in message:

> It depends on what you're looking for. TDD isn't
> a testing methodology first. It's a design methodology
> first and a testing methodology second.

That is really true. I didn't get that until the Fibinacci example in Kent's
book. I guess I am slow.

>
> TDD normally gives statement coverage in the high
> 90s, and branch coverage in the middle 90s. As far
> as other coverage metrics, I haven't a clue - I haven't
> seen either measurements or theoretical arguments.

I don't know what metric you are using but you and CTip have vastly
different numbers. How is something like this provable? How can we have
scientific data?

John, you must understand how frustrating this can get when you here two
experts with such vastly different data sets. It is like hearing 50% of
economists saying outsourcing is good and another 50% saying it is bad. It
just makes people who want to learn so much less faithful in the whole
process.

> We've had some comparisons of TDD and classical
> testing methodologies in the past. The general conclusion
> was that TDD gave a much *smaller* set of tests than
> the testing gurus thought was an adequate set,  by a
> factor of at least five.

Can you show it? Prove it? I am not saying this in an adversary tone.

> I normally don't worry overmuch about a "minimal"
> set of tests. I understand that's a consideration in
> classical testing, but it's irrelevant in XP for one very
> simple reason: tests *must* be written to run as fast
> as possible, because you will be running literally
> hundreds of them on every edit and compile cycle.
>
> As a very high level rule of thumb, I'd expect
> an average of around 5 loc for each test written.
> Since TDD does one test and then the code to
> make it pass on each pass through the loop, the
> problem must be constructible in that kind of
> very small step. There may very well be areas
> where this isn't possible, and compiler optimization
> algorithms may very well be one of them.
>
> Let's just say that I have my doubts. What
> I'd consider as an adequate demonstration is
> someone who has a lot of experience with TDD
> trying it and deciding that it's not an appropriate
> area to apply that technique.
>

That certainly makes sense.

Stede

> John Roth
>


0
Reply Stede 11/16/2004 11:33:57 PM

CTips wrote:

I'm obviously not Laurent, but I will answer anyway:

> Whats the largest program you wrote yourself?

The main project I am currently working on has approximately 0.5 million 
LOC.

> How long did it take?

It is now running for more than six years. Currently there are 4 developers 
in the core team.

> Whats the most complex algorithm you've implemented? How long was it?
> Did you do either of them TDD?

Probably the most complex algorithm that I know of was implemented TDD style 
is an iso-area algorithm.

Basically that algorithm gets a big number of interpolation points for a 
curved surface. The algorithm interpolates more points of the surface, finds 
points that are at the same height, builds polygons from them, joins 
polygons that are at the same height and touch each other, has to take care 
of holes in the polygons etc. It's actually more complicated than it might 
sound to you - the original algorithm that was recently replaced by the 
test-driven one never worked without nasty bugs.

I will take a look at its code size at work...

> I tend to write about 100klocs of production/production-quality code
> every year (I'm not counting things like the test code itself). Its
> slipped now that I'm doing customer/sales/management kind of stuff,
> but I'm still trying to get to about that half that. And its not
> exactly simple stuff either - at this point a large part of what I
> write involves development of new algorithms as well as the coding.
>
> So permit me to say this - if you can hit those kinds of
> productivities using TDD and the kinds of testing approaches you're
> advocating, more power to you. If you can't, then maybe you should
> consider how to change your approach to get there.

What I found is that with TDD I am more productive *by writing less code*. 
That's not only because it prevents code bloat, but also because it's a 
*design technique* that helps me building better decoupled, and therefore 
more reusable code.

> Sorry, I think you're not getting it. If I have to check for the
> correctness of one run through a function, and that function generates
> 200 values, then I have to check 200 values to ensure that it is
> correct.

It certainly doesn't generate all those 200 values at once - that is, there 
probably is some code that handles not more than one value at a time? Could 
you test that code in isolation?

> If I have a typical 50 line function then it probably will need 8
> different tests to get minimal (branch) coverage.

I don't like 50 line functions - they are simply too big for my taste.

> That means that to
> test a single function adequately, I need to write 1600 lines for the
> tests - that is a 32-to-1 ratio of test code to actual code.
>
> Now, in a 1000 kloc block, I have 20 such functions. To test each one
> separately, I'd have to write 32kloc.

That seems to assume that you can't reuse any testing code. That's not my 
experience.

Cheers, Ilja 


0
Reply Ilja 11/16/2004 11:37:19 PM

stedetro@yahoo.com wrote:

> What about TDD as regression testing? 

One of the things that occurs to me is that if you write a test, write 
code, write a test, etc. cycle, you are basically designing your tests 
with assumptions about the implementation.  Now suppose you change the 
implementation.  Let's say you need to increase performance and can do 
this by handling an input parameter x in two ranges: 0 - 15, and 16 or 
greater, rather than a single range as originally designed.  If the 
original test didn't test these ranges separately, the test after change 
might not catch a new bug.  How do you prevent such cracks opening when 
you refactor?

Thad

0
Reply Thad 11/17/2004 12:03:10 AM

Laurent Bossavit wrote:
>>Whats the largest program you wrote yourself? 
> 
> 
<snip> ... It ran to 30KLOC of Java code [in 3 months]
> 
> There was a C++ project to which I was a contributor for a year and a 
> bit, where I might have produced quite a bit more, but I no longer have 
> access to the source.
<snip>
> 
>>Whats the most complex algorithm you've implemented? How long was it?
>
> I couldn't say - depends on what you mean by "complex". The most 
> challenging at the time might have been that RTP protocol implementation 
> for an Internet telephony thingie, but it was tiny.
>
>>Did you do either of them TDD?
>  
> Nope.
 > By my own assessment, TDD has been a productivity boost, but I
 > couldn't  care less about showing it in terms of LOC.

This set of questions wasn't meant to question whether TDD was or was 
not a productivity boost, but to see to what sized programs you'd 
applied it to. The answer appears to be to small programs only.

Now, the question is what makes you think that it is better than (or 
even remotely comparable to) other approaches for building medium and 
large sized programs with a "reasonable" degree of correctness.

It can't be your personal experience, because you've never used TDD on a 
even a medium sized or medium complexity program.

Lets hear it from other people - what is the largest or most complex  or 
most challenging (or some combination) program that you wrote using TDD?

[Note that I have never worked on a large software program - most of my 
experience is with medium-sized programs, in the 100k to 300kloc range. 
I typically consider  50-100kloc small, 100kloc - 500kloc medium, 
500kloc+ large]
0
Reply CTips 11/17/2004 12:41:08 AM

CTips wrote:
snipped

> Lets hear it from other people - what is the largest or most complex  or 
> most challenging (or some combination) program that you wrote using TDD?


We are using TDD (and XP in fact) on (IMO) a medium sized application. 
Its a telecomms equipment Network Management System. Multiple Servers 
managing telecomms equipment and providing that data and manageability 
to multiple clients.

We use CORBA as the RPC mechanism between each (n equipment <-> n 
servers <-> n clients)

The system currently supports Fault and Configuration Management and in 
due course will provide Auditing, Performance and Security management of 
  the telecomms equipment.

> 
> [Note that I have never worked on a large software program - most of my 
> experience is with medium-sized programs, in the 100k to 300kloc range. 
> I typically consider  50-100kloc small, 100kloc - 500kloc medium, 
> 500kloc+ large]

These number are IMO meaningless.  Its totally Dependant upon the 
language, persistence mechanisms, RPC technology, etc which are always 
going to be different for people here on this NG to be able to compare.

For example, a C++ program can usually be much large than its Java 
equivalent, yet the same program developed in Ruby can be even smaller.

Then there's auto generated code. Our systems CORBA's IDL is only a few 
hundred LOC, but generates 000s of lines of Java code.

Also, anyone can write lots of code. If productivity is measured by LoC, 
then people tend to just write lots of code, rather than well factored 
code that does the same job, but in few loc.

How about we compare functionality instead?
0
Reply Andrew 11/17/2004 12:58:17 AM

Thad Smith wrote:

> One of the things that occurs to me is that if you write a test, write
> code, write a test, etc. cycle, you are basically designing your tests
> with assumptions about the implementation.  Now suppose you change the
> implementation.  Let's say you need to increase performance and can do
> this by handling an input parameter x in two ranges: 0 - 15, and 16 or
> greater, rather than a single range as originally designed.  If the
> original test didn't test these ranges separately, the test after change
> might not catch a new bug.  How do you prevent such cracks opening when
> you refactor?

By making the tests more hyperactive than they need to be (to incidentally
test more details than the end-result needs), and by running all the tests
after the fewest possible edits. If refactoring fails a test, you hit undo
and try again. Either your code will squirm around within the very small
space the tests allow it to, or you will give up and make the most
frequently failing tests fuzzier. These are both good - especially if you
include a dose of "thinking about the problem space" along with the
relentless testing.

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/17/2004 1:14:02 AM

Stede Troisi wrote:

> "John Roth" <newsgroups@jhrothjr.com> wrote in message:
> 
> 
>>It depends on what you're looking for. TDD isn't
>>a testing methodology first. It's a design methodology
>>first and a testing methodology second.
> 
> 
> That is really true. I didn't get that until the Fibinacci example in Kent's
> book. I guess I am slow.
> 
> 
>>TDD normally gives statement coverage in the high
>>90s, and branch coverage in the middle 90s. As far
>>as other coverage metrics, I haven't a clue - I haven't
>>seen either measurements or theoretical arguments.
> 
> 
> I don't know what metric you are using but you and CTip have vastly
> different numbers. How is something like this provable? How can we have
> scientific data?
> 
> John, you must understand how frustrating this can get when you here two
> experts with such vastly different data sets. It is like hearing 50% of
> economists saying outsourcing is good and another 50% saying it is bad. It
> just makes people who want to learn so much less faithful in the whole
> process.
> 

Actually, we're not disagreeing about the coverage - I do agree that it 
is possible to use strict TDD and get 90% statement coverage. Branch 
coverage will depend on the kinds of programs one writes, but for what 
TDD gets used for, its probably reasonable.

I'm also not arguing about TDD giving a productivity boost. Given how 
non-systematically many (most?) programmers write code, any structure 
will show a productivity boost.

However, after a point, TDD starts getting in the way. Consider an ideal 
scenario - you spend time figuring out what the final code should look 
like, you code it, and then generate a minimal set of tests. Clearly, 
this is faster than coding a test, writing the code, possibly 
refactoring, and so on. You avoid all the overhead of extra tests and of 
extra refactoring.

Advocates of TDD don't believe that that this ideal is achievable (or at 
least, not achievable by most programmers). However, all of the most 
productive programmers I know actually work this way.

I don't think that using TDD will cause long term damage to a 
programmers development. I'd probably say its like training wheels - 
eventually you need to move beyond them.
0
Reply CTips 11/17/2004 1:25:13 AM

I get your point now. So I guess you don't use TDD at all because you are
past the training wheel stage? If you do use a little TDD what is the best
way to determine when to use it and when not?

Thanks,
Stede

"CTips" <ctips@bestweb.net> wrote in message
news:10pla3r86k7k013@corp.supernews.com...
> Stede Troisi wrote:
>
> > "John Roth" <newsgroups@jhrothjr.com> wrote in message:
> >
> >
> >>It depends on what you're looking for. TDD isn't
> >>a testing methodology first. It's a design methodology
> >>first and a testing methodology second.
> >
> >
> > That is really true. I didn't get that until the Fibinacci example in
Kent's
> > book. I guess I am slow.
> >
> >
> >>TDD normally gives statement coverage in the high
> >>90s, and branch coverage in the middle 90s. As far
> >>as other coverage metrics, I haven't a clue - I haven't
> >>seen either measurements or theoretical arguments.
> >
> >
> > I don't know what metric you are using but you and CTip have vastly
> > different numbers. How is something like this provable? How can we have
> > scientific data?
> >
> > John, you must understand how frustrating this can get when you here two
> > experts with such vastly different data sets. It is like hearing 50% of
> > economists saying outsourcing is good and another 50% saying it is bad.
It
> > just makes people who want to learn so much less faithful in the whole
> > process.
> >
>
> Actually, we're not disagreeing about the coverage - I do agree that it
> is possible to use strict TDD and get 90% statement coverage. Branch
> coverage will depend on the kinds of programs one writes, but for what
> TDD gets used for, its probably reasonable.
>
> I'm also not arguing about TDD giving a productivity boost. Given how
> non-systematically many (most?) programmers write code, any structure
> will show a productivity boost.
>
> However, after a point, TDD starts getting in the way. Consider an ideal
> scenario - you spend time figuring out what the final code should look
> like, you code it, and then generate a minimal set of tests. Clearly,
> this is faster than coding a test, writing the code, possibly
> refactoring, and so on. You avoid all the overhead of extra tests and of
> extra refactoring.
>
> Advocates of TDD don't believe that that this ideal is achievable (or at
> least, not achievable by most programmers). However, all of the most
> productive programmers I know actually work this way.
>
> I don't think that using TDD will cause long term damage to a
> programmers development. I'd probably say its like training wheels -
> eventually you need to move beyond them.


0
Reply Stede 11/17/2004 1:37:24 AM

Andrew McDonagh wrote:

> CTips wrote:
> snipped
> 
>> Lets hear it from other people - what is the largest or most complex  
>> or most challenging (or some combination) program that you wrote using 
>> TDD?
> 
> 
> 
> We are using TDD (and XP in fact) on (IMO) a medium sized application. 
> Its a telecomms equipment Network Management System. Multiple Servers 
> managing telecomms equipment and providing that data and manageability 
> to multiple clients.
> 
> We use CORBA as the RPC mechanism between each (n equipment <-> n 
> servers <-> n clients)
> 
> The system currently supports Fault and Configuration Management and in 
> due course will provide Auditing, Performance and Security management of 
>  the telecomms equipment.
> 
>>
>> [Note that I have never worked on a large software program - most of 
>> my experience is with medium-sized programs, in the 100k to 300kloc 
>> range. I typically consider  50-100kloc small, 100kloc - 500kloc 
>> medium, 500kloc+ large]
> 
> 
> These number are IMO meaningless.  Its totally Dependant upon the 
> language, persistence mechanisms, RPC technology, etc which are always 
> going to be different for people here on this NG to be able to compare.
> 
> For example, a C++ program can usually be much large than its Java 
> equivalent, yet the same program developed in Ruby can be even smaller.
> 
> Then there's auto generated code. Our systems CORBA's IDL is only a few 
> hundred LOC, but generates 000s of lines of Java code.
> 
> Also, anyone can write lots of code. If productivity is measured by LoC, 
> then people tend to just write lots of code, rather than well factored 
> code that does the same job, but in few loc.
> 
> How about we compare functionality instead?

Absolutely - I'd consider optimizing compilers, database kernels, 
operating system kernels to be medium functionality programs. Large 
functionality programs would be fighter avionics packages, air-traffic 
control programs and teleco switches. Note that many of these tend to be 
written in C (or C++ or assembler (and maybe Ada, though I think many 
avionics packages used to get waivers)).

I suspect that I'd call your program on the small size. Unless of 
course, the program is a distributed fault-tolerant program with support 
for things like recovery after network partitioning. Or unless it 
requires switch-level 5-9s reliability. In which case it definitely 
counts as a medium sized program.

0
Reply CTips 11/17/2004 1:38:06 AM

"Stede Troisi" <stede@verizon.net> wrote in message 
news:F7wmd.7314$063.6686@trndny03...
>
> "John Roth" <newsgroups@jhrothjr.com> wrote in message:
>
>> It depends on what you're looking for. TDD isn't
>> a testing methodology first. It's a design methodology
>> first and a testing methodology second.
>
> That is really true. I didn't get that until the Fibinacci example in 
> Kent's
> book. I guess I am slow.

I wouldn't say slow. The word "test" seems to mislead
a lot of people, and it's not helped by [name withheld] insisting
that it's the right word because we usually use xUnit
as the vehicle.

As far as testing goes, I (and quite a few others)
would describe the test suite as a regression test,
not a unit or integration test suite in the classical
sense.

>> TDD normally gives statement coverage in the high
>> 90s, and branch coverage in the middle 90s. As far
>> as other coverage metrics, I haven't a clue - I haven't
>> seen either measurements or theoretical arguments.
>
> I don't know what metric you are using but you and CTip have vastly
> different numbers. How is something like this provable? How can we have
> scientific data?
>
> John, you must understand how frustrating this can get when you here two
> experts with such vastly different data sets. It is like hearing 50% of
> economists saying outsourcing is good and another 50% saying it is bad. It
> just makes people who want to learn so much less faithful in the whole
> process.

Well, emperically it comes out of a number of measurements
that were reported on the XP mailing list. Then we looked
at why we were getting such high numbers, and discovered
that if you do TDD ***exactly*** by the book, not adding
one more keystroke than  you needed to make each test
pass, you _should_ get 100% statement and branch coverage.

As a practical matter, though, its just soooooo easy to slide
in one more statement because you "know you're going to
need it", and there goes your 100% mark. Things like
Java's exception mechanism also tend to force you to write
untested code unless you really take care to do things like
write the tests for exception handlers first.

CTips, however,  was talking about a number of other
coverage metrics. I have no data on them, and have
no particular reason to think that TDD would do particularly
well, although the tendency to produce lots of little straight-line
methods should have a good effect on path coverage.

>
>> We've had some comparisons of TDD and classical
>> testing methodologies in the past. The general conclusion
>> was that TDD gave a much *smaller* set of tests than
>> the testing gurus thought was an adequate set,  by a
>> factor of at least five.
>
> Can you show it? Prove it? I am not saying this in an adversary tone.

You'd have to find the original e-mail set. I believe
that the discussion was between either Kent or Ron
and a rather highly respected testing guru. Kent said
that a test set of about 5 was adequate, the guru said
it needed about 80!

I can see where they were both coming from;
it's simply a different mindset.

John Roth

>
> Stede
>
>> John Roth
>>
>
> 

0
Reply John 11/17/2004 2:41:05 AM

"Thad Smith" <ThadSmith@acm.org> wrote in message 
news:419a8807$1_2@omega.dimensional.com...
> stedetro@yahoo.com wrote:
>
>> What about TDD as regression testing?
>
> One of the things that occurs to me is that if you write a test, write 
> code, write a test, etc. cycle, you are basically designing your tests 
> with assumptions about the implementation.  Now suppose you change the 
> implementation.  Let's say you need to increase performance and can do 
> this by handling an input parameter x in two ranges: 0 - 15, and 16 or 
> greater, rather than a single range as originally designed.  If the 
> original test didn't test these ranges separately, the test after change 
> might not catch a new bug.  How do you prevent such cracks opening when 
> you refactor?

That is a truely excellent question, to which there is,
unfortunately, no equally excellent answer.

The best I can say is that there are a number of factors
that have to be considered.

One is that refactoring is, technically, behavior preserving
on some level. Tests written outside of that level shouldn't
be affected, and code written inside that level should have
new tests written for them.

A second is that you have to keep refactoring and
reworking the tests as well as the code. Programmer
tests are not static and you can get quite a boost by
refactoring them occasionally.

John Roth
>
> Thad
> 

0
Reply John 11/17/2004 2:46:03 AM

Stede Troisi wrote:

> I get your point now. So I guess you don't use TDD at all because you are
> past the training wheel stage? If you do use a little TDD what is the best
> way to determine when to use it and when not?

A good question to which I unfortunately don't necessarily have a good 
answer. Maybe the following will help.

I'm assuming you're using TDD for designing your modules. Next time you 
have to, try and work out the design, preferably in your head, but 
possibly with paper and pencil, and see how far you can get. See how 
much of the module you can actually visualize. Definitely _think_ about 
all the test cases you'd expect to write. Think about all the invariants 
you'd expect to hold. Think about how you expect the module to be used. 
Think about the data-structures you'd use. Think about any algorithms 
that will be required.

Now, if at the end of this process, you think you have a fairly good 
handle on what the final code will look like, go ahead and code it up. 
You may want to code it in phases, and test each phase separately.

  If you find that you are not going back and rewriting code, then you 
know you don't need TDD. If you find that you're rewriting code because 
your understanding of the problem was at fault, then maybe TDD is still 
a good design technique. If you find that you're rewriting code because 
the code becomes cleaner that way, you probably don't need TDD, but 
maybe you do.
0
Reply CTips 11/17/2004 2:49:05 AM

CTips wrote:
> stedetro@yahoo.com wrote:
> 
>> What about Design by Contract?
> 
> Design by contract is an interesting idea. Unfortunately, its also more 
> overhead than its worth in most situations.
> 
>  Lets look at the whole idea of preconditions, postconditions and 
> invariants. They are expressions on the state of the program that are 
> (supposted to be) true before, after, and during the execution of a 
> piece of code. Quite often, the actual condition ends up being very hard 
> to express correctly.
> 
> For instance, how do I specify the post-condition for the factorial 
> function?
>   n = fact(i);
>   /* at the end of this n is i! */
> To actually check this would require rewriting the factorial function 
> some other way.
>   n = fact(i)
>   assert( n == fact_1(i) );
> So, in practice, one would just leave the post-condition as a comment. 
> Which kind of defeats the purpose.

A partial solution, which I might use is
   assert ((i == 0 && n == 1) || n == i * fact(i-1));

That requires computing another factorial and doesn't fully test the 
function, but does do one consistency check.  If tested over enough 
values, it should actually confirm correct behavior (assuming no hidden 
states in the function).

> Another problem with pre-, post- and invariant conditions in complex 
> situations is that coming up with the "right" conditions is pretty 
> difficult (particularily the invariants). Here "right" means that they 
> correctly capture the state, are weak enough to be written succintly, 
> and are strong enough to detect many bugs.

Difficult, yes, but I think the exercise of doing it builds confidence 
in a correct approach.

Thad

0
Reply Thad 11/17/2004 6:33:35 AM

On Tue, 16 Nov 2004 15:46:58 -0500, CTips <ctips@bestweb.net> wrote:

>I tend to write about 100klocs of production/production-quality code 
>every year (I'm not counting things like the test code itself). Its 
>slipped now that I'm doing customer/sales/management kind of stuff, but 
>I'm still trying to get to about that half that. And its not exactly 
>simple stuff either - at this point a large part of what I write 
>involves development of new algorithms as well as the coding.
>
>So permit me to say this - if you can hit those kinds of productivities 
>using TDD and the kinds of testing approaches you're advocating, more 
>power to you. If you can't, then maybe you should consider how to change 
>your approach to get there.

Most programmers are not up to your standards, no matter what
techniques they use. It's problematical to measure any process against
you and people like you, don't you think?

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/17/2004 10:18:48 AM

On Tue, 16 Nov 2004 23:33:57 GMT, "Stede Troisi" <stede@verizon.net>
wrote:

>> TDD normally gives statement coverage in the high
>> 90s, and branch coverage in the middle 90s. As far
>> as other coverage metrics, I haven't a clue - I haven't
>> seen either measurements or theoretical arguments.
>
>I don't know what metric you are using but you and CTip have vastly
>different numbers. How is something like this provable? How can we have
>scientific data?
>
>John, you must understand how frustrating this can get when you here two
>experts with such vastly different data sets. It is like hearing 50% of
>economists saying outsourcing is good and another 50% saying it is bad. It
>just makes people who want to learn so much less faithful in the whole
>process.

Stede, let me offer two notions:

First, someone else's data tells us very little about what will happen
to us. Something, but very little. It might encourage us to try
something, or discourage us from trying it, but our own results are
what matters.

Second, CTips reports productivity from himself and his team which is
almost unprecedentedly high. Taking their figures at face value, their
experience in inapplicable to ordinary mortals. I'd like to get
someone in there to look at what they really do, but so far we haven't
been able to set that up.

Regards,

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/17/2004 10:21:45 AM

On Tue, 16 Nov 2004 21:49:05 -0500, CTips <ctips@bestweb.net> wrote:

>  If you find that you are not going back and rewriting code, then you 
>know you don't need TDD. If you find that you're rewriting code because 
>your understanding of the problem was at fault, then maybe TDD is still 
>a good design technique. If you find that you're rewriting code because 
>the code becomes cleaner that way, you probably don't need TDD, but 
>maybe you do.

I'd suggest that while doing this experiment one would also want to
note time spent debugging, and final defect rates in the released
code.

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/17/2004 10:23:06 AM

On Tue, 16 Nov 2004 17:03:10 -0700, Thad Smith <ThadSmith@acm.org>
wrote:

>One of the things that occurs to me is that if you write a test, write 
>code, write a test, etc. cycle, you are basically designing your tests 
>with assumptions about the implementation.  Now suppose you change the 
>implementation.  Let's say you need to increase performance and can do 
>this by handling an input parameter x in two ranges: 0 - 15, and 16 or 
>greater, rather than a single range as originally designed.  If the 
>original test didn't test these ranges separately, the test after change 
>might not catch a new bug.  How do you prevent such cracks opening when 
>you refactor?

Wouldn't we want to write new tests for each of the ranges, given that
we are trying to (a) improve performance and (b) will very likely be
creating two new methods?

Certainly if we don't, we're in the danger you mention. Therefore ...
we need to think, write new tests as we see fit. And if we get a
defect later, then, as always, we need to reflect on what we've
learned. (Which will be something like: if we write range-dependent
code, be sure to write tests for all ranges.)

Does that help? Does it raise new questions or answers for you?

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/17/2004 10:26:51 AM

On Tue, 16 Nov 2004 15:49:33 -0500, CTips <ctips@bestweb.net> wrote:

>Ronald E Jeffries wrote:
>
>
>> I can't remember the last time I encountered a team that had an very
>> very low defect rate, but I'm sure they are out there somewhere. And
>> maybe they don't need TDD. Someday maybe I'll get to observe such a
>> team and find out.
>> 
>
>You know you have a standing invitation. Swing by whenever you're in the 
>area. I'll make sure you get access to our bug database, and see if I 
>can arrange access to our customer help e-mails.

I'm sure it will be fascinating. Did Ralph wind up deciding there was
no fit for a visit on his part?

-- 
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.
0
Reply Ronald 11/17/2004 10:27:55 AM

Thad Smith wrote:
> CTips wrote:
> 
>> stedetro@yahoo.com wrote:
>>
>>> What about Design by Contract?
>>
>>
>> Design by contract is an interesting idea. Unfortunately, its also 
>> more overhead than its worth in most situations.
>>
>>  Lets look at the whole idea of preconditions, postconditions and 
>> invariants. They are expressions on the state of the program that are 
>> (supposted to be) true before, after, and during the execution of a 
>> piece of code. Quite often, the actual condition ends up being very 
>> hard to express correctly.
>>
>> For instance, how do I specify the post-condition for the factorial 
>> function?
>>   n = fact(i);
>>   /* at the end of this n is i! */
>> To actually check this would require rewriting the factorial function 
>> some other way.
>>   n = fact(i)
>>   assert( n == fact_1(i) );
>> So, in practice, one would just leave the post-condition as a comment. 
>> Which kind of defeats the purpose.
> 
> 
> A partial solution, which I might use is
>   assert ((i == 0 && n == 1) || n == i * fact(i-1));
> 
> That requires computing another factorial and doesn't fully test the 
> function, but does do one consistency check.  If tested over enough 
> values, it should actually confirm correct behavior (assuming no hidden 
> states in the function).
If you rewrite your test as:
   assert( n == ( i == 0 ? 1 : i*fact(i-1)) );
it becomes more apparent that you're really just rewriting the factorial 
function, recursively.

>> Another problem with pre-, post- and invariant conditions in complex 
>> situations is that coming up with the "right" conditions is pretty 
>> difficult (particularily the invariants). Here "right" means that they 
>> correctly capture the state, are weak enough to be written succintly, 
>> and are strong enough to detect many bugs.
> 
> 
> Difficult, yes, but I think the exercise of doing it builds confidence 
> in a correct approach.

Oh, this exercise is useful. But DBC is by no means as generally 
applicable, or as powerful as its advocates suggest. Like many ideas it 
works better on small examples, and starts to show limitations once you 
try to apply it to bigger or more complex situations.
0
Reply CTips 11/17/2004 12:00:13 PM

Ronald E Jeffries wrote:

> On Tue, 16 Nov 2004 15:46:58 -0500, CTips <ctips@bestweb.net> wrote:
> 
> 
>>I tend to write about 100klocs of production/production-quality code 
>>every year (I'm not counting things like the test code itself). Its 
>>slipped now that I'm doing customer/sales/management kind of stuff, but 
>>I'm still trying to get to about that half that. And its not exactly 
>>simple stuff either - at this point a large part of what I write 
>>involves development of new algorithms as well as the coding.
>>
>>So permit me to say this - if you can hit those kinds of productivities 
>>using TDD and the kinds of testing approaches you're advocating, more 
>>power to you. If you can't, then maybe you should consider how to change 
>>your approach to get there.
> 
> 
> Most programmers are not up to your standards, no matter what
> techniques they use. It's problematical to measure any process against
> you and people like you, don't you think?
> 

Why are they not at those standards? Some of it may be talent + desire + 
experience, but some of it is definitely the practices we adopt.

Now, if you look at those practices, and see that they are different 
from the ones you use/advocate, then you have to ask yourself - are your 
current practices holding you back? Are you at your limit, or can you 
get better? Obviously, switching to a new set of practices means that 
your productivity may go down before it goes up again, and it may be 
that the new practices don't work as well for you - but unless you try, 
you'll basically plateau.

For example - I've picked up splint, and am toying with the idea of 
writing my next program splint -strict warning free. I'd like to see its 
impact on my overall productivity. Unfortunately, for the next few 
months most of my coding is going to be VHDL, so its unlikely I'll be 
able to get around to it soon.

For some of the simpler practices I use, have a look at 
http://users.bestweb.net/~ctips/
0
Reply CTips 11/17/2004 12:12:43 PM

Ronald E Jeffries wrote:

> On Tue, 16 Nov 2004 15:49:33 -0500, CTips <ctips@bestweb.net> wrote:
> 
> 
>>Ronald E Jeffries wrote:
>>
>>
>>
>>>I can't remember the last time I encountered a team that had an very
>>>very low defect rate, but I'm sure they are out there somewhere. And
>>>maybe they don't need TDD. Someday maybe I'll get to observe such a
>>>team and find out.
>>>
>>
>>You know you have a standing invitation. Swing by whenever you're in the 
>>area. I'll make sure you get access to our bug database, and see if I 
>>can arrange access to our customer help e-mails.
> 
> 
> I'm sure it will be fascinating. Did Ralph wind up deciding there was
> no fit for a visit on his part?
> 
He never got back to me. And I got caught up in work, and didn't push him.
0
Reply CTips 11/17/2004 12:14:07 PM

Ilja Preu� wrote:

> I will take a look at its code size at work...

5612 LoC in 105 classes/interfaces.

Cheers, Ilja


0
Reply Ilja 11/17/2004 1:16:26 PM

"CTips" <ctips@bestweb.net> wrote in message 
news:10pmfapc4b5ov70@corp.supernews.com...
> Thad Smith wrote:
>> CTips wrote:
>>
>>> stedetro@yahoo.com wrote:
>>>
>>>> What about Design by Contract?

DBC is an interesting beast. From my viewpoint, it's
an attempt to implement a lot of the formal methods
material from the '70s in a testing environment rather
than a formal or informal proof environment.

Its advocates miss a number of things, one of which
is that it's essentially opportunistic testing. It tests
based on whatever values happen to be generated
during the test run, rather than the carefully selected
values of classical testing or the different (but equally
carefully selected) values of TDD.

And of course, just about everyone who's tried
it seriously points out that you can't write what
you'd really like for postconditions much of the
time.

What I'd really like is a program verifier. I know
that it seems to be a really "hard" research topic,
but I suspect that's part of the problem. What
little I've seen on the subject is that they're trying
to verify pre-written modules, and we know
(from an XP standpoint) that we don't really
want to do that. Integrating testing with coding
has really pervasive effects on how we code;
integrating verification with coding should have
equally pervasive effects.

John Roth


0
Reply John 11/17/2004 1:46:19 PM

> For instance, how do I specify the post-condition for the factorial 
> function?
>    n = fact(i);
>    /* at the end of this n is i! */

I'm not sure why you refer to "the" post-condition - there could be any 
number, depending on which characteristics of fact were important in the 
program.

Factorial would have a precondition of "i >= 0", and offhand we may 
suppose that the postcondition "n > 0" is relevant.

The point of contracts is that they avoid having to resort to defensive 
programming - if your factorial gets passed a negative integer, then 
that's a bug in client code, not in factorial. You can leave out all 
error-handling code related to that condition. By the same token, a 
violated postcondition says that the bug was in your function.

Laurent
0
Reply Laurent 11/17/2004 4:13:26 PM

> Now, the question is what makes you think that it is better than (or 
> even remotely comparable to) other approaches for building medium and 
> large sized programs with a "reasonable" degree of correctness.

I don't remember stating that I think that, so it most certainly isn't 
"the" question. :)

Your experience with large programs will yield valuable insights as to 
how TDD might fare in such a context, and I'm interested in these 
insights, should I ever work on a large program. However, I'd like to be 
sure that your scrutiny bears upon TDD as I practice it, rather than on 
some misunderstanding of it.

Has the conversation thus far cleared up some of the points that were 
unclear for you about TDD ?

In particular, what differences has it revealed between the way you 
attempted to use TDD and the way practitioners use it ?

Laurent
0
Reply Laurent 11/17/2004 4:33:05 PM

Laurent Bossavit wrote:

> Has the conversation thus far cleared up some of the points that were 
> unclear for you about TDD ?
> 
> In particular, what differences has it revealed between the way you 
> attempted to use TDD and the way practitioners use it ?
>

There are no differences. I did it exactly the way its advocated. It 
just turns out to be inefficient - extremely so. I'd estimate that I'd 
probably be about 2x to 10x less efficient if I used TDD to program.

 >>Now, the question is what makes you think that it is better than (or
 >>even remotely comparable to) other approaches for building medium and
 >>large sized programs with a "reasonable" degree of correctness.
 >
 >
 > I don't remember stating that I think that, so it most certainly isn't
 > "the" question. :)
 >
 > Your experience with large programs will yield valuable insights as to
 > how TDD might fare in such a context, and I'm interested in these
 > insights, should I ever work on a large program. However, I'd like to be
 > sure that your scrutiny bears upon TDD as I practice it, rather than on
 > some misunderstanding of it.
 >

Like I said, most of my work has been with medium sized programs. I have 
never really worked on a 500kloc+ sized program. So I don't really have 
any insights about large programs - except that I'd try not to write a 
large program - either break it up into medium sized programs, or try to 
find a way of reducing the size, possibly through the use of a little 
language.

I think that someone who has worked on more than one such project in a 
non-management but non-grunt position is really qualified to talk about 
practices in programs of that scale. And it would help if a couple of 
those projects were after the introduction of workstations (mainframe 
style practices might be a little dated). I don't really think that 
there are too many such people around, unfortunately.
0
Reply CTips 11/17/2004 4:41:02 PM

> There are no differences. I did it exactly the way its advocated. It 
> just turns out to be inefficient - extremely so.

OK - that clears up doubts I had from some of your comments, such as 
about having to write 1000 lines before you could write a real test.

I'm assuming you tried TDD on a problem you selected specifically as 
being amenable to it, rather than what you would consider a "real" 
programming task ?

Inefficient as full-on TDD turned out to be for you when you tried it, 
did you notice any particular effect on the resulting design of your 
code, or its defect density, or any other characteristic ?

Laurent
0
Reply Laurent 11/17/2004 5:32:48 PM

Laurent Bossavit wrote:

>>There are no differences. I did it exactly the way its advocated. It 
>>just turns out to be inefficient - extremely so.
> 
> 
> OK - that clears up doubts I had from some of your comments, such as 
> about having to write 1000 lines before you could write a real test.
> 
> I'm assuming you tried TDD on a problem you selected specifically as 
> being amenable to it, rather than what you would consider a "real" 
> programming task ?
> 
> Inefficient as full-on TDD turned out to be for you when you tried it, 
> did you notice any particular effect on the resulting design of your 
> code, or its defect density, or any other characteristic ?
> 
> Laurent

I used it formally to develop a toy example (parsing a formatted text 
file, probably ~200loc) and informally to develop a data-structure 
(lists with length, I believe, also about that length).

Unfortunately, at that size/complexity, I don't inject too many bugs - 
coupled with the kind of defensive programming I do, I can identify any 
bugs with very little testing.

As a matter of fact, in the toy example, because I tried consciously to 
keep from doing anything unnecessary, the time to isolate all bugs 
probably went up.

And the time to implement went way up. I can usually code and debug a 
200 line C module in under an hour (unless it has some hidden 
complexity), but with TDD, I think the time doubled or tripled.
0
Reply CTips 11/17/2004 6:02:23 PM

While we're on the subject of productivity, I'd like to point to a 
(sub-)thread in comp.arch; in particular look at the following articles 
(I hope this works; I've never tried cut and paste of google archived 
news messages)

http://groups.google.com/groups?selm=3DFAA2EA.FEE1135E%40bestweb.net
http://groups.google.com/groups?selm=atmhu3%24r5h%241%40vkhdsu24.hda.hydro.com
http://groups.google.com/groups?selm=3DFEDB59.34CB27B3%40bestweb.net
http://groups.google.com/groups?selm=nheqta-ih1.ln%40cohen.paysan.nom

0
Reply CTips 11/17/2004 6:25:33 PM

CTips schrieb:
> For instance, how do I specify the post-condition for the factorial
> function?
>    n = fact(i);
>    /* at the end of this n is i! */

> To actually check this would require rewriting the factorial function
> some other way.
>    n = fact(i)
>    assert( n == fact_1(i) );
> So, in practice, one would just leave the post-condition as a comment.
> Which kind of defeats the purpose.

I think the purpose of the postcondition is to prevent your code from
causing bugs in other code. Specifying that the function result should
be correct as a postcondition is useless. If you want to compute the
result with two different algorithms (or two different implementations
of the same algorithms), that will certainly increase bug detection
rates, but is the postcondition the right place to do that?

Other code can expect the factorial to return a positive integer; it
should be clear how big this integer should be, and I'd expect either a
precondition that puts a limit on the input number size, or a
postcondition that mentions that the output can be NaN. I think assert(
i > 0 ) would fail for NaN?

If I replace fact(i) with code that always returns 1, will code outside
break? If it doesn't break, but merely returns incorrect results, that
is no problem.
One expectancy of the outside could be that for i>1, fact(i) > i.

If I put fact(i) > fact(i-1), I must rely on the language to compute
fact(i-1) without invoking the assertions for fact() yet again.
Languages with retrofitted assert() don't do that, do they?

Just my 2c
Michael
-- 
Still an attentive ear he lent        Her speech hath caused this pain
But could not fathom what she meant   Easier I count it to explain
She was not deep, nor eloquent.       The jargon of the howling main
                 -- from Lewis Carroll: The Three Usenet Trolls
0
Reply Michael 11/17/2004 9:11:08 PM

Ronald E Jeffries schrieb:
> On Tue, 16 Nov 2004 17:03:10 -0700, Thad Smith <ThadSmith@acm.org>
> wrote:
> >One of the things that occurs to me is that if you write a test, write
> >code, write a test, etc. cycle, you are basically designing your tests
> >with assumptions about the implementation.  

I'd do that with Unit tests.

> >Now suppose you change the
> >implementation.  Let's say you need to increase performance and can do
> >this by handling an input parameter x in two ranges: 0 - 15, and 16 or
> >greater, rather than a single range as originally designed.  If the
> >original test didn't test these ranges separately, the test after change
> >might not catch a new bug.  How do you prevent such cracks opening when
> >you refactor?
> 
> Wouldn't we want to write new tests for each of the ranges, given that
> we are trying to (a) improve performance and (b) will very likely be
> creating two new methods?

I'd decide to do a second implementation.
I'd write a test to check the two implementations return equal results.
I'd copy the existing implementation and write a second interface to it
to make the test pass.

I'd decide to improve performance for range 0-15.
I'd write a test that tests performance for the main implementation.
I'd implement code that does it.

If the test that compares implementations still passes, 
where's the worry?



The tests show ditinctively that
a) there are indeed 2 different implementations
b) the performance requirement you've set and fulfilled


Cheers
Michael
-- 
Still an attentive ear he lent        Her speech hath caused this pain
But could not fathom what she meant   Easier I count it to explain
She was not deep, nor eloquent.       The jargon of the howling main
                 -- from Lewis Carroll: The Three Usenet Trolls
0
Reply Michael 11/17/2004 9:19:19 PM

CTips a �crit:
> Stede Troisi wrote:
> 
>> I get your point now. So I guess you don't use TDD at all because you are
>> past the training wheel stage? If you do use a little TDD what is the 
>> best
>> way to determine when to use it and when not?
> 
> 
> A good question to which I unfortunately don't necessarily have a good 
> answer. Maybe the following will help.
> 
> I'm assuming you're using TDD for designing your modules. Next time you 
> have to, try and work out the design, preferably in your head, but 
> possibly with paper and pencil, and see how far you can get. See how 
> much of the module you can actually visualize. Definitely _think_ about 
> all the test cases you'd expect to write. Think about all the invariants 
> you'd expect to hold. Think about how you expect the module to be used. 
> Think about the data-structures you'd use. Think about any algorithms 
> that will be required.
> 
> Now, if at the end of this process, you think you have a fairly good 
> handle on what the final code will look like, go ahead and code it up. 
> You may want to code it in phases, and test each phase separately.

But when I do that :

- whole design
- coding phase
- testing phase

I observe that during the coding phase, I can't modify my design until 
my testing phase is done (i.e. until I have honest test coverage for the 
code) not doing so would be hacking.

So either I continue coding until testing is done (continuing coding 
while acknowledging for some design defects..hmm) either I stop coding 
and restart the design for this phase (with a risk of analysis paralysis 
lurking).

For this to work I'd have to do very very good thinking in the design 
phase. A design phase where design flaws would not be an option, in fact.

Given the usual pressure for delivering not too late a bug-free system, 
this way of trying to make it right at the outset sounds just stressful 
to me.

> 
>  If you find that you are not going back and rewriting code, then you 
> know you don't need TDD. If you find that you're rewriting code because 
> your understanding of the problem was at fault, then maybe TDD is still 
> a good design technique. If you find that you're rewriting code because 
> the code becomes cleaner that way, you probably don't need TDD, but 
> maybe you do.

*my* probability of not going back and not rewrite code because my 
design is so good = near 0.
*my* probability of having a less-than-perfect understanding of the 
problem at the outset of the project = near 1.
*my* probability of having to rewrite code because it gets dirty = near 1.

So I'm glad I picked TDD after all. What I hear from what you say is 
that really bright programmers don't need TDD, that it could decrease 
their velocity instead of increasing it. I agree. For the rest of us (a 
majority of my coworkers are like me, not very good at design) TDD just 
rocks !

Regards --ct

0
Reply Christophe 11/17/2004 9:55:52 PM

CTips schrieb:
> get better? Obviously, switching to a new set of practices means that
> your productivity may go down before it goes up again, and it may be
> that the new practices don't work as well for you - but unless you try,
> you'll basically plateau.

This works both ways - if you're as good as you are, it could be a
result of combining the experiences made using both sets of practices:
andin this case, it wouldn't matter whether you used set A and switched
to B, or vice versa: you'd improve in both cases.

If a new set B is introduced, there won't be any people switching from B
to A, so you could get the impression that B is an improvement overall,
even if the improvement is due to the abovementioned effect.

To be really accurate, you'd have to identify the groups you apply the
new methodology to: "To practicioners of practice X, TDD promises a
development speedup of X %, on the average, after a spinup time of circa
6 weeks". "When used in college-level intorductory programing courses,
method B produced on average better statistics (show details) than
method A".

I've not seen statements like this.

Cheers
Michael
-- 
Still an attentive ear he lent        Her speech hath caused this pain
But could not fathom what she meant   Easier I count it to explain
She was not deep, nor eloquent.       The jargon of the howling main
                 -- from Lewis Carroll: The Three Usenet Trolls
0
Reply Michael 11/17/2004 10:17:44 PM

> Unfortunately, at that size/complexity, I don't inject too many bugs - 

I wouldn't call that unfortunate. :)

> And the time to implement went way up. I can usually code and debug a 
> 200 line C module in under an hour (unless it has some hidden 
> complexity), but with TDD, I think the time doubled or tripled.

It's a new technique for you - it would be surprising if you had the 
same performance on your first few runs. "Trying consciously to keep 
from doing anything unnecessary" would slow you right down - as if you 
were a virtuoso player of some instrument, switching to a rather 
different one; it would be a while before you could stop thinking of 
where what finger goes, and so on.

Also I would be looking for gains from TDD a little later in the 
complexity curve - when that little module has to take on one, two, 
three further features.

Laurent
0
Reply Laurent 11/18/2004 7:19:53 AM

Laurent Bossavit wrote:
>>Unfortunately, at that size/complexity, I don't inject too many bugs - 
> 
> 
> I wouldn't call that unfortunate. :)
> 
> 
>>And the time to implement went way up. I can usually code and debug a 
>>200 line C module in under an hour (unless it has some hidden 
>>complexity), but with TDD, I think the time doubled or tripled.
> 
> 
> It's a new technique for you - it would be surprising if you had the 
> same performance on your first few runs. "Trying consciously to keep 
> from doing anything unnecessary" would slow you right down - as if you 
> were a virtuoso player of some instrument, switching to a rather 
> different one; it would be a while before you could stop thinking of 
> where what finger goes, and so on.
> 
> Also I would be looking for gains from TDD a little later in the 
> complexity curve - when that little module has to take on one, two, 
> three further features.
> 
> Laurent

In my considered judgement, as complexity goes up, the amount of 
work/time to do TDD goes up (you're definitely writing more tests, and 
you're interrupting the coding of the module). If the time to design 
does not go down, or the time to debug the resulting code does not go 
down enough to counter-balance it, its a net loss.

Given that going through a sequence of simple steps quite often yeilds 
pushes one into a sub-optimal design and requires extensive rewriting, 
it seems quite inefficient. This is going to be more likely in complex 
situations than in simple ones. Alternatively - if you already know 
up-front a near-optimal design, why bother with the TD_D_?

The number of tests required to get statement coverage when written 
post-facto will usually be less than those generated by TDD. (I'm 
talking about white-box style testing). If we're trying to get more than 
statement coverage (which you should, for anything other than the 
simplest modules), you're going to have to generate tests that will 
subsume the TDD generated tests anyway.

So, where is the benefit of TDD? Basically, its for programmers who 
can't figure out the design of modules by thinking about the problem.

This sparks another thought - TDD may work for figuring out small module 
design. But how are you going to architect & design a large program? 
Either you can try and do it incrementally, or do a decent job upfront.

If you do it incrementally via TDD, and you find that TDD drives you 
through sub-optimal intermediate architectures, you will have to 
rewrite major portions of the program, every time you have to switch 
architectures.

If you try and do a decent job upfront, you're going to have to think 
about the program as a whole, taking into account as much as you know 
about the possible requirements (and their stability/uncertainity) at 
that point. But that means that you have to have the skill to do the 
design up-front for a whole program. If you have that skill, then why 
are you scared of designing a module? If you don't have that skill, then 
how can you ever hope to tackle anything but small, simple projects.

IMO, anyone who aims to tackle medium/large programs had better out-grow 
TDD.
0
Reply CTips 11/18/2004 1:52:20 PM

CTips wrote:

> So, where is the benefit of TDD? Basically, its for programmers who
> can't figure out the design of modules by thinking about the problem.

That is another way to say "how to write programs that scale to more
complexity and functionality than can fit in programmers' brains all at the
same time".

> IMO, anyone who aims to tackle medium/large programs had better out-grow
> TDD.

You contradict yourself.

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/18/2004 2:05:37 PM

CTips wrote:

 > In my considered judgement, as complexity goes up,
 > the amount of work/time to do TDD goes up (you're
 > definitely writing more tests, and you're
 > interrupting the coding of the module). If the time
 > to design does not go down, or the time to debug the
 > resulting code does not go down enough to
 > counter-balance it, its a net loss.

My (5 years) experience with TDD shows quite the
opposite. I would go so far as to say that the benefits
of TDD become more and more apparent as complexity
grows.

What made you imagine that TDD takes more work as
complexity grows?

 > If you do it incrementally via TDD, and you find that
 > TDD drives you through sub-optimal intermediate
 > architectures, you will have to rewrite major
 > portions of the program, every time you have to
 > switch architectures.

TDD does indeed drive one through intermediate
architectures which may be sub-optimal for a
hypothetical final program, but they are optimal for
each, real, intermediate program. This makes a lot of
sense, both technically and in business terms, in
situations where the exact scope and details of the
final program are not certain, or when the customer
benefits from getting usable software sooner.

Do you work in an area where the exact scope and
details of the final program are know from the start?

 > If you try and do a decent job upfront, you're going
 > to have to think about the program as a whole, taking
 > into account as much as you know about the possible
 > requirements (and their stability/uncertainity) at
 > that point. But that means that you have to have the
 > skill to do the design up-front for a whole
 > program. If you have that skill, then why are you
 > scared of designing a module? If you don't have that
 > skill, then how can you ever hope to tackle anything
 > but small, simple projects.

I believe I have that skill to a reasonable degree, and
I am certainly not scared. I used to work that way, in
fact I took great pleasure and pride in it.

But I found TDD even more pleasurable, intellectually
interesting, effective, and better adapted to the
reality of changing requirements and of working with
teams. There are many people out there who are far more
brilliant designers than I who have also adopted TDD.

 > IMO, anyone who aims to tackle medium/large programs
 > had better out-grow TDD.

I think anyone who aims to tackle large programs should
first learn TDD.

Dominic Williams
http://dominicwilliams.net

----

0
Reply Dominic 11/18/2004 2:40:56 PM

Dominic Williams wrote:
> CTips wrote:
> 
>  > IMO, anyone who aims to tackle medium/large programs
>  > had better out-grow TDD.
> 
> I think anyone who aims to tackle large programs should
> first learn TDD.
> 

Out of curiosity, what kinds of programs would you call large?

What is the "largest" program you've worked on personally as a coder (a 
description and a rough LOC would be nice)? What %age of the code was yours?

What is the largest module you've written solo?
0
Reply CTips 11/18/2004 3:19:25 PM

> If the time to design  does not go down, or the time to debug the
> resulting code does not go down enough to counter-balance it, its
> a net loss.

Where I get a net productivity gain is in the latter - it reduces 
debugging time handsomely.

> Given that going through a sequence of simple steps quite often yeilds 
> pushes one into a sub-optimal design and requires extensive rewriting, 

I can see that this can be a concern, but in practice that turns out not 
to be the case.

> So, where is the benefit of TDD? Basically, its for programmers who 
> can't figure out the design of modules by thinking about the problem.

Exactly. If you're not able to solve the entire design problem (and I 
mean *entire*), you're better off having a way of evolving it 
incrementally *safely*. My tiny brain has trouble handling a loop, so 
TDD starts to help from that point onwards.

Your massive brain can handle up to whatever it is, I'll pull the number 
of 10KLOC out of thin air, then TDD will become a net benefit shortly 
beyond that point.

What is evidence that you are not able to solve the *entire* problem ? 
Whenever you find you have introduced a defect in your code. If you have 
bugs - if you spend any time debugging - you are tackling problems whose 
complexity is beyond you.

> If you don't have that skill, then how can you ever hope to tackle
> anything but small, simple projects.

TDD *is* a skill that allows you to tackle more complexity.

Laurent
0
Reply Laurent 11/18/2004 6:13:05 PM

CTips wrote:

snipped

> I suspect that I'd call your program on the small size. Unless of 
> course, the program is a distributed fault-tolerant program with support 
> for things like recovery after network partitioning. Or unless it 
> requires switch-level 5-9s reliability. In which case it definitely 
> counts as a medium sized program.
> 

It is a distributed fault tolerant system, and requires 5-9s 
availability - not reliability.

Strange, I'd never use 5-9s availability as a measurement of application 
size, cause I can get that with a simple HelloWorld app.
0
Reply Andrew 11/18/2004 7:41:25 PM

CTips wrote:
> Dominic Williams wrote:
> 
>> CTips wrote:
>>
>>  > IMO, anyone who aims to tackle medium/large programs
>>  > had better out-grow TDD.
>>
>> I think anyone who aims to tackle large programs should
>> first learn TDD.
>>
> 
> Out of curiosity, what kinds of programs would you call large?
> 
> What is the "largest" program you've worked on personally as a coder (a 
> description and a rough LOC would be nice)? What %age of the code was 
> yours?
> 
> What is the largest module you've written solo?

There you go again with the LoC... :-)
0
Reply Andrew 11/18/2004 7:42:42 PM

Andrew McDonagh wrote:
> CTips wrote:
> 
> snipped
> 
>> I suspect that I'd call your program on the small size. Unless of 
>> course, the program is a distributed fault-tolerant program with 
>> support for things like recovery after network partitioning. Or unless 
>> it requires switch-level 5-9s reliability. In which case it definitely 
>> counts as a medium sized program.
>>
> 
> It is a distributed fault tolerant system, and requires 5-9s 
> availability - not reliability.
> 
> Strange, I'd never use 5-9s availability as a measurement of application 
> size, cause I can get that with a simple HelloWorld app.

Remember the word "switch-level 5-9s reliability" - that means that the 
service will be unavailable something like 9 hours per year; this will 
include any scheduled maintainence on the servers, as well as any 
software updates, apart from the usual possibility of actual computer 
and/or network failures.

Its in the context of what stresses it puts on the system; if you're 
designing for 5-9 availability on a distributed fault-tolerant system 
[even if you're building on top of some framework like Horus/Isis], it 
gets pretty complex.

Some of the nastier problems
- what happens if the network partitions, and someone makes an update? 
How do you reconcile the information when they rejoin?
- How about things like a bad router table somewhere?
- What kind of failure detectors are you using?
- Are you dealing with Byzantine failure models? What resilience are you 
targeting?
- What happens when A can talk to B and C (and vice versa), but the link 
between B & C is very slow?

Hats off to you if you've already built the infrastructure to test some 
of those corner cases, let alone derive the code from them.
0
Reply CTips 11/18/2004 8:25:39 PM

Andrew McDonagh wrote:

> CTips wrote:
> 
> snipped
> 
>> I suspect that I'd call your program on the small size. Unless of 
>> course, the program is a distributed fault-tolerant program with 
>> support for things like recovery after network partitioning. Or unless 
>> it requires switch-level 5-9s reliability. In which case it definitely 
>> counts as a medium sized program.
>>
> 
> It is a distributed fault tolerant system, and requires 5-9s 
> availability - not reliability.
> 
> Strange, I'd never use 5-9s availability as a measurement of application 
> size, cause I can get that with a simple HelloWorld app.

BTW: you can only get 5-9s availability on a simple HelloWorld app if 
your machine does not go down for more than 9 hours a year. Good luck, 
particularily if you were in Florida this year, or in the North-east 
last year.
0
Reply CTips 11/18/2004 8:41:27 PM

Laurent Bossavit wrote:
>>If the time to design  does not go down, or the time to debug the
>>resulting code does not go down enough to counter-balance it, its
>>a net loss.
> 
> 
> Where I get a net productivity gain is in the latter - it reduces 
> debugging time handsomely.
> 
> 
>>Given that going through a sequence of simple steps quite often yeilds 
>>pushes one into a sub-optimal design and requires extensive rewriting, 
> 
> 
> I can see that this can be a concern, but in practice that turns out not 
> to be the case.
> 
> 
>>So, where is the benefit of TDD? Basically, its for programmers who 
>>can't figure out the design of modules by thinking about the problem.
> 
> 
> Exactly. If you're not able to solve the entire design problem (and I 
> mean *entire*), you're better off having a way of evolving it 
> incrementally *safely*. My tiny brain has trouble handling a loop, so 
> TDD starts to help from that point onwards.
 >
> Your massive brain can handle up to whatever it is, I'll pull the number 
> of 10KLOC out of thin air, then TDD will become a net benefit shortly 
> beyond that point.

The reason we can handle the design of large systems is because of 
layers of abstraction. When we're thinking about a 100kloc program, we 
don't think about the 100kloc simultaneously, we just think of the top 
level functions/data-structures/abstractions. Then, separately, we think 
of the how each of the componets of the top abstraction layer are to be 
implemented. And so on and so forth.

Of course, it isn't that clean in practice, but the point is still this 
- if you come up with right kinds of abstractions, you don't have to 
think about the whole program at the same time. System architecture, to 
a large extent, is coming up with the right kinds of top-level 
abstractions. ADTs are just another (lower-level) abstraction mechanism.

On a somewhat different note, the largest program for which I didn't 
really have too many abstraction layers was about 30 kloc. For 
performance reasons, it was exactly one function with the only 
abstraction mechanism being macros. That was quite an experience. 
However, it wasn't too bad; I think it took about 4 months total.

> What is evidence that you are not able to solve the *entire* problem ? 
> Whenever you find you have introduced a defect in your code. If you have 
> bugs - if you spend any time debugging - you are tackling problems whose 
> complexity is beyond you.

I don't see how that follows. You can introduce "defects" into a "hello, 
world" program.

  I'd say if you keep changing the structure (architecture) of the 
program, then the complexity is beyond you.

> 
>>If you don't have that skill, then how can you ever hope to tackle
>>anything but small, simple projects.
> 
> 
> TDD *is* a skill that allows you to tackle more complexity.

If you're starting from nothing, certainly TDD is a skill that allows 
you to tackle more complexity. I'm not saying that TDD is not 
appropriate at a certain skill level. Its just that, I suspect, most 
good programmers will outgrow it.

The funny thing is that none of the more productive programmers I know 
use TDD. And a few of them are pretty passionate about trying to 
increase their productivity and probably would have looked at in the 
past. I haven't actually discussed the issue with them, but I'd be 
surprised if their reasons for not using TDD are different from mine.
0
Reply CTips 11/19/2004 1:43:24 AM

CTips <ctips@bestweb.net> wrote in message 

> >  > IMO, anyone who aims to tackle medium/large programs
> >  > had better out-grow TDD.

Quite frankly, I am finding your repeated claims to be somewhat
grating. On one hand, you have a right to your opinion, but on the
other hand, you disregard all evidence presented by people such as
Laurent, Iljya, Phlip, Ron, RCM, with respect to fairly large projects
that have been successfully run using Xp and TDD. It is one thing to
say "My experience suggests TDD is not worthwhile." It is another to
imply TDD is for poor developers who need "training wheels."

Also, your position is not very logically consistent in my view. You
seem to point in the direction of writing large amounts of code based
on an initial visualized design, yet you also admit that the right
design emerges only after a significant amount of code has been
written. Finally, VERY large projects are inherently difficult. This
is just my opinion, but a VERY large project, which I would say about
anything in the 500,000+LOC range, is just plain difficult, regardless
of the methodology used. If I were managing a truly enormous project,
I would be much more concerned about the long term maintainability of
my code than about the lines of code being developed per week. I would
want a robust application, and I would expect enhancements and changes
to be done quickly and without adding a lot of new bugs. Also, I would
consider constant wholesale rewrites from scratch all the time to be a
bad thing.

As for the training wheels analogy, I'd say TDD is much more like a
harness for mountain climbing. You may find it uncomgfortable
initially if you're not used to it, but at the end of the day, you're
much better off learning to live with it.
0
Reply vladimir_levin 11/19/2004 3:27:34 AM

Vladimir Levin wrote:
> CTips <ctips@bestweb.net> wrote in message 
> 
> 
>>> > IMO, anyone who aims to tackle medium/large programs
>>> > had better out-grow TDD.
> 
> 
> Quite frankly, I am finding your repeated claims to be somewhat
> grating. On one hand, you have a right to your opinion, but on the
> other hand, you disregard all evidence presented by people such as
> Laurent, Iljya, Phlip, Ron, RCM, with respect to fairly large projects
> that have been successfully run using Xp and TDD. It is one thing to
> say "My experience suggests TDD is not worthwhile." It is another to
> imply TDD is for poor developers who need "training wheels."

Also, what evidence do you have that there are any large/medium sized 
projects that have used TDD/XP? Or XP/TDD leads to decent productivity? 
I've looked at the literature. If there is any such published example, I 
have yet to see it. And I have asked for some such example repeatedly. 
Every time I am pointed to a paper, it turns out that the project was 
small and/or the productivity was abysmal. Worse yet, a lot of them seem 
to be written _after_ the project was canceled.

Now, lets look at some of the medium sized projects whose history is 
well known, and which are successfully in use - things like the gcc 
compiler, the linux kernel (I'm talking about only the kernel), emacs, 
the apache server etc. Definitely _NOT_ done using anything like XP or 
TDD. Also, done by a relatively small teams of fairly competent 
programmers. As far as evidence goes, the perponderance of evidence 
suggests that writing medium-sized complex programs is best done by 
small teams of competenet programmers using practices other than XP.

> Also, your position is not very logically consistent in my view. You
> seem to point in the direction of writing large amounts of code based
> on an initial visualized design, yet you also admit that the right
> design emerges only after a significant amount of code has been
> written.

I never said that. I said that in _TDD_ the right design emerges only 
after you have thrown away a lot of code.

Of course, with any approach, you will always encounter situations where 
you have to rewrite major portions of the code, or, worse, change the 
entire architecture. This is likely to happen due to requirement 
changes, or performance issues, but can happen because your initial 
visualization was faulty.


 >
  Finally, VERY large projects are inherently difficult. This
> is just my opinion, but a VERY large project, which I would say about
> anything in the 500,000+LOC range, is just plain difficult, regardless
> of the methodology used. If I were managing a truly enormous project,
> I would be much more concerned about the long term maintainability of
> my code than about the lines of code being developed per week.

Possibly - however, in my experience productive programmers also tend to 
be the ones that write robust, maintainable programs. Its probably 
because they write robust programs that they are productive :)

Also, if you spend time upfront designing the program and ask yourself - 
what can change? How would I make this aspect flexible? you tend to get 
a much more extensible design than otherwise.

 > I would
> want a robust application, and I would expect enhancements and changes
> to be done quickly and without adding a lot of new bugs. Also, I would
> consider constant wholesale rewrites from scratch all the time to be a
> bad thing.

So do I - but isn't that what you can get with TDD?

> As for the training wheels analogy, I'd say TDD is much more like a
> harness for mountain climbing. You may find it uncomgfortable
> initially if you're not used to it, but at the end of the day, you're
> much better off learning to live with it.

Based on the current evidence, you're at least 1 order of magnitude off 
in productivity than the top-end programmers. How do you think you can 
improve 10x? If TDD/XP will give you that improvement, then yes, TDD is 
a safety net. If they can only get you part of the way there, then 
perhaps you'll have to look beyond them.
0
Reply CTips 11/19/2004 4:58:07 AM

CTips wrote:

> Also, what evidence do you have that there are any large/medium sized
> projects that have used TDD/XP?

XP and TDD suck, and barely last long enough to sustain writing a book about
whatever your project is. But the books seem to sell. Actually, XP and TDD
obviously are failing to live up to their claims in all sectors where we
tried it, because the kafluffle you hear on all kinds of forums (mailing
lists, Wikis, blogs, USENET, the bus stop, etc.) must really just part of a
vast righto-leftist conspiracy to get you to post dumb questions.

> Or XP/TDD leads to decent productivity?

RCM keeps claiming "10x reduction in defects released to the field". But he
probably works with the kinds of Fortune 100 companies whose pointy haired
bosses had no direction to go but up.

> I've looked at the literature.

Ah, then you must have read /Agile and Iterative Development: A Managers
Guide/, by Craig Larman. Its main conclusion (besides "waterfall sucks") is
that given our industry's 70% failure rate for large projects, simply
failing less often would be a better goal than increased productivity.

> If there is any such published example, I
> have yet to see it. And I have asked for some such example repeatedly.
> Every time I am pointed to a paper, it turns out that the project was
> small and/or the productivity was abysmal. Worse yet, a lot of them seem
> to be written _after_ the project was canceled.

That's right. A _scientific_ paper must have a large number of matched
populations of controls and subjects. Then you kill all the programmers and
disect their brains, looking for evidence of damage. Nope - the Diet
Mountain Dew affected both groups the same. Start again.

> Now, lets look at some of the medium sized projects whose history is
> well known, and which are successfully in use - things like the gcc
> compiler, the linux kernel (I'm talking about only the kernel), emacs,
> the apache server etc. Definitely _NOT_ done using anything like XP or
> TDD. Also, done by a relatively small teams of fairly competent
> programmers. As far as evidence goes, the perponderance of evidence
> suggests that writing medium-sized complex programs is best done by
> small teams of competenet programmers using practices other than XP.

Oh, my god! Somebody, somewhere, wrote a successful project without XP!!

> I never said that. I said that in _TDD_ the right design emerges only
> after you have thrown away a lot of code.

It "emerges" as a combination of you discovering it and you already knowing
what it is. Kind of like the end of The Wizard of Oz, when the Wikid Witch
of the North told Dorothy that she had the answer with her all along, but
she had to learn it for herself. You just tap the heels of your Ruby
language slippers together three times, and hit the One Test Button.

> Of course, with any approach, you will always encounter situations where
> you have to rewrite major portions of the code, or, worse, change the
> entire architecture. This is likely to happen due to requirement
> changes, or performance issues, but can happen because your initial
> visualization was faulty.

In real life, what typically happens is, after the twister, your house lands
on top of the previous lead architect, and you peer tremulously out the door
at a twisted landscape of big balls of crufty mud, written by short
programmers, their growth stunted by long working hours and junk food, under
the spell of some lessor process than XP.

This leaves you wondering where to start. Then Mike Feathers, wearing a
Wikid Witch of the North costume, appears with a copy of /The Joy of Legacy
Code/, and smacks you across the forehead with it.

   DOROTHY
  But -- after I add tests to all this legacy code, how to I fix it? Do I
  split it down the middle? Do I look for the big common patterns ---

    MIKE
  Just refactor the low hanging fruit.

   DOROTHY
  But that sounds too easy! Shouldn't I make a plan too...

But MIKE floats away inside a soap bubble, leaving you all alone. With the
short programmers crowding around you.

    DOROTHY
  My..!  People turnover so quickly here!
  ...Refactor the low hanging fruit?  Refactor the
  the low hanging fruit?

    DBA
  Refactor the low hanging fruit.

    TOOLS GUY
  Refactor the low hanging fruit!

    GUI GUY
  Refactor the low hanging fruit.

    MATH GUY
  Refactor the low hanging fruit.

    ALL
  Refactor the low hanging fruit.
  Refactor the low hanging fruit.
  Refactor, factor, factor, factor,
  Refactor the low hanging fruit.

  Refactor the low-hanging
  refactor the...

So, by finding the simplest possible fixes to the lowest level code within
that ball of mud, and by isolating the effects of your changes with
characterization tests, you can begin to tease apart its design.

.... And ... you ... are ...

    Testing your way to dee-velopment
    Developing tests for the cause
    You'l find it a whiz, so give it a squiz
    Each test just gives one little pause
    The tests you test are bestests tests
    The bests are tests to test the best
    Because ... because
    because, because, because,
    Because of the wonderful code they does.
    You're testing your way to dee-velopment
    The wonderful tests for the cause!

-- 
  [Phlip2004]
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/19/2004 5:46:15 AM

Laurent Bossavit wrote:

> Also I would be looking for gains from TDD a little later in the
> complexity curve - when that little module has to take on one, two,
> three further features.

Or colleagues. TDD is a great way to help folks with less exposure to your
module than you change it while you are doing something else.

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/19/2004 11:18:53 AM

In article <HMfnd.25927$5b1.21207@newssvr17.news.prodigy.com>, 
phlip_cpp@yahoo.com says...
> CTips wrote:
> 
> > Also, what evidence do you have that there are any large/medium sized
> > projects that have used TDD/XP?
> 
> XP and TDD suck, and barely last long enough to sustain writing a book about
> whatever your project is. But the books seem to sell. Actually, XP and TDD
> obviously are failing to live up to their claims in all sectors where we
> tried it, because the kafluffle you hear on all kinds of forums (mailing
> lists, Wikis, blogs, USENET, the bus stop, etc.) must really just part of a
> vast righto-leftist conspiracy to get you to post dumb questions.

But answer came there none.
 
> Ah, then you must have read /Agile and Iterative Development: A Managers
> Guide/, by Craig Larman. Its main conclusion (besides "waterfall sucks") is
> that given our industry's 70% failure rate for large projects, simply
> failing less often would be a better goal than increased productivity.

So THAT'S where Ed's figure comes from!  

- Gerry Quinn
0
Reply Gerry 11/19/2004 11:36:30 AM

Gerry Quinn wrote:

> > Ah, then you must have read /Agile and Iterative Development: A Managers
> > Guide/, by Craig Larman. Its main conclusion (besides "waterfall sucks")
is
> > that given our industry's 70% failure rate for large projects, simply
> > failing less often would be a better goal than increased productivity.
>
> So THAT'S where Ed's figure comes from!

His figure comes from eating too many Krispy Kremes.

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/19/2004 11:41:12 AM

CTips wrote:

 >>> IMO, anyone who aims to tackle medium/large
 >>> programs had better out-grow TDD.
 >>
 >> I think anyone who aims to tackle large programs
 >> should first learn TDD.
 >
 > Out of curiosity, what kinds of programs would you
 > call large?
 >
 > What is the "largest" program you've worked on
 > personally as a coder (a description and a rough LOC
 > would be nice)? What %age of the code was yours?
 >
 > What is the largest module you've written solo?

I don't attach very much importance to these metrics,
and as a consequence I have not measured or kept such
statistics throughout my 11-years programming. If it
can induce you to attach more weight to my opinion and
those of others, here are a few indications:

The largest program I worked on was about as big as
they come. It was AICES, the commercial survivor of the
ICES system developed at MIT in the 60, then continued
by IBM before being developed and used by Bureau
Veritas to certify offshore oil rigs. It's an
Integrated Civil Engineering System, quite complex
because in addition to doing quite complex numerical
analysis it includes its own specialized languages
(ICETRAN, CDL, STRUDL...), compilers, interpreters, its
own memory management etc. Developed in C, FORTRAN,
Assembly.

Arriving as I did in the mid 90's, "my" code was only a
very small part of this of course. At the time,
compiling the whole system (which was almost never
necessary) took 2 or 3 days on an IBM RISC-6000 AIX
system.

Between '98 and 2003 I was developing real-time
distributed mission- and safety-critical automatic
train control systems. The last two were developed
basically from scratch, I was technical leader, wrote
quite a lot of code. They were both done in full XP and
TDD; I evolved from being the principal architect to
being an XP coach who helped the team of developers
agree on an evolving design. Four years, approx. 400
man-months. The second of those alone was 240 KLOC
(physical), 1200 classes.

I've done a number of smaller things in between. I
can't remember what is the biggest module I developed
solo. But apart from an operating system, about which I
don't know very much, I can't think of any kinds of
software projects that I would find too daunting. Yet
given the choice, I would work in TDD with an XP team
every time. Better, faster, more fun.

Regards,

Dominic Williams
http://dominicwilliams.net

----

0
Reply Dominic 11/19/2004 3:00:52 PM

Dominic Williams wrote:

> I don't attach very much importance to these metrics,
> and as a consequence I have not measured or kept such
> statistics throughout my 11-years programming.

Well, CTips is getting lots of statistics like "in my {10, 15, 30} years of
programming..." here.

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/19/2004 3:08:59 PM

Dominic Williams wrote:
> CTips wrote:
> 
>  >>> IMO, anyone who aims to tackle medium/large
>  >>> programs had better out-grow TDD.
>  >>
>  >> I think anyone who aims to tackle large programs
>  >> should first learn TDD.
>  >
>  > Out of curiosity, what kinds of programs would you
>  > call large?
>  >
>  > What is the "largest" program you've worked on
>  > personally as a coder (a description and a rough LOC
>  > would be nice)? What %age of the code was yours?
>  >
>  > What is the largest module you've written solo?
> 
> I don't attach very much importance to these metrics,
> and as a consequence I have not measured or kept such
> statistics throughout my 11-years programming. If it
> can induce you to attach more weight to my opinion and
> those of others, here are a few indications:
> 
> The largest program I worked on was about as big as
> they come. It was AICES, the commercial survivor of the
> ICES system developed at MIT in the 60, then continued
> by IBM before being developed and used by Bureau
> Veritas to certify offshore oil rigs. It's an
> Integrated Civil Engineering System, quite complex
> because in addition to doing quite complex numerical
> analysis it includes its own specialized languages
> (ICETRAN, CDL, STRUDL...), compilers, interpreters, its
> own memory management etc. Developed in C, FORTRAN,
> Assembly.
> 
> Arriving as I did in the mid 90's, "my" code was only a
> very small part of this of course. At the time,
> compiling the whole system (which was almost never
> necessary) took 2 or 3 days on an IBM RISC-6000 AIX
> system.

I know about it. Its actually a collection of "programs" (loosely 
speaking) rather than a single app.

> Between '98 and 2003 I was developing real-time
> distributed mission- and safety-critical automatic
> train control systems. The last two were developed
> basically from scratch, I was technical leader, wrote
> quite a lot of code. They were both done in full XP and
> TDD; I evolved from being the principal architect to
> being an XP coach who helped the team of developers
> agree on an evolving design. Four years, approx. 400
> man-months. The second of those alone was 240 KLOC
> (physical), 1200 classes.

7200/lines per man-year. Though, of course, its probably not the entire 
story - you're not counting the other program - so lets say about 
10kloc/year productivity. Thats about 4x the productivity claimed in the 
available XP literature. Better.

So, now, how would you push your productivity up to say about 
50kloc/year? Or do you think that would be unachievable?

> I've done a number of smaller things in between. I
> can't remember what is the biggest module I developed
> solo. But apart from an operating system, about which I
> don't know very much, I can't think of any kinds of
> software projects that I would find too daunting.

Try high-performance WAN distributed fault-tolerant systems - with 
support for things like network partitioning/recovery and 
semi-synchronous byzantine failure. *shudder*.

 > Yet
> given the choice, I would work in TDD with an XP team
> every time. Better, faster, more fun.
0
Reply CTips 11/19/2004 3:28:17 PM

CTips wrote:

> So, now, how would you push your productivity up to say about
> 50kloc/year? Or do you think that would be unachievable?

Naw, an XP team can write the same number of features in 20kloc, and in less
than a year.

Have you figure out yet that _no_ methodology should count lines of code as
a progress metric?

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


0
Reply Phlip 11/19/2004 3:44:21 PM

CTips wrote:

> So, now, how would you push your productivity up to say about 
> 50kloc/year? Or do you think that would be unachievable?

By doing the XP practices better and more exclusively, 
and using a programming language that is better suited 
to XP and generally more productive.

Regards,

Dominic Williams
http://dominicwilliams.net

----
0
Reply Dominic 11/19/2004 4:22:39 PM

CTips <ctips@bestweb.net> wrote in message news:<10pqvb8t1srrbbc@corp.supernews.com>...
> Also, what evidence do you have that there are any large/medium sized 
> projects that have used TDD/XP? Or XP/TDD leads to decent productivity? 
> <snip>
> I've looked at the literature. If there is any such published example, I 
> have yet to see it. And I have asked for some such example repeatedly. 

Let me say this. If you are working on a 500KLOC+ project that is
going really well and not having any problems, and you're not using
XP/TDD, then that's great. The methods you have described toward
getting there do not seem applicable in general terms ("just think
really hard and get things right the first time").

I am working for a company where the project I am working on is about
50KLOC. The other project at my company is about 100KLOC. We are happy
with our productivity, in terms of meeting requirements of demanding
customers, and the quality of releases has improved dramatically since
XP was introduced. We have measurements, but I doubt I could publish
them.  I consider our application to be quite representative of what
programming is about. It is a challenging application with many sticky
technical and business features that are not easy to get right. If XP
applies to this app, it applies to 80% of software development efforts
out in the world.

> I never said that. I said that in _TDD_ the right design emerges only 
> after you have thrown away a lot of code.

Maybe the problem is that some TDD examples go to rather silly lenghts
to make a point. If you know you're going to use a table driven
approach, write it that was using TDD. It's not as though you have to
discover that. If I am dealing with a list of values, I'm not going to
write a test for adding 1 item, then another item, then another before
I go, gee, all these separate variables I'm creating should really be
items in a list.

> Of course, with any approach, you will always encounter situations where 
> you have to rewrite major portions of the code, or, worse, change the 
> entire architecture. This is likely to happen due to requirement 
> changes, or performance issues, but can happen because your initial 
> visualization was faulty.
 
The extent to which you have to do this, and the practicality of doing
it without introducing loads of bugs varies with each practice. I have
found XP to be a really good way to manage this process, and to reduce
the frequency with which it has to happen.

> Possibly - however, in my experience productive programmers also tend to 
> be the ones that write robust, maintainable programs. Its probably 
> because they write robust programs that they are productive :)

Very different from my experience. On the contrary, many "productive"
programmers I've seen, i.e. people who write a lot of lines of
code/hour, have been sloppy and produced buggy, ill-conceived code.
They also tend to think that bugs are ok. "Oh yeah, all I have to do
is fix this little thingy here and it'll work" then "Oh yeah, that
change caused this other little thing to break too, no worries I'll
fix it in a jiffy."
 
> Also, if you spend time upfront designing the program and ask yourself - 
> what can change? How would I make this aspect flexible? you tend to get 
> a much more extensible design than otherwise.

I do this when I use TDD. I never just jump into coding. But spending
more than a few hours on such an exercise is a waste of time. You need
to build something before you can really see feasable your ideas truly
are.
 
>  > I would
> > want a robust application, and I would expect enhancements and changes
> > to be done quickly and without adding a lot of new bugs. Also, I would
> > consider constant wholesale rewrites from scratch all the time to be a
> > bad thing.
> 
> So do I - but isn't that what you can get with TDD?

I think so. That was my point.
 
> Based on the current evidence, you're at least 1 order of magnitude off 
> in productivity than the top-end programmers. 

I believe that the different between a high productivity programmer
and a low productivity programmer will likely remain relatively
constant at all times. With a good approach, both will improve. With a
bad approach, both will deteriorate.

Enough of this discussion for me. Over and out.
0
Reply vladimir_levin 11/19/2004 6:27:41 PM

Phlip wrote:

snipped the very excellent Wizard of Oz metaphor..

Brilliant Philp, just brilliant.. .
0
Reply Andrew 11/19/2004 7:26:24 PM

Dominic Williams wrote:
> CTips wrote:
> 
>> So, now, how would you push your productivity up to say about 
>> 50kloc/year? Or do you think that would be unachievable?
> 
> 
> By doing the XP practices better and more exclusively, and using a 
> programming language that is better suited to XP and generally more 
> productive.

At your present rate of improvement, when do you think you'll get there?
0
Reply CTips 11/19/2004 8:34:07 PM

CTips wrote:

> > By doing the XP practices better and more exclusively, and using a
> > programming language that is better suited to XP and generally more
> > productive.
>
> At your present rate of improvement, when do you think you'll get there?

Ask her what her defect rate is.

Ask her what her "work in progress" metric is - the time between making a
decision about a feature, and delivering to live users (or proxies).

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces



0
Reply Phlip 11/19/2004 9:04:49 PM

Andrew McDonagh wrote:

> Phlip wrote:

Thank you thank you.

> snipped the very excellent Wizard of Oz metaphor..

Are you a Good Lead Architect, or a Bad Lead Architect?

(but I'm in mourning for not capitalizing "SOAP bubble"...)

-- 
  Phlip
  http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces



0
Reply Phlip 11/19/2004 9:04:57 PM

Phlip wrote:
> CTips wrote:
> 
> 
>>So, now, how would you push your productivity up to say about
>>50kloc/year? Or do you think that would be unachievable?
> 
> 
> Naw, an XP team can write the same number of features in 20kloc, and in less
> than a year.
> 
> Have you figure out yet that _no_ methodology should count lines of code as
> a progress metric?
> 

No, he persists on using an totally irrelevant measurement - regardless 
of how good a developer he is.
0
Reply Andrew 11/19/2004 10:28:50 PM

CTips wrote:

> Andrew McDonagh wrote:
> 
>> CTips wrote:
>>
>> snipped
>>
>>> I suspect that I'd call your program on the small size. Unless of 
>>> course, the program is a distributed fault-tolerant program with 
>>> support for things like recovery after network partitioning. Or 
>>> unless it requires switch-level 5-9s reliability. In which case it 
>>> definitely counts as a medium sized program.
>>>
>>
>> It is a distributed fault tolerant system, and requires 5-9s 
>> availability - not reliability.
>>
>> Strange, I'd never use 5-9s availability as a measurement of 
>> application size, cause I can get that with a simple HelloWorld app.
> 
> 
> Remember the word "switch-level 5-9s reliability" - that means that the 
> service will be unavailable something like 9 hours per year; this will 
> include any scheduled maintainence on the servers, as well as any 
> software updates, apart from the usual possibility of actual computer 
> and/or network failures.

5-9's is a little over 5 minutes a year.

Not achievable on a single system with commercially
available hardware that I know of.  A distributed
FT system is the way to go (as noted below).

> 
> Its in the context of what stresses it puts on the system; if you're 
> designing for 5-9 availability on a distributed fault-tolerant system 
> [even if you're building on top of some framework like Horus/Isis], it 
> gets pretty complex.
> 
> Some of the nastier problems
> - what happens if the network partitions, and someone makes an update? 
> How do you reconcile the information when they rejoin?
> - How about things like a bad router table somewhere?
> - What kind of failure detectors are you using?

Let's add "fault isolation" to your "fault detection"
line above.  Also add "automatic failover" and "failback"
(without hysteresis).  Reboot is NOT an option.

> - Are you dealing with Byzantine failure models? What resilience are you 
> targeting?
> - What happens when A can talk to B and C (and vice versa), but the link 
> between B & C is very slow?
> 
> Hats off to you if you've already built the infrastructure to test some 
> of those corner cases, let alone derive the code from them.

Amen to that remark, CTips.

These are examples of what some
folks call "non-functional" requirements and others
call "supra-functional" requirements.  Others call
them "implied" and "derived" requirements based on
customer expectations.  If you're working on a UI
on an unstable OS in the first place, they will
probably curse the OS rather than you.

I will say, however, that 99.99% of developers may never
have to program in this environment, so it, itself, may
be a "corner case."  At the 5-9's level that you are talking
about, testing is absolutely essential, not just unit
testing, but stress testing and stability testing.
(Two related but subtly different things.)  I also
fail to see how unit testing alone can drive the code
for this case.  I feel that this case needs up-front design
rather than a reactive approach.



-- 
"It is impossible to make anything foolproof
because fools are so ingenious"
  - A. Bloch
0
Reply Nick 11/19/2004 11:07:37 PM

Nick Landsberg wrote:

> CTips wrote:
> 
>> Andrew McDonagh wrote:
>>
>>> CTips wrote:
>>>
>>> snipped
>>>
>>>> I suspect that I'd call your program on the small size. Unless of 
>>>> course, the program is a distributed fault-tolerant program with 
>>>> support for things like recovery after network partitioning. Or 
>>>> unless it requires switch-level 5-9s reliability. In which case it 
>>>> definitely counts as a medium sized program.
>>>>
>>>
>>> It is a distributed fault tolerant system, and requires 5-9s 
>>> availability - not reliability.
>>>
>>> Strange, I'd never use 5-9s availability as a measurement of 
>>> application size, cause I can get that with a simple HelloWorld app.
>>
>>
>>
>> Remember the word "switch-level 5-9s reliability" - that means that 
>> the service will be unavailable something like 9 hours per year; this 
>> will include any scheduled maintainence on the servers, as well as any 
>> software updates, apart from the usual possibility of actual computer 
>> and/or network failures.
> 
> 
> 5-9's is a little over 5 minutes a year.

Yeah...sorry, I had confused 3-9s and 5-9s - my bad.

> Not achievable on a single system with commercially
> available hardware that I know of.  A distributed
> FT system is the way to go (as noted below).

Well, it needs to be parallel yes; distributed makes the problem a whole 
lot harder (by distributed I mean multiple machines connected by a 
network, by parallel I mean multiple CPUs in the same nest). In a single 
machine with multiple CPUs (such as a zSeries mainframe), the machine is 
a controlled environment which can be exactly characterized for MTBF, 
usually by the manufacturer.

Clusters are somewhat similar, except that the characterization has to 
take into account the network, but presumably the network is completely 
under your control and is not likely to change.

In a distributed environment the environment is much less fixed. There 
may be hardware on the links which are not under your control. The links 
themselves may change. Roughly speaking in the first two cases, the 
system is synchronous, while in the distributed case, the system is 
ansynchronous. That causes both theoretical and practical difficulties.

>>
>> Its in the context of what stresses it puts on the system; if you're 
>> designing for 5-9 availability on a distributed fault-tolerant system 
>> [even if you're building on top of some framework like Horus/Isis], it 
>> gets pretty complex.
>>
>> Some of the nastier problems
>> - what happens if the network partitions, and someone makes an update? 
>> How do you reconcile the information when they rejoin?
>> - How about things like a bad router table somewhere?
>> - What kind of failure detectors are you using?
> 
> 
> Let's add "fault isolation" to your "fault detection"
> line above.  Also add "automatic failover" and "failback"
> (without hysteresis).  Reboot is NOT an option.
> 
>> - Are you dealing with Byzantine failure models? What resilience are 
>> you targeting?
>> - What happens when A can talk to B and C (and vice versa), but the 
>> link between B & C is very slow?
>>
>> Hats off to you if you've already built the infrastructure to test 
>> some of those corner cases, let alone derive the code from them.
> 
> 
> Amen to that remark, CTips.
> 
> These are examples of what some
> folks call "non-functional" requirements and others
> call "supra-functional" requirements.  Others call
> them "implied" and "derived" requirements based on
> customer expectations.  If you're working on a UI
> on an unstable OS in the first place, they will
> probably curse the OS rather than you.

<Gasp> No, really - someones going to use Windows for 5-9 stuff?

> I will say, however, that 99.99% of developers may never
> have to program in this environment, so it, itself, may
> be a "corner case."  At the 5-9's level that you are talking
> about, testing is absolutely essential, not just unit
> testing, but stress testing and stability testing.
> (Two related but subtly different things.)  I also
> fail to see how unit testing alone can drive the code
> for this case.  I feel that this case needs up-front design
> rather than a reactive approach.

However Andrew McDonagh's team has either achieved 5-9 availability with 
TDD, or feel that they will get there by the time they finish. It would 
be interesting to hear how they spec'd the machines and network to get 
to that level - are they using dedicated, non-IP connections between 
zSeries or Tandem machines? And it would be equally interesting to hear 
about the analysis they had to do to hit that.

Also, they're using Java in at least part of their apps. Thats an 
interesting choice. One hopes that they will use compiled java with 
compiled libraries statically linked in, or are using controlled / 
dedicated systems. IMHO, there is very little chance of hitting 5-9s 
using a JVM (except on a dedicated system) - too much chance of someone 
changing the operating enviornment.
0
Reply CTips 11/20/2004 3:35:10 AM

CTips wrote:

> Nick Landsberg wrote:
> 
>> CTips wrote:
>>
>>> Andrew McDonagh wrote:
>>>
>>>> CTips wrote:
>>>>
>>>> snipped
>>>>
>>>>> I suspect that I'd call your program on the small size. Unless of 
>>>>> course, the program is a distributed fault-tolerant program with 
>>>>> support for things like recovery after network partitioning. Or 
>>>>> unless it requires switch-level 5-9s reliability. In which case it 
>>>>> definitely counts as a medium sized program.
>>>>>
>>>>
>>>> It is a distributed fault tolerant system, and requires 5-9s 
>>>> availability - not reliability.
>>>>
>>>> Strange, I'd never use 5-9s availability as a measurement of 
>>>> application size, cause I can get that with a simple HelloWorld app.
>>>
>>>
>>>
>>>
>>> Remember the word "switch-level 5-9s reliability" - that means that 
>>> the service will be unavailable something like 9 hours per year; this 
>>> will include any scheduled maintainence on the servers, as well as 
>>> any software updates, apart from the usual possibility of actual 
>>> computer and/or network failures.
>>
>>
>>
>> 5-9's is a little over 5 minutes a year.
> 
> 
> Yeah...sorry, I had confused 3-9s and 5-9s - my bad.
> 
>> Not achievable on a single system with commercially
>> available hardware that I know of.  A distributed
>> FT system is the way to go (as noted below).
> 
> 
> Well, it needs to be parallel yes; distributed makes the problem a whole 
> lot harder (by distributed I mean multiple machines connected by a 
> network, by parallel I mean multiple CPUs in the same nest). In a single 
> machine with multiple CPUs (such as a zSeries mainframe), the machine is 
> a controlled environment which can be exactly characterized for MTBF, 
> usually by the manufacturer.
> 
> Clusters are somewhat similar, except that the characterization has to 
> take into account the network, but presumably the network is completely 
> under your control and is not likely to change.
> 
> In a distributed environment the environment is much less fixed. There 
> may be hardware on the links which are not under your control. The links 
> themselves may change. Roughly speaking in the first two cases, the 
> system is synchronous, while in the distributed case, the system is 
> ansynchronous. That causes both theoretical and practical difficulties.
> 

Tell me about it.  You're preaching to the choir here (at
least in my case).  Simple Markov models may be developed
for such cases which, in turn, drive the "implied"
requirements, e.g. MTBF and MTTR (mean time between
failure and mean time to restore).  Testing MTBF is
a bear, but testing MTTR is eminently possible through
fault injection techniques.

>>>
>>> Its in the context of what stresses it puts on the system; if you're 
>>> designing for 5-9 availability on a distributed fault-tolerant system 
>>> [even if you're building on top of some framework like Horus/Isis], 
>>> it gets pretty complex.
>>>
>>> Some of the nastier problems
>>> - what happens if the network partitions, and someone makes an 
>>> update? How do you reconcile the information when they rejoin?
>>> - How about things like a bad router table somewhere?
>>> - What kind of failure detectors are you using?
>>
>>
>>
>> Let's add "fault isolation" to your "fault detection"
>> line above.  Also add "automatic failover" and "failback"
>> (without hysteresis).  Reboot is NOT an option.
>>
>>> - Are you dealing with Byzantine failure models? What resilience are 
>>> you targeting?
>>> - What happens when A can talk to B and C (and vice versa), but the 
>>> link between B & C is very slow?
>>>
>>> Hats off to you if you've already built the infrastructure to test 
>>> some of those corner cases, let alone derive the code from them.
>>
>>
>>
>> Amen to that remark, CTips.
>>
>> These are examples of what some
>> folks call "non-functional" requirements and others
>> call "supra-functional" requirements.  Others call
>> them "implied" and "derived" requirements based on
>> customer expectations.  If you're working on a UI
>> on an unstable OS in the first place, they will
>> probably curse the OS rather than you.
> 
> 
> <Gasp> No, really - someones going to use Windows for 5-9 stuff?

Does the phrase "over my dead body" ring a bell?

> 
>> I will say, however, that 99.99% of developers may never
>> have to program in this environment, so it, itself, may
>> be a "corner case."  At the 5-9's level that you are talking
>> about, testing is absolutely essential, not just unit
>> testing, but stress testing and stability testing.
>> (Two related but subtly different things.)  I also
>> fail to see how unit testing alone can drive the code
>> for this case.  I feel that this case needs up-front design
>> rather than a reactive approach.
> 
> 
> However Andrew McDonagh's team has either achieved 5-9 availability with 
> TDD, or feel that they will get there by the time they finish. It would 
> be interesting to hear how they spec'd the machines and network to get 
> to that level - are they using dedicated, non-IP connections between 
> zSeries or Tandem machines? And it would be equally interesting to hear 
> about the analysis they had to do to hit that.

I would also be interested in seeing the results of that.

> 
> Also, they're using Java in at least part of their apps. Thats an 
> interesting choice. One hopes that they will use compiled java with 
> compiled libraries statically linked in, or are using controlled / 
> dedicated systems. IMHO, there is very little chance of hitting 5-9s 
> using a JVM (except on a dedicated system) - too much chance of someone 
> changing the operating enviornment.

On a dedicated system, I have seen at least one project achieve
that (after beating on SUN to fix a bug in the GC to no avail,
and spend many staff months to find the memory leak and fix it.)
Even so, they had to restart the JVM once every three months,
30 seconds restart time.  The customer grudgingly accepted the
"solution" since it could be scheduled, rather than happen during
busy hour.

NPL

-- 
"It is impossible to make anything foolproof
because fools are so ingenious"
  - A. Bloch
0
Reply Nick 11/20/2004 1:52:19 PM

Nick Landsberg wrote:
> CTips wrote:
> 
>> Nick Landsberg wrote:
>>
>>> CTips wrote:
>>
>> However Andrew McDonagh's team has either achieved 5-9 availability 
>> with TDD, or feel that they will get there by the time they finish. It 
>> would be interesting to hear how they spec'd the machines and network 
>> to get to that level - are they using dedicated, non-IP connections 
>> between zSeries or Tandem machines? And it would be equally 
>> interesting to hear about the analysis they had to do to hit that.
> 
> 
> I would also be interested in seeing the results of that.

You seem to be have more recent experience about this than I do - would 
you use anything other than big-iron (IBM zSeries, Unisys ???, Tandem or 
whatever they're calling it these days) for 5-9 apps? Do the AS/400s or 
any of the Suns make the cut? How about the Opeteron based stuff from 
Newisys (though I heard that they got acquired)?

>>
>> Also, they're using Java in at least part of their apps. Thats an 
>> interesting choice. One hopes that they will use compiled java with 
>> compiled libraries statically linked in, or are using controlled / 
>> dedicated systems. IMHO, there is very little chance of hitting 5-9s 
>> using a JVM (except on a dedicated system) - too much chance of 
>> someone changing the operating enviornment.
> 
> 
> On a dedicated system, I have seen at least one project achieve
> that (after beating on SUN to fix a bug in the GC to no avail,
> and spend many staff months to find the memory leak and fix it.)
> Even so, they had to restart the JVM once every three months,
> 30 seconds restart time.  The customer grudgingly accepted the
> "solution" since it could be scheduled, rather than happen during
> busy hour.

Was this a JNI issue? I remember that at least on one implementation, 
stuff would not be GC'd if there was a pointer lying around on the C 
stack, and it turned out that for a weird combination of reasons a 
pointer value persisted on the stack for a long time even though it was 
dead. Had to show them how to tweak stuff so that the stack was zero'd 
on exit.
0
Reply CTips 11/21/2004 1:27:28 AM

CTips wrote:

> Nick Landsberg wrote:
> 
>> CTips wrote:
>>
>>> Nick Landsberg wrote:
>>>
>>>> CTips wrote:
>>>
>>>
>>> However Andrew McDonagh's team has either achieved 5-9 availability 
>>> with TDD, or feel that they will get there by the time they finish. 
>>> It would be interesting to hear how they spec'd the machines and 
>>> network to get to that level - are they using dedicated, non-IP 
>>> connections between zSeries or Tandem machines? And it would be 
>>> equally interesting to hear about the analysis they had to do to hit 
>>> that.
>>
>>
>>
>> I would also be interested in seeing the results of that.
> 
> 
> You seem to be have more recent experience about this than I do - would 
> you use anything other than big-iron (IBM zSeries, Unisys ???, Tandem or 
> whatever they're calling it these days) for 5-9 apps? Do the AS/400s or 
> any of the Suns make the cut? How about the Opeteron based stuff from 
> Newisys (though I heard that they got acquired)?

If you had asked that question about 5-7 years ago,
my answer would have been yes... use big-iron.
Since then, we have had success with Sun servers
and Intel servers running various versions of Solaris.
Could we get the same with Linux?  Skunkworks project
trying now.  We'll see.

We've achieved 5-9's with a whole spitload of software
assist (and redundancy, of course).  Automatic failover
when the heartbeat fails, data replication while both
systems are up, resyncing when the failed system
comes back up... the usual set of suspects.

It's taken many years to get it almost right.

> 
>>>
>>> Also, they're using Java in at least part of their apps. Thats an 
>>> interesting choice. One hopes that they will use compiled java with 
>>> compiled libraries statically linked in, or are using controlled / 
>>> dedicated systems. IMHO, there is very little chance of hitting 5-9s 
>>> using a JVM (except on a dedicated system) - too much chance of 
>>> someone changing the operating enviornment.
>>
>>
>>
>> On a dedicated system, I have seen at least one project achieve
>> that (after beating on SUN to fix a bug in the GC to no avail,
>> and spend many staff months to find the memory leak and fix it.)
>> Even so, they had to restart the JVM once every three months,
>> 30 seconds restart time.  The customer grudgingly accepted the
>> "solution" since it could be scheduled, rather than happen during
>> busy hour.
> 
> 
> Was this a JNI issue? I remember that at least on one implementation, 
> stuff would not be GC'd if there was a pointer lying around on the C 
> stack, and it turned out that for a weird combination of reasons a 
> pointer value persisted on the stack for a long time even though it was 
> dead. Had to show them how to tweak stuff so that the stack was zero'd 
> on exit.

I don't recall the exact issue, sorry.  I *do* know that
the app is now running for 180+ days without a restart.
The customer has not reported any memory growth during that time.
(And they are savvy enough to monitor that,, along with
CPU utilization and network utilization.)

Every 6 months there's a new version. It gets loaded
during a "maintenance period" and, obviously, the new
version starts up after that.  The customer soaks this
in the lab for at least a month before foisting it on
real users.

I'm no longer on that project since it's stable.  I get
to poke at the unstable ones and make "suggestions" as
to how to stabilize them.  The verbiage I use is left
as an exercise for the reader :)

NPL

-- 
"It is impossible to make anything foolproof
because fools are so ingenious"
  - A. Bloch
0
Reply Nick 11/21/2004 3:44:11 AM

> Given that going through a sequence of simple steps quite often yeilds 
> pushes one into a sub-optimal design and requires extensive rewriting, 
> it seems quite inefficient. This is going to be more likely in complex 
> situations than in simple ones. Alternatively - if you already know 
> up-front a near-optimal design, why bother with the TD_D_?

I find this comment very enlightening. I have a theory that most of
the approaches are related and are driving at an accelerated
experience.

If you already have the experience necessary to chose the correct
design then emergent approaches are not necessary.

"Experience trumps formal estimation" but also experience trumps
iterations, refactorings, and all techniques to get to that supple
design.

If you know what you are doing then do it.

Since you are human there will still be a need to catch mistakes in
implementation.

SirGilligan
0
Reply sirgilligan 11/22/2004 5:08:39 PM

Phlip wrote:
> Andrew McDonagh wrote:
> 
> 
>>Phlip wrote:
> 
> 
> Thank you thank you.
> 
> 
>>snipped the very excellent Wizard of Oz metaphor..
> 
> 
> Are you a Good Lead Architect, or a Bad Lead Architect?
> 
> (but I'm in mourning for not capitalizing "SOAP bubble"...)
> 

Some say I'm a straw man ;-)
0
Reply Andrew 11/23/2004 8:50:11 PM

[This is a thread from about a year ago; I've included links (and 
excerpts) from a couple of posts in it at the bottom, for those who want 
more context.]

To summarize: Andrew McDonagh posted that his group was developing a 
5-9s availability distributed program using TDD and XP (at least partly 
in Java).

I was wondering what progress had been made on the project. How well did 
TDD work? In particular, how well did TDD do in ensuring 5-9s reliability?

------------------------------------------------------------
http://groups.google.com/group/comp.programming/msg/f8da85007880f7d8
Andrew McDonagh wrote:
> We are using TDD (and XP in fact) on (IMO) a medium sized application.
> Its a telecomms equipment Network Management System. Multiple Servers
> managing telecomms equipment and providing that data and manageability
> to multiple clients.
> 
> We use CORBA as the RPC mechanism between each (n equipment <-> n
> servers <-> n clients)
> 
> The system currently supports Fault and Configuration Management and in
> due course will provide Auditing, Performance and Security management of
>   the telecomms equipment. 


http://groups.google.com/group/comp.programming/msg/3e541ec3bb9cac7c
Andrew McDonagh wrote:
> It is a distributed fault tolerant system, and requires 5-9s 
> availability - not reliability.
0
Reply CTips 10/22/2005 8:02:25 PM

CTips wrote:
> [This is a thread from about a year ago; I've included links (and 
> excerpts) from a couple of posts in it at the bottom, for those who want 
> more context.]
> 
> To summarize: Andrew McDonagh posted that his group was developing a 
> 5-9s availability distributed program using TDD and XP (at least partly 
> in Java).
> 
> I was wondering what progress had been made on the project. How well did 
> TDD work? In particular, how well did TDD do in ensuring 5-9s reliability?
> 
> ------------------------------------------------------------
> http://groups.google.com/group/comp.programming/msg/f8da85007880f7d8
> Andrew McDonagh wrote:
> 
>> We are using TDD (and XP in fact) on (IMO) a medium sized application.
>> Its a telecomms equipment Network Management System. Multiple Servers
>> managing telecomms equipment and providing that data and manageability
>> to multiple clients.
>>
>> We use CORBA as the RPC mechanism between each (n equipment <-> n
>> servers <-> n clients)
>>
>> The system currently supports Fault and Configuration Management and in
>> due course will provide Auditing, Performance and Security management of
>>   the telecomms equipment. 
> 
> 
> 
> http://groups.google.com/group/comp.programming/msg/3e541ec3bb9cac7c
> Andrew McDonagh wrote:
> 
>> It is a distributed fault tolerant system, and requires 5-9s 
>> availability - not reliability.

Hi,

Don't have time today to answer this - but wanted to let you know I will 
ASAP.

Andrew
0
Reply Andrew 10/23/2005 10:24:47 PM

Andrew McDonagh wrote:
> CTips wrote:
> 
>> [This is a thread from about a year ago; I've included links (and 
>> excerpts) from a couple of posts in it at the bottom, for those who 
>> want more context.]
>>
>> To summarize: Andrew McDonagh posted that his group was developing a 
>> 5-9s availability distributed program using TDD and XP (at least 
>> partly in Java).
>>
>> I was wondering what progress had been made on the project. How well 
>> did TDD work? In particular, how well did TDD do in ensuring 5-9s 
>> reliability?
>>
>> ------------------------------------------------------------
>> http://groups.google.com/group/comp.programming/msg/f8da85007880f7d8
>> Andrew McDonagh wrote:
>>
>>> We are using TDD (and XP in fact) on (IMO) a medium sized application.
>>> Its a telecomms equipment Network Management System. Multiple Servers
>>> managing telecomms equipment and providing that data and manageability
>>> to multiple clients.
>>>
>>> We use CORBA as the RPC mechanism between each (n equipment <-> n
>>> servers <-> n clients)
>>>
>>> The system currently supports Fault and Configuration Management and in
>>> due course will provide Auditing, Performance and Security management of
>>>   the telecomms equipment. 
>>
>>
>>
>>
>> http://groups.google.com/group/comp.programming/msg/3e541ec3bb9cac7c
>> Andrew McDonagh wrote:
>>
>>> It is a distributed fault tolerant system, and requires 5-9s 
>>> availability - not reliability.
> 
> 
> Hi,
> 
> Don't have time today to answer this - but wanted to let you know I will 
> ASAP.
> 
> Andrew

Came across a post by you recently that jogged my memory. If you have 
time ....

Thanks
0
Reply CTips 6/8/2006 2:37:23 PM

"CTips" <ctips@bestweb.net> wrote in message 
news:128gd5c5pca239c@corp.supernews.com...

> Andrew McDonagh wrote:
>> CTips wrote:

>>> [This is a thread from about a year ago; I've included links (and 
>>> excerpts) from a couple of posts in it at the bottom, for those who want 
>>> more context.]

>>> To summarize: Andrew McDonagh posted that his group was developing a 
>>> 5-9s availability distributed program using TDD and XP (at least partly 
>>> in Java).

>>> I was wondering what progress had been made on the project. How well did 
>>> TDD work? In particular, how well did TDD do in ensuring 5-9s 
>>> reliability?

>>> ------------------------------------------------------------
>>> http://groups.google.com/group/comp.programming/msg/f8da85007880f7d8
>>> Andrew McDonagh wrote:

>>>> We are using TDD (and XP in fact) on (IMO) a medium sized application.
>>>> Its a telecomms equipment Network Management System. Multiple Servers
>>>> managing telecomms equipment and providing that data and manageability
>>>> to multiple clients.

>>>> We use CORBA as the RPC mechanism between each (n equipment <-> n
>>>> servers <-> n clients)

>>>> The system currently supports Fault and Configuration Management and in
>>>> due course will provide Auditing, Performance and Security management 
>>>> of
>>>>   the telecomms equipment.

>>> http://groups.google.com/group/comp.programming/msg/3e541ec3bb9cac7c
>>> Andrew McDonagh wrote:

>>>> It is a distributed fault tolerant system, and requires 5-9s 
>>>> availability - not reliability.

>> Hi,

>> Don't have time today to answer this - but wanted to let you know I will 
>> ASAP.

> Came across a post by you recently that jogged my memory. If you have time 
> ....

This is the OAM stuff (mgr apps +/- mgmt agent) for the GSM pico-cell
systems (OMC +/- BSC +/- BTS) built by ip.access, is it not.

An agent on a GSM pico-cell BTS is low-scale.
Without product specs, it is not known what the BSC/BTS ratio is
(similarly for OMC/BSC) . Hence the scale of the real world networks.

It would be difficult for me to do a like for like comparison on similar 
systems I
have worked on (UMTS base station controllers supporting 250-500,000 busy
hour call attempts, using iterative development very similar to the Feature 
Driven
Development method etc) .


Regards,
Steven Perryman 


0
Reply S 6/9/2006 7:51:01 AM

116 Replies
115 Views

(page loaded in 1.083 seconds)

Similiar Articles:


















7/29/2012 3:29:54 AM


Reply: