Sunday, July 23, 2006

Software Testing

These are some notes from an evening course I've taught at the
University Of Calgary in Continuing Education for a few years.
Eventually I hope to organize them better and merge them into
something more whole and coherent, but for now here they are so
I won't forget where I put them!

Unit Testing and Refactoring

The topic of this class is TDD, or Test-Driven Development. TDD is
about developing high quality code, keeping the design simple, and
having a regression framework in place for code maintenance. However,
TDD is not sufficient to insure the quality of commercial or
enterprise applications. In general terms, I break down testing
into three general categories:

I) Developer Tests (TDD)
II) QA Tests
III) User Acceptance Tests

Each category is somewhat different from the others. The primary
goal of developer testing is a strong design, high code quality, and
low defect rates.

I) Developer Tests (TDD)

In TDD, a developer always writes a test first, before writing any
code. The steps in TDD are always the same:

1) Write a test
2) Make sure the test fails as expected. If there are new classes or
methods involved, then the test won't even compile in a language
like Java or C#.Net. If the function already exists, but the
code inside has been modified, then make sure the test
initially fails the way you expect it to fail. Let's say you
have a function which calculates change for a purchase and you
are writing a test to cause it to break change down differently
depending on the quanities of available denominations; so whereas
it currently returns one quarter, it is supposed to return two
dimes and a nickle. Make sure the test fails initially by returning
a quarter (as opposed to simply crashing or actually returning the
right change before you've modified the code.
3) Write the new code and make sure the tests now all pass.
4) Examine the design and refactor any duplication (I'll discuss
refactoring in more detail in another class).

That's it, now rinse and repeat!

I'd like to make it clear that the goals of TDD are somewhat
different from the goals of other kinds of testing. TDD is
a development activity. The goal of TDD is first and foremost
to drive the design of the code itself. Using TDD ought to
generate code in which independent concerns are expressed in
separate pieces of code. Such separation makes it easier to
test the code. If the code is hard to test, that implies the
design of the code is not optimal. If the code is easy to
test, then the design of the code is better. Better code means
it's easier to add new features and it also means the QA people
will find fewer defects, and especially should find almost no
defects related to basic functionality.

You can look at an application as a bunch of nested boxes. The
innermost box is the code framework you're using to develop on
top of. It's there before you've written a single line of your
own code. Then you develop code around that kernel. The code
tends to become organized in ever wider layers, although some
"inner" layers may depend on outer layers, so the layering
is rarely "perfect." Still, good code generally has this sort of
hierarchical structure:
[ __A__ ]
[ [ B ] ]
[ [ [C] ] ]
[ [_____] ]
[_____________] etc...

If you're writing tests for code in B the general approach is
as follows: If necessary, you can "mock" out functionality in C
by subsituting a mock/stub/fake instead of the real code in C.
This kind of thing is done when the real code in C accesses
external resources which are irrelevant to testing the logic in B.
Having extensively tested B by itself, you would then write
a comparatively small number of tests against the application
as a whole making sure that the code in B is actually used by
the application. Such "sanity" tests will make sure the code in B
really has been included in the app and works for a few basic
cases. You can think of such "functional" tests as poking lines
through the outer layer of the application all the way through.
Note, do not confuse a functional test in TDD with a FIT test,
which is a functional integration test. The extensive testing has
already been done, so even though these functional tests do not
test every path through B, they make sure it's properly fitted
into the application as a whole.

A functional test through B looks something like this:
[ _\_A___ ]
[ [ \B ] ]
[ [ [\C] ] ]
[ [_______] ]
[_______________] etc...

You can see it's hard to write enough tests like this to properly
cover B. That's what the unit tests are for.

Here is an eample unit test in Java-like syntax (modified from real
Java to make the example more readable):

public void
testCalculateChange_WithChangeLevelling() {
VendingMachine vm = new VendingMachine();

int[] change = vm.makeChange(25);

assertEquals("use nickles and dimes", [10, 10, 5], change);

Note: Before implementing the "levelling" algorithm, the test should
return [25] instead of the expected [10, 10, 5]. Note that this test
just tests the makeChange function. It does not worry about how the
amount of change itself if calculated. A functional test would
be more complicated because more setup would be required, but
there are generally fewer of these per feature:

public void
testFunctionalTest_Purchase_WithChangeLevelling() {
VendingMachine vm = new VendingMachine();
vm.addItem("Mars Bar", "$0.75", 15); //15 mars bars; 75 cents

int[] change = vm.purchase("Mars Bar", "$1.00");

assertEquals("use nickles and dimes", [10, 10, 5], change);

II) QA testing is done with a build of the application provided by
the developers. Therefore QA testing is done after a certain amount of
code has been developed. QA testing is usually done by QA
professionals often with support from users or business people to
make sure the developed software really does work and meets the user
requirements. Some QA testing can be automated with scripts. On
our project, we use a tool called Watir (
to script interactions with the Web application as if a real user
were clicking on buttons and entering data.

III) User acceptance tests are generally written before the
application code has been developed. The format of the tests is
a table with the inputs entered in by the user and expected outputs.
Later, such tests are linked in with the application and executed
to determine whether they've passed. FIT (
and Fitnesse ( are tools commonly used in
the XP community for such "functional integration" testing.

No comments: