Saturday, November 24, 2007

Pair Programming Redux

I've been working for a few years in a mostly XP pair programming environment. Here's my list of pros and cons of pair programming based on this experience:

Pros:
  • Unlike most other projects I've worked on, I've seen an overall improvement in the code quality over the past few years. I've never seen that happen on any other project. As people learn better ways of doing things, pairing helps to disseminate that knowledge in a way that meetings, presentations, and code reviews just don't.
  • In my experience, there hasn't been a problem with people who are working together arguing and consequently getting nothing done. On the contrary, pairing seems to have encouraged an atmosphere of cooperation and friendliness.
  • Sometimes under time pressure one can't resist the urge to copy and paste some code or hack in some functionality without developing tests first. The mutual supervision of pair programming really does seem to have a positive effect on these kinds of transgressions. It's a lot harder to take a nasty shortcut under the watchful eye of the person you're pairing with.
  • I would tend to agree with the principle that working in pairs doesn't really hurt overall productivity. The reason seems to be that the team having a common understanding of the code and the business generally trumps the value of having two people typing code in separately. Integration is the real difficulty in many software projects, so whatever gain in the amount of code written might come from separating pairs, that gain seems to be offset by the greater consistency of code written when pairing.
Cons:
  • Hygiene: Pair programming in our environment means sharing one keyboard and mouse, and sitting in close proximity to another person all day long as well as switching pairs frequently. We often have had problems with people spreading colds around the office. I think developers in an XP shop should have their own personal wireless mouse/keyboard combination as well as their own chair that they don't have to share with anyone else.
  • Ergonomics: The reality is that sitting at a computer is not a natural position for the body and you can do damage to your back, neck, shoulders and head, and of course your wrists over time. Pairing tends to encourage bad ergonomic habits, because it's inconvenient to change the position of the keyboard tray, monitor, chair, etc when it's time to take one's turn at the keyboard. I really believe that these things need to be considered if you want to do pair programming long term.
  • Personal space: Some people enjoy having some peace and quiet as well as a space where they can keep their things. As much as I am a fan of pair programming, I think developers should have their own desks away from the common bull pen they can retreat to from time to time.
My conclusion is that pair programming does work, but it requires some care. First, developers have to respect each other and be willing to compromise. Also, pair programming 100% of the time doesn't work. Sometimes when facing a design problem it helps to go off and work separately, then get back together and discuss later. Also, pair programming - constantly communicating, asking questions, explaining and justifying one's own ideas throughout the day - is very demanding. After a period of time, burnout can occur. When that happens, I think it's a good idea for people to be able to work on something alone for a while.

Thursday, November 15, 2007

Demystefying Stubs, Fakes, and Mock Objects

This blog entry is my attempt to explain the different kinds test-only objects that can be substituted for the real thing in automated tests: Stubs, Fakes, and Mocks. It can be easy to get confused, so I hope this entry helps!

Stubs, fakes, and mocks are all objects that replace the objects that would normally be used in the actual application. What purpose do they serve? First, using such test-only objects promotes good design. Writing code so that classes have dependencies on interfaces and abstract classes which can be implemented either by the full-fledged production code or by a test-only class of some kind helps to reduce the overall coupling in an application. Test-only objects frequently come up when an application is accessing another system: Making sure that system is available and produces the same response to a given input all the time may be a problem (consider a service that provides the current price for stocks for example). Also, simply configuring something like a relational database or a web server so that tests can run against it can be an issue. More generally, using a test-only object makes it easier to set up the initial conditions for tests. Such objects can make it simpler to provide the initial state that the function under test will respond to. The time it takes for tests to run can also be an factor; using simple test-only objects can speed up the time it takes for tests to run substantially. Finally, in some cases one may want to develop some logic that depends on classes that haven't been written yet or that you don't have access to. Say for example another team is working on that functionality and you don't have access to their code. In such cases, you can write code against objects that, for the purposes of testing your own logic, implement the interfaces you expect to see in the yet-to-be-written API. The core idea behind the use of test-only objects is that we often want to write tests for the application logic we're currently working on without having to worry about th behaviour of some of the external code our logic is using.

Let's consider an example. I once wrote a program that allowed a user to schedule polling of different sensors - e.g. read sensor 'A' once an hour, sensor 'B' every minute, and sensor 'C' every second. I wanted to test my scheduling logic, but I wanted to make it independent from the actual sensors: I didn't want to have my application connected to real sensors just to make the my unit tests run. Also the scheduling system of course obtained the computer's system time to see whether it was time to read a given sensor. I also wanted my tests to be independent of the actual system time: What I wanted was to set the time in my test and make sure my program responded by reading the correct sensors - and avoided reading the wrong ones. I didn't want to have to set the actual computer's system time inside of my tests! So, for both of these cases, the business of actually connecting to the sensors and setting and reading the time, I created special test-only objects to stand-in for the ones that would be used in the actual application.

If you're interested in ways to instrument your code, and the trade-offs involved, so that you can substitute these kinds of test-only objects Jeff Langr's Don't Mock Me article is a great reference. I should note that his use of the word "mock" is more generic than the one that's often used. He means "mock" in the general sense of any kind of test object. As we'll see a bit further down, mock objects often have a specific meaning that's different from stubs and fakes.

Now, on to Stubs, Fakes, and Mocks.

Stub: A stub is an object that always does the same thing. It's very simple and very dumb. In our example above of polling sensors, the system time seems like a useful entity to replace with a stub. Let's suppose our scheduling code was in a class called Scheduler. This class might have a method called getSystemTime(). For the purpose of testing, we might create a TestingScheduler class that extends Scheduler and overrides the getSystemTime() method. Now you can set the system time in the constructor of this test-specific class, e.g:

public class TestingScheduler extends Scheduler {
public TestingScheduler(int timeInMillisForTest) {
this.timeInMillisForTest = timeInMillisForTest;
}

public int getSystemTime() {
return timeInMillisForTest;
}
}

When a TestingShcheduler object is used as part of a test, the rest of the Scheduler logic works normally, but it's now getting the time that's been set in the test instead of the actual system time.

Fake: A fake is a more sophisticated kind of test object. The idea is that the object actually displays some real behaviour, yet in some essential ways it is not the real thing. Fakes can be used to speed up the time it takes tests to run and/or to simplify configuration. For example, a project I am currently working on is using Oracle's Toplink as an object-relational mapper (ORM). This allows data in Java objects to be transparently saved to and retrieved from a relational database. To make tests that use this framework run faster, a much simplified memory-only implementation of Toplink's interfaces was implemented. This version doesn't know about transactions and doesn't actually persist data, but it works well enough to allow many of our tests to run against it - and since the actual Oracle database isn't involved, the tests run over an order of magnitude faster. Going back to the scheduler example, we developed a piece of software that could behave as though it was a real sensor. That way we were able to run a variety of fairly complicated tests to make sure our application could communicate with sensors correctly without actually having to hook up the tests to a real sensor. Any time you write code that simulates an external service - some sensors, a Web server, or what have you, you're creating a fake.

You can find a simple example of a fake in the TestNode class in my loop finder example. The TestNode implements the Node interface for the purposes of the unit tests. Classes that are actually part of the application have their own, more complex implementation, of this interface - but we're not interested in testing their implementation of the Node interface here. This allows us to write tests that can run in isolation from the rest of the application. From the perspective of the overall design, this approach helps us to reduce coupling between classes. The LoopFinder class only depends on the Node interface rather than on any specific implementation. That's an example of how making code easier to test concomitantly improves the design.

Mock: Mock objects can be the most confusing to understand. First of all, one can argue that the two types of test classes mentioned above are mocks. After all, they both "mock out" or "simulate" a real class. In fact mocks are a certain kind of stub or fake. However, the additional feature mock objects offer on top of acting as simple stubs or fakes is that they provide a flexible way to specify more directly how your function under test should actually operate. In this sense they also act as a kind of recording device: They keep track of which of the mock object's methods are called, with what kind of parameters, and how many times. If your function under test fails to exercise the mock as specified in the test, the test fails. That's why developing using mock objects is often called "interaction testing." You're not only writing a test which confirms that state after a given method call matches the expected values; you're also specifying how the objects in the function under test, which of course have been replaced with mocks, ought be exercised within a given test.

To sum up: A mock object framework can make sure a) that the method under test, when executed, will in fact call certain functions on the mock object (or objects) it interacts with and b) that the method under test will react in an appropriate way to whatever the mock objects do - this second part is not any different from what stubs and fakes offer.

We've already seen how stubs and fakes can be used, so let's create a hand-rolled example of the kind of thing that mock object frameworks can help with. Let's go back to the scheduler we've already talked about. Let's say the scheduler processes a queue of ScheduledItem objects (ScheduledItem might be an interface) . If it's time to run one of these items, the scheduler calls the item's execute method. In our test, we can create a queue of mock items such that only one of them is supposed to be executed. A simple way of implementing this mock item might look something like this:

public interface ScheduledItem {
public void execute();
public int getNextExecutionTime();
}

public class MockScheduledItem implements ScheduledItem {
private boolean wasExecuted;
private int nextRun;

public MockScheduledItem(int nextRun) {
this.nextRun = nextRun;
}

public void execute() {
wasExecuted = true;
}

public int getNextExecutionTime() {
return nextRun;
}

public boolean getWasExecuted() {
return wasExecuted;
}
}

Our test might look something like this:
public void testScheduler_MakeSureTheRightItemIsExecuted() {
//setup
MockScheduledItem shouldRun = new MockScheduledItem(1000)
MockScheduledItem shouldNotRun = new MockScheduledItem(2000)
Scheduler scheduler = new TestingScheduler(1100);
scheduler.add(shouldNotRun);
sheduler.add(shouldRun);

//execute
scheduler.processQueue();

//verify
assertTrue(shouldRun.getWasExecuted());
assertFalse(shouldNotRun.getWasExecuted());
}
That's a really simple, hand-rolled, example of a mock object. The test just makes sure that the processQueue method ran the execute method on the first item, but not for the second one. Of course this example is very simple. We could make it a little fancier by counting the number of times the execute method is called and make sure it's only called once during the test. Then we could start to implement functionality that makes sure functions belonging to a given mock object are called with particular arguments, in a particular order, etc. Mock object frameworks support this kind of functionality out of the box. You can take any class in your application and create a mock version of that class to be used as part of a test. There are a bunch of mocking frameworks for many different programming languages.

Before you dive in, consider my word of caution: In the great spectrum between pure black-box and pure white-box testing, using mock objects is about as "white-box" as it gets. You're saying things along the lines of "I want to make sure that when I call function X on object A (the function and object under test), that functions Y on object B and function Z on object C will be called in that order, with certain specific arguments." When the test you write is making sure that something *happened* as a result of your test, it tends to be easier to understand what the test is trying to do. On the other hand, if your test is just making sure that some functions were called, what does that really mean? Potentially very little. Also, because your mock objects are basically fakes or stubs, you are not guaranteed that the behaviour of the actual objects that are being mocked out will be consistent with the mocks. In other words, you can create a mock version of an object that adds one to its argument whereas the real function subtracts one. If you change the behaviour of a given function that is being mocked in a test somewhere, you have to be careful to make sure to adjust the mock accordingly. If you don't, you'll wind up with passing tests, but you may still have introduced a bug into the application. This kind of problem tends to become more likely as the sophistication of the fake implementation increases - and pure fakes also suffer from the same weakness. I do think that creating mock tests where the specified interactions become complicated and the mock itself is a sophisticated fake that can respond to a wide variety of interactions compounds the likeliness of running into this kind of problem. Also, on a more basic level, simply refactoring code can be difficult with mock objects. The mock frameworks sometimes use strings to represent the mock object's methods internally, so renaming a method using a refactoring tool may not actually update the mock, and your tests would suddenly fail just because you renamed a function. Of course even a slightly more complicated refactoring, like breaking up a method into two can also cause mock objects to fail trivially, telling you that yes indeed, you actually changed some code.

As you can tell, I am not a huge fan of extensive use of mock objects in the sense of specifying interactions. I believe that such objects can indeed be useful in specific cases, but that's not how I think about writing my code. When I write a test, I try to keep it simple and concentrate on what I can expect to happen as a result of running that test, not specifically what the execution path of the function under test will look like. There are of course cases where this type of interaction testing is useful. I think the scheduler example above is a good case in point. You want your test to make sure a method is called, but thats it; you're not interested in what the real implementation of that method may do. All in all, I tend to prefer to stick with simple hand-rolled stubs, fakes, and mocks in my TDD practice. Your mileage may vary. Martin Fowler has written about the distinction betwen mocks and stubs/fakes also.

I know that when I first encountered "mock objects", I had some trouble figuring out exactly what it meant and what all the fuss was about. If you've found this blog entry because you were experiencing the same confusion, I hope it's been of some help.

Additional Links:

Monday, November 12, 2007

Even Grandmasters Get The Blues

I read an article some months ago about Vladimir Kramnik's incredible mistake in a game against the computer opponent Deep Fritz. Kramnik was at that point the undisputed world champion in chess - apparently the first person to achieve that status since Kasparov. In this game however, he made a truly astounding blunder. Deep Fritz had made a move that threatened an immediate check mate. Even someone like me who barely knows the rules of chess can see this fairly easily, but somehow Kramnik missed it. Here's a diagram of what happened:



With Deep Fritz's queen at E4 threatening to check mate at H7 with the very next move, Kramnik, instead of defending, blithely moved his queen from A7 to E3. In an instant, the game was over. Here is an account of the events:

Kramnik played the move 34...Qe3 calmly, stood up, picked up his cup and was about to leave the stage to go to his rest room. At least one audio commentator also noticed nothing, while Fritz operator Mathias Feist kept glancing from the board to the screen and back, hardly able to believe that he had input the correct move. Fritz was displaying mate in one, and when Mathias executed it on the board Kramnik briefly grasped his forehead, took a seat to sign the score sheet and left for the press conference, which he dutifully attended.

It's a fascinating thing. How could Kramnik have made such an error? There could be lots of explanations. He might have been ill or in a bad mood. He might have been experiencing a lot of stress because his opponent was a computer: It must be psychologically difficult to cope with the idea of an opponent with perfect memory who will never make a reading mistake or get tired; an opponent who can not be intimidated with an agressive move, or confused, or tricked. Never the less, the fact is that this mistake really did happen.

For me this might be an interesting example of a phenomenon I'll call the "expert's blunder," which can apply to anything, and in particular, to software development. A beginner in any field has a very small number of things he or she can keep track of. As one learns more and more about a given area however, one has to think of an ever expanding tree of concepts and ideas. Anything you do as an expert is heavier and more difficult. The more you know, the more tools are available to you, the more effort is required to choose which tool or approach to use next. It becomes easier to overlook something completely simple that a beginner would spot right away. As Kramnik was playing this ill-fated game, I think he wasn't really looking at the current state of the board, but rather at his own mental picture, a cloud of variations and possibilities. He might have got out of sync with his place in the actual game.

I've noticed this kind of thing happening to me when I've found myself implementing a more complicated piece of code than was necessary. Here's a case I blogged about some time ago for example: My original blog entry, followed by my realization that my code was over-designed. Here I too was getting ahead of myself, thinking of classes and subclasses and interfaces instead of just solving the problem in the most straightforward way. I think this kind of mistake is a good example of why TDD (test-driven development) and pair programming are useful tools in software development. The more we focus on the simple step-by-step design process of TDD and the more we constantly subject our code to critique as we do in pair programming, the less likely we are to develop bloated and over-designed code - however well-intentioned we might have been in writing it in the first place. I think this also promotes the idea of pairing a seasoned veteran with a recent grad: The value flows in both directions, not just from the expert to the novice. Not only beginners can write bad code. By virtue of their experience, people with more knowledge can be just as susceptible. At the very least, it's safe to say that one should always pause and just "look at the board" every so often.

Saturday, November 10, 2007

Fun Introductory Software Projects

I've taught some evening/weekend courses in software development over the years. Usually I've taught adults, but sometimes I've had the chance to work with some talented kids too - from about 12 to 17 years of age. I have to admit, working with the kids was great. Someone recently suggested that I write up some of the projects that were done during these courses. I've always had students come up with their own ideas: I'm a big believer that if you come up with your own idea for something to work on, you'll be genuinely motivated. Therefore, I really suggest coming up with your own concept if you're looking for a project to work on... In case you're looking for some inspiration though, here are some of the types of things people in my courses have done. Have fun!

  • A version of the famous "snake" video game.
  • A really neat original 80's style arcade game which involved shooting a mothership and picking up space junk floating around on the screen to augment one's own spaceship. Let me tell you, it's quite an experience asking about how the swarming bullets were done and getting the reply "Oh, it's a very simple algorithm, but the swarming is emergent behaviour" from a 13/14 year old.
  • A simple contact manager which you can enter people's information and pictures into. This program used a file to store its data, so all of the storage/retrieval routines had to be written from scratch as opposed to using a database program, which proved quite instructive to the developer.
  • A very impressive network multi-player game along the lines of Warcraft (albeit much simpler). The programmer did good work with path-finding and making sure all players were in sync. The players were happy faces of different colors which became sad faces as you attacked them.
  • An online pizza-ordering application. I recall some good discussions about the user interface and whether it was ok to allow people to order who didn't want to enter in their credit-card information.
  • A chat program along the lines of ICQ/Instant Messenger
  • A "Towers of Hanoi" program in which the user could select the number of disks. The user could play the game him/herself or let the computer solve the puzzle. A challenging bonus (which in this case was not implemented) would be to incorporate a "hint for next move" feature.