Tuesday, April 24, 2007

TDD Is Not An Algorithm Generator!

[Update: more on this topic]

In his blog, Ravi compares Ron Jeffries working out a sudoku solver using TDD (test-driven development) with Peter Norvig's implementation.

Now, what can we learn from the two efforts? Peter does not use TDD to solve the problem, yet his solution is compact and complete, whereas Ron's TDD effort remains just a partial solution. Does that mean that TDD is inferior to design up front? I do not believe that's the point Ravi was trying to make, but perhaps he was trying to say that TDD is not an algorithm generator. In that respect, I completely agree. In fact, one of the first rules I teach my students when I am doing a TDD workshop or teaching a course is precisely that TDD is not an algorithm generator! Solving sudoku is just the kind of problem you want to find an algorithm for first, then implement that algorithm. This is not a question of up-front design. It's just common sense: It's perfectly reasonable to know conceptually how to solve sudoku puzzles before writing any code. Else how would you know whether any of the intermediate methods that you so diligently write tests for will be useful in the end?

There is a real gap between a design, which is a particular structure of code, and an algorithm, which is a description of how to solve a problem. An algorithm can be expressed in any form you choose. It can be a mathematical notation or it can be written down in plain english, or it can live in your head, but if an algorithm is required to complete a story, you should certainly know what it is before you engage in writing your production code. If you doubt what I'm saying, consider a story where I ask you to write code to encrypt and decrypt files. Would you just start hacking out methods (using TDD or otherwise)? You'd be lucky to develop a trivial cipher. You need to choose a cryptographic algorithm before you begin any programming. Once you do start with the programming, I would certainly recommend using TDD as a way of designing the code.

Let's say you're faced with a problem where you want to find the right algorithm yourself, either because you have to or just because you enjoy the challenge. Writing some code to solve a simpler part of the problem may be a good way to start. That approach worked for Peter in his solution to Sudoku - I doubt it's guaranteed to work all the time. In any case, when you're hunting for an algorithm, any code you write should be considered provisional. The main purpose is not to produce high quality maintainable code, it's just to see if getting part way toward a solution will trigger any additional insights. Again, it's once you've understood the correct algorithm that you can begin writing the production code.

I certainly think Ron's example is not a very good indication of how TDD should work. I am also disheartened that Ron doesn't clearly make the point that he is looking for an algorithm. Therefore his efforts to refactor his code during the tdd process, are far too premature. Based on Peter's solution, I'd say Ron never even gets half way toward the correct algorithm and wastes his last few articles on refactoring to objects instead of doing anything particularly useful or enlightening. When you're in an investigative mode where you don't even know how to solve your problem, that harly seems like the time to solidify the design. Even if you're doing agile development, keep in mind that for each XP story, you should be certain that you know at least informally in your own mind the steps needed to solve the problem. Trying to offer TDD as an algorithm generator is dopey and it's just going to make it easy for people not to take TDD seriously as a valid design technique.

So what is the purpose of TDD then? One goal of TDD is to reduce the need to determine ahead of time which classes and methods you're going to implement for an entire story. There's a large body of shared experience in the developer community that trying to anticipate such things tends to lead to paralysis where nothing useful gets done and/or produces bloated, over-designed code. Instead, you can develop one aspect of the story at a time, using each test to keep yourself moving forward and refactoring the design as you go along - think of TDD then as a ratchet or a belay device in climbing. One of the principles of agile development is that it's generally not a good idea to try to comprehensively understand a whole project (or a big part of it) up-front, as is often done in waterfall-ish methodologies. However, that doesn't mean you need to begin every story blindly without understanding how to solve that particular problem.

Updates: UncleBob makes a good comment on reddit: http://programming.reddit.com/info/1kth0/comments/c1lkvh; Read about my TDD Sudoku effort

Sunday, April 22, 2007

Snake-Oil Salesmen, Process Skepticism, and Why I am Not a "Post-Agilist"

The optimist thinks that this is the best of all possible worlds, and the pessimist knows it.
-- J. Robert Oppenheimer, "Bulletin of Atomic Scientists"

While I am a strong proponent of agile methods, I've witnessed a number of people I respect writing about being skeptical of agile processes and complaining about evangelists who try to sell methodology without really knowing what they're talking about. For example, Jonathan Kohl has written about Post-Agilism where he states :
Personally, I like Agile methods, but I have also seen plenty of failures. I've witnessed that merely following a process, no matter how good it is, does not guarantee success. I've also learned that no matter what process you follow, if you don't have skilled people on the team, you are going to find it hard to be successful.

Ravi Mohan has written of Agile as a religion, "with its holy books, many patriarchs, its churches, and commandments." In his blog, Ravi also frequently decries the lack of actual programming experience demonstrated by many self-professed gurus to back up their claims.

On a certain level I agree. It seems evident that any movement, as it grows beyond a certain point, becomes vulnerable to distortion, both by the well meaning but ignorant, and also by those who cynically aim to make a buck from whatever is trendy today. Agile is no exception. One thing I stress whenever I talk about agile methods is that the agile manifesto is self-referencing. The first statement (Ravi would perhaps say commandment!) of the manifesto is clear:

Individuals and interactions over processes and tools

Those "processes and tools" include the methods of Scrum and XP! If you're working on an agile project and the people on the project are trying to blindly follow some agile practice when it clearly doesn't make sense, they are violating the spirit of agile development: Ipso facto, you must use your knowledge of what's actually happening on your particular project, your team, at a particular point in time to decide what to do next. Everything else is just a guideline.

However, things are unfortunately not so simple because it is often not clear when you should abandon a given practice. Let's say you think that something isn't testable and you give up test-first development; then you decide that a given feature really requires a lot of up-front design and you give up iterations. Before long you've entirely given up all the essential aspects of what agile software development is about! Following an agile practice is therefore always a delicate balancing act. You must respond empirically to reality, but you must learn the practical value of agile practices. I've found that many of these practices are worth defending. For example, at times I have been tempted to abandon test-first development when testing became difficult, but upon deeper consideration I realized that there was something wrong with the way I was approaching my problem, and that was the real culprit that made the test-first approach problematic. By re-thinking my approach, I found a way back to test-first development and its benefits.

I guess I would like to emphasize the following point: If you are interested in agile development, you must take the time to learn the value of the practices you're using. Blindly doing anything will never do any good. However, discomfort I have with notion of "post-agilism" is that the malaise people are experiencing, that's causing them to react, is rooted in fundamental problems: Managers will read articles about agile and impose it on their teams; so-called gurus will go into a company dispensing advice without properly understanding what's going on, perhaps without really knowing anything of value in the first place; teams will get caught up in zealous behaviour and start following practices blindly; people will splinter into factions, arguing over minutiae and jealously guarding their particular version of the Truth; incompetence will have its day. Thus dogmatism will set in. The way I see it, these things are inevitable. They are inevitable in the same way that corruption is inevitable in any social or political order. As with corruption, one can only do one's best to manage such problems, to reduce them to a minimum, but one will never get rid of them entirely.

I fear that creating a new movement/idea and calling it "post-agilism" is just sweeping these basic problems under the rug. If the agile movement is a case of taking some pragmatic ideas and codifying them into a general system, or "going meta" with them, as Bob Martin has written, then "post-agilism" is a case of going "meta-meta." I can't see how this will lead to any good. Before long, there will be "post-agilist" gurus and dogma, and because of its second order nature, the attendant problems will be much harder to deal with and untangle. I hope that the agile community will get itself out of this spiral and instead will establish ways to do research and to educate people so that as much as possible software development will be about common sense and about what works instead of dogma. Things will never be perfect, but hopefully we can work toward a system that represents the best possible balance and thus minimizes the inevitable corruption that accompanies any human endeavour.

An agile development process ought to be self-correcting. If that's not working, I'm not sure how embracing "post-agilism" will help: For better and for worse, I think the agile approach represents the best of all possible worlds.

Wednesday, April 18, 2007

TDD: The Bowling Tracker

Some years ago I purchased Bob Martin's book Agile Software Development, Principles, Patterns, and Practices. At the start of the book, he brings up a TDD exercise using 10 pin bowling. At the time I thought it would be a good idea to try my hand at the exercise so I avoided reading that chapter in the book; looking at someone else's ready-made solution to a problem tends to create mental blinders so I always try to do things on my own first, then look at the solution in the back, so to speak. Also, if my solution turns out to be inferior, I can learn a lot more from thinking about the differences between the two. I can't remember any more if I ever actually did the exercise or just forgot about it. Recently however, I was thinking about developing this example for potential use in a presentation or seminar about TDD for Eamug. Here then is my implementation along with all of the tests - so feel free to have a look!

Here are my general observations about the process:
  • It took longer than I thought! I expected to spend a couple of hours, but it actually took over a day. I estimate the total development time was about 10 hours!
  • Because all of the classes except BowlingLane emerged from refactoring, none of the classes used by BowlingLane had unit tests for their methods. Since none of those methods was public - they were only ever called by BowlingLane which provides the public API, I suppose that's ok. I think my rule is that if a method of a class comes about as a result of refactoring, then there should already be a test for that functionality at a higher level - so one doesn't need to worry about writing a specific test for that method.
  • Despite the simplicity of the problem, I was surprised to find quite a few gotchas that broke my tests. I definitely think that using TDD to generate this functionality both produced better code and saved time over trying to hack the entire thing out.
PS: Coincidentally Curtis Schofield made a link available to Bob Martin and Robert Koss' original bowling TDD example so feel free to have a look their implementation as well. They focus on just getting the scoring right, so they don't distinguish between an unplayed frame and a scratch where both throws were gutter balls; they also don't seem to handle displaying an empty cumulative score if the player makes a strike or a spare but hasn't made the subsequent throws required to settle the score for the current frame. I suppose that logic could go into display-specific code they didn't bother writing since their emphasis was just on the scoring itself. I might try their approach of just having an array of ints of size 21, which would make the scoring code a lot simpler. Then I'd have to put the smarts around displaying frames properly and displaying an empty cumulative score when appropriate into code that is specific to the display. The nice thing is that now that I have my external API I could in principle completely re-write the back end and as long as I my tests passed, I'd be all right.

PPS: For bonus marks, I just had a look over my code and I'm pretty sure there is at least one bug! Can you find it? :) Sadly even TDD doesn't remove the possibility of bugs, but I can think of a test that would expose it!

Tuesday, April 17, 2007

EAMUG #2!

We had another presentation for Eamug this evening! This time it was about TDD (Test-Driven Development) and Curtis Schofield was the presenter. He did an interesting and decidedly manic presentation and had a very fun presenting style. I think he had a slide at one point that said "Current development practices are teh suck!" or something along those lines, and his presentation was interspersed with random photos of pineapples among other things. It made me feel like I am too didactic when I do presentations - I'm so intent on making sure I am expressing myself clearly that I sometimes forget to just enjoy and have fun and that's a good lesson for me in any future presentations I do. We'll see if I can actually learn that lesson! The only problem I had with the presentation is that the code he was demonstrating - writing a templating engine in ruby - was too sophisticated for a short presentation. Not only was everything in Ruby which was the least well-known language among the attendees but it was ruby on steroids, full of crazy dynamic reflection meta-programming stuff. I had a lot of trouble following what was going on. On the other hand, I learned about some features of Ruby that I didn't know about, so I guess it wasn't entirely a bad thing. It's always tough when you're trying to use an example from a real project rather than a toy project - conveying the context becomes difficult. On the other hand toy projects are by definition not representative of real world issues. I think introducing TDD in a one hour presentation still requires a toy project though because otherwise it's just too difficult to get everything across clearly. In any case, there was a lively discussion afterward and generally a good feeling in the room and I was really happy to be there.

Monday, April 02, 2007

Zen And The Art of Software Development

I don't know who deserves the credit for this thought, but a few years ago I read (or someone might have said to me) that no matter how complex an application becomes, it must nevertheless start with a single line of code. Every application, no matter how sophisticated it may be, starts off as a simple application. I think that's a rather deep statement and worth thinking about. Here's my contribution: If you look a random piece of code, say about 10 lines, from *any* application, what you see will look much the same. There may be some kind of looping construct, or assignment, or function calls, or some conditional logic. That's true whether you're looking at a Web app, or video game code, or an operating system kernel! That's why I believe so much in the power of simplicity and of abstraction. No matter how devilishly complex or clever your algorithm may need to be, it can be abstracted away into some kind of function call. As for the code around that algorithm, it's probably going to be good old prosaic application logic. That's why using TDD as a design tool is so valuable. It helps to tackle complexity by thinking about interfaces, not gnarly implementation details. It pushes you to organize your code so it is as orthogonal as possible - so that modules of code can work independently and any combination of ways in which that code is executed will still work properly. If your code is like that, you don't *need* to test every possible combination - which is impossible anyway. You can count on things working because you've removed the duplication from your code. That's my rant for today, thanks for reading! :)