Skip to content

Moved to neontapir.com

26 August 2012

I’ve taken the plunge and moved this content to my own domain, neontapir.com. Please join me there!

BetaFish 3 – Scaredy Cat

22 March 2012

This is article 3 in a series of posts in which I introduced the idea of building an F# robot for Robocode. The last installment showed an improved version of the robot that didn’t bump into walls, but did quake at the sight of an enemy.

The twitching behavior I described in the last post is a result of receiving a number of events each turn. When the robot is hit, it tries to evade. However, when it sights an opponent, it turns to attack. Evade, attack, evade, attack… the robot gets stuck in a loop as long as the opponent is facing it and continues to fire. Poor BetaFish. Let’s see if we can fix that:

namespace Neontapir
open Robocode
open System

type ActionType =
    | EndGame
    | Search
    | Evade of float
    | Attack of float
    | AvoidWall
    
type BetaFish() =
    inherit Robot()
   
    let random = Random(DateTime.Now.Millisecond)
    let defaultFirepower = 3.0
    let moveUnit = 20.0
   
     let randomTurn amount (robot:Robot) =
        let direction = random.Next 2
        match direction with
            | 1 -> robot.TurnLeft amount
            | 0 -> robot.TurnRight amount
            | _ -> failwith "Unexpected direction value"
   
     let shouldEvade enemyBearing =
        match enemyBearing with
        | bearing when Math.Abs(bearing : float) < 20.0 -> false
        | _ -> true
   
     let evade (robot:Robot) =
        robot |> randomTurn 90.0
        robot.Ahead moveUnit      
   
     let mutable lastEvent : ActionType = Search
   
     override robot.Run() =        
        try
            while true do
                match lastEvent with
                | EndGame -> ()
                | AvoidWall ->
                    robot.Back moveUnit
                    robot |> randomTurn 30.0
                    lastEvent <- Search           
                | Evade attackerBearing ->
                    match attackerBearing with
                    | bearing when not(shouldEvade bearing) ->
                        lastEvent <- Attack defaultFirepower
                    | _ ->
                        robot |> evade
                        lastEvent <- Search
                | Attack firepower ->
                    robot.Fire firepower               
                | Search
                | _ ->                           
                    robot.Ahead moveUnit
                    robot.TurnRight 40.0
        with _ ->
            lastEvent <- EndGame

    override robot.OnScannedRobot(event) =
        match lastEvent with
        | Attack _ -> () // robot.Out.WriteLine "Scanned robot"
        | _ ->
            robot.TurnRight event.Bearing
            lastEvent <- Attack defaultFirepower
   
    override robot.OnBulletHit(event) =
        let newEvent =
            match lastEvent with
            | Attack strength ->                
                Attack (Math.Min(strength + defaultFirepower, Rules.MAX_BULLET_POWER))
            | _ -> Attack defaultFirepower       
        lastEvent <- newEvent
          
    override robot.OnBulletMissed(event) =
        lastEvent <- Search
    override robot.OnHitByBullet(event) =
        if (event.Bearing |> shouldEvade) then lastEvent <- Evade(event.Bearing)
    override robot.OnHitWall(event) =
        lastEvent <- AvoidWall
    override robot.OnDeath(event) =
        lastEvent <- EndGame
    override robot.OnBattleEnded(event) =
        lastEvent <- EndGame

There are a few minor changes worth mentioning here:

  • I renamed the DoNothing ActionType to EndGame for readability
  • I’ve encapsulated the defaultFirepower and moveUnit values for reuse
  • I’m doing more sophisticated pattern matching, like the nested match in Run‘s lastEvent matching for the Evade type.
  • I dropped the type specifications on the handlers for Robocode events, since F#’s type inference engine can infer them

To solve the twitching issue, I created a shouldEvade method that decides whether to try to evade fire. This version will stand and take it if it’s facing an enemy. I found it odd that I had to specify that bearing was a float type, until I realized that Math.Abs has a number of overloads.

I changed the signature of randomTurn so I could use the piplining operator (|>). This operator allowed to me write robot |> randomTurn 30.0, which reads more natually to me than randomTurn robot 30.0. With pipelining, I’m able to say “with robot, do a random turn”. My code less like Yoda sounds, pipeline it makes.

I wrapped the code in Run to solve another nagging issue. In previous versions, when the battle was over, BetaFish would often throw an exception or the battle simulator would have to terminate it. The try ... with _ -> lastEvent <- EndGame facility allowed me to catch the DeathException I often receive at the end of battle.

This BetaFish version, the one I’m submitting to the code kata, does far better than its predecessors. It almost beats the Fire sample robot. Fire often wins because it turns its gun and its body separately. If we get another go-round, I will try this strategy. I took a quick stab at it, but the first attempt did very poorly.

It still has problems, don’t get me wrong. It runs like a scaredy cat if it gets fired upon a lot. It’s performs poorly because it moves the whole tank body instead of just the gun or the radar. And it is easily defeated by the C# sample robot, that runs fast along the border and fires towards the middle. BetaFish can’t get a lock on it fast enough.

Beyond the robot, there are certainly more F# features to explore. For example, I’m halfway through reading “Real World Functional Programming” by Petricek and Skeet, and a future chapter teases at a better way to handle event registration and waiting on events to fire. I can’t wait!

I hope this article series has been of interest. Drop me a line if you found it interesting and would like to see more.

BetaFish Part 2 – Twitchy

19 March 2012

Last time, I introduced the idea of building an F# robot for Robocode. The second iteration contains some improvements:

namespace Neontapir
open Robocode
open System

type ActionType =
    | DoNothing // end of game
    | Search
    | Evade of float
    | Attack of float
    | AvoidWall

type BetaFish() =
    inherit Robot()
   
    let random = Random(DateTime.Now.Millisecond)
    let firepower = 1.0
   
    let randomTurn (robot:Robot) amount =
        let direction = random.Next 2
        match direction with
            | 1 -> robot.TurnLeft amount
            | 0 -> robot.TurnRight amount
            | _ -> failwith "Unexpected direction value"
    
     let mutable lastEvent : ActionType = Search
   
     override robot.Run() =
        while true do
            match lastEvent with
            | DoNothing -> ()
            | AvoidWall ->
                robot.Back 20.0
                randomTurn robot 30.0
                lastEvent <- Search           
            | Evade(bearing) ->
                randomTurn robot (90.0 - bearing)
                lastEvent <- Search
            | Attack firepower ->
                robot.Fire firepower               
            | Search | _ ->                           
                robot.Ahead 20.0
                robot.TurnRight 40.0
      
    override robot.OnBulletHit(event) =
        let newEvent = match lastEvent with
            | Attack strength -> Attack (strength + 1.0)
            | _ -> Attack 1.0       
        lastEvent <- newEvent
         
    override robot.OnScannedRobot(event : ScannedRobotEvent) = lastEvent <- Attack 1.0         
    override robot.OnBulletMissed(event) = lastEvent <- Search          
    override robot.OnHitByBullet(event : HitByBulletEvent) = lastEvent <- Evade(event.Bearing)
    override robot.OnHitWall(event) = lastEvent <- AvoidWall
    override robot.OnDeath(event) = lastEvent <- DoNothing
    override robot.OnWin(event) = lastEvent <- DoNothing

You may recall me saying that the first version of this robot could be defeated by a wall. While trying to implement this feature, it became clear that firepower didn’t represent enough state for the BetaFish robot to care about. There were a number of intermediate versions between last post’s code and the above, as I experimented with created a class to encapsulate state.

While trying to use multiple classes, I was stymied for a time by Visual Studio. In an F# project, it wasn’t clear to me how to spread my solution over multiple source files. I couldn’t reference my other classes. I discovered that in a functional language, it’s necessary to order the source files to resolve dependencies, so that items do not depend on other items that are compiled afterward. It turned out that the classes I wanted to use came after the main class in alphabetical order. In Visual Studio, you can reorder source files in an F# project by right-clicking the source file in Source Explorer and moving it up or down.

After a number of iterations, I discarded that secondary class in favor of a discriminated union, which I called ActionType. Conceptually, a discriminated union is kind of like a generic C# enumeration, but of course much more useful. For example, notice that the Evade and Attack elements take a float argument, whereas the others don’t. And pattern matching provides a more robust way to examine the contents of an ActionType than C# enumeration’s ever would. To accomplish this in C#, you’d need a hierarchy of classes.

The Run method has become more sophisticated also, now matching against the AttackType instead of just a scalar value. Notice that I’m able to name the union type’s argument inline, such as in the case of Evade‘s bearing. This is a feature of pattern matching, not discriminated unions, as you can see with strength in the OnBulletHit event handler.

While this code looks more impressive, it actually stinks in battle. If you try this robot, you’ll find that once BetaFish engages the enemy, the robot freezes in battle as soon as it’s hit, twitching until it is annihilated. Can you see why? The next article in the series contains the answer.

BetaFish Part 1 – a Robocode robot in F#

15 March 2012

The group here at work chose to compete via Robocode for our next code kata, and this post is the first in a series that describes how I wrote an F# robot for that challenge.

Robocode is a programming game in which a platform is provided, including an abstract Robot battle tank class, and competitors write concrete implementations. Those robots compete in real-time on-screen. The battle simulator has decent graphics, and the platform exposes a few dozen events that robots can subscribe to in their quest to be the last robot standing. For example, when the robot sees an opponent or hits a wall, an event is fired.

Robocode itself is written in Java, but there is a library called JNI (Java-.NET interface) that allows .NET robots to take part in the fun. Since I’m learning F#, it seemed like the language to choose.

First, I visited the site and downloaded the Java platform. I then followed the instructions for creating a .NET robot, which including installing JNI and some helper libraries. The instructions were written for C#, and being F#, there were some subtle nuances that I missed which cost me a couple of hours:

I followed the Visual Studio debug setup instructions included, but found that if I set a breakpoint prior to hitting F5, I got a fatal exception loading the JNI jar file. A colleague wasn’t having this problem, and we figured out he wasn’t setting breakpoints right away. Now, I use this workflow:

  • Disable my breakpoints
  • Hit F5 and let the Robocode battle simulator load
  • Enable my breakpoints
  • Start the battle

I found a post on Zamboch’s blog showing a very basic F# robot. It was enough to get me over the hump. Here’s the first version of the robot:

namespace Neontapir
open Robocode
open System

type BetaFish() =
     inherit Robot()
    
     let random = Random(DateTime.Now.Millisecond)
     let mutable firepower = 1.0 // TODO: handle this immutably instead?
    
     let randomTurn (robot:Robot) amount =
          let direction = random.Next 2
          match direction with
               | 1 -> robot.TurnLeft amount
               | 0 -> robot.TurnRight amount
               | _ -> failwith "Unexpected direction value"
    
     override robot.Run() =
               while true do
                    robot.TurnRight 40.0
                    robot.Ahead 20.0
    
     override robot.OnScannedRobot(event : ScannedRobotEvent) =
               robot.Fire firepower
     override robot.OnBulletHit(event) =
               robot.Ahead 20.0
               firepower <- 1.0 + firepower
    
     override robot.OnBulletMissed(event) =
               robot.Back 20.0
               firepower <- 1.0
     override robot.OnHitByBullet(event : HitByBulletEvent) =
               randomTurn robot (90.0 - event.Bearing)

You can see, it has a lot of similarity to Zamboch’s sample robot. The new piece for me was the randomTurn method. This was the first occasion I’d had to use the pattern matching (matchwith) construct.

Pattern matching is a little like regular expressions for code. Instead of writing a switch statement or a series of if/then blocks, I can use pattern matching to conditionally execute code. In the case above, I’m getting a random result between 0 and 1, and using that to decide whether to turn left or right. The F# compiler won’t allow me to specify an incomplete pattern, so I use the “match anything” (_) symbol to complete the pattern. The underscore is analogous to a default case in a switch statement.

You’ll notice that I have a mutable variable in this robot, firepower. Although F# supports mutables, I wanted to get rid of it. Typically, I’d do this by having a method that returns a new BetaFish robot. However, because the Robocode battle simulator expects a single instance of the Robot, this strategy won’t work. At my present knowledge level, I’ll need at least one mutable variable.
This robot isn’t very smart. In fact, it can be thwarted by running into a wall. In the next article, I’ll add some more basic functionality.

Case Study: Release Planning in Agile

18 February 2012

This is a story about how I applied my CSM experience and training to help a mid-sized project get off the ground.

Beginnings

When I came onto the project, the company was struggling with agile. They had been doing software engineering for a while, but in the context of controlling hardware. This project, an item state management system, was a departure for them. They already had a product in this space which they were essentially reselling from a consulting firm. The existing product was awkward to use, its feature set was limited, and the consulting firm did not wish to re-write the product to scale.

So, the company decided to write a replacement. They chose C# and .NET as the language. “Mason”, the principal engineer on the project, was a Java programmer by trade, so they needed some help. I was hired onto the team as a senior .NET engineer who could help mature Mason’s technical capabilities in .NET, and they had already hired a mid-level .NET engineer who started a few weeks before I did. It quickly became apparent that my CSM knowledge would be just as valuable.

The team had the desire but not the field experience. Mason had written a prototype, and after quick examination of the code, it was clear that his architecture would not support the ambition of the product.

The project manager, “Jason”, had chosen Scrum as the delivery methodology, yet he had no experience with it. He was on the right track with sprint-level work management, but he was having trouble with release-level planning. he asked me if I could help get the project back on track. The product owner, “Kathy”, already had a comprehensive set of use cases outlined. While they were not fully-baked, they contained enough meat to discuss in detail what the project is trying to accomplish.

Theme Sizing

I introduced the idea of themes. Kathy had already grouped the use cases into broad categories, so it seemed natural to think of these categories as themes. Jason and the product owner went off and grouped the use cases. They did it more fine-grained than I’d anticipated, putting them into a couple dozen smaller categories that I’d call epics.

I then introduced the idea of T-shirt sizing for themes. The goal of this was to talk about which parts of the application contained the greatest effort. Some use cases have some risk — we as a team aren’t sure how we’ll tackle them. Others don’t have enough definition. Still others represent a lot of work.

Over the course of a couple of afternoons, the team got them sized. The challenge remained, how do you use T-shirt sizes to create something that resembles the Gantt chart the project sponsors were comfortable with?

Theme Points

It hit me, these T-shirt sizes were a scale, just not a numeric scale. I needed to quantify T-shirt sizes.

Mike Cohn has an article about relating story points to hours in which he shows story points graphed as Bell curves. So I wrote a scale like this on the whiteboard:

Onesie  |  Small  |  Medium  |   Large  |     XL

I picked an arbitrary number to represent a Medium. I wanted it to be on a scale that couldn’t be mistaken for user story points, so I chose 500.

Onesie  |  Small  |  Medium  |   Large  |     XL
                      500    

Of course, a Medium is not exactly 500 points. These represent a broad range, so I declared that a Medium was 500±50%, or 250-750.

Onesie  |  Small  |  Medium  |   Large  |     XL
                  |   500    |
                 250        750

I extrapolated to determine the rest of the scale.

Onesie  |  Small  |  Medium  |   Large  |     XL
  60    |   180   |   500    |   1500   |    5000
       125       250        750        2500

This worked out to roughly powers of 3.

Onesie  |  Small  |  Medium  |   Large  |     XL
   1        x3         x8         x25        x83

Jason took these ranges and our commentary during sizing that Theme X seemed bigger than Y and assigned values to each theme.

So When Will That Be Ready Again?

By this time, we had already completed a sprint or two, enough that we had an idea that our velocity was going to be somewhere near 30. We had enough information to try to correlate these themes against a calendar. Projecting when we’d complete the epic’s functionality and armed with our velocity, Jason threw the numbers into a spreadsheet and started trying out numbers.

Our current epic was rated at 800 theme points. We thought we’d be done with it in 6-7 ideal days, which means 3 developers can get through about 750 theme points in an ideal week. This became 250 points/dev/week, a velocity of sorts.

Thanks to the spreadsheet, we were immediately able to see that calculated out to about 12 months to write the application.

Better yet, we were able to forecast what general area of the application we’d be working on months from now, which allowed us to project some milestones. We need to have this epic done by Halloween to be on track, that set of themes by Thanksgiving, and so on.

Of course, these numbers had a high margin of error. We communicated it would be more like 6-18 months to write it. We reasoned that as the team got a epic or two under our belts, Jason and I should have been able to refine the conversion rate and get a more realistic idea.

What If…

The real fruits of this exercise were that we now had a few variables that we could experiment with.

For example, that velocity of 30 was measured while we were storming as a team and 2/3 of us were new to the domain. Maybe we could expect the work would get easier as we learned the domain. This would be expressed as an increased velocity. If you increase the conversion rate from 250 to 300 after a month, you gain about a month over the life of the project.

A similar gain would be realized if a 4th developer is added after a couple of months. Other projections were made about the impact of adding a QA engineer early, which the project didn’t have until much later.

Course Correction

Of course, the project sponsors wanted it faster. To them, time to market was crucial. The company saw a dwindling market for their current offerings and were looking for ways to increase their reach. Their goals were compelling — it seemed like a good opportunity. This product would do things that others don’t.

Management needed a product that was feature-compatible with its competitors in the marketplace before the venture capital expired, so we were asked to come up with a plan that might possibly work.

We were able to use the spreadsheet to estimate how much of the application we’d have to de-scope to get it done in the timeframe we want, and Kathy was able to identify several themes as not mission critical to beta testing. The idea was that development would continue while the product was in beta, which I had discarded as too risky but Jason felt was viable.

Presentation

We presented the plan. It was, in the parlance of a senior company executive, a “success-oriented plan”. No one was comfortable with it, so some difficult conversations followed. The project was in a bad position.

There were two senior executives who vied for control of the project. One felt that the company should continue to sell their poor solution until the new one was ready. The other felt that the poor solution could not be maintained, should not be sold, and the new one needed to be rushed to market.

The second executive won and demanded a crushing work pace, complete with extended work hours. I knew from experience that such a pace was not sustainable and was prone to error on a product whose core had to be solid the first time. The customers in that market would not forgive a poor first showing.

I could not in good conscience proceed with the project as outlined, so I left. I don’t know exactly how the story ends. The other experienced .NET engineer on the team left for similar reasons. I have spoken to Mason, and he told me the project is very behind the schedule Jason and I laid out.

Conclusion

I think the exercise was worthwhile. Although the schedule Jason and I concocted was wildly speculative, it provided a frame of reference so that valuable conversations could take place. It got prompt and appropriate management attention to the project. The technique allowed us to do the analysis in a timeframe where its recommendations could be acted on.

Many of you reading this article have probably already formed opinions. It’s clear the project is in trouble. For me, it was that no project plan could overcome the fact that the delivery team did not have enough technical experience nor the business knowledge to rapidly develop such a complex product.

Without doing the analysis we did, I don’t think the errors in the original project plan would have surfaced until much later.

Revenge of the Secret Santa Code Kata – F#

17 February 2012

Here’s another solution to the Secret Santa code kata, this time a scouting mission into functional programming with F#.

To recap:

  • My first solution, written in PowerShell, relied on selecting random pairs of people and using a “hill-climbing” algorithm to avoid getting stuck.
  • My second solution I constrained to be deterministic — no randomness.

This one was more about trying to write something meaningful in F# using a problem I’m by now familiar with. Take a look at the code:

open System
open System.IO

// TODO: what's the accepted way to configure an F# program at runtime, App.config?
let lines = File.ReadAllLines(@"names_santa.txt") |> List.ofArray

let surname (x : string) = (x.Split ' ').[1]

// TODO: how do I make this generic enough to use in both cases below?
let lastItem (a : string[]) = a.[a.Length - 1]

let randomizer = new Random(DateTime.Now.Millisecond)

let swap (a: _[]) x y =
    let tmp = a.[x]
    a.[x] <- a.[y]
    a.[y] <- tmp

// shuffle an array (in-place), borrowed from a code snippet site
let shuffle a = 
    let len = Array.length a
    Array.iteri (fun i _ -> swap a i (randomizer.Next(i, len))) a

let linesArray = Array.ofList lines

// TODO: this "do-while" loop, I should be able to rewrite it as a recursive function, but how?
shuffle linesArray 
while surname linesArray.[0] = surname linesArray.[(Array.length linesArray) - 1] do shuffle linesArray

let rec pairings = linesArray
                    |> Seq.windowed 2
                    |> Seq.choose (fun (x:string[]) -> 
                                        match x with
                                        | x when surname x.[0] <> surname x.[1] -> Some(x)
                                        | _ -> None
                                   ) 

let pairingsList = List.ofSeq pairings

List.append pairingsList [[|pairingsList.[pairingsList.Length - 1].[1]; pairingsList.[0].[0]|]] |> 
    Seq.iter (fun (x:string[]) -> printfn "%s gives a gift to %s" x.[0] x.[1])

Console.WriteLine "Done, press Enter to exit"
Console.ReadLine() |> ignore

As you can see, I left myself a few to-do items. I boxed the effort at three hours, which is why I went ahead and borrowed the shuffle logic from the internet.

The first hour was learning to speak F# again, especially the method signatures. I played with F# when it was in beta, so I didn’t go into the kata stone cold. However, it took me a while to troubleshoot why my methods didn’t have the signature I expected.

I spent the most time writing the pairings function. Once I had the insight of using the match operator, the algorithm came together very quickly. I decided that I wasn’t happy with it being deterministic. I found that if I provided a file that was already in an appropriate order, pairings regurgitated the list. Dissatisfied, I went about trying to shuffle the array. Not being familiar with unit testing in F#, I used the F# Interactive console. I got frustrated with trying to implement the shuffle algorithm, so I decided to study a working copy.

The hard part came in integrating it with my code. I resorted to using a while loop, which I know from reading about functional programs means that I’m not thinking of the problem through. I believe I couldn’t convert this into a recursive function because shuffle returns a unit (that is, it does the work inline). If it returned an array, I bet I’ve have more luck. I’ll have to experiment further.

My attempts to rewrite the loop badly broke the program. I used comments as poor man’s source control, and I got lucky. I really should have check in the first working version into my local Subversion repository.

Some time passed. I managed to look at the do-while problem again, and after about 90 minutes, I solved both it and another issue that bugged me.

Here’s the second revision:

open System
open System.IO

let randomizer = new Random(DateTime.Now.Millisecond)

let surname (x : string) = (x.Split ' ').[1]
let lastItem (a : 'a list) = a.[a.Length - 1]

let shuffle items =
    let upperBound = (List.length items) * 100
    let randomlyWeightedItems = List.map (fun item -> item, randomizer.Next(0, upperBound)) items
    let sortedByWeight = List.sortWith (fun (_, leftWeight) (_, rightWeight) -> leftWeight - rightWeight) randomlyWeightedItems
    List.map (fun (item, _) -> item) sortedByWeight

let shuffleUntilEndsDiffer theList =    
    let rec loop a =        
        let shuffled = shuffle a        
        if surname shuffled.Head = surname (lastItem shuffled) then             
            loop shuffled    
    loop theList

// Main part of the program, need to find out how to separate code into files

let lines = List.ofArray (File.ReadAllLines(@"names_santa.txt"))

shuffleUntilEndsDiffer lines

let rec pairings = lines
                    |> Seq.windowed 2
                    |> Seq.choose (fun (x:string[]) -> 
                                        match x with
                                        | x when surname x.[0] <> surname x.[1] -> Some(x)
                                        | _ -> None
                                    )
                    |> List.ofSeq 

let veryFirstPerson = pairings.Head.[0]
let veryLastPerson = (lastItem pairings).[1] 

List.append pairings [[|veryLastPerson; veryFirstPerson|]] 
    |> Seq.iter (fun (x:string[]) -> printfn "%s gives to %s" x.[0] x.[1])

Console.ReadLine() 
    |> ignore

The do-while loop was solved by making the shuffling function return the shuffled contents instead of unit (void in C# parlance).

The second change was to get away from arrays. Arrays in F# are mutable, and I wanted to only use immutable data structures as much as possible.

The shuffler was the main barrier. Once I hit upon the idea to associate each person with external data, in this case a random weighting, the rest fell into place. You may also notice some language usage improvements, such as using a.Head instead of a.[0]

Questions? Comments?

Secret Santa Code Kata Redux in C#

10 February 2012

Here’s another solution to the Secret Santa code kata, this time in my “native” programming language of C#. This is the second solution I’ve written this week to this problem. Code katas are about experimenting with different approaches to simple but not trivial problems.

My first solution, written in PowerShell, relied on selecting random pairs of people and using a “hill-climbing” algorithm to avoid getting stuck. This time, I gave myself the constraint that the solution had to be deterministic — no randomness. I had been toying with the idea of using an Abelian ring. A couple of common examples are the hours on a clock or the modulus (remainder) operator in math. But I couldn’t decide how I’d handle iterating through the members of that ring without duplicates. I determined that I’d need to re-sort the list.

I wrote this solution using test-driven development (TDD), meaning I wrote my tests before implementation — I won’t show all of those tests today. As it turned out, I didn’t need to code a Ring class. I find when I do TDD, I often don’t end up writing half the code I thought I’d need!

Unlike my PowerShell solution which just used strings, I knew I wanted to create a data object to store the incoming rows of data in, which I named the Participant class:

using System;

namespace SantaKataApp
{
    public class Participant
    {
        private Participant(string firstName, string lastName, string email)
        {           
            FirstName = firstName;
            LastName = lastName;
            Email = email;
        }

        public string FirstName { get; private set; }
        public string LastName { get; private set; }
        public string Email { get; private set; }

        public static Participant Create(string descriptor, int id)
        {
            var parts = descriptor.Split();
            var email = parts[2].Replace("<", "").Replace(">", "");
            return new Participant(parts[0], parts[1], email);
        }

        public override string ToString()
        {
            return string.Format("{0} {1} <{2}>", FirstName, LastName, Email);
        }
    }
}

For a while, I thought I might need to implement IEquatable<Participant>, Equals(), and GetHashCode() in order to compare Participants, but again, TDD proved me wrong.

The plan for this approach was to:
1. Parse the incoming list
2. Resort the list
3. Display the results

The act of resorting took the majority of the development time. I created a ListResorter class to do the work.

I started by writing tests….

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using SantaKataApp;

namespace SantaKataTests
{
    [TestClass]
    public class ListResorterTests
    {
        [TestMethod]
        public void SwapItemsInList_Succeeds()
        {
            var list = new List<int>(new[] { 3, 2, 1 });
            var listResorter = new ListResorter<int>(list, null);
            listResorter.Swap(0, 1);
            CollectionAssert.AreEqual(list.ToArray(), new[] { 2, 3, 1 });
        }

        [TestMethod, ExpectedException(typeof(InvalidOperationException))]
        public void SwapWithInvalidIndex_Throws()
        {
            var list = new List<int>(new[] { 3, 2, 1 });
            var listResorter = new ListResorter<int>(list, null);
            listResorter.Swap(74, 1);
        }

        [TestMethod]
        public void CanAdjoin_WithTwoItems_DeterminesIfTheyCanAdjoin()
        {
            var list = new List<int>(new[] { 3, 1, 2 });
            var listResorter = new ListResorter<int>(list, (x,y) => x % 2 != y % 2);
            Assert.IsTrue(listResorter.CanAdjoin(1, 2)); // list[1] = 1, list[2] = 2
            Assert.IsFalse(listResorter.CanAdjoin(0, 1));
        }

        // ... and so on ...
    }
}

These are the first two tests I wrote. I decided the simplest operation I needed to resort a list was the ability to swap two items in that list. Once I had that working, I picked the next piece, the ability to decide if two things can adjoin (“to be close to or in contact with”), and I proceeded from there.

Notice that the LineResorter<T> constructor takes a list to operate against, and a function that compares two instances of type T and determines if they can be adjoining in the list. In the case of my unit tests, I used (x,y) => x % 2 != y % 2, which is a fancy way of saying that two odds or two evens can’t be next to each other in the list. I wanted to use a different type (int) than I’d be using in my real use case to make sure I didn’t make any assumptions about the nature of type T. This comparison was the first one for two numbers that came to mind.

Each time I needed functionality out of the ListResorter, I wrote a test. I watched it fail, then I made all the tests pass. If I saw opportunities to refactor, I took them whenever all my tests were green (passing). By the time I was done, I had about a dozen tests and this class:

using System;
using System.Collections.Generic;
using System.Linq;

namespace SantaKataApp
{
    public class ListResorter<T>
    {
        private readonly List<T> _list;
        private readonly Func<T, T, bool> _canAdjoin;

        public ListResorter(List<T> list, Func<T, T, bool> canAdjoin)
        {
            _list = list;
            _canAdjoin = canAdjoin;
        }

        internal void Swap(int index1, int index2)
        {
            ThrowIfIndexesAreInvalid(index1, index2);
            T temp = _list[index2];
            _list[index2] = _list[index1];
            _list[index1] = temp;
        }

        private void ThrowIfIndexesAreInvalid(int index1, int index2)
        {
            if (_list.Count < index1 - 1 || _list.Count < index2 - 1)
                throw new InvalidOperationException("An index is beyond the length of the array");
        }

        internal bool CanAdjoin(int index1, int index2)
        {
            ThrowIfIndexesAreInvalid(index1, index2);
            return _canAdjoin(_list[index1], _list[index2]);
        }

        public int GetNextIndex(int i)
        {
            if (i >= _list.Count)
                throw new InvalidOperationException("Invalid index");
            if (i == _list.Count - 1)
                return 0;
            return i + 1;
        }

        internal bool CanAllAdjoin()
        {
            return !_list.Where((t, i) => !CanAdjoin(i, GetNextIndex(i))).Any();
        }

        public List<T> Resort()
        {
            var list = _list;

            while (! CanAllAdjoin())
            {
                for (int i=0; i < list.Count; i++)
                {
                    int j = GetNextIndex(i);
                    int k = GetNextIndex(j);
                    if (! CanAdjoin(i, j) && CanAdjoin(i, k))
                        Swap(j, k);
                }
            }
            return list;
        }
    }
}

This class has two public methods, GetNextIndex() and Resort(). That Abelian ring idea still lives in the GetNextIndex() method, which says that the next member after the last in the list is the first, making the list circular. Resort() does what you would expect.

The other methods in the class are marked internal, so that my TDD tests can access them via the [Assembly:InternalsVisibleTo()] attribute in the Assembly.cs code file. After design is done, I would consider rewriting the test methods that talk to internal methods so that they are exercised through the public method. I don’t want my unit tests to be break should someone decide to change the internal implementation of these methods. You can see a bit of this thinking in the ThrowIfIndexesAreInvalid() method. I pulled this method out to avoid duplication of code during refactoring, where I already had unit tests in place and thus I didn’t need to write new ones.

Once I had ListResorter working, writing a console wrapper was easy:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace SantaKataApp
{
    class Program
    {
        static void Main(string[] args)
        {
            string[] fileContents = File.ReadAllLines(args[0]);
            Func<Participant, Participant, bool> surnamesDontMatch = (x, y) => x.LastName != y.LastName;
            List<Participant> names = fileContents.Select(Participant.Create).ToList();

            var listResorter = new ListResorter<Participant>(names, surnamesDontMatch);
            List<Participant> sorted = listResorter.Resort();

            DisplayParticipants(listResorter, sorted);
        }

        internal static void DisplayParticipants(ListResorter<Participant> listResorter, IList<Participant> sorted)
        {
            for(int i=0; i < sorted.Count; i++)
                Console.WriteLine(Display(sorted[i], sorted[listResorter.GetNextIndex(i)]));
        }

        internal static string Display(Participant name, Participant giveToName)
        {
            var giveTo = giveToName != null ? giveToName.ToString() : "NONE";
            return string.Format("{0} gives a gift to {1}", name, giveTo);
        }
    }
}

Most of the work done in the console app is formatting the output for display. The heavy lifting is done by the ListResorter class.

This program outputs data like:

PS C:temp> .SantaKataApp.exe .names_santa.txt
Luke Skywalker <luke@theforce.net> gives a gift to Toula Portokalos <toula@manhunter.org>
Toula Portokalos <toula@manhunter.org> gives a gift to Leia Skywalker <leia@therebellion.org>
Leia Skywalker <leia@therebellion.org> gives a gift to Gus Portokalos <gus@weareallfruit.net>
...

I met my goal: the output is the same every time given the same input file.

I time-boxed the effort at two hours and I came close to reaching my limit. I initially began by writing an iterator, but abandoned it when I realized that I only wanted to traverse the collection once.

There is more that could be done to this program, of course. For example, I’m not happy about DisplayParticipants using GetNextIndex(). The only reason that DisplayParticipants() needs a ListResorter is to use its ring-like GetNextIndex() method. This functionality should have been extracted into its own class.

Comments? Questions?

“Secret Santa” Code Kata in Powershell

25 January 2012

The team at work is doing a Code Kata, the Secret Santa kata from Ruby Quiz. In it, you are given a list of names and emails and need to pair givers with receivers, with the only constraint being that a giver and receiver can’t have the same surname.

I wrote my solution in PowerShell. I wanted to use native PowerShell functionality, so I didn’t import any .NET libraries. I know this post is brief, but I wanted to get my solution posted with a little commentary while it was still fresh in my mind.

When run like this:

PS> .santa.ps1 .names.txt

It gives output like:

Giver                                                Receiver
-----                                               ---------
John Smith <jsmith@example.com>         Jane Doe <jdoe@example.com>
...

Here’s the code:

 param($file)

 function ConvertTo-Array($hashtable) 
 {   
     $result = @()
     $hashtable.GetEnumerator() | % {
         $object = New-Object PSObject   
         Add-Member -inputObject $object -memberType NoteProperty -name Giver -value $_.Name 
         Add-Member -inputObject $object -memberType NoteProperty -name Receiver -value $_.Value 
         $result += $object
     }   
     $result
 }

 $none = "NONE"
 $names = @{}
 gc $file | % { $names.Add($_, $none) }
 $namesToMatch = $names.Keys.Count
 if ($namesToMatch -lt 2) {  throw "Need at least two names to match" }

 while (($names.Keys | ? { $names.$_ -eq $none }).Length -gt 0) 
 {
     $from = $names.Keys | ? { $names.$_ -eq $none } | Get-Random
     $to = $names.Keys | ? { $_ -ne $from -and $names.Values -notcontains $_ } | Get-Random
     #"DEBUG: $from, $to"
     if ($from -ne $null -and $to -ne $null -and $from.split()[1] -ne $to.split()[1]) 
     {
         $names.$from = $to
     }
     else
     {
         $undoMatch = $names.Keys | ? { $names.Values -ne $none `
             -and $from.split()[1] -ne $_.split()[1]} | Get-Random
         #"DEBUG: unset $undoMatch"
         if ($undoMatch -ne $null) 
         {
             $names.$undoMatch = $none
         }
     }
     $percentComplete = 100 * ($names.Values | ? { $names.$_ -ne $none }).Length / $namesToMatch
     Write-Progress -activity "Match in Progress" -status "% Complete:" -percentcomplete $percentComplete
 }
 $results = ConvertTo-Array $names
 $results | ft  

(I apologize in advance for the formatting. I will look into another solution like Pastebin if I do this more often. Please comment if you have recommendations.)

It reads the contents of the file and writes each line to a hashtable, where all the receivers are $none (a special value I created which I used like $null).

As long as there are unmatched people, I go through an algorithm to select a random gift giver who doesn’t have a receiver ($from) and a receiver who doesn’t have a giver ($to). The split(” “)[1] yields a person’s surname.

I used a variant on the “Hill Climbing” algorithm mentioned on the site. If I run into a situation where there are no more matches, I take a random person who could have been a match and throw them back into the pool of candidates. In this way, the algorithm never gets stuck.

At the end, I call my ConvertTo-Array function to beautify the output. Without it, the hashtable columns are labeled “Name” and “Value”, and that didn’t satisfy me. (I couldn’t get Format-Table to help.)

I added a progress bar to see how often this part of the code is invoked, and it gets called once or twice per run on average. However, the script itself has decent performance. It takes a second or so to match 40 names.

Please let me know what you think or ask questions. I’m happy to go into more detail.

Getting off the Ground

23 January 2012

“You don’t need more time, you just need to decide” – Seth Godin

At work, there are a lot of things that I’ve identified for improvement.

Where I work is largely a Java shop, but there are a couple of large applications that are .NET and a growing group of .NET developers. And there are a lot of amazing opportunities here.

The company has a good track record of agile delivery using Scrum. However, I see areas for improvement also. The job of scrum master has become institutionalized as just a meeting facilitator. The responsibility of the scrum master as team coach has been lost in translation from training done there a few years ago.

Our current scrum master wanted the team to attend introductory scrum training. As a certified scrum master, I was anxious to hear what the agile coach on staff would say, and I was gratified to agree with his curriculum. It’s a matter of practice in our group.

Fortunately, I am well-suited to help move the team toward a steady cadence of delivery. We need to get our product owner away from firefighting mode and to get what he and our customers need within the confines a fixed sprint backlog. Having a strong scrum master will help.

And we need to get beyond our large estimates for small projects due to fear of the unknown with this application — turnover has left most of the delivery team new to the code base.

There’s a desire for a .NET community of practice. I’ve started the ball rolling with a book club. The first book we’ll read together is Working Effectively with Legacy Code by Michael Feathers, which is all about getting pre-written systems under unit test. We’re also starting our first code kata this week, an old Ruby Quiz about Secret Santas.

Both the book club and katas are exercises to start talking about software design in a simpler context. These .NET applications grew organically and were written by developers who were more comfortable with SQL than .NET. The application has shortcomings that you might expect: lots of logic in stored procedures, poor separation of concerns between SQL and .NET, and a presentation layer (ASP.NET web forms) that’s highly coupled to the database.

And the application has a lot of room to grow. It is ripe for becoming a data warehouse solution. I have some mixed experience setting up data warehousing from a previous position, but this time, the team is blessed with a talented database developer onboard to guide us through the tough parts. I can’t wait!

The concern I have is being spread too thin. As you can see, there are a number of vectors to pursue. I need to choose one, maybe two of these and drive them to completion before taking on others. I haven’t talked about the nine months of architecture changes that are needed to get the application able to scale to handle the desired demand or some of the expanded features they want for this application this year.

If I have learned anything over the past couple of years, it is that I cannot be all things to all people. I need to pace myself and be ready for some false starts. I also find a way to handing off these projects when (if?) they get off the ground. Here’s where having a good manager will help me, and I am blessed to have one.

I feel energized, and excited about the possibilities!

Parsing large files using PowerShell

17 January 2012

At work, I had the need to parse through a large pipe-delimited file to count the number of records whose 5th column meets and doesn’t meet my criteria.

PS C:temp> gc .items.txt -readcount 1000 | `
  ? { $_ -notlike "HEAD" } | `
  % { foreach ($s in $_) { $s.split("|")[4] } } | `
  group -property {$_ -ge 256} -noelement | `
  ft -autosize

This command does what I want, returning output like this:

  Count Name
  ----- ----
1129339 True
2013703 False

Here’s some explanation in English, for those of you who don’t know PowerShell.

The first command is gc (Get-Content), which reads the file in 1000 (readcount) lines at a time.

The second command is ? (Where-Object), which filters out the HEAD row.

The next command % is an alias for Foreach-Object, where object in this case is a 1000-line chunk. The inner loop is another foreach loop, which is slightly different from Foreach-Object in ways that are unimportant to the matter at hand. Point is, you can’t nest % blocks. The block of the foreach loop splits each line by pipe delimiter and returns just the 5th column (first column is numbered 0).

The next command in the chain is group, an alias for Group-Object, in this case we’re grouping by a calculated property, whether the output of the previous command is greater than or equal to 256. By saying “-noelement”, I’m saying I don’t need an enumerated list of the values, which in this case are unimportant.

Finally, we get to ft (Format-Table). It is necessary because the Count column may be over 99999, in which case the value is truncated. The option “-autosize” causes PowerShell to make it fit instead.

However, for a 500 MB test file, this command takes about 5.5 minutes to run as measured by Measure-Command. A typical file is over 2 GB, where waiting 20+ minutes is undesirably long.

I posted a query to StackOverflow for some ideas.

While I waited, I discovered that 2500 was the optimum value for -ReadCount, getting the command execution time down to about 3.5 minutes.

Within minutes, I got a helpful hint from Gisli to look into using the .NET StreamReader. Here’s what that script looks like:

=== begin Show-SourceCounts.ps1 ===
param($file = $(Read-Host -prompt "File"))
$fullName = (Get-Item "$file").FullName
$sr = New-Object System.IO.StreamReader("$fullName")
$trueCount = 0; 
$falseCount = 0; 
while (($line = $sr.ReadLine()) -ne $null) {
      if ($line -like 'HEAD|') { continue }
      if ($line.split("|")[4] -ge 256) { 
            $trueCount++
      }
      else {
            $falseCount++
      }
}
$sr.Dispose() 
write "True count:   $trueCount"
write "False count: $falseCount"
=== end file ===

This script yields the same results as the first command, but in about a minute. Quite an improvement!