Tales of a Startup Junky

Tuesday, May 18, 2010

Technical Debt

Addressing technical debt is a severely under-valued need of most company codebases. While everyone agrees that it is important, somehow it never quite gets the time it deserves . I want to take a look at why this occurs, what can be done about this and what some of the unintended side affects of this can be.

Why is refactoring needed? Simple. Unrefactored code causes slower delivery of buggier features. This isn't news nor is it profound. Everyone agrees that fast delivery of working features is important. Yet, somehow, sparse hours are given to paying back technical debt.

Its not that time isn't planned for doing this. The common approach is "We'll give a couple of hours/days towards the end of the sprint, but ..." (and here's the kicker) "we just need to make sure feature XYZ is finished first." I will let you in on a secret, between production fires, bugs and XYZ growing in scope, that refactoring time will never see the light of day.

Furthermore, the worst time to refactor is after adding new code that hasn't been QA'ed. It is similar to building sandcastles by the sea. The time to start moving around major chunks of code is after everything has been signed off as working as expected. That way you know when your cleanup work breaks something. Also, this prevents you from being tempted to back out all your refactoring if feature XYZ does pass QA./

This puts the developers in a horrible position. They have two options.

One, steal away time from other assignments they have or work on this during nights and weekends. This sucks. It tells the developers that their internal company needs are not important. Also, (and often not considered) this precludes sufficiently complex pieces of code from ever being unraveled. If it takes more than a couple of hours to do, developers will never be able to do it. The areas of the codebase that need working on the most, will rot.

The other option, is that the developers don't even try to refactor anything. This will lead to a lot of "damp" code (the opposite of DRY) that takes longer than necessary to produce and will likely be buggy. This won't be because the developer can't do it. It is because the developer can't do it on top of the Rube Goldberg codebase that has arisen. This is the best way to take a top shelf developer and make him feel like a failure. (hint: If you start seeing your best developers leave, look at the codebase.)

Often this is where most development departments end up.

Others will be fortunate enough to finally gain buy in to get an entire sprint dedicated to refactoring. Here to, I would like to caution the would be team. Product and Marketing will be all too keen on seeing this as there opportunity to have you fix ever minor bug and one-off that they has ever bothered them. The second you crack this door open, expect a flood to wash away this opportunity.

A refactoring iteration should be for the developers and the developers alone to set priorities. They write the code. They know where things are the messiest. They know where the biggest "bang for refactoring buck" lies. Trust and empower them.

Now, with all that said, we can't spend all of our time perfecting the codebase. Companies need to make money; they need to test business plans. So, where is the balance? There is no magic formula, but let me propose the following two rules of thumb.

1) One of every every eight sprints should be a refactoring sprint. From experience, this seems to be a good pace to fight programatic entropy. Also, refactoring has a rather close parallel to the seventh habit and should have the proper proportion of your time.

2) After a company pivot has been proven. If your company has tested the waters and decided that it is time to change directions, then it is time to prepare the codebase for this. Batten down the hatches. Those first months, as a new idea is being actualized, being able to rapidly flesh out the idea is crucial to maintain momentum. If you are slowed down by the needless legacies of prior pivots, you are not even giving this new idea a fair chance to flourish.

The quality of the codebase a developer works in can make the difference between their job being something they do versus something they love. Do what you can to make sure your developers have a job they love.

What do you think? Are there other ways to tackle technical debt? What are some other roadblocks to look out for on your way to a refactoring iteration? Are there any other rules of thumb that can be applied?

Friday, December 11, 2009

Debugging a hung ruby process with GDB

GDB is a great xNix debugger than is hugely useful into tracking down why a ruby process is hung. Below is a very simple how-to.

1. Install GDB

This is different between operating system. On Ubuntu, it is as simple as:

sudo apt-get install gdb

2. Find the hung process id

It will typically be the far left number associated with process you are looking for.

ps aux | grep ruby

3. Point GDB at that process

Once started you will see a lot of text scroll past.

gdb -p <process-id>

4. Print the backtrace

This is the part where you find what you are looking for.

bt

While this may look a little cryptic, it is a C backtrace and will tell you exactly where things are seizing. In this example it was a long running profiling tool that locked things up.

5. Go forth and fix your code

Now you just need to quit out of GDB; thank the world for its existance and go fix your code.

quit

Thursday, December 3, 2009

The Exception to global ruby variables

Global variables are often the over looked cousin of ruby variables. If you have never used them before, here is a quick breakdown

Today I took a closer look as to how they are used. I have seen a lot of ruby code that uses "$!" to refer to a caught exception. Simple example:

Well, today, as I was trying to write a style guide for OtherInbox, I started to get really concerned about the thread safety of such code. Since global variables are accessible by any thread, there appeared to be a pretty clear thread-safety issue with using $! to refer to an exception. I was ready to scrub the codebase and rewrite everything to look like this:

Prior to jumping off the deep end, I discussed this with the Capital Thought team. After a rather lengthy discussion about the proper way we should all be handling exceptions (which I hope to share in the future), I wrote a quick test to see if there was a thread safety issue or not.

If you run this script, it will output "boom, pow, boom". This clearly shows that although "$" defines a global variable in the case of $ERROR_INFO ($!), it is has a local scope.

Tuesday, November 17, 2009

Why I don't test

Three words and one contraction and the entire development community probably hates me. Well, almost the entire community (I hope to find a few developers that share the my sentiment).

I work for a (not-so-much-anymore) early-stage startup and we have more abandoned ideas than successful ones. This is indicative of many companies. You have to throw a lot of spaghetti against the wall to see what sticks. This is rarely the environment for BDD, TDD, TAFT or perfectionists. This is where you write a couple of functional tests and release your code into the wild. The words "Alpha" & "Beta" are dear to my heart.

Startups are the entrepreneurial spirit at its best. Even if you are smart, motivated and hard working, this environment is largely a numbers game. The more ideas you actualize, the more are given a chance to take hold. The 80/20 rule is in full effect and great code doesn't matter if no one uses it.

As a developer, ask yourself, "How many cool side projects never made it past the first couple days of coding?". I guarantee all of you have at least one and most of you have a lot more. Now, ask yourself, "What is the main reason you didn't get further in your project?". I also guarntee that the most common reason is getting bogged down in the minutia (like testing). If you would have taken the quickest path to throwing up something that worked most of the time, you could have discovered if your idea was successful.

Not testing isn't as crazy as it sounds. If an idea sticks, then you need to build out a test suite as you handle edge cases and add further functionality. On the other hand, odds are, what you are coding right now isn't going to be around 6 months from now. So, don't waste your time.

I once had to write an IMAP client coded to rfc spec in less than a week. Their is no way I could have done this while writing tests the whole way through. The requirements changed constantly and I would have had to mock an entire IMAP server first. (Fortunately, this idea gained in popularity and now has great test coverage.)

Code is a means to an end and not an end in itself. Tests support the code we write, proving it works well. If your tests don't allow you to code faster then they are failing (if you work for a startup). Here, we write code to prove ideas work. This is why Ruby on Rails has become so popular; it's ability to rapidly prototype (aka quickly birth a concept).

Their are many environments where this attitude doesn't work and wouldn't even make sense (contractors, opensource contributions, large companies, proven models, etc, etc), but TAFT is wrong. Their are no hard and fast rules, but if their is one, I would suggest it is this: T.O.S. (Test On Success).

Monday, November 9, 2009

A Rhythm to the Madness

OtherInbox, the company I work for, has been in development for two years now. In that time, the team of developers has varied in size from one to six people. We have explored many avenues of email management; turned on a dime to address emerging technological trends and met some very tight deadlines. As we have been doing this, we have also been constantly defining and refining our teams rhythm.

The first step in doing this was to define a block of time in which we would measure our progress and deliver new features. For us, two weeks was the sweet spot. It is long enough to solve complex problems, but short enough to get rapid feedback from users. We refer to these two week chunks of time as "sprints".

With that defined many questions arose:

What do we need to do to be prepared for a sprint?

When should we start our sprint?

How can we make sure we will reach our sprint goals?

What does it mean for a sprint to be successful?

When should we deliver these new features to the website?

How do we incorporate user feedback into this rhythm?

To address these questions we plotted what we should be thinking about on a recurring calendar.

The interesting thing that we discovered was that to effectively manage all these concerns and keep an open pipeline of progress; we had to be focused on four sprints in any given two week timespan. For instance, lets look at "Sprint 3", that begins development on September 23rd (9/23). We started planning for that sprint back on 9/7 in a product planning meeting (Sprint 3 - Ideas) where we listed out the ideas we would like to develop and determined which ones were worth investigating further. The next day (9/8) we take a week to draw up, refine and negotiate between our Marketing, Product and Development Departments what these features will look like. From there (9/15), the Development Department takes 4 days (Sprint 3 - Design & Estimation) to translate these specs in to tickets and to estimate how complex each ticket is. At the beginning of this ticketing process(9/15), the Development Department will also talk to our Customer Service Advocate and incorporate our user's feedback into the tickets available for Sprint 3. On Monday & Tuesday (9/21 - 22) we will organize the tickets created based on priority. This all leads us up to Wednesday morning (9/23), the developers now have an organized list of defined, designed and estimated tickets to work on.

That is just the first half of our rhythm. On October 6th Sprint 3 will end for our developers and acceptance testing and deployment concerns begin. Sprint 3 will spend the next five days being tested (10/7 - 10/11) at alpha.otherinbox.com (go check it out!). Then, finally, this sprint, that began on 9/7 with group of people huddled around a table throwing around ideas will be deployed to our production (my.otherinbox.com) on October 12 and be available to the public.

While this might seem like a long cycle, its important to note that development never stops. For that two week period that we were actually writing the code for Sprint 3 we also:

QA'ed Sprint 2

Spec'ed Sprint 4

Deployed Sprint 2

Demo'ed Sprint 2

Designed & Estimated Sprint 4

Gathered User feedback for Sprint 4

Started Dreaming & Spec'ing Sprint 5

You may wonder why we start our sprints on Wednesdays. The answer is simple: we wanted to deploy on Mondays. This gives us the entire week to address any issues that may arise in production that were not caught in QA. Nobody likes working weekends (and that often happens with late week deploys).

On that note, I will leave you with what I think the most successful aspect of our development cycle is: flexibility. Don't have a schedule that is set it stone or you will leave no room for improvement.