Southwestern UgandaFlying over northern ArizonaSanta Barbara

« Building the Machine That Will Build the Machine

 | Main | 

Here Is My Idea »

Death March in an Information Technology Project

Software development is a problem solving business. But it seems we sometimes can't even apply common sense. This is a true story of how the leadership of a large IT project (team of 100+) went through various stages of denial and last ditch adjustment on a long drawn out death march.

To make a large problem manageable you generally have to break it down into smaller problems and let separate teams handle the smaller problems. Of course you need to break it down along lines that allow the smaller problems to be solved individually, otherwise you have not really broken it down at all.

Big Design Up Front (BDUF)

The project was launched with a lot of excitement and enthusiasm about doing the job right. The goal was to modernize a whole bunch of legacy systems from various platforms and locations throught the world into a new design with a set of subsystems (roughly analogous to the original legacy systems but greatly enhanced and interconnected). They were aware of failures in the past, but they assembled a formidable team of domain experts and were well-versed in the latest software development best practices and technologies. Some of the key players were hot off of other successful (albeit smaller) projects and everyone felt the synergy was just right.

All of the initial goodwill may have gone to some peoples' heads because there seemed to be a sense that something really quite visionary was going on. They decided that as part of the requirements analysis and design phases, they would create a software framework that was meant to streamline the project by abstracting certain generalizable processes out of the overall application and data development effort. Though it was their pride and joy, I'm guessing that this framework concept may have hurt the design phase by allowing abstractions to dominate instead of tangibles.

The ideal of the BDUF (Big Design Up Front) approach used in virtually all large IT projects is that with good design and specifications to work from, the various teams can go off and implement their parts and later bring them all together and get them working with tweaks here and there. The principle is that it is much more efficient to resolve conflicts during requirements assessment and design than after such a large system is implemented.

Whether specifications can allow teams to go away and implement their parts independently is debatable. But even as a means of defining and integrating the efforts of teams on this large project BDUF had 3 major problems:

  1. The design negotiation process got paralysed. Meetings never seemed to reach a resolution because of the sheer volume of legacy system details and experiences being considered. The more experts that got involved the worse it got. The more they earnestly wanted concensus the longer it took.
  2. Specifications could not cover the enormity of details. Ultimately specifications are only as good as the people writing them can get their minds around the problem. Modeling tools and prototypes helped but the results were not inspiring.
  3. No one read them. Well that's not true, but many times they were only read until a glitch or inconsistency was found, and then the author was contacted anyway.

The bottom line is that BDUF demands something of humans that we aren't any good at, at all.

After a year or two when the customers started to have misgivings about whether progress was being made, the design meetings became shouting matches. The project leaders tried to control the egos involved by planning a certain amount of time for each point to be addressed and opinions to be discussed. I went to a couple of these meetings a year apart and had intense déjà vu as I discovered some of the exact same issues were still being debated.

Common sense from trial and error should have told them that unifying the design of a large number of legacy systems on the scale they were attempting was simply not going to happen in meetings and spec-writing iterations. But ultimately the problem was not BDUF, it was a complete unwillingness to break the large problem down into smaller more manageable problems.

Special teams and new initiatives saddled

Giving true autonomy to teams tends to lead to non-uniform and duplicative solutions (which may nevertheless yield better results but I'll come back to that another time), and this prospect worried them so much that they ruled it out altogether. However, their stubbornness did not stop them from hedging their bet with some experiments. After insisting for years that the BDUF approach was paying off they began to consider other options behind closed doors.

They formed some small special teams to go and quietly start building software outside of the official process. They did this somewhat secretively because they didn't want to be seen as undercutting the process they had championed for so long, and also because they weren't exactly confident about it. But it was precisely because of their timidity that it did not bear much fruit.

The special teams weren't allowed to work freely with the other teams because of their unofficial status and yet were expected to utilize the progress of the other teams so as not to be duplicating effort. In addition they were forced to work within the same framework designed by the team leadership and so-called software architecture experts. The special players made a herculean effort and months of overtime turned into years, to no avail.

At one point a buzz circulated that the project leadership was embracing a new "iterative" approach, but it turned out to be in name only, since iterative development emphasizes building releasable software and they were still only producing a framework and system architecture. The deliverables still required the customers to use their imaginations; it was not a releasable end product. In fact it took a lot of smoke and mirrors to keep the customer from concluding that the emperor had no clothes.

The waterfall methodology (requirements analysis, design, implementation, testing, integration, and maintenance phases) used on this project (and most if not all large IT projects) has built in mechanisms to measure and predict progress. However, due to its reliance on volumes of specifications that the customers can never hope to digest, it is susceptible to being hijacked by misguided project management. In addition, it appeared that the customers had committed to the framework vision advanced by the project leaders. With all hopes pinned on a framework there was no room for truly introducing any new approaches to avert disaster.

Ultimately every well-meaning effort was undermined by subjugation to the over-architected, meddling, micro-managed whole.

Separating DBA, UI, reports into teams

The high level unified design being hammered out in meetings played into a manifestation of groupthink that had a dangerous effect on the division of problem areas. For example, in order to build the subsystems of the end product, there were 3 teams to manage different aspects of each of the subsystems:

The idea was that the report system would be written once and then each subsystem would be easily added after that, the UI team would utilize common UI components across all subsystems, and the database administrators would design and manage a unified data and application model to greatly reduce potential maintenance down the road. These were meant to be independent teams, not just tasks within a larger team, and were to be judged on their progress separately.

However, on any given subsystem used by a specific group of users there are many details affecting all of the data, UI and reports. Effectively, the 3 teams were expected to each learn the exact same things about every subsystem: the user roles, tables, indexes, and the ways of accessing, displaying and filtering the data, all of which effect each other a great deal (e.g. a DBA cannot design user roles and indexes without knowing about the reports and filters that will be needed for all the affected subsystems). The most sobering problem was that compromises made for the unified design often turned out to be showstoppers for individual subsystems and this would drive the issues back up into the design meetings.

A pragmatic approach would have been to give one team jurisdiction over all of the data, UI and reports of a particular subsystem and truly relegate the framework to optional services, but that idea was always dismissed as a maintenance nightmare.

Dividing each subsystem between the 3 teams ultimately led to an insurmountable effort, not to mention a dumbing down of the specific requirements that further hindered completion of any one subsystem. The cost of conformity and efficiencies across subsystems was too high a price to pay when it stopped progress in its tracks.

No matter what you do, you cannot get away from the reality that if a problem cannot be solved except in the context of the solutions to other problems, then you are greatly increasing the work by trying to assign it to a separate team.

Faced with increasing pressure and desperation from developers, the leadership eventually adopted a modified team structure based on subsystems with attached DBA and reports people. But they were too slow to ease the mandates on framework conformance to allow substantial productivity to set in. As a friend of mine likes to say, they were just "moving deck chairs around on the Titanic."

The consequences

After some 3 or 4 years, the customers started to make desperate demands to see something concrete. The project leaders chose a small sample subsystem and decided to accelerate it. This was going to be the showcase, the proof that the complex framework (the metaphorical "machine") they had put in place to generate the end product was going to yield vast efficiencies (see "Building the Machine That Will Build the Machine"). However, at every turn, it was the framework that was slowing them down. They took to spending whole days in the rooms with the developers, telling them to skip this and forget about that in a mad rush to get something concrete.

As the deadline approached they completely scrapped the application framework in favor of rapid application development, still maintaining that they would go back and retrofit the solution to their framework later. But they had already clung to framework goals too long to allow the team to adequately address the intricacies of the specific subsystem.

When the customers were presented with a demonstration of this sample program they were sorely disappointed. They recognized immediately that it was only a cursory treatment of the subsystem requirements it was meant to service. Many years and millions upon millions of dollars already invested were the only thing keeping them from bailing immediately. It would be some time still before the project was scaled back and effectively dissolved.

It is an unfortunate story that has affected me deeply. I observed this from a system services role (not any of the special, DBA, UI or report teams mentioned above) and wasn't in a position of influence. In a future article I'll talk about a "market" approach that I argue would have worked well because of this project's natural subsystem organization.

 

None of us...
...is as dumb as all of us.

Despite winning every battle, the overall war was lost, making every success still viewed as a failure. While unfortunate, I learned so much of 'what not to do' while on the Program, and have to attribute the reminder of this failure to much of my more recent successes. Great read, thanks!

submitted by Rainman, 19 Jan 2006 14:09:00 -0500

 

see discussion at Joel's forum:
http://discuss.joelonsoftware.com/default.asp?joel.3.294328

note by original poster Ben Bryant, 21 Jan 2006 11:00:00 -0500

 

« Building the Machine That Will Build the Machine

 | Main | 

Here Is My Idea »