Breakthrough ideas often occur when we bring concepts from one field into new and relatively unfamiliar territory.
When problems in software development persisted at Schneider Electric R&D, the company decided to do just that. Adopting flow principles typically used in operations and manufacturing of physical products, Schneider and partner Vector Consulting Group — a supply-chain operations firm — diagnosed a core issue that afflicts many of today’s software development environments.
Their solution, the “Rapid Feature Flow Model,” uses supply chain fundamentals to tackle Schneider’s complex software development methodology.
Schneider Electric Engineering Workbench, a product from Schneider Electric's Process Automation business helps in automating the engineering and configuration of the instrumentation and control systems of brown/greenfield projects. Faster engineering helps in reducing lead time to start of operations and thereby improves profitability and reduces the risks of penalties for the automation vendor.
Software requirements (both technical and marketing) are pooled in from globally distributed teams of technocrats and marketing personal and converted into a road map of deliverable products (or software versions). Each product is thus a predefined bundle of features which has to be completed before the decided release date. Schneider’s project management team describes the ‘user stories’ in each feature and then outsources the development of these products to software service companies.
Outsourced companies of Schneider used to follow the Agile methodology consisting of sprints (four weeks each) to develop the software. During each sprint, teams of developers and testers created “tasks” for each “user story”, wrote code for it, and tested it. As users stories got completed, customer demos were undertaken followed by functional testing. The final round of testing was done in the last sprint of release called the “hardening sprint”. Once all features cleared these test cycles, the product was declared ready for release.
Unfortunately, over time the company found itself in an environment of poor results, increasing work stress and a strained reputation with customers. The Engineering Workbench experienced poor productivity, significant rework (about 20 percent) and effort overrun (about 30 percent). Half of the defects detected led to working on requirements and design again! Features coming out from the periodic sprint cycles were not “usable” and there was no predictability particularly for clients who wanted a firm release schedule for a set of usable features to plan their own projects. Further, in spite of many developers working late nights regularly, lead times were stretched (release would take up to 10 months) and marketing often complained of missed opportunities due to these long lead times. Moreover, when the product was eventually released, customers complained of defects and became unhappy.
They soon realized that they had a serious problem ensuring the flow of products through the value chain. And they needed to re-examine the methodology they had adopted for software development and use supply chain thinking to handle their complex environment.
The most popular methodologies used world over, for software development are Waterfall (the traditional approach) and Agile (which is often implemented using Scrum). The use of Waterfall methodology is thought to be fraught with problems and liable to run projects into chronic delays. Therefore, the use of Waterfall for software development is waning. The Agile methodology which addresses the conceptual and implementation problems with Waterfall is now considered superior and an increasing number of companies are adopting it.
Waterfall: Process Over Speed
The Waterfall methodology was designed with two objectives:
For the first need, freezing requirements before starting the project is an absolute must. Once requirements are frozen, the accountability for any effort or time overrun can be clearly traced back to either of the stakeholders. The second need requires one to approach software development like one would development of a new physical product. Developing a new physical product requires the design to be complete before start of the actual building — allowing design to remain fluid during implementation can lead to uncontrolled rework and interruptions.
Borrowing from these principles, a Waterfall methodology requires one to move the project into following phases sequentially with clear “gates”, marking completion of each phase.
Strict processes are usually put in place as a “deterrent” against any rethink late in the process. However, when customer-testing is undertaken at the end of the entire project, “hindsight” clarity creates severe conflicts between customer and developer on what was “told” initially in the requirement phase and what was “understood”. So, practically speaking, late-scope redefinition is difficult to prevent, more so when the relationship between customer and supplier is not of equals. Resultant effect is massive rework, leading into uncontrolled delays. In this environment, therefore, contractual conflicts, delays, effort spiking and associated stress towards the end are quite common.
Agile: Speed Over Process
In sharp contrast, Agile takes a viewpoint that scope and design cannot be frozen upfront. Hence, it adopts a flexible working approach, which can embrace change much later in the development cycle. Instead of taking a “one perfect project delivery” viewpoint of software development, Agile believes in the principles of delivering usable features, good enough for customers to start using, and then keep improving based on frequent feedback. The idea is to deliver workable software as soon as possible and get feedback for further iterative development. This way of working also takes away the need to freeze requirements and design upfront.
So, in the Agile world, one can move away from sequential working between phases of software development. The team can work as a self-organizing group, which can design, develop, test and deliver workable features in an iterative manner. Time is often managed by using concept of sprints (widely used in the version of Agile called Scrum) — a “time box” of 15/30 days, is used to deliver small batches of usable software features. While the ongoing sprint is frozen, subsequent ones can be changed based on customer feedback.
The Problem With Agile
When the Agile methodology evolved, many believed that the “silver bullet” had been found. However, there is little scientific support for many of the claims made by the Agile community. Moreover, numerous blogs have been highlighting the increasing stress on developers, the ever-lengthening backlog of bugs and customer dissatisfaction with the usability of software.
As a counter argument, proponents of the methodology have pointed out that implementation gaps, not conceptual gaps, are the primary reason for these woes. In order to validate these arguments, it is important to define the boundary conditions or assumptions of the theory and check if these are violated in a majority of cases.
Agile advocates “self-organizing teams” and iterative development. The resultant effect of any Agile intervention is a re-organization of an erstwhile functional structure of design, development and testing into many small self-organizing teams, which are expected to be self-sufficient in all these tasks. This assumption of self-sufficiency of small divided teams in terms of adequacy of high-end skills, particularly designing and requirement assessment is highly questionable in most companies. High attrition rates, with many developers changing jobs every two to three years are hard realities of the industry, which is obsessed with outsourcing and reducing cost of development.
The high attrition rate does not impact availability of generic skill resources like expertise of a programming language. But environment-specific skills like design and requirement analysis, which is only acquired with adequate experience, are scarce. So it is highly likely, with skills decentralization after an Agile intervention that many Agile teams will fall short on skills, even though, each team may have the required numbers to be declared “self-sufficient” as per records.
Fluid Requirements and Design
Without the required skill level or a formal process to control sequential movement in a team, the rigor of design and requirement assessment suffers. Hence, testing becomes the primary source of feedback on design and development gaps. Consequently, continuous “flow back” from testing causes frequent interruptions. The problem of interruptions and rework further aggravates when different small agile groups are working independently on different sets of features in parallel; design decisions, at times, become “local” which cause conflicts between teams and creates further rework.
The few SMEs or designers in some self-organizing teams become overloaded. These scarce resources are forced to multi-task beyond their capacity; they have to not just work within their team but also get into issue management of other teams. This further deteriorates the productivity of the system as everyone waits for feedback from these heavily loaded resources. End result is an uncontrolled development process.
Standard Sprint Cycles Agile (Scrum specifically) has an antidote for this inherent chaotic working approach — sprint deadlines. Since the final product is cut up into smaller features which have to be completed in a sprint, the sprint cycle deadline forces open work fronts on these into closures. But effort spiking close to the sprint end to somehow complete the sprint compels compromises — some testing is skipped or some bugs are kept aside, some scope is set aside for later sprints. Moreover, as almost all resources in the team become extremely busy towards a sprint closure, any planning activity of subsequent sprint is put off to the start of the next sprint.
Consequently, in every sprint, initial time is used up in planning work. Actual development and testing happens only towards the second half of a sprint cycle, thus building pressure of time, followed by skips and misses. Testing resources stay less utilized during beginning of a sprint and then get overloaded at the end, where all planned features land up together. Sprint-end rush and compromises becomes inevitable!
The damaging effect of “forcing” a standard sprint cycle is not just seen in execution but also in planning. The plan arbitrarily “cuts” deliverables based on a standard time. At times, lead sprint managers size out work which may not result in any feature which is usable at the end of the sprint.
When execution is chaotic, and features coming out from a sprint cycle are not “usable”, the predictability for the end customer is lost, particularly for those who might want a firm release schedule for a set of usable features to plan their own projects.
So, in many environments, it is not uncommon to find a hidden Waterfall methodology-based project management imposed upon Agile. This leads to the welcoming of the worst of both solutions in terms of productivity losses.
A Hybrid Solution
Typically, an overall project plan is created for “final release” for customer commitment. The longer-term plan is broken into “sprints” with every sprint having its own predefined scope plan. The commitment to the end customer is the final release. Sprints turn into typical milestones of an overall project schedule.
When many series of consequent sprints are pre-defined with scope, flexibility to re-scope or repeat a sprint based on feedback of previous sprints becomes nearly impossible. So after every sprint, the pressure is to work on scope of next sprint rather than to deal with feedback/issues of the previous one.
This creates an ever-increasing bug list and leads to massive effort spiking towards the end of the “final release”. Some companies leave aside a time box of “final sprint” before final release to clear all the old mess of previous sprints. So in the real sense, customers don't get anything usable till the final release.
At times, even the “buffer” of one sprint is not enough to make up for all the past sins. Less available time, pressure to deliver and overload of pending work is a perfect recipe for more compromises. Bad quality products are likely to be delivered to customers who report the bugs back to the development team. Pressure to expand the team to tackle the huge list of bugs of previous releases while developing for new releases increases the cost of development. Stress, poor productivity, delays and bad quality cannot be a rare phenomenon in such an environment.
At the same time, when an overall release project is superimposed on agile cycles, the typical initial waste of time and development capacity spent in assessing the entire project efforts, scope and plan, negotiations becomes a reality. Software development companies lose out from all sides.
The Core Conflict
One the assumptions of the Waterfall method, which was questioned by Agile, is the ability of customers to clearly specify requirements up front. Many times, a requirement becomes clear only when customers actually see or start using something. At the same time, the need for perfection or the “best” can make the requirements phase go out of control. Hence, Agile did away with the necessity of trying to be perfect in design and requirements gathering but struggles with inadequate design. These two standpoints are behind a very generic conflict underlying all software development processes.
On one hand, we should freeze designs and requirements before implementation because this approach ensures that rework is under control and hence, time is under control.
On the other hand, we should not freeze design and requirements because it is impossible to freeze them. Users get clarity on what they actually want only after using the product.
Direction of Solution
The only way out of the above conflict is to understand that there are clearly two types of rework which gets generated in any new product development process.
Type A (Foresight-led rework): They are misses and skips made under pressure of time and lack of synchronization between team members. Some people in the team know about these inputs. Due to pressure of time, these inputs are not incorporated in time; this leads to rework later in the process. These errors cause frustration among team members for missing the obvious. Using testing resources to get feedback on Type A errors is a criminal waste of time and capacity.
Type B (Hindsight-led rework): This is rework created due to new knowledge generated after an event of testing or clear visualization. It is in a way value adding, because new knowledge is generated in the process. Prototypes generate Type 2 value-adding rework. Type B rework is inevitable and no amount of thought experiment can help one visualize the Type B errors upfront. The only way out is to devise a process to generate Type B errors early in the development cycle.
The Waterfall Method implicitly assumes that all errors are Type A and hence should be prevented. This school of thought has inspired the manufacturing slogan “First Time Right”. Process rigidity with checks and “no-go” gates becomes the key to avoid Type A lead errors. Agile assumes all errors are Type B, hence it considers one-time handovers between phases as a waste of time. Hence agile propagates frequent back and forth between all phases.
Learning From Auto Manufacturing
Interestingly, this “Agile” way is exactly how automobiles used to be manufactured in late 19th Century. It was called the craft manufacturing method. A small group of mixed skill resource groups were each given a car to be assembled from start till end. This way of car manufacturing was very time consuming and expensive.
When Henry Ford presented the assembly line concept with clear division of labor, car assembly was transformed. The specialization on one type of task, along with avoidance of switching losses (of craft manufacturing) between different types of work enabled a productivity jump of close to five times and dramatically reduced the cost of making a car!
However, an assembly line working approach requires two critical conditions to be successful: 1) Flow backs are negligible. If the nature of work is such that it frequently comes back to previous workstations for rework, then it can create chaos and reduce output. (In such an environment, craft manufacturing, or the Agile method, is a better way). Assembly line working is enabled by standardization of parts, i.e., without the need to file and adjust ill-fitting parts for a specific car, there is no flow back. 2) Transfer batches should be small (ideally a single piece flow).
Transferring of big batches not only expands lead time in every stage but also leads to late detection of quality issues.
The New Rapid Feature Flow Model
These two critical conditions can be used to resolve the inherent conflict of software development process:
Minimizing flow backs. Organizing resource groups into requirements assessment team, design team, development and testing group helps create required division of labor for assembly line working. This allows the limited expert resources to be centralized in the requirements and the design groups. This grouping also helps in proper utilization of their capacity in type of work, which requires their expertise.
However, without clear definition of what constitutes end of requirement and end of design, the assembly line type sequential working between resource groups will never materialize due to flow backs. It is also important to put up a process check completions — this is imperative to break the bad habit of proceeding without completing the phases.
Requirements phase. This phase should focus not only on defining what should be done in a feature but also defining what will not be given. It is also important to begin with the end in mind. The inputs of all test cases to be used in the final validation should be defined. At times, visual prototypes can also help customers visualize what they really want.
Design phase. Design stage should focus on unearthing detailed dependencies in terms of interaction with different functionalities. Identifying verification test cases upfront in this stage makes sure that the gap related to any kind of dependencies which can lead to regression are minimized. In the absence of proper design, lot of capacity wastage can happen later in verification testing as fixing one issue might lead to breaking of other things, creating a vicious loop leading to unpredictability in flow.
The gates of design and requirements along with principle of “beginning with end in mind” should help in preventing type A errors and, unearthing type B errors early in the process.
Minimizing transfer batch. Just having an assembly line with delivery coming out after a period of six months or a year is also undesirable. When different customers require different features at different points of time, having one delivery date for a large set of features is of no use to anyone. It is unreasonable to expect the customer to wait till all other features are complete. So, moving individual features through the assembly line in order of priority ensures better flow and early delivery of workable features that individual customers want.
The way to make a transfer batch of one feature is to institute a WIP (work in process) limit in the different resource groups. Deployable features can be defined at a usable level as there is no need force fit a feature into a sprint. The work in process control implies that a new feature will be introduced to the resource group only when one under progress is complete as per the criteria set for handovers. It also ensures that the output of features is in a staggered manner and that load on testing is levelled out (instead of wave of lumpy work for testing in a sprint-based scheduling.) This staggered flow prevents temporary bottlenecks in the system.
The WIP control also aids transparency of queues, which in turn helps in better capacity planning between resource groups. Skill group sizes can be adjusted time to time to manage any temporary elongation of queue.
The above principles will help in rapid flow of features from this system. The rapid delivery of individual deployable features also helps eliminate the search for perfection in requirements phase. Customers (or marketers of the software) can keep improving the features by initiating work for the next versions.
The suggested model of working, however, has the following drawbacks: 1) The rule of WIP control makes scheduling of tasks redundant. In the absence of a time schedule, predictability is lost. 2) It is in direct conflict with the typical sales planning process of a product company that works on annual or biannual software release plans.
But these drawbacks can be addressed successfully.
The above system exposes queues in the system and helps differentiate touch time (actual work time) from waiting queues. If the features' content can be standardized as multiples of a unit work package, then it is possible to model the expected arrival time to provide the desired predictability. When flow improves, with drastic reduction in interruptions, priority changes and rework, the system predictability goes up many folds.
But it is important to use the scale of time only to provide continuous predictability (as Google maps do by looking at queues on road) and not use it as a tool to drive “commitment” in execution or use it as a schedule. It is a fallacy that schedules, and commitment around schedules, prevent time wastage. In reality, wastage of time is prevented by daily process of issue resolution and close hand holding by flow supervisors in the respective groups.
Dynamic Software Release Planning
The new model also mandates that one move away from managing software development as a large monolith project with detailed plan aiming for a single release. Final releases can be done as frequently as one can manage based on the release overheads. Single features or a set of features can be bundled into releases as and when one wants. Instead of committing to a release plan upfront for a year, marketers of the software can decide on release strategy based on dynamics of the market. This provides necessary flexibility to react to competition strategy, and hence change the queue of features in the waiting list to deal with market dynamics.
Within eight months of implementing the new Rapid Feature Flow Model, Schneider reported the following benefits:
Customer satisfaction. Hitherto, customers had continued to use older versions in spite of new versions being available (though late) because they knew that the new one will be full of bugs. But now they are not only happy with the reliability of product delivery, they also have confidence in the product and so readily upgrade.
Focus on improvement initiates. At end of every feature, origin of obstacles to are identified and a fix is incorporated in the next feature itself. Senior management also now concentrates on POOGI (a process of ongoing improvement) activities instead of the erstwhile practice of daily firefighting.
Time released for pure R&D work. With the time now freed from firefighting in development, the company is able to dedicate time to pure R&D work (1 patent filed and 3 RDIs completed)
Stress free environment. People now not only find the work environment stress-free, they are also are going back home on time.
Timely, incisive articles delivered directly to your inbox.