Scalable Agile Estimation and Normalization of Story Points: Calibrated Normalization Method for Top-Down Estimation (Part 5 of 5)
In Part 1 of this multi-part blog series, I introduced the topic of the blog series and provided an overview. Scalable agile estimation methods are required to provide reliable estimates of workload (work effort) and also reliable velocity metrics (both estimated velocity and observed velocity) at the team, program and portfolio levels of large-scale agile projects. Without reliable estimates of workload and reliable velocity metrics at all levels, effective and meaningful determination of costs, return on investments and prioritization cannot be made. For scalable agile estimation methods to work properly, story points across the space dimension (teams, epic hierarchies, feature group hierarchies, goals, project hierarchies, programs, portfolios, etc.) as well as across the time dimension (sprints, release cycles) need to have the same meaning and the same unit of measure. In other words, story points need to be normalized so they represent the same amount of work across the space and time dimensions.
In Part 2 I reviewed the key requirements that must be satisfied for traditional velocity-based planning to work properly. I presented three key challenges associated with traditional velocity-based planning, and how they are exacerbated as agile projects begin to scale up. The three key challenges are:
- A single story point is unlikely to represent the same amount of work across teams and sprints.
- Bottom-up story point data is frequently not available for estimating work during program and portfolio planning.
- Yesterday’s weather model requirements may not hold for one or many of the multiple teams involved in large-scale agile projects.
In Part 3 I presented two published scalable agile estimation methods along with my critique. The first method is covered by Mike Cohn in his Agile Estimating and Planning book. The second method is the story point normalization method promoted by SAFe that I refer to as “1 Point = 1 Developer Day Normalization Method” (1NM for short).
In Part 4 I presented a scalable agile estimation method, called Calibrated Normalization Method (CNM). Part 4 emphasized CNM bottom-up estimation (from teams to programs up to portfolios). CNM (bottom-up estimation) consists of four steps summarized below.
- Decide on the Normalization Basis for an enterprise.
- Estimate the relative sizes of stories using relative sizing techniques
- Determine the Calibration size for each team.
- Normalize the story points and enter NSP for each story in an agile project management tool
CNM is a general method for story point normalization, and is not tied to any specific agile scalability approach or framework. CNM can be applied to small (even single-team) agile projects, very large agile projects consisting of multiple portfolios and programs, and enterprises with a large number of mostly independent projects.
Before describing how top down estimation works, it is important to recognize the key responsibilities of portfolio management and program management.
Portfolio management responsibilities:
- Assess, prioritize and select business initiatives to pursue that will create the most value for the company
- Authorize funding for business initiatives, review the progress to provide governance over funding
- Select business initiatives with less details, but gain better insight for governance by reviewing demonstrated progress via working software
Program management responsibilities:
- Coordinate teams and resources in the most efficient manner to deliver on the selected business initiatives
- Decompose business initiatives to identify specific deliverables and required resources
- Serves as the glue that links business strategy and initiatives to execution
- Update the portfolio level with project-level execution that might affect the value from a business initiative
A key component of CNM is the method for performing top-down estimation (from portfolios to programs down to teams). CNM estimates the scope of work at the portfolio and program levels without knowing lower-level story point details which are typically not yet available. For a portfolio, CNM requires identification of a baseline epic, an epic about which the portfolio leadership team has the most knowledge. Similarly, CNM requires identification of a baseline feature in the baseline epic, which is the feature about which the program leadership team has the most knowledge. Finally, CNM requires identification of a baseline story in the baseline feature, which is the story that the program leadership team feels most confident in estimating. Then estimation begins. The baseline story is estimated in normalized story points (NSP). Once the baseline story is estimated, then all other stories in the baseline feature are estimated using relative sizing techniques; then the roll-up of those story estimates become the estimate for the baseline feature. Then other features in the baseline epic can be relative sized against the baseline feature, which then roll up to the baseline epic estimate. Finally, other epics in the portfolio can be relative sized against the baseline epic, which then roll up to the portfolio estimate. All of these estimates roll-up consistently as NSPs.
Clearly these estimates can only be rough as lower-level story point details for all stories are not available; only the story point for baseline story is known along with relative sizes of stories (compared to the baseline story), relative sizes of features (compared to the baseline feature) and relative sizes of epics (compared to the baseline epic). This information is still more credible and directly tied to the portfolio and programs at hand compared to pulling swags (smart wild-ass guesses) or gut-feel numbers. These rough estimates done during portfolio planning are later progressively elaborated and revised during program planning and team-level sprint planning as more information becomes available about lower-level story points.
Portfolio and program-level estimates are used to answer two important and legitimate business questions asked by management.
- For a given set of teams and a fixed amount of time (usually represented in number of sprints and release cycles), estimate the scope of work (represented by planned epics or features). This is called Fix time/Flex scope planning, which is the most common type of agile planning. From the given set of resources and the fixed amount of time, cost (budget) can be derived.
- For a given set of teams, and a fixed scope of work represented by planned epics or features, estimate the amount or time (number of sprints and release cycles) required, and estimate the cost that would be incurred to deliver those epics and features. This is called Fix scope/Flex time planning, which is less common than Fix time/Flex scope planning.
While we have discussed team-level planning in Part 4 of this blog series, in practice an enterprise will typically begin portfolio planning, then proceed to program planning, and finally do team-level planning.
Estimation at the portfolio level
A portfolio consists of one or more business initiatives realized by a set of epics. An epic may need one or more release cycles to be fully implemented. As shown in Figure 5, the business initiatives for Portfolio 1 are to be realized by a set of M epics: Epic 1.1, Epic 1.2 …to Epic 1.M. Most of these epics are business facing, but there may be few that are architectural epics necessary to build an architectural runway. An enterprise may have multiple portfolios. Each portfolio consists of a set of programs. Program planning is done after the parent portfolio-level planning. A program consists of a set of features needed to realize an epic. Recall that each feature is small enough to be implemented in a single release cycle. Some people suggest taking swag at estimating effort at the portfolio level judging “bigness” of epics on a linear scale of 1 to 5 or 1 to 10 or a Fibonacci scale of 1, 2, 3, 5, and 8. However, swags are in a different scale than team story points, and not normalized in any way.
Let us see an example of CNM top-down estimation as illustrated in Figure 5.
Figure 5: Portfolio-level planning and CNM top-down estimation
The baseline story is the lowest-level in hierarchy. In our example the baseline story (Story 18.104.22.168) is estimated by the team to take 21 ideal hours. This number is divided by the Normalization Basis (40 hours in our example) to calculate the NSP for the baseline story. The NSP for the baseline story comes out to be (21/40) = 0.52, shown inside the gray oval below the baseline story 22.214.171.124 in Figure 5. Now all other stories in the baseline feature 1.1.2 are estimated in terms of their relative sizes, i.e., Team Story Points (TSPs) compared to the baseline story. This can be done by playing Planning poker or any other technique. In Figure 5, story 126.96.36.199 is 2x compared to the baseline story 188.8.131.52, and story 1.1.2.k is the same size (1x) compared to the baseline story 184.108.40.206. These TSPs are shown in yellow ovals below story 220.127.116.11 and 1.1.2.k. Therefore, the NSP for story 18.104.22.168 is (2 * 0.52) = 1.04 and story 1.1.2.k is of (1 * 0.52) = 0.52 NSP. NSPs of all stories of the baseline feature 1.1.2 are rolled-up (simple addition) to calculate NSPs for feature 1.1.2 (which is 4.32 NSP shown in gray oval next to feature 1.1.2).
Going one level up to the program level where features are managed, efforts for all features of the baseline epic 1.1 are estimated in terms of their relative sizes (so-called “feature points”) compared to the baseline feature, and those feature points (now expressed in NSPs) are rolled up to the next higher-level, i.e., the epic level. Feature point estimation can again be done by playing Planning poker or some other technique for relative size comparison. In Figure 5, NSPs for all features in the baseline epic are shown, and their roll-up at the baseline epic 1.1 is also shown. The baseline epic 1.1 is thus estimated to take 23.3 NSP.
Going one level up to the portfolio level where epics are managed, efforts for all epics of the portfolio 1 are estimated in terms of their relative sizes (so-called “epic points”) compared to the baseline epic, and those epic points (now expressed in NSP) are rolled up to the next higher-level, i.e., the portfolio level. Epic point estimation can be done by playing Planning poker or some other relative size estimation technique. In Figure 5, NSPs for all epics in the portfolio are shown, and their roll-up at the portfolio level is also shown. Portfolio 1 is thus estimated to take 163.5 NSP, which is same as (163.5 x 40) = 6,540 ideal hours of work. Of course, this number is a rough estimate.
Thus the estimation of effort at portfolio, program and team levels follows a common pattern, and hence this approach has fractal (i.e., “self-similar”) elegance. The effort estimate now available at the portfolio-level is better than swag or any other gut-feel estimate. It is also in units that can be converted into estimated cost based on the known loaded cost for one NSP of work (either based on historical data or loaded cost estimate for one NSP for the portfolio). Note that team story points, feature points and epic points are relative sizes of stories, features and epics. They are used for conceptual understanding. CNM converts all of them into a single currency, which is the Normalized Story Points.
Based on this effort, quick calculations can be done to estimate how many sprints or release cycles will be needed to complete the estimated work for the given capacity of each team. This is essentially Fix Scope – Flex Time agile planning mentioned earlier. For example, if Agile Capacity for each sprint for the portfolio is 400 hours, at least (6,540/400) = 17 sprints of time will be needed.
If there is a deadline of only 12 sprints of time, the portfolio capacity is only (12 x 400) = 4,800 hours. As the estimated work is 6,540 ideal hours, approximately (6,540 – 4,800) = 1,740 hours = (1,740/40) = 43.5 NSP worth of lowest rank-order epics from the portfolio backlog need to be removed from the scope. This is essentially Fix Time – Flex Scope agile planning mentioned earlier. Let us say, you decide to remove Epic X and Epic Y from the portfolio backlog (lowest rank-order epics), which would remove 35 NSP, falling 8.5 NSP short of the target of 43.5 NSP scope reduction. If you were to select an entire Epic Z (estimated scope of 12 NSP) for removal, you would end up reducing the scope too much. You can deal with this problem by breaking Epic Z into its member features and estimating and rank-ordering those features to identify few lowest rank-ordered features (with a total of 8.5 NSP) for scope reduction.
The estimates for both Fix Scope – Flex Time and Fix Time – Flex Scope planning are rough estimates only; but they can be explained in a meaningful and consistent way by using NSP measure; these estimates then become a starting point for conversation for evaluating different options. Portfolio or program planning based on swags or “bigness” gut-feel numbers do not offer these benefits.
Estimation at the program level
The details of estimation at the program level follow the same pattern described above for the portfolio level. The only difference is that the highest level now is the program level instead of the portfolio level. Each program in a portfolio is estimated by its own program leadership team. Each program leadership team does the estimation of effort for the baseline feature at the program level following the approach described above in portfolio estimation.
We now have more information available because for each program in a portfolio, we are identifying the baseline feature; and for the baseline feature of each program, we identify its baseline story. As an example, if we have 4 programs in a portfolio, we identify 4 baseline stories and estimate them in NSP. We now have more lower-level data available, which improves the reliability of estimates. This program-level estimate allows us to revise and refine the estimates done earlier at the higher portfolio level. Based on these revised, new estimates, some decisions taken earlier may be revised or refined. This is how progressive elaboration and estimation should work.
Estimation at the team level
This was already covered in great details in Part 4 of this blog series. Once you have estimates at the team level, all stories are now estimated in NSP across all teams for a sprint. All that information can be rolled-up level by level giving us more detailed, refined and revised estimates than what was done earlier during program-level or portfolio-level planning. Based on these revised, new estimates, some decisions taken earlier at program and portfolio levels may be revised or refined.
Portfolio-level and program-level planning sessions take place before team-level planning sessions. Sprint planning is done by each team for each sprint (typically 2 to 4 week sprints). Teams in a program should ideally do program-level planning jointly for each release cycle (3 to 6 month release cycles). Portfolio planning is done by the portfolio leadership team covering multiple release cycles (6 to 12 month planning horizon).
The Capacity_Workload_Calculator.xlsx downloadable template handles all math required for Agile Capacity calculation, and workload estimation at the portfolio-level, program-level and team levels. You should start with its Instructions worksheet, and follow the instructions to use all the remaining worksheets.
- Normalization Basis worksheet: Enter the Normalization Basis for your enterprise.
- Team_Capacity worksheet: Used by each team to calculate its Agile Capacity for each sprint. The team capacity automatically flows to Team_Workload worksheet so you can easily compare the estimated workload for a team against the Agile Capacity for the team to find out if the team is overloaded or has excess capacity.
- Portfolio_Workload worksheet: Used to calculate the estimated workload of a portfolio in NSP and equivalent ideal hours.
- Program_Workload worksheet: Used to calculate the estimated workload of a program in NSP and equivalent ideal hours.
- Team_Workload worksheet: Used to calculate the estimated workload of a team in NSP.
The CNM and the Capacity_Workload_Calculator.xlsx downloadable template have been used by my clients over many projects since 2010, leading to several refinements culminating into the current version of CNM and the template as described in this blog.
How CNM deals with all three challenges listed in Part 2.
- Challenge 1: A single story point is unlikely to represent the same amount of work across space and time
SAFe’s 1NM requires each team to find 1 IDD story as benchmark, and does not recommend any re-calibration. Its Calibration Size is 1 IDD, and its Normalization Basis is also 1 IDD, i.e., 8 ideal hours of effort. On the other hand, CNM decouples Calibration Size and Normalization Basis. It simply determines Calibration Size based on sprint by sprint re-calibration, if needed. This a more robust normalization technique based on TSP re-calibration for every sprint if required, which responds to changing reality sprint by sprint. Story points are normalized properly across both the space and time dimensions.
- Challenge 2: Bottom-up story point data is not available for estimating work during program and portfolio planning
In this Part 5 of the blog series, I have presented the details of how CNM addresses the issue of top-down estimation at the portfolio and program levels without requiring the knowledge of story points for all team-level stories.
- Challenge 3: Yesterday’s weather model requirements may not apply.
CNM does not require any historical velocity data. When a team has no or little historical velocity data, it simply calibrates (or recalibrates) its Team Story Point based on a sample of up to three stories. CNM does not depend on yesterday’s weather model requirements. For each sprint, it recalibrates its Team Story Point and recalculates its Point Conversion Factor if any requirement required for yesterday’s weather model is not going to be met; if yesterday’s weather model is applicable, you may use the average velocity (in NSP) from the last 3 to 4 sprints to estimate the velocity for the next sprint (in NSP). Recalibration effort based on up to 3 sample story points may take approximately 20 minutes (a smaller sample of 1 or 2 stories will take less time). But this is done only once for a sprint during its sprint planning. Once the Agile Capacity is calculated (a 15 minute exercise using the template provided with this Part 5 of the blog series), estimation of velocity expressed in NSP for a sprint is a snap. Challenge 3 is a non-issue for CNM as yesterday’s weather model is neither assumed nor required to hold true.
Comparison between 1NM and CNM
SAFe is a complete framework for managing large scale agile projects. CNM is not a scalability framework, but only a method for scalable agile estimation. Table 1 shows a comparison between SAFe’s 1NM and my method CNM.
Swags use yet another currency of effort in an enterprise besides story points and ideal hours. These three currencies have no relationship. Having three unrelated currencies for estimating effort is confusing and unnecessary. Unless you normalize story points in a large enterprise, story point math and planning based on story points will not make sense. For the same reasons story points may lose meaning without normalization, swags too may lose meaning without normalization. Normalized story points are naturally related to ideal hours and provide a single currency for estimating effort across the space and time dimensions. A single-currency world is much simpler to understand and manage.
Summary and concluding remarks for the blog series
Estimates are estimates, and not commitments or contracts. This fact does not change with scalable agile estimation methods using normalized story points. Scalable agile estimation can be classified based on the degree of centralized vs. distributed decision making, as illustrated in Figure 6. Mike Cohn’s method (covered in Part 3 of this blog series) is an example of the centralized class. Another example of the centralized class is the method of “cross-team synchronization on points with a canonical set” (briefly covered in Part 3) by Larman and Vodde (Practices of Scaling Lean & Agile Development, Chapter 5, pp. 182-183). As explained in Part 3, centralized methods are difficult to scale for large-scale agile projects. SAFe’s normalization method, 1 NM (also covered in Part 3) is an example of the semi-distributed class; each team is allowed to select its own 1 normalized story point benchmark (decentralized operation), each team must select a story of 1 IDD (8 ideal hours) effort as 1 normalized story point (NSP), which is a centralized decision. CNM is fully decentralized as each team is free to select its own 1 Team Story Point (TSP) story, and no team is required to select 1 TSP of any pre-defined size (let the chips fall wherever they may).
Figure 6: Spectrum of Scalable Agile Estimation Methods
- CNM can be applied to small (even single-team) agile projects, very large agile projects consisting of multiple portfolios and programs, and enterprises with a large number of independent projects.
- CNM deals with various challenges in estimating large-scale agile projects, and offers advantages over centralized and semi-distributed methods.
- CNM is not tied to any specific agile scalability approach or framework, such as SAFe. However, CNM can be used in conjunction with SAFe.
- SAFe’s normalization method, 1NM, can be considered as a special case of CNM.
- CNM is not tied to any commercial agile project management tool.
- A free downloadable template makes agile capacity calculation and story point normalization math quick and easy.
- CNM promotes local, decentralized, and autonomous decision making at the team level by allowing teams to use their own story point scales.
- Using top-down estimation techniques and baseline estimates, portfolio and program estimates based on CNM are more credible than swags or gut-feel numbers.
- CNM-based portfolio-level normalized estimates become more relevant as they are revised when more information becomes available about lower-level story estimates and teams’ actual velocities.
- CNM supports Fix time/Flex scope as well as fixed scope/flex time planning.
- CNM has been continuously refined based on user feedback from clients since 2010.
- CNM offers a solid foundation for consistent estimation across large-scale agile projects.
If you would like to receive a single file version (as a PDF document) of this 5-part blog series, please send me an email at Satish.Thatte@VersionOne.com requesting the document.
I hope that you have benefited from this comprehensive 5-part blog series. Please provide feedback and/or comments via: Satish.Thatte@VersionOne.com or twitter: @smthatte.
Acknowledgements: I have greatly benefited from discussions and review comments on this blog series from my colleagues at VersionOne, especially Dave Gunther, Andy Powell and Lee Cunningham.
Part 1: Introduction and Overview of the Blog Series – published on 14 October 2013.
Part 2: Estimation Challenges Galore! – published on 4 November 2013.
Part 3: Review of published scalable agile estimation methods – published on 7 November 2013.
Part 4: Calibrated Normalization Method for Bottom-Up Estimation – published on 18 November 2013.