# Scalable Agile Estimation and Normalization of Story Points: Calibrated Normalization Method for Bottom-Up Estimation (Part 4 of 5)

In Part 1 of this multi-part blog series, I introduced the topic of the blog series and provided an overview. Scalable agile estimation methods are required to provide reliable estimates of workload (work effort) and also reliable velocity metrics (both estimated velocity and observed velocity) at the team, program and portfolio levels of large-scale agile projects. Without reliable estimates of workload and reliable velocity metrics at all levels, effective and meaningful determination of costs, return on investments and project prioritization cannot be made. For scalable agile estimation methods to work properly, story points across the space dimension (teams, epic hierarchies, feature group hierarchies, goals, project hierarchies, programs, portfolios, etc.) as well as across the time dimension (sprints, release cycles) need to have the same meaning and the same unit of measure. In other words, story points need to be *normalized* so they represent the same amount of work across the space and time dimensions.

In Part 2 I reviewed the key requirements that must be satisfied for traditional velocity-based planning to work properly. I presented three key challenges associated with traditional velocity-based planning, and how they are exacerbated as agile projects begin to scale up. The three key challenges are:

- A single story point is unlikely to represent the same amount of work across teams and sprints.
- Bottom-up story point data is frequently not available for estimating work during program and portfolio planning.
- Yesterday’s weather model requirements may not hold for one or many of the multiple teams involved in large-scale agile projects.

In Part 3 I presented two published scalable agile estimation methods along with my critique. The first method is covered by Mike Cohn in his *Agile Estimating and Planning *book. The second method** **is the story point normalization method promoted by SAFe that I refer to as “1 Point = 1 Developer Day Normalization Method” (1NM for short)**.**

In this Part 4 I present a scalable agile estimation method, called **Calibrated Normalization Method **(CNM), which I have developed, taught and applied by working with clients in agile training and coaching engagements since 2010. Part 4 emphasizes CNM *bottom-up* estimation (from teams to programs up to portfolios). CNM is a general method for story point normalization, and is not tied to any specific agile scalability approach or framework, such as SAFe. CNM can be applied to small, single-team projects, very large agile projects consisting of multiple portfolios and programs, as well as within enterprises with many independent projects.

**Normalization analogy from the currency conversion domain**

The concept of normalizing estimates is analogous to converting (i.e., normalizing) currencies across different divisions of a multi-national company. This analogy helps us understand the need for story point normalization across the space and time dimensions of a large enterprise.

As shown in Figure 2, a US-based company has divisions in Germany, UK, Australia and Hong Kong, each reporting its 2^{nd} quarter revenue in its local currency (Euro, Pound, Australian dollar, and Hong Kong dollar, respectively). US Headquarters needs to calculate corporate revenue in US dollars. The US dollar plays the role of currency “Normalization Basis.” Figure 2 also indicates the currency conversion factors, for example 1 £ = 1.59 US$. In this case, the currency conversion factor for the UK pound is 1.59. It is meaningless to simply add the revenue numbers without first converting (normalizing) the numbers to a standard currency.

Figure 2: Normalizing currencies in a multi-national company

Each division normalizes its local currency to US dollars by applying its currency conversion factor. After currency conversion (normalization), the total revenue for the company can be expressed in US dollars.

**Step-by-Step description of CNM for bottom-up estimation**

CNM (bottom-up estimation) consists of four steps described below.

**Step 1: Decide the** **Normalization Basis for an enterprise**

All teams, projects, programs and portfolios across the enterprise should agree on the ideal hour equivalent of 1 **Normalized Story Point (NSP).** The number of hours decided is called the **Normalization Basis**. Thus 1 NSP = the number of ideal hours decided upon = the Normalization Basis.

The Normalization Basis for an enterprise can be an arbitrary value, such as 8, 10, 16, 20, 32, 40, 25, 100, or even 1 ideal hour. It does not matter what the Normalization basis is, as long as everyone agrees on a single value throughout an enterprise. The Normalization Basis value can be decided simply by fiat or a random draw. As such, Step 1 is literally a one minute decision step; no discussion is warranted. This number will remain constant across the space and time dimensions for an enterprise.

In this blog series, as an example assume that an enterprise has decided its Normalization Basis = 40 ideal hours or 1 week (1 NSP = 40 ideal hours). 40 ideal hours per 1 NSP becomes the common standard for representing estimates throughout the enterprise covering both the space and time dimensions.

One ideal hour of effort has the same meaning across the enterprise. Similarly**,** one NSP of effort has the same meaning (40 ideal hours), as per the example above. In one ideal hour of effort, different teams may produce different amount of outputs based on their productivity. One ideal hour of effort represents the same amount of effort, but not the same level of output across all teams, projects, programs, portfolios, and sprints.

**Step 2: Estimate the relative sizes of stories using relative sizing techniques **

Relative sizes of stories are commonly estimated in story points by using techniques such as Planning Poker. Because a story point only has meaning in the context of the team that did the estimation, I call it a Team Story Point (TSP). TSPs for one team are not equivalent to TSPs for any other team. TSPs of a team may not be compared across different sprints of even the same team.

**Step 3: Determine the Calibration size for each team**

In CNM, each team calibrates the size of one TSP by using a sample of up to 3 stories from its sprint backlog for each sprint. This process determines the average number of hours per TSP, or **Calibration Size**, for a team.

Calibration Size = (Total estimated hours of effort for up to 3 sample stories) / (Total Team Story Points for same sample stories) = Team Hours per Team Story Point (TSP)

As an example based on Figure 3:

The team’s Calibration Size = (29 + 62 + 98) / (1 + 2 + 3) = 31.5 ideal hours per TSP

Figure 3: Story Point Calibration

Using the Calibration size, the team can predict how many ideal hours it will take to complete a story of a given size. For example, if a story is estimated to be 3 TSP, then it will likely take 3 * 31.5 = 94.5 ideal hours to complete.

Up to 3 sample stories should be selected from the sprint backlog for determining the Calibration Size. Your sample may contain 2 or 3 small stories, which reduces the risk of using only a single story for calibration, because a single story may be an outlier and skew all estimates. However, if you feel confident in choosing only a single story as your calibration point, you may do so.

If you have any other basis to calculate the Calibration Size, perhaps based on that team’s historical data for the actual effort needed for 3 sample stories, you may do so. You need to be careful that the historical data is truly representative of the team (exactly the same team members working under a very similar “weather pattern”).

Unlike SAFe’s 1NM, CNM does not force you to select a story with 1 IDD effort or any pre-determined quantity of effort. CNM simply lets the “chips fall wherever they may” and calibrates the size of one TSP for each team for every sprint, without making any assumptions.

**Step 4: Normalize the story points and enter NSP for each story in agile project management tool**

I now define an important ratio called the **Point Conversion Factor, **which is the ratio: **Calibration Size / Normalization Basis**. The Point Conversion Factor is similar to the currency conversion factor in a multi-national company shown in Figure 2. The Point Conversion Factor allows you to convert team story points into equivalent normalized story points.

In our example, the Point Conversion Factor = 31.5/40 = 0.787.

To convert the TSP for a story into NSP, multiply the TSP number by the Point Conversion Factor. So if the team has estimated a story to be 3 TSP, then this becomes the equivalent of 0.787 * 3 = 2.361 NSP.

The Point Conversion Factor also allows you to convert an NSP into equivalent TSP. To covert the NSP for a story into TSP, divide the NSP number by the Point Conversion Factor. Therefore in our example, 2 NSP = (2/0.787) TSP = 2.54 TSP.

In Part 5 of the blog series, I will provide an Excel-based downloadable template for doing all story point normalization math in a very quick and easy way.

Now that story points are normalized, we can enter and use these values in our agile project management tools (such as VersionOne) to ensure that all story point roll-ups, progress bars, math and reports are meaningful and correct across large-scale agile projects with several teams, programs and portfolios. (Hooray!)

I now describe various applications of CNM.

**Estimating team velocity **

For teams operating under yesterday’s weather model (see Part 2 of this blog series), a reasonable prediction of future velocity can be done by taking an average of the last 3-4 sprints. The velocity for a team could be expressed as TSP/sprint or NSP/sprint. However, especially for large-scale agile projects involving more than one team, it is important that we use the normalized velocity of each team so that the team velocities can be added together or rolled up correctly.

For teams not operating under yesterday’s weather model, it is harder to predict future velocity using past sprint results. In this case, CNM recommends that the teams calculate their **Agile Capacity** for future sprints in hours. Agile capacity for a team is the total hours available from all team members, not including time allocated to planning, meetings, vacations, emails, etc. Note the following relationships.

Estimated maximum normalized velocity = (Agile Capacity / Normalization Basis) NSP per sprint

Estimated maximum team velocity = (Agile Capacity / Calibration Size) TSP per sprint

For example, If the Agile Capacity of a team is 500 hours/sprint, and the enterprise Normalization Basis is 40 ideal hours and Calibration Size = 20 ideal hours, then the estimated maximum normalized velocity = 500/40 or 12.5 NSP per sprint, and estimated maximum team velocity = 500/20 or 25 TSP per sprint.

In Part 5 of the blog series, I will present an Agile Capacity calculation template as a worksheet inside the Excel-based template mentioned above for doing all story point normalization math. The template will make agile capacity and all story point normalization calculations very quick and easy. For any given team, the Calibration Size may change from sprint to sprint, but this has no bearing on the estimated maximum velocity expressed in NSP. Estimated maximum normalized velocity depends only on the Agile Capacity of a team and the Normalization Basis chosen for the enterprise.

**Story point normalization deals well with team velocity differences caused by calibration size differences **

Let us assume Team A and Team B have the same Agile Capacity of 400 hours, but Team A takes a sample of small stories for its calibration, and as a result its calibration size is 5 ideal hours; while Team B’s sample consisting of larger stories yields a calibration size of 20 ideal hours. Team A will have an estimated maximum local velocity = (Agile Capacity / Calibration Size) = 400/5 or 80 TSP, while Team B will have estimated maximum local velocity = 400/20 or 20 TSP. However, both teams are estimated to have a maximum normalized velocity of (Agile Capacity / Normalization Basis) = 400/40 or 10 NSP. If Team A and B are part of a program, rolling up their velocity numbers (whether estimated or measured) in TSPs to calculate program velocity (estimated or measured) would be meaningless; but it would make perfect sense to roll up their velocities in NSPs to get program velocity in NSPs.

**Roll-up of NSPs and velocity metrics**

For portfolio of programs, epic hierarchies, feature group hierarchies and goals, NSP numbers should be rolled-up the hierarchy (and not TSP numbers, as done typically in many agile projects). An example is illustrated in Figure 4. ** **All story point numbers shown in gray ovals in Figure 4 are estimated NSP numbers (i.e., planned normalized velocity numbers); and all story point numbers shown in green ovals are measured normalized velocity numbers. Therefore, their roll-up math (addition while moving up the hierarchy) is meaningful and correct.

Teams 1.1.1, 1.1.2 to 1.1.j making up Program 1.1 have estimated workload of 13.4, 14.1 and 11.9 NSP at the beginning of a sprint. They have demonstrated measured velocity of 12.2, 13.8 and 12.0 NSP at the end of that sprint. Program 1.1 has estimated workload of 61.7 NSP and measured velocity of 52.8 NSP.

Each NSP number can be easily converted to the equivalent number of ideal hours by multiplying each NSP number by Normalization Basis (40 ideal hours in our running example). If the enterprise has historical data for the loaded cost for one ideal hour of work or one NSP work, it can easily calculate the estimated cost at the team, program, portfolio and enterprise levels. Similarly, NSP numbers can be added up for a release cycle by adding the numbers for all sprints of that release cycle.

Note that the estimated workload and velocity numbers are rolled up from bottom to top by agile project management tool (such as VersionOne). It requires the lowest team-level workload estimates and velocities in NSP for each team and every sprint. This can easily done if each team plans its sprint before the actual sprint work starts, and during its sprint planning estimates its stories in TSP and converts TSP numbers into NSP numbers by following Steps 2 through 4 of CNM described above.

The approach illustrated in Figure 4 for bottom-up roll-up of estimated workload and measured velocities (all expressed in NSP) can be adapted for use by enterprises with many independent projects. This aggregation will be meaningful if the story points for stories at the bottom-most team levels are entered in NSP. These independent projects are likely to have different sprint cadences, and sprints most likely are not synchronized. It is still possible to report on “Velocity by Date” metric or “Weekly Throughput” of accepted stories expressed in NSP. For example, an enterprise may report a throughput of 500 NSP per week, which is equivalent to (500 x 40) = 20,000 ideal hours of accepted work. Metric represented in ideal time units (hours, staff-days, staff-weeks, etc.) are much more meaningful to management than metric represented in story points.

Needless to say CNM can also be used by small agile projects too. Even a single-team project may find story point normalization useful if one or more requirements of yesterday’s weather model (see Part 2 of this blog series) do not hold for any sprint of that team.

Figure 4: Roll-up of normalized story points from team-level to program-level to portfolio-level to enterprise level

**Why not just estimate each story in ideal hours instead of story points?**

I now answer this legitimate and important question by giving the following reasons:

- Estimating a story in ideal hours requires substantial effort as the story needs to be broken down into its tasks and tests to be estimated in ideal hours. This is a lot more effort compared to relative size estimation of stories.
- Stories are of value, benefits and meaning to customers, while tasks and tests inside stories are not. Relative sizes of stories remain stable while estimates in ideal hours may change over time.
- CNM does not require you to estimate each story in a backlog in ideal hours; only up to 3 stories in a sample are estimated in ideal hours to calibrate the size of one TSP. CNM also does not require you to track the actual effort in hours.
- Relative size estimation is a team effort; the discussion among team members for estimating relative sizes increases the collective and shared understanding of each story among all team members. This benefit of the conversation is often of greater value than the result of arriving at a specific story point number for each story.
- NSP numbers can be immediately converted to ideal hours by multiplying a NSP number with Normalization Basis. There is no reason to estimate each story in ideal hours by breaking it into its tasks and tests, and then estimating each task and test in ideal hours (this is a lot of effort as stated above).
- Like SAFe’s 1NM, CNM too is a hybrid method. Both methods establish equivalence between NSPs and ideal hours of work. In SAFe’s 1NM, 1 NSP = 1 IDD = 8 ideal hours. In CNM, 1 NSP = Normalization Basis number of ideal hours.

In Part 5 of the blog series, I will explain how CNM performs the *top-down* estimation (from portfolios to programs down to teams). I will also compare and relate SAFe’s 1NM (covered in Part 3) with CNM (covered in Parts 4 and 5). I will provide a downloadable template for calculating the Agile Capacity, estimating the team workload for bottom-up estimation, and estimating the portfolio and program workloads for top-down estimation. It makes it easy, quick and avoids human errors.

**Acknowledgements:** I have greatly benefited from discussions and review comments on this blog series from my colleagues at VersionOne, especially Dave Gunther, Andy Powell and Lee Cunningham.

**Your feedback:** I would love to hear from the readers of this blog either here or by e-mail (Satish.Thatte@VersionOne.com) or hit me on twitter (@smthatte).

**Part 1****: **Introduction and Overview of the Blog Series – published on 14 October 2013.

**Part 2****: **Estimation Challenges Galore! – published on 4 November 2013.

**Part 3****: **Review of published scalable agile estimation methods – published on 7 November 2013.

**Part 5: **Calibrated Normalization Method for Top-Down Estimation – published on 2 December 2013.

## Join the Discussion

## Stephen

Satish,

I’ve been reviewing your posts for the last couple of days and find them quite helpful. I have a couple of clarifying questions on step 2 of part 4. 1) is it correct to assume that the hours estimated for the 3 stories (i.e., 29, 62 and 98) are actual estimated hours (60 min intervals)? In other words, the team in this scenarios has stated that it will take us 29 hours to complete story A? 2) In this scenario, what does a TSP of 1 for story A, TSP of 2 for story B, etc. mean. How are these numbers determined for each team?

Do you have additional material on your CNM approach?

## Satish Thatte

Hi Stephen,

The hours estimated for three sample stories A, B and C are in ideal hours, and are estimated by the team. It is correct that the team has estimated Story A to take 29 hours, Story B to take 62 hours, and Story C to take 98 hours, in this example. The hours estimated need not be in the 60-minute interval (in this example, that happened to be the case).

Story A is of Team Story Point 1 size (TSP of 1), Story B is of Team Story Point 2 size, and Story C is Team Story Point 3 size. That means that the team has determined that Story B will take twice as much effort to implement as that of Story A, and Story C will take three times as much effort to implement as that of Story A. So the relative sizes of Story A, B and C are 1, 2 and 3. These relative sizes are determined by the team by using the popular technique of Planning Poker, or some other method of its choice. Hope this answers your questions.

I will email you a technical report based on this 5-part blog series. If you still have questions, feel free to contact me by email.

Regards,

Satish Thatte

Agile/Lean Coach and Product Consultant, VersionOne

## Om

Hi Sathish ,

Would like to understand on why you need to take samples to arrive at a calibration size, why not we use the Ideal Hours of 40 to be considered directly to normalize the story point

## Satish Thatte

Hi Om,

If a team or a project has a capacity of “N” hours, the question we are trying to answer is “how many story points of work can be completed in that capacity?

One way to answer the question is to estimate all stories in story points and ideal hours, and see how many of those stories will fit in the capacity of N. Then simply add up the story points of all those stories to arrive at the answer. However, this is a lot of effort as you will need to estimate all stories in story points and ideal hours.

However, if you can take a statistical sample of 3 to 5 small stories (typically story points 1 and 2), and calibrate the average size of 1 story point story in ideal hours, that will give you calibrated story point size. For example, if you take a sample of three stories, say A, B, C of story points, 1, 2 and 1, respectively, and estimate them in ideal hours and find that these stories will take 27, 60 and 19 hours, respectively, then the calibrated one story point = (27+60+19)/(1+2+1) = 26.5 hours. If the capacity N = 600 hours, you can quickly estimate that the team of project should plan on completing a maximum of (600/26.5) = 23 story points of work. This estimate of 23 story points requires working only with a statistical sample of 3 to 5 small stories, and estimation effort of the whole backlog is not required (saving time and effort).

If the Normalization basis is 40 hours, in this example, 23 story points will translate into 23*(26.5/40) = 15 normalized story points. I hope this explanation helps in answering your question.

Regards,

Satish Thatte

## Per-Magnus

Hi there,

Your logic is farily simple, but I don’t find this very helpful as I find the whole point of estimation to be about establishing an immovable reference base, so that we can see if we achieve productivity improvements (or not). E.g. if we invest 1 M dollars in automation software and coaches – what impact on output did this have?

To me, estimation of hours can be done in many ways, including your way, but is also missing the main point – tracking productivity progress.

/PM

## Satish Thatte

Hi Per-Magnus:

The whole point of agile estimation is to figure out how much work to undertake in a sprint or a release cycle of a project. Large-scale agile projects present two major challenges in estimation:

1. There are multiple teams often organized into programs and portfolios. Individual teams estimate work often in story points, but the scales followed by different teams are different. Without normalizing those story points, you cannot do any meaningful math on those story points (such as addition, roll-up, progress bars, reports or analytics).

2. When a large-scale project is estimated, at best you have information about some high-level epics, but not details of stories or their story points.

In my 4-part blog series, I have addressed both challenges in great detail and have provided a solution based on fully-distributed normalization method, and have compared my method with other methods (Centralized and Semi-Distributed methods of normalization).

The traditional velocity-based estimation requires “stable weather pattern”, i.e., identical or very similar conditions need to continue over several sprints to project the velocity of a future sprint for a team based on the average velocity of the last 3 to 4 sprints. In large-scale projects with several teams (often several dozens, and sometimes even few hundred), this stable weather pattern is extremely unlikely to hold across all teams; there will be changes going on in one team or the other (team composition change, domain change, technology change, etc.).

The normalization method I have presented is impervious to these changes. A team simply calibrates the size of its story points for each sprint (if needed) and calculates its Story Point Conversion Factor (all done by the Normalization math embedded in the Excel template). This method is robust. If a team or a program or a portfolio is improving its productivity (for example, by investing $1M in automation software and coaching in your example), you will see its normalized velocity improving over a period of time, whether or not you can establish an immovable reference point (which is same as “stable weather pattern”.)

If you want to track productivity progress, the normalization method described in my blog series will do that extremely well and robustly whether you have stable weather pattern (immovable reference base) or not. Normalization process takes care of differences in teams, changes in their weather patterns, etc.

Satish Thatte

## Florian

Satish is right – even though the smaller chunks of work that are the foundation of the agile approach do make estimating easier, getting estimates right can still pose problems. Calibration is one of the factors that can help to fine tune your estimates. This technique can be enabled relatively easily by a flexible Story Map, which can be grouped by Story Points. have a look at:

## Satish Thatte

Hi Florian,

Thanks for your feedback. In your comment, the web link for the technique you would like us to take a look at is missing; so I have added that the web link here:

http://bauer-information-technology.com/story-map_agile-estimation/

Regards,

Satish Thatte