There are four steps to calculating the race Scores.
Finding similar races
Calculating the expected score for each runner
Grouping data – selecting runners
Asymmetric regression on selected runners
These 4 simple steps are followed for each rate to determine the overall score.
Step 1 - Finding similar races
Of all the past races in the database which could provide insights into this specific race?
We select similar races based on two criteria
Time: Races are less than 50 months (4yrs 2 months)
Effort: Convert distance + elevation into basic effort. Shorter races that are greater than 65% of the effort of this race Longer races = no limit, all races.
These selected races form the basis for the next steps.
Step 2 - Calculating the expected score for each runner
We calculate the predicted performance of each runner in this race based on their past results*
We calculate the predicted performance of each runner in this race based on their past results*.
Or, to put it another way… “ Based on their relevant past results, what score do we think each runner would achieve in this race on this day? “
This prediction is aimed to be both optimistic and realistic
For each runner individually: take all results that are timely + effort relevant.Each past result is given a weighting based on similarity of the race profile and recency to the race in question.Bad performances are ignored. Up to the top 5, best rated past performances are selected. These are used to calculate the expected performance of the runner, with newer and more similar races given higher weighting.
For each runner we now have an expected score. BUT not all of these are built with data of the same reliability. So, in addition each runner is also given an overall stability /confidence score based on the results selected above;
- Experience – how many races are used
- Relevant – how did the past races compare in profile
- Variability – how does the runners scores normally vary
- Recency – how recent are the scores selected.
Step 3 - Selecting runners
We only want to use the runners with the best data as a representation of the whole race.
We now have expected score and a stability score for each runner
We only want to use the runners with the best data as a representation of the whole race.
Not all runners.
Firstly we remove the slowest runners from the sample, we define this as the runners that finished in more than double the time of the fastest runner.
Secondly we take the runners with the highest stability score in the prvious steps. This group includes approximately 80% top ranked runners 20% regular runners
This then gives us a reduced group of runners with an expected score and a stability score (from the previous step)
Less high-quality data is more reliable than more low-quality data
For example an OCC 2025:
- 1471 finishers.
- 694 runners with Expected Score.
- 107 runners for final analysis.
Step 4 - Regression on selected runners
Analysing this specific race.
We now have 3 pieces of information for each runner in the sample/
- Expected score
- Stability
- Speed. (km/h)
This is depicted graphically here:
Horizontal axis speed Vertical axis expected score Colour - stability (dark = low, yellow = high)
Regression: In simple terms, fitting a straight line to the data points. In reality, it get a bit more complex. Each person in the calculation carries their own stability weighting, so has a greater or lesser impact on the position of the line.
This shape of the data TOP RIGHT to BOTTOM LEFT is seen on all races. Runners who are predicted to have higher scores have a faster race speed.
Asymmetric regression
To go even further, we use ‘asymmetric regression’ which limits the upper progression to realistic boundaries using these factors
- Average steepness of the terrain
- The competition level
- The average altitude
The formula of this line is then used to convert the runners speed in km:hr into the race score. Since the speeds between all the runners are relative to one another, since they took part in the same race.
This allows us to directly convert the speed into the final score.
For example If we trace the expeted score of 900, to where it intersects the regression line, then this speed will represents an actual score of 900 In the same way if we trance the expected score of 500 , then the speed corresponds to a final score of 500.
All the runners receive a score based on the race speed using this method.

We can identify the runner who have performed
- Faster than expected
- around as expected
- slower than expected
