How much can you expect a search rank in Google to change? Is dropping by five places worth panicking about? What if it happened on the fifth page, or should you only worry if it were on the first? Being able to make decisions based on this kind of information is important in managing workflow within most Search Engine Optimisation (SEO) projects.

It is easy to assume that you will always observe changes of greater range the further away from the first position in Google you get. Especially as the further away from the first spot you get, the greater the rate at which you can climb to the top.

Ranking data is available and easy to get through a number of services and tools, with SEOmoz, Raven Tools, Advanced Web Ranking (AWR) and Google Webmaster Tools able to manage and automate much of the process.

**Plotting Search Engine Result Page Rank Changes**

Looking beyond just the week-to-week changes requires a little more work. And data. Most good tools provide historic data for keywords that are currently being tracked, and this is the data used for this post. The ranking data was collected over a period of time from multiple sites and across multiple keywords in Google using SEOmoz.

This data was used to create the sample seen in Figure 1, plotting the observed position on Google’s SERPs and how much it changes from the next week’s rank. In the scatter plot, Change is on the *y* axis, while Current Rank is on *x*.

Google.AU Current (x, Week 1) |
Week 2 | Change (y) |

2 |
5 | 3 |

*Chart 1*

The points in Figure 1 represent Google.AU Current, and Change, a number derived from the difference between the current rank and the next observed rank for that keyword. Consequently, when Change is a positive number, it represents a movement away from the top position, while a negative number represents movement towards it.

As SEOmoz collects ranking data for tracked keywords appearing in the first 50 positions in Google, Bing and Yahoo, the only changes in ranking tracked have to fall within this range (however there are other tools that provide data beyond the first 50). This limit affects the data, skewing the range of change lower than what is probably seen within the population, especially as the rank gets closer to 50 and to 1. Due to these constraints, the data used in this post is a truncated distribution, restricted to observing only changes between 1 and 50.

**Range of Change per Position**

The standard deviation for each rank position seen in Figure 2 from the data used is fairly inconsistent, and not entirely unexpected given the limited nature of the data. The size of the sample used does not seem to be large enough to provide enough observations at each rank.

**Four Bins**

Q |
Range | Cases | Cumulative | Change Mean | Change Standard Deviation |

1 | 1 | 25.2% | 25.2% | 0.34 | 3.08 |

2 | [2, 5] | 31.8% | 57.0% | 0.38 | 3.11 |

3 | [5,13] | 18.1% | 75.1% | 0.38 | 3.81 |

4 | [13,50] | 24.9% | 100.0% | -0.52 | 4.84 |

*Chart 2*

The sample is heavily weighed towards the first ten positions, with very little data available for any rank beyond the second page of Google (Figure 3), giving an inter-quartile range of just 2 to 13 from 1 to 50. The first two quartiles range from first to fifth position, and the third only just reaches the second page of Google’s search results.

Even within these limitations, it can easily be shown that there is a difference in expected movement as a site’s position falls further down the rankings. Figure 4 displays the distribution of individual observations, standard deviation of change per quartile with the black error bars, and the mean of change with a 95% confidence interval in red.

Using quartiles produces a series of standard deviation of change closer to what you would expect: a great range of observed changes in rank as you move away from position one. While the data supports the hypothesis, the range of positions covered by the last 25% is too large relative to the sample population to be meaningful.

*k*-means Clustering

Another approach taken with the data was* k*-means clustering, with *k* as four clusters. More than four clusters failed to break up the one to six range, accounting for about 63% of observations, and reduced the number of observations in the other groupings below a useful level. Even at four clusters, the groups outside the one to six range never accounted for more than 17% of the sample.

k |
Range | Skewness | Cases | Change Mean | Change Standard Deviation |

1 | [19,34] | -1.412 | 11.912% | 0.204 | 4.335 |

2 | [33,50] | -4.012 | 8.915% | -1.488 | 5.856 |

3 | [7,18] | 1.542 | 16.120% | 0.004 | 3.046 |

4 | [1,6] | 10.147 | 63.053% | 0.408 | 3.337 |

*Chart 3*

Looking at the skewness across each of the clusters seems to prove that the further down from one a position goes, the more it skews towards going up. Unfortunately for cluster 1 and 2, this is a deceptive number. As there are no observations outside of the top 50, the closer you get to 50, the fewer drops in rank will be observed, rendering the data biased towards increases in rank.

Unfortunately this was inevitable. As seen in the quartile ranges, the positions between one and five accounted for at least 57% of all observations. This distribution of data is an artefact of how the sample was created, with the keywords selected by non-random means.

It is certain to be a product of the limitations of the data collected, where the only observations included must be changes in position occurring between 1 and 50. Unsurprisingly, the same tendency towards a greater range of change the further away from the top of Google can be seen within the *k*-means clusters.

Much like Figure 4, Figure 6 includes black error bars representing one Standard Deviation from mean, and red error bars for a 95% confidence interval of mean for each cluster. The clusters are not in order of the positions in Google they represent:

k |
1 | 2 | 3 | 4 |

Range |
[19,34] | [33,50] | [7,18] | [1,6] |

*Chart 4*

The data in Chart 4 revealed that the range of change increased from cluster 1 to cluster 2. These two groups were both represented by the last 25% of all observations, or the final group in Chart 2. k-means clustering can also highlight outlier populations within a data set.

Partitioning the sample data into six clusters highlighted one group of observations within the first ten positions. This group showed a significantly higher than average change in rank compared to other values in this range. This group is also reflected in the skewness of cluster 4 in Chart 3.

**Making Sense of the Data**

There are a number of issues with the sample used for this blog post. These limitations mean that the data presented here is not a good selection of the query spaces in which the sites used exist. A few of the problems include:

- Only 1400 records were used
- Massive convenience sampling issues such as:
- Keywords are selected by inconsistent, non-random criteria

- SEOmoz data has no visibility past 50, which limits ability to observe changes involving any rank beyond that point
- No differentiation between keywords such as taxonomy or competitiveness
- No allowance for known algorithm changes

Convenience sampling is a significant issue with the data selected. Tracking terms selected for campaign and client management is certainly best practice from an SEO perspective. The data collected will create a false impression of how search engines behave in a broader sense, and only provide insight into one of the search environments as defined by the objectives of those involved. It is almost certain that this will focus the sample on vanity and short/head terms, with little tracking of long tail queries.

The data SEOmoz collects is a truncated distribution with no visibility on behaviour past 50. In practical terms, this means that the highest change it is possible to observe in this set is either 49 or -49. Terms dropping down to below 50 are not included in the data set, nor are terms coming up from below this rank.

Even within these limitations, the data did demonstrate an increase in average rate of change either up or down. Unfortunately the sample was not large enough, nor did it cover enough of a range to provide any heuristics for most of the positions observed.