Note: I know and have worked briefly with Nick Drewe. He is rather good at what he does and it usually involves numbers, the Internet and spreadsheets.

The Warmest 100

Spoiler Alert: The Warmest 100

Every year Australia’s major youth focused radio station holds a poll to pick the 100 most popular tracks for the last twelve months. This poll is called the Hottest 100; you may have heard of it. A reasonable proportion of the voting happens online, and unsurpringly, the site gives the voters the option of sharing their responses online. A few Brisbane locals, @nickdrewe (http://nickdrewe.com), @jacktmurphy (http://jacktmurphy.com) and @andythelander (http://thelanded.com), built a site to track this activity on Twitter to show which song might end up on top. Their site is called the http://warmest100.com.au.

It was interesting watching all the commentary surrounding this site, ranging from discussion on its impact on betting for the winners to the creators being accused of ruining all the things and generally being ‘UnAustralian’. Some journalists even described deriving insights from a set of just over 35,000 data points as using ‘Big Data’. The dominant narrative though focused on the role social media had in providing information and its potential use in predicting trends. At least until the countdown started, then the conversation alternated between where the Warmest 100 was wrong as much as where it was right.

When is Enough Data Enough?

Late on Sunday night, just two hours before the close of voting, Drewe conducted his last collection of data, pushing his sample up to a massive 35,081 votes – or roughly 2.7% of the anticipated final vote – further refining his countdown’s final tally.

Hottest 100 cracked as Triple J outsmarted

A 2.7% sample does not sound like a lot. Until you consider that Gallup frequently uses far smaller samples in their polls, such as in their survey on attitudes towards gun law with just over 1000 respondents. Gallup goes to great lengths to ensure that their sample is random. Their methodology is interesting, as Gallup surveys randomly choose respondents by phone number representing all groups within the population and weights the responses appropriately. While Gallup polls are subject to some biases (they won’t ever represent groups without a phone, for example), they are acknowledged as generally being representative of the population.

It is hard to say the same for people choosing to post their vote to Twitter. The Warmest 100 was not using a random sample. Out of the population of people who voted in the Hottest 100, it was by design selecting those that had a Twitter account and decided to share their vote. This biases the information towards one group who share a behaviour that may not be present evenly across the whole population.

The data scraped from Twitter is a convenience sample with a strong self selection bias. Though these biases do not automatically make the data worthless. If adoption of Twitter and its use in this way is uniform enough among the population, it could still be representative.

How Right Should it Have Been?

Percentage of vote in sample to rank

The percentage of the vote each rank received in the sample

When the Hottest 100 countdown began, it quickly emerged that not every result matched the Warmest 100′s list. This isn’t surprising. Looking at the distribution of the votes for the top 200 responses captured by the Warmest 100 shows that while the top three tracks were well ahead of the rest, the rest of the results tend to group. The distance between each point in the scatter plot reduces as you move further away from number one, and out of the top fifty, there is no reason to assume that their rank in the sample would match the population.

Track Hottest 100 Warmest 100 Vote % Plus 2SE Minus 2SE Difference
Macklemore & Ryan Lewis
Thrift Shop (Ft. Wanz)
1 1 2.60% 2.80% 2.40% 0
Of Monsters And Men
Little Talks
2 2 2.46% 2.66% 2.26% 0
Alt-J
Breezeblocks
3 3 2.24% 2.42% 2.05% 0
Flume
Holdin On
4 6 1.68% 1.85% 1.52% 2
Mumford & Sons
I Will Wait
5 7 1.67% 1.84% 1.51% 2
Major Lazer
Get Free {Ft. Amber Coffman}
6 8 1.62% 1.78% 1.46% 2
Tame Impala
Elephant
7 5 1.70% 1.87% 1.54% -2
Frank Ocean
Lost
8 4 1.86% 2.03% 1.69% -4
Tame Impala
Feels Like We Only Go Backwards
9 9 1.50% 1.65% 1.34% 0
Rubens. The
My Gun
10 10 1.40% 1.55% 1.25% 0

The top ten from the Warmest 100, while not spot on, were still very accurate, with no track more than four spots out of place compared to the Hottest 100. While “Frank Ocean – Lost” actually ranked 8th in the Hottest 100, and not 4th as predicted, Lost’s lowest predicted share of the vote fell between the shares from the sample for the 9th and 7th in the Hottest 100, meaning that this result still fell within two standard errors of the predicted position.

Difference in rank between real and sample data

The difference between the real rank and its position in the sample.

Ideally we would want the total votes each track got in the Hottest 100 so we can compare each track’s share of the vote to the figures from the sample. This information would make it easier to establish how representative the sample was of the voting behaviour of the population as a whole. However reality is rarely so accommodating. What is clear is that the further down the Warmest 100′s list you go, the more likely the track’s rank isn’t the same as in the Hottest 100, and the greater the difference between the placements in both.

Sample rank with Error Bars

Sample rank with +/-2 Standard Error

A Confidence Interval (CI) of +/- 2 Standard Error (SE) for each track’s votes determined using the Bootstrap method.

Considering the range that each track’s vote could fall within in comparison to others, the unreliability of the tail end of the sample makes sense. For example, it isn’t until over rank 57 in the sample that the upper range of the Confidence Interval for track 100 no longer overlaps with another’s lower limit. Also the total number of votes each track has in the sample falls dramatically as you move down the list. While the first three results are significantly different from each other those further down the list are closer, meaning even a small level of error could significantly alter their order and the chance that their rank in the Warmest 100 will match the Hottest 100 will be low.

Relative frequency of percent of total vote.

Ranks in the sample with equal, or otherwise shares of the responses

As the actual number of votes each track received in the Hottest 100 is not available, it is not possible to see how close the actual results came to those from the sample in terms of share of the whole vote. Without this information, the expectation that the Warmest 100 would have predicted the share of the vote each song had to within +/-2 SE at least 95% of the time cannot be tested. It does seem that the Warmest 100 is mostly correct for the results with a large number of responses, with the accuracy declining as the number of observed votes fall.

Data Works*

*But sometimes your sample doesn’t.

Positive and negative differences in rank between real and sample data

The positive and negative difference between the real rank of a track and its position in the sample.

The Warmest 100 got the top three tracks right. The top ten were never off by more than four spots and the distance between what Twitter said and what the general population said gets wider from there. As you move down the list the greater the impact an error of only a few votes will have. Even in the top ten, a handful of additional votes for many of the tracks would change its position. As for the lower ranked tracks, as the number of votes in the sample gets smaller any difference in votes recorded, even within the 95% CI range, can change its position significantly.

Comment on accuracy

Conservative estimate.

Even with the potential bias issues mentioned above, it seems that Twitter is fairly accurate for picking large trends for this population. As inaccurate as it was for most of the Hottest 100, it did pick the top three. There is more than one reason why the sample did not accurately represent the other 90, and it isn’t limited to sampling error.

The small amount of data for the tracks further down the list is also a problem, exaggerating the effect of any rate of error on their order. In this case, Twitter provided enough information to predict the top three, but once the number of responses per track fell out of the hundreds, its accuracy predictably declined.

The data used for this analysis and for generating the graphs can be found here.

Travel times and distance in your SERPs

Answering short travel questions on the results page.

You can see Ana’s thoughts on the topic over here on her post titled “Latest Changes in Google Search Results”.

The kinds of information Google displays on their results seem to get more interesting and complicated. Recently an SEO specialist I work with, Ana Diaz, saw something interesting. A number of travel queries came with a map, with a route, distance and travel time displayed. Like the special dates details seen last month, there was not much either on Google’s blogs or elsewhere online about this.

While it seems that adding this information to the main search results is a new thing from Google, displaying this kind of information isn’t much of a departure for the search engine. Google Now, available on Android, provides the same kind of information automatically based on your location and time of day. Google Maps has also provided this kind of information for a few years now, and has even been providing their traffic data via API.

Google Now's travel information

Google Now’s travel information

Understanding the Query

A less than semantic search

Unusually easily confused for Google.

Unfortunately, as cool as this new kind of result is, it is easily confused. The map with travel details does not display consistently across a number of queries. While it will show for “St Lucia to Newstead” and “St Lucia from Newstead” (two Brisbane suburbs) it won’t for “Drive from St Lucia to Newstead” or “Travel to St Lucia from Newstead”. It won’t support more than two destinations in a sequence either. For now, it appears to only be able to return a result for “location” “direction” “location”. It will also accept other qualifying location terms like “St Lucia to Newstead Brisbane” or “St Lucia from Newstead Australia”. Just like the fields available in Google Maps.

A close match in syntax

As in Google Maps, so it is in Google search, for now.

It is interesting how closely the queries used in Google’s main search need to match to the format seen in Google Maps. It seems to indicate that this feature is not as closely integrated with Google’s search as you would expect.

As cool as it is, there are a number of ways this tool seems to fail. A search for “Ascot to Manly” won’t return a map, which isn’t surprising as a number of cities have suburbs with these names. Adding a location qualifier like “Ascot to Manly Brisbane” doesn’t help, nor does the fact that Google is pretty certain I am in Brisbane, QLD. However it will work with “Surry Hills to Paddington”, although there are suburbs called Paddington in both Sydney and Brisbane.

Paddington Sydney or Paddington Brisbane?

Paddington Sydney or Paddington Brisbane?

How good is it really?

Not so vague in Google Maps

Not so vague in Google Maps

Travel time and distance results in Google’s main search results are interesting, and they do appear to be new. When they work, they are useful, and it does not appear to be using any information that they have not had for a while, or used in the same way elsewhere. It is interesting how sensitive to query structure this feature is, especially given how good Google usually is with poorly structured and spelled searches. Even Google Maps seems to be able to cope with some of the searches that stymied this other feature.

Facebook lets you Google Social

Getting search in your social

Facebook is finally bringing social search. Or more accurately, adding search to social. Facebook has actually been providing a social layer to search for a while now, with both Bing and Blekko using Facebook information to help curate their results for signed in users. Facebook’s Graph Search is different. From the information available to date, it seems to be far more about adding search to their users’ social experience.

The initial rollout will be very limited, with Facebook saying that it will only cover the following areas at first:

People: “friends who live in my city,” “people from my hometown who like hiking,” “friends of friends who have been to Yosemite National Park,” “software engineers who live in San Francisco and like skiing,” “people who like things I like,” “people who like tennis and live nearby”

Photos: “photos I like,” “photos of my family,” “photos of my friends before 1999,” “photos of my friends taken in New York,” “photos of the Eiffel Tower”

Places: “restaurants in San Francisco,” “cities visited by my family,” “Indian restaurants liked by my friends from India,” “tourist attractions in Italy visited by my friends,” “restaurants in New York liked by chefs,” “countries my friends have visited”

Interests: “music my friends like,” “movies liked by people who like movies I like,” “languages my friends speak,” “strategy games played by friends of my friends,” “movies liked by people who are film directors,” “books read by CEOs”

As it is only in limited beta, there isn’t much discussion about its ability to handle natural language queries or what kind of information is available. But there is one other interesting thing to come from today’s announcement: Bing’s involvement.

Bing and Facebook

On their own blog shortly after the announcement from Facebook, Bing outlined how they were involved in “Evolving Search on Facebook”. Bing has worked with Facebook since 2008, starting by powering their web search, with AdCentre placing ads next to the organic results. Since then, their relationship appears to have worked, with Bing continuing to provide search while Facebook took over the ads in 2010 and also hinted at bringing public data from Facebook’s API into their own search experience in October 2009. Far more recently Facebook data has been integrated into Bing’s search results.

Putting the Value into the Search in Social

If all the rage surrounding Facebook’s attempts to find the perfect way to handle EdgeRank is to be believed, they have a real discoverability problem. The news feed works just fine for some light social stalking, and with their existing search features all but broken, it is almost impossible to unearth the kind of information that Facebook seems to want to target with Graph Search, and somehow advertise against.

Graph Search should fix discoverability and provide more navigation tools to their users, allowing Facebook to turn one of the world’s larger collections of user information into something both engaging and useful. Or at least as engaging and useful as the quality of the information it has collected would allow.

Seen from a data perspective, the acquisition of Instagram and Facebook’s integration services such as Spotify make a lot of sense. While Facebook is far more device-agnostic than it used to be, it still does not completely own its users’ online life. Providing a platform and social integration for services such as Spotify and outright buying others like Instagram extends Facebook’s ability to collect information on its users’ behaviour beyond its own touch points.

Why this Matters?

Ultimately Facebook needs impressions. Facebook’s main source of revenue is advertising, from banners to sponsored stories. These ads lose value and more importantly traffic when there is no-one there to look at them. To ensure that there are enough eyeballs to go around for the advertising inventory that Facebook needs to sell to keep the shareholders happy and the servers running, their users need to be kept on the site. And this is where engaging, sticky content comes in. Despite the Internet’s love of a false dichotomy, Graph Search does not need to be about beating Google in search. It merely needs to improve Facebook for its users, so they can help Facebook meet its business objectives.

The gap between where you are and where you should be

The gap between where you are and where you should be

3am is not the busiest time of day for search. Nor is it the best for conversions. For organic traffic, this does not matter. Ranks do not change depending on the time of day or week in the search results. The same is not true for paid search.

Search activity and volume change over time, and most of the time, in predictable ways. There are patterns that repeat from day to day, from week to week and from year to year. These changes over time can be important for managing organic search engine marketing. Understanding how demand and interest change and when is important for planning site development and content.

Typically the cycles in demand that are important to search engine optimisation (SEO) are longer than those that matter in paid search (though updates like Caffeine have shortened these). Generating visibility for specific changes in search within the organic results is not as straightforward as a media buy.

Timing and effectiveness are limited by how appropriate the content is for the targeted queries, how successful promotional and linking activity was and how the search engines crawl and rank the content. Paid search does not have the same limitations.

Visibility, Productivity and Competitive Bidding

Adwords Ad Schedule

AdWords Ad Schedule feature

Paid search and display advertising platforms such as AdWords let advertisers manage their campaigns by hour of day and day of the week. How an optimised campaign will use these tools will depend on industry trends, how and when their target market uses search and their own objectives. Unfortunately there is usually more than one advertiser doing this.

Choosing when to push for more traffic and impression share on AdWords and other realtime bidding-based platforms is important. Cost per acquisition (CPA), click through rate (CTR) and conversion rate (CVR) are all good indicators for what is performing and what isn’t.

Another factor to take into consideration is competition. If a day or time works for you, it is likely that it would work just as well for your competitors, and be just as desirable to them as a result. Consequently, when a particular time or day stops performing as well as expected, it could be due to increased competetion rather than just the market. AdWords provides a number of metrics that make it possible to analyse for competitive pressure:

  • Search impression share
  • Average position
  • Average cost per click (CPC)
  • CTR

Search impression share is the most straightforward of these: it does what it says. The information is available down to ad group level, and assuming these are tightly themed, it will give a good indication of what kind of share of voice you have within those query groups.

The other metrics do not directly measure competition, but they can show its effects. Average position is not a reliable metric. It represents an average of all the positions the ads have appeared in for the period, but gives no indication of spread. However, it can indicate large general movement. Changes in CPC and CTR are more reliable indications of competitive activity. Changes in CPC can signal changes in bidding and CTR can also indicate changes in position. Together these two metrics can indicate changes in competitor activity, barring other confounding factors.

The Why of Benchmarking

Monitoring changes within the account can only provide insights when compared to something. Benchmarking makes the difference between identifying a shift in the market and noise. Choosing what to benchmark campaign changes against will depend on what other information is available. Other campaigns that experience the same user behaviour and market similar products are one possibility; organic search traffic for the same kinds of terms is another covering a similar time period.

While there are a few ways to approach analysing and processing this information to create actionable insights, setting some kind of a benchmark matters. Because of AdWords’ nature, many of the changes in cost and behaviour are as likely to be caused by your competition as by changes made to the campaigns and changes in actual user behaviour. Consequently, it is important to be able to differentiate between each.

Finding dates on Google

Finding dates on Google

Event dates such as the one above seem to be another example of Google’s drive to providing a richer, more informative Search Engine Result Page (SERP). It certainly is interesting that this kind of information seems to have only just started to show up. Especially as earlier this week Google announced a great new tool for webmasters, the Data Highligher.

Now available through Google Webmaster Tools (GWT) the Data Highlighter makes it easier to help Google identify structured data on your site. For now the tool is only available for English language sites and for events related information like concerts, festivals, sporting events and festivals.

The Data Highlighter as it is rolled out will become a popular alternative to Schema.org. It will make some forms of structured data easier to implement while requiring fewer resources and it is included with one of Google’s own widely used tools.

Structured Data’s Slow Burn

Movie Times on Google

Movie Times on Google

Structured date in search isn’t new. Microformats have been around for a while and used by search engines to provide a richer search experience, Schema.org is simply the latest. When Schema.org was released it was overshadowed by Google+, which was announced at the same time.

Unsurprisingly as Google has increased the amount of information it is displaying directly in the search results, structured data has been attracting more and more attention. Google’s release of the Knowledge Graph, the inclusion of structured data preview tools and Bing’s own snapshot feature has certainly indicated an accelerating shift in how search engines serve their customers.

Decision Engines and Portals

The Hobbit Movie

The Hobbit Movie

Search is currently far more than just a list of links as Bing, Google and newer products such as Siri and Google Now are getting better able to answer a user’s question directly and without sending them to a different site or application. With the amount of information now available on the SERPs and the tools such as calculator’s and converters usable from the search bar, it is almost as if Google is becoming a portal.

For many stakeholders in search this is a good thing. Users get the information that they want easier and faster by avoiding poorly designed sites and obtuse navigation. Providing a better experience for searchers allows Google to maintain it’s position in the market, and by doing this means the search engine can provide their advertisers a large potential audience.