Search is an interesting creature. As well as a way to generate traffic, it is an interesting study of language and intention. Ignoring for a moment how search engines also function as a Skinner box with the effect this will have in consumer behaviour, what someone types into a search engine is an indicator of where they are in the sales funnel and what their intention is.
With long tail search queries it is hard to clearly see what is working and what is not, unless you group traffic around commonalities. With search traffic, the most relevant is the actual phrase, as this reflects user behaviour and can provide a guide for future SEO activity. Time of day, search engine used and the user’s browsing history are also useful.
Multivariant statistics are good for this, especially Cluster Analysis. I pulled a quick sample of some search query data via Google Webmaster tools for a demonstration. I am aware that there is more than one search engine, and I know that data on terms a site appears on is meaningless without information on clicks or search volume per query. This is what you might call a convenience sample.
As I do not have SAS Enterprise Miner on this machine, this analysis will be simple. Each cluster will be split on a commonality that is greater than 20%. If there is no such commonality, then it is exhausted.
As is demonstrated within the sample, there is still a significant dissimilar longtail. A few very niche groups identified were also identified in the sample. Ultimately, this data is not a true representation of user behaviour. Just because a number of different individuals found your site using the same small cluster does not automatically mean that they are after the same thing. More information is required to make those conclusions. This is just a model. It can help guide your decisions, and it can indicate points of interest worth investigating. What it is not, is gospel.
Tags: Cluster Analysis, data, internet, Marketing, Multivariant Statistics, Online, Search, SEM, SEO, Statistics


2 comments
Comments feed for this article
Trackback link: http://contoleon.com/blog/2010/01/21/search-query-commonality-and-clusters/trackback/