data

You are currently browsing articles tagged data.

EVE Online (or Spreadsheets in Space as it is also known) is a MMORPG with strong PVP gameplay. There are a large number of other ways to play EVE Online, from the market through to PVE, but it is PVP that stands out in the game. It has been called a ganking game, which is a fair comment, as there is a real risk of loss of gear and skills (comparable to levels in other games). Loss of gear and skills creates behaviours aimed at minimising this risk while maximising rewards. In other MMORPGs with little or no chance of loss, PVP activity tends to be restricted to the market.

Winning at PVP in EVE Online

Winning at PVP in EVE Online

Wining at Spreadsheets in Space

PVP in EVE Online is not fair. In fact the challenge in PVP in EVE Online is in setting up these unfair encounters. In most MMORPGs, the actual act of combat consists of a few mouse clicks and some waiting. EVE Online is no different. It is the risk of losing stuff that makes players focus on everything before the actual combat a lot more. It is taking the right mix of ships, avoiding being out-numbered and cornered by a superior foe and acting before the opponent even knows they are in a fight where player skill starts to make a real difference.

Why SEM is like EVE Online PVP

Search Engine Marketing (SEM) in very similar to PVP. It is a zero-sum environment where operators compete for a resource through actions governed by a set of rules and environmental factors generated through user behaviour. There are a few principles that carry over from EVE Online PVP to SEM.

  • Situational awareness is king
    • Know how the advertising network works
    • Understand competitive activity
    • Understand how the market behaves
  • Observe, act and assess
    • Analysis without an accompanying action is useless
    • Assess the effectiveness of activity & reassess decision making model
    • And repeat…
  • Know where you can compete and where you can’t
    • Don’t waste time & resources competing directly with advertisers intent on outspending you
    • Find alternative ways of reaching potential customers.

Information is the key. Understanding how the query space works, having good situational awareness, and knowing where in the sales funnel certain terms are is valuable. It won’t save you from the SEM equivalent of a gate camp (high margin and ‘branding’ campaigns with large budgets), but it is essential for remaining competitive without burning through your budget.

Tags: , , , , , , , , , , , , , , , ,

In my last post, “Too soon for decisions”, I discussed applying a consistent set of rules to campaigns to assess the performance of new ads and targeting. However, in practice, assessment and tracking an AdWords or Facebook campaign can be an interesting exercise.

The data generated by a campaign is not a true representation of the population. The data is a snapshot limited by the markets targeted and the visibility available for the budget spent. Any single campaign can be exposed to direct competition over the whole market or specific subgroups. For example, just because “Campaign A” does not get traffic from Victoria does not mean that no-one in that state is searching for “Keyword B”.

A competitor could simply be focused on that market and value the traffic more. Other factors to consider are the effectiveness of the competition’s creatives and offers, the appeal of their product, efficiency of their site in turning clicks into sales and how much they return per conversion. All of these factors will influence their budget, and how much they are willing to spend per click or impression. Tools provided by the advertising networks that increase the efficiency of campaigns like Remarketing are also worth considering.

According to Wikipedia, a confidence interval is defined as:

…a particular kind of interval estimate of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given. Thus, confidence intervals are used to indicate the reliability of an estimate. How likely the interval is to contain the parameter is determined by the confidence level or confidence coefficient. Increasing the desired confidence level will widen the confidence interval.

In use here, it is assumed that between similar competitors, the average Cost Per Acquisition (CPA) within the group is likely to be within a 95% confidence interval of the known CPA.

Confidence interval and estimated CPA

Confidence interval and estimated CPA

Confidence interval can give you an estimate of what other bidders may be paying for a conversion, assuming they are operating as efficiently as you are. In the graph included above, confidence interval of the CPA is used to estimate the most likely highest possible CPA a campaign can still compete on. In conjunction with Cost Per Click data, it is fair to assume that the competitors in the query space are willing to spend over the highest likely observed CPA. Reasons for their bidding strategy can vary from shutting out competitors by absorbing a short term loss, to a higher sustainable CPA. In a query space where a number of different verticals are competing for the same traffic, this metric is considerably less useful and your mileage may vary. For comparing CPA campaigns, creating a model for understanding the market, or simply to assess which ads are potentially performing a lot better or worse than your target in the face of direct competition, it is a useful tool.

Confidence interval can be a guide to how much your competitors expect to spend per conversion, assuming a lot of similarity in product and business practices. Arbitrage and industries with heavily commodified products are prime candidates for this, as well as campaigns with a very aggressive high cost bidding strategy, such as those competing directly with another member of your industry.

Tags: , , , , , , , , , , , , ,

Most apparent behaviour on the Internet is a result of people attempting to complete a task using a distributed network of connections and tools. Not all of this takes place on the World Wide Web. There are a lot of different tools that use the Internet to locate, collate and move information, such as Kindle, iTunes, Steam and BitTorrent clients.

Even through the web, the methods used to locate and consume information are diverse. From search to portals to socially generated recommendations, there is a huge range of navigational nodes online that shape the user’s experience. Focusing on what information is consumed and where, rather than on what the user is trying to do, can be very myopic.

Find music with only the Internet

Find music with only the Internet, Click for full size.

For example, what if the user wants to listen to music? With access to a computer and a browser they may go to YouTube, or the site of a band recommended in an email from a friend. With P2P software like a BitTorrent client they may use inbuilt search tools and download it via a peer network. They could also use iTunes to find, choose and purchase a song without touching a browser. Ultimately all these methods use the Internet, but only one is dependent on the World Wide Web.  This does not even start to consider devices other than a traditional computer.

Nodes and Friction

Each mode of content location and acquisition uses a different set of nodes and they can range from invisible to obstructive. Each one is another opportunity for the gatekeeper of the node to create friction and shape experience. Search and social sites have contextual advertising, Internet Explorer treats incorrect URLs as search queries, DNS services can redirect mistyped or incorrect URLs and the iPad does not support Flash.

Nodes such as portal sites, search engines, social networks and applications such as Steam control and direct attention in different ways. Each gives the user different tools to discover content, with different levels of friction placed between the user and what they wish to do.

Some sites use a disruptive model and place ads in front of the user, using available data to tailor their ads. Applications like iTunes and Steam operate as shopfronts and work to minimalise frictions between the user and the buy button. They help the user find, acquire and consume the media with the least effort, and generate sales in the process.

Why attention matters more

The internet is an environment where there is almost limitless content and space to display it in. The scarcity is with attention. The limits on the size of the audience and the amount of time they spend online are far more immediate than potential advertising inventory. Unlike traditional media, the Internet does not have a page limit, nor is it restricted by spectrum or the number of hours in a day. The low cost of storing and moving data, the asynchronous nature of most content and the ability to generate more content automatically or through user activity changes its value. There is no value in just existing; there is no ‘only two papers in town’ or ‘only three TV networks’ online. Online, the value of a node is in how much attention it affects. Each one is an opportunity to distribute attention among the next group of nodes in the chain.

Why the Task Model

How most people use and access the Internet has changed over the last few years. The ubiquity of Internet capable devices is as significant a factor as the prevalence of fast and wireless home connections. While the Internet on a phone in some form is not new, large numbers of people with fast and easy access is. A proliferation of applications designed to give access to content independent of the World Wide Web is significant too.

Social networks, both formalised like forums and Facebook, and ad hoc such as email, will remain a factor, as well as portal sites like Yahoo! and search engines. These tools for content distribution and discovery are not being replaced, they are just being supplemented.

The user’s aims and knowledge determine their actions online. The channels they use do have an impact on the kinds of information they access, how they access it and how they locate new material. As the Internet becomes richer in content and tools, it will also continue to fragment and change. We have gone a long way from the Internet being tied to a desktop computer through just a browser or email client, but the user will always have a want or desire that they wish to meet, and they will use whatever tools are available to do it.

Tags: , , , , , , ,

For value for money it is hard to top Facebook. It costs nothing and in return you can host photos and videos, communicate with people all over the world, consume vast amounts of content, create groups and participate in various communities. To create and host something similar yourself would cost a lot of time and money.

Free sites and services like Facebook, YouTube, and Google Search still have to pay their developers, provide hosting, repay investors and generate revenue to keep it all going. Traffic, registered users and great PR do not pay the bills by themselves; at some point cash needs to be involved. This is where the interests of those providing the services and those using them diverge.

Free at a price

There will always be a cost to the end user, and if it is not cash it will be something else. Lack of technical support, poor documentation, slow bug fixes, compromised privacy and exposure to advertising are a few ways operating costs are managed and paid for. Some paid services suffer some of these issues as well, but they are not the norm.

When the user pays, there is a clear cost in losing them and therefore higher expectations of service. When the service is free and the costs are paid for by advertisers and investors, creating value for them is important for the business. The advertising model is often the first choice for generating revenue and targeted traffic or impressions, and richer forms of display advertising become more important. When the user pays, creating value for them becomes important to the business.

Facebook appears to be going through this process now. A lot of the recent changes seem to create more value for advertisers than for some segments of their community. With Facebook being such a dominating presence, this is generating a lot of discussion. With all this focus on user control over data and experience, Diaspora could not have begun development at a better time.

Will you pay?

Diaspora as a social media platform will be interesting, and potentially very disruptive to this space. It looks easily accessible for many users, either through Turnkey or individually installed and operated servers. As a distributed network of easily installed and managed ‘Seeds’ across a variety of servers, Diaspora can be compared to WordPress. Based on the offers on Kickstart for funding, it seems that while the software will be free, access to Turnkey servers and technical support will cost money.

Diaspora at the very least will place a dollar value on privacy and control over your social media profiles, and it will ask one other question: Will you pay for access to a social media platform, either through hosting or a Turnkey server?

Tags: , , , , , , , , ,

Almost no-one accesses the Internet. What most users access is a selection information determined to varying degrees by their own behaviour, and the nature of the gatekeepers, such as ISP, browsers, applications, DNS, platforms, language, social networks and online nodes such as search engines and portal sites. The Internet has always been a media especially prone to creating silos of information and homogeneous communities, however increasingly behavioural and real world factors are having a greater impact than before.

Organic factors such as user interest and active social behaviour has always influenced what a person might see and experience online. Someone with no interest or no friends who are interested in esoteric information like Babylon 5 are not likely to hear much about it if they do not actively seek it. There is less chance of being exposed to information they have no interest in on the Internet than in most mass media. Of course the larger the social network of the individual and the greater their direct involvement, the more information outside of their immediate sphere of interest they will be exposed to.

None of these factors are unique to the Internet. The tools available online make it possible to interact with more people in some way then was even possible before. The speed and diversity of content that can be shared online has no parallel in history, but ultimately, it’s just people being people online. What has become increasingly important is the influence of location, device, software and sites or platforms that actively use user generated data to shape your online experience.

The technology to change what is shown by IP, cookies, logged in accounts, OS and browsers are not a new innovation. Their implementation online is becoming more apparent with more obvious use and a proliferation of Internet capable devices in the population. This trend covers commercial sites, social media, news and search engines. It affects content from advertising, articles through to search listings.

Personalised Search

Currently, one of the most interesting things about Personalised Search is the averages users complete ignorance of it. Personalisation of content thought to be consistent for all who access it will have interesting social ramifications. Most users are not actually aware that their own behaviour, and at some point the behaviour of people they are connected to through their Google products, will have an effect on what appears where in their search results.

Google has for a lot of people become a portal, with users retrieving information through the search box with keywords they have learnt, or told to use. This change in user experience of information retrieval for sites other than brand and generic terms may discourage users from being so totally dependant on Google Search acting as a replacement for sharing and directing accessing URLs.

Cross Platform Content Consumption

Not all content works on all platforms. Mobile browsers are far more sophisticated than they were when WAP was the standard, and most web content is now easily accessible on mobile devices, with a few notable exceptions such as flash. Due to differences in screen size and interface some sites will serve a different site to different devices.

Location Aware

With IP addresses, the ability to serve different content to users from different locations has been available for ages. No where is this more apparent then in search. What is new is how location aware applications are now using device APIs to access information from the GPS chip. This location data makes it possible to serve information based on a far more granular level than is possible through IP addresses.

User Experience and Advertising

Delivering the right message to the right person in the right place at the right time is as important for advertising as it can be for sharing information. Delivering relevant information from trusted sources in the right place and time to a user who has demonstrated an interest does go a long way towards managing the huge volume of information available. There is a cost associated with this, including privacy and an increasingly myopic view on the Internet, especially with content that is currently assumed by the average user to be consistent for all.

Tags: , , , , , , , , ,

Bing has had a relationship with Facebook since 2008, and it has just become more involved. The latest announcement regarding Bing’s relationship to Facebook appeared on Bing’s blog on the 5th of this month.

Briefly, web search through Facebook will continue to be developed, including the integration of more of Bing’s features, and this will be rolled out both in and outside of the USA. Facebook will provide all display advertising within their site; however, Microsoft will continue to provide the search ads. In the whole post, it is the following paragraphs that I find the most interesting:

Bing will continue to exclusively power the web search results on Facebook. This change will also enable Microsoft to continue its focus on driving strong performing campaigns across our own social media and communications tools, including Windows Live Messenger and Hotmail, and via rich content environments across MSN and Xbox Live.

This is an exciting time for us as we continue to work with Facebook on great new experiences for customers. As you know, Bing has been very focused on helping customers make important decisions. We believe that counsel from family and friends can be a big part of that process. Going deeper in web search experiences with Facebook, in addition to the collaboration we announced last October about bringing public data from Facebook’s API into the search experience, will enable us to do great things together for our customers.

By providing Facebook with its websearch functionality, Bing can get around the strong Google brand, and get its search tools in front of people where they already are, rather than attempting to change existing habits the hard way. Bing’s features are comparable to what Google provides through their search experience, and Bing’s intent to become a decision engine and their integration of Wolfram Alpha has a lot of potential.

When you stop and consider the social networks that Microsoft is already heavily invested in, working with Facebook makes a lot of sense. The quote: “…bringing public data from Facebook’s API into the search experience…” is interesting in what it points to. Google is leveraging their own social data through Gmail, Google Reader, etc, to further enhance their own search results, and to provide a better ad product.

How Bing will use the data is collects from Facebook will be very interesting. As a recommendation engine, Facebook is incredibly effective, and the volume of data that they collect is very cool, or very scary, depending on your views. Data mining how links, information, video and photos are shared, tagged and recommended would be invaluable for Bing. Between these announcements and the development of Google Social, search will be a very different place in five years time, at the latest.

It is very silly to write off Microsoft too quickly. While they did release Vista and they have gone through anti-trust litigation, they still have considerable resources, and large companies can still be creative and agile.

Tags: , , , , , , , , , ,

How someone searches is a result of every single query you have ever done. It is a learnt behaviour shaped by an ongoing conversation between the user and the algorithm.

The results page for each query is feedback on how well the question was articulated. If the results are relevant enough to end your search, that is good. If it was not, then you try a new approach and adopt a new method. For most web literate people, this process is repeated dozens of times everyday.

What appears on the results page is determined by three factors: the query itself, how the search engine interprets the query, and what sites appear to be the most relevant to this query. The user only has control over the query that they put in. However, over time, this will be influenced by the interaction between the search engine, the sites it indexes, and what the user deems relevant enough.

In low competition namespaces where optimisation activity is low to nonexistent, it is only feedback from the search engine that shapes user behaviour. Assuming an effective algorithm, the user may not have to try as many different query structures, refinements or synonyms to find a site that would be relevant enough.

In a more competitive namespace where the optimisation activity keeps pace with or exceeds the search engine’s ability to control what is deemed relevant for a specific query, the user’s behaviour is affected by both the search engine and SEOs who consider that namespace to deliver a good return. Assuming that the optimised sites do deliver an experience that is relevant, then this will have minimal impact on user behaviour. In namespaces where there is more than one potential subject, then optimisation activity for one may force a shift in the search behaviour of the users seeking the other.

A crowded namespace can have another interesting effect on search behaviour: an increased use of a site’s branded terms to locate it by existing customers. Where there is a high level of competition on generic product terms, or the most relevant site for the namespace is outranked by less relevant results, the user can be taught to use branded instead of generic terms.

Search engines are Skinner Boxes. Each time the user conducts a query, they get feedback on how closely it relates to their intent. In response to this feedback, their behaviour changes. The feedback they receive comes from two sources: the search engine itself, and those optimising for it. These in turn influence how the user describes the product online, and can encourage them to hone in on a more focused range of queries.

Tags: , , , ,

Search is an interesting creature.  As well as a way to generate traffic, it is an interesting study of language and intention. Ignoring for a moment how search engines also function as a Skinner box with the effect this will have in consumer behaviour, what someone types into a search engine is an indicator of where they are in the sales funnel and what their intention is.

With long tail search queries it is hard to clearly see what is working and what is not, unless you group traffic around commonalities. With search traffic, the most relevant is the actual phrase, as this reflects user behaviour and can provide a guide for future SEO activity. Time of day, search engine used and the user’s browsing history are also useful.

Multivariant statistics are good for this, especially Cluster Analysis. I pulled a quick sample of some search query data via Google Webmaster tools for a demonstration. I am aware that there is more than one search engine, and I know that data on terms a site appears on is meaningless without information on clicks or search volume per query. This is what you might call a convenience sample.

As I do not have SAS Enterprise Miner on this machine, this analysis will be simple. Each cluster will be split on a commonality that is greater than 20%. If there is no such commonality, then it is exhausted.

Cluster Analysis

Cluster Analysis and Search Queries

As is demonstrated within the sample, there is still a significant dissimilar longtail. A few very niche groups identified were also identified in the sample. Ultimately, this data is not a true representation of user behaviour. Just because a number of different individuals found your site using the same small cluster does not automatically mean that they are after the same thing. More information is required to make those conclusions. This is just a model. It can help guide your decisions, and it can indicate points of interest worth investigating. What it is not, is gospel.

Tags: , , , , , , , , ,

Like every other year, 2009 has been a busy one for search. A merger, a new algorithm looming from the market leader and a pile of new tools and so-called ‘Google killers’.

The biggest change to come out of 2010 is probably going to come from Personalised Search and a Google Labs project called Social Search.

Personalised Search has been discussed as far as possible with the information currently available. What was only available to those signed into their Google accounts has now been implemented for those who are not via cookies. Data is collected on searches made, and this information is used to determine exactly what the searcher actually means by “australian coach” or “ctr ppc social online marketing” for the mutual benefit of end user and advertiser.

Using previous searches to place subsequent queries in context will make keyword selection much more important and far more involved. It will also mean that any company that is able to frame the conversation around its product and industry can create a real compeitive advantage, greater than what is currently possible. Where this will become more interesting is when or if Social Search becomes a part of Universal Search, and if it becomes a ranking factor for Personalised Search.

Currently the biggest limitation for Social Search is the scope of sources used to map out a social network. Drawing on Google properties alone is a real limitation, especially as there is a shift away from email and RSS as the main way to share links. I expect Social Search to incorporate other channels such as other social networks and additional tools such as goo.gl.

In short, get ready to pay more attention to either leveraging or creating social networks, and doing something worth talking about. Otherwise, you may find all the programmatic SEO in the world won’t keep you visible.

Tags: , , , , , , ,

cpcctrpos

The relationship between Cost Per Click, Average Position and Click-Through Rate is very interesting. While the relationship between Average Position and Click-Through can be demonstrated without digging too deeply, looking at these factors with Cost Per Click as well requires more data.

CPC CTR

Position CTR

The data discussed here was taken from a single keyword that maintained a consistent Quality Score. There were changes in Cost Per Click, Average Position and Click-Through. The data was taken from a few months of operation, and from just Google Search. The keyword experiences a high, stable level of activity, and did not experience any spikes of interest from advertising, PR or related news. Any shift in Click-Through, Cost Per Click and Average Position will probably only relate to changes in the other two variables.

CPC CTR

Position CPC

The relationships between Click-Through and Position, Cost Per Click and position, and Click-Through and Cost Per Click exist, but do not appear to be very strong. Which of the three had the strongest relationship to the others was not clear either at this point. A quick look at correlation between all three variables showed the following:

Avg CPC Avg Position Avg CTR
Avg CPC -0.500039708 -0.492482923
Avg Position -0.500039708 -0.143205517
Avg CTR -0.492482923 -0.143205517

Correlation coefficient of CTR, CPC and Average Position

CPC CTR

CPC CTR

For Average Position and Click-Through, Cost Per Click has the strongest relationship, even though it is not very strong. These figures are not conclusive however, but do serve as a guide to the relationship between Cost Per Click, Click-Through and Average Position.

Tags: , , , , , , , ,

« Older entries