getstats – promoting the understanding of statistics
Data is becoming more and more important in every sphere of society. This is underlined by companies like Google that have it as their mission to organize the world’s information and make it universally accessible and useful. Major consulting firms are acknowledging the emergence of data-driven decision making as an emerging global trend. This is a trend that is not only limited to business world. We are increasingly being exposed to statistics and data in our everyday lives.
The Royal Statistical Society is launching its 10 year campaign for statistical literacy on World Statistics Day: 20/10/2010. The vision for the campaign, known to its friends as getstats, is “a society in which our lives and choices are enriched by an understanding of statistics”. Please visit http://www.getstats.org.uk for more information and to show your support As a company operating in a data-driven industry, we are proud support this global initiative.
Drill down to search query level in Google Analytics
In our second Google Analytics post focusing on paid search, we illustrate the use of very useful reporting tools in Google Analytics for gaining some quick and actionable insights into into your paid search campaigns. A useful three-tier strategy is to
- Focus on the major performance shifts at a campaign-level
- Identify the keywords in the campaigns that are driving those shifts in performance
- Zero in on the actual search queries to look for opportunities
Lets consider the above strategy at the hand of an example from a multi-product retail site.
Step 1 – Identify major shifts in Campaign-level performance
Below we show quarter-on-quarter shifts in campaign revenue for all paid search campaigns for the first 2 quarters of 2010, from the Campaigns Report under Traffic Sources. We include filters to focus on the higher traffic non-brand campaigns. Predominantly there has been strong growth in revenue across most campaigns. We can now drill down to a keyword-level to understand which keywords are driving the strong growth in some of the categories.
Google Analytics insights for enhancing paid search performance
If you are not using web analytics to provide extra insights into your PPC campaigns, I suspect Louis Gossett Jnr is lurking somewhere in the background like in the Namibian brewer’s beer commercial, ready with his catch phrase: “What are you doing Quentin?”
The availability of a free and robust web analytics tools such as Google Analytics removes any excuse for viewing your paid search campaigns in isolation. Paid search marketers often treat search campaigns as though they exist in a marketing vacuum. A tool like Google Analytics will quickly help you understand how search campaigns perform as an integrated part of the greater marketing effort that may include SEO, e-mail and above the line campaigns. Furthermore, Google Analytics will provide you valuable behavioral data of the paid search traffic you are driving to your paid search landing pages.
Over the next while we will look at some simple but effective ideas for using Google Analytics to improve your paid search performance. Today we will start by looking at site search behavior.
Leverage the power of site search to improve the quality of your site links
In previous posts we showed how the inclusion of site links in paid search ad copy can improve the click-through rate on your high traffic terms. It will typically be the brand search terms that will display site links, although Google has relaxed this recently. You can use behavioral data from Google Analytics for suggesting effective site links.
Many customers will know your brand, and also that they can find their desired product on your site. Many of them will navigate to your site using a search engine. If the click on a PPC advertisement you will typically land them on the home page and they will navigate their way to their desired product page, with the amount of effort depending on the quality of your site. If you know what the above visitors are typically looking for, why not give them the option to make the journey to that conversion shorter?
With very little effort you can enable site search tracking in Google Analytics and create a custom segment to focus on brand search traffic. The Search Terms Report for the above segment can then give you a quick overview of what most people that land on your site through a brand search typically search for on your site. In the example report below it is clear that many of them are looking for iPod products. This advertiser is also based in a geographical area that is currently experiencing winter, so heaters and dehumidifiers also seem to be in seasonal demand. After looking at the e-commerce tab for this report (not shown here), it is also clear that these products are converting quite well. By including site links to the relevant product pages you preempt visitor intent and shorten the conversion funnel, which is likely to improve conversion.
If Louis Gosset Jnr was selling paid search he would have said: “Always keep it real, use Google Analytics to enhance your paid search campaigns.”

Clicks2Customers and RingRevenue Honored with LinkShare award
Clicks2Customers and RingRevenue Honored with LinkShare’s Golden Link Award for Innovative Publisher of the Year
Pay-Per-Call Partners Recognized for Taking Search Marketing to the Next Level
CAPE TOWN—(28 June 2010) — Last week at leading USA Performance Marketing Network LinkShare’s Symposium in New York, pay-per-call technology provider RingRevenue www.RingRevenue.com) and its publishing partner Clicks2Customers (www.clicks2customers.com), the leading South African Search Engine Marketing agency, were honored with LinkShare’s Golden Link Award for Innovative Publisher of the Year. The award is given out annually to a publisher who makes use of new technology or innovative methods to drive higher sales and deliver great returns. Using LinkShare’s Pay-Per-Call program, powered by RingRevenue, Clicks2Customers was able to create an innovative performance marketing campaign focused on mobile search that has delivered over 550 calls within their first two months of launching and is converting at over 30%.
“Pay-Per-Call is gaining a lot of traction in the USA and we have started rolling it out for many of our USA clients” said Michael Leeman, USA Country Manager at Clicks2Customers. “The USA search market tends to be at the cutting edge of global search innovation. We strive to introduce our clients to the critical new search developments that have the ability to significantly ramp up their online sales. Pay-Per-Call is a clear potential game changer due to the explosive growth of mobile search, whilst Facebook advertising is also currently receiving a lot of our attention.”
According to Jonathan Gluckman, Joint Managing Director of Clicks2Customers in Cape Town, “our achievements with Pay-Per-Call highlight our unique search offering to our global clients. The majority of our clients are in South Africa and Australasia, with targeted growth in South America, Africa and South East Asia. Our USA operation services select Top 100 online retailers and enables us to stay at the cutting edge of search in the leading USA market and then transfer this knowledge and insights seamlessly to our global clients. This ensures that they enjoy global best practice whilst still benefiting from our emerging ecommerce market understanding and competitive cost base.”
About Clicks2Customers
Clicks2customers is a focused Paid Search specialist with a team of 46 including 16 Google advanced campaign managers, making it the largest qualified search team outside of the USA, as well as 5 Google Analytics specialists, proprietary technology, an excellent performance based track record, strong pricing algorithms and a blue chip client list. Clicks2Customers is based in Cape Town and has satellite offices or representation in London, San Diego and Melbourne.
Contact Details
Jonathan Gluckman
Jonathan.g@clicks2customers.com
+27 21 442 5040
Become a better keyword vintner
A key objective in PPC is to grow revenue within efficiency targets. This can be achieved through good bid management on the existing keyword set, as well as through the addition of new keywords. It is easy to add thousands of keywords using product feeds or a machine generation method. These keyword sets often add very little term value in a PPC campaign. Quantity is no substitute for quality. It is important to track the quality of keywords you add to a PPC campaign over time and possibly by the source of the keyword e.g. product feed, keyword research, search query mining etc.
Some questions you may want answer include:
- What is the average additional revenue we generate per keyword added (and is it within efficiency targets)?
- Is our initial evaluation period for new keywords sufficient, or are we potentially missing opportunities?
- What is the best source for new keywords?
- Is the quality of new keywords improving over time?
The answers to the above questions are not always apparent from traditional keyword performance reports. Keyword portfolios have already been compared to investment portfolios when it comes to keyword bidding. We use a similar analogy which compares keyword portfolios to loan portfolios. This allows us to apply the idea of a vintage curve for a loan portfolio to keyword portfolios. A loan vintage is a group of accounts that originated in a given time period. For example, all new customers from 2009 make up the 2009 vintage. Vintage groupings can be monthly, quarterly, or annual, depending on the application or data available. Different groups of customers from the same vintage can also be compared to assess their relative quality (e.g. walk-in customers versus directly solicited pre-selected customers). A performance metric such as the proportion of accounts more than 90 days in arrears is often chosen to compare different loan vintages by time on book. If we make the analogy that new keywords represent new customers we can extend the idea of loan vintages to that of keyword vintages. Similar to loans, keywords are added at different time and can originate from different sources (e.g. product feeds or a break-down of search query reports). We can track performance metrics with “keyword age”, which may include percentages of new keywords with a click per impression or revenue per new keyword after a certain time period.
Below we will illustrate one application of a vintage analysis on real campaign data to gain some insight into new keywords. Many more applications are also possible. In figure 1 we consider 8,467 new keywords added to a PPC campaign from April to August 2009. Suppose we start a large set of new keywords on a fairly aggressive initial bid in order to gather performance data with which to optimize keyword bids. We would expect that a large proportion of them would be bid down over time in order to achieve an acceptable or agreed efficiency target, until we reach a stable overall bid level for these keywords as a group. When we study a larger set of campaign data across various advertisers, this seems to happen on average after about 3 months. A similar trend is apparent in figure 1 for our example campaign. It suggests that we achieved an average monthly long-term revenue of about $4 per original keyword added. It is clear how new keywords were bid down over time as performance was optimized.
Figure 1: Revenue per new keyword added versus Average Bid
There is an important variable we did not consider in the above view, and that is the efficiency metric for the new keywords over time. In the case of the above campaign the efficiency metric is cost of sales (i.e. PPC cost/sales). A target of around 8% (allowing for statistical noise) on non-brand terms is an acceptable benchmark for this campaign. In figure 2 below we show the average revenue per keyword (as shown in figure 1) against the cost of sales by keyword age. Despite consistently decreasing bids on new keywords over time, the overall efficiency varied around the 8% cost of sales benchmark throughout. This suggests that we potentially did not give new keywords enough time to prove themselves before decreasing bids on them. If we kept the average bid higher for a longer period we can potentially achieve a higher average long-term revenue than $4 per keyword added, within the efficiency target. The keyword bidding in month 8 seems to try and address this efficiency by raising the average bid. Hopefully we have not priced too many of the original keywords out of contention by this stage.
Figure 2: Revenue per new keyword versus Efficiency
Similar to the above curves you can track the average revenue that you gain from each new keyword added for different cohorts of keywords. This often decreases over time if your inventory remains fairly stable over time, as the most relevant keywords are often selected upfront if your initial campaign setup and research processes are well developed. Treating your keywords as ‘vintages’ can provide useful insight about the quality and performance of your new keywords over time. It also gives you an approximation of the average revenue you can expect per keyword added and at what efficiency. I hope 2010 will be a good vintage for your PPC keywords!
The paid search keyword tail
There is no doubt that the keyword tail is important, provided it is effectively managed in a resource efficient manner, for example by using a bid management system. The difficulty in assessing just how much the tail contributes lies in the exact definition of the tail itself. The contribution of the tail to total campaign revenue is quite sensitive to this definition. There are numerous definitions of the keyword tail and head out there. A good practical view of the keyword head is to consider it as the number of keywords an analyst can manage manually with reasonable ease. A number of 250 to 500 seem to be popular choices. Other definitions specify fairly arbitrary click thresholds. In this post we will consider a visual way for assessing the size of your campaign’s tail. It is also important to assess if your keyword tail meets your campaign’s overall efficiency targets. The tail can add great value to your campaign, but it needs to be managed effectively, just like the rest of your campaign.
A useful way to view the tail is to sort all keywords by their traffic contribution over a specified period. This enables us to divide keywords into different segments based on their contribution to total traffic, and then to compute the contribution of each of these keyword segments to total revenue and their relative efficiency. The choice of segments need some care due to the discrete nature of keyword traffic for lower volume keywords. Figure 1 below shows how this data can be visually represented to give you an overview of an entire campaign. This can then be visually represented as illustrated in figure 1. The data represents the performance of all non-brand keywords for an online retailer over the last 3 months of 2009. The blue bar represents about 3,818 keywords (23% of the 16,601 keywords with an impression over the period). If we consider this group of keywords as the head keywords, it will provide a fairly conservative estimate of the tail. In this definition the tail keywords still contribute about 24% of total revenue at an acceptable efficiency given the campaign’s efficiency targets.
Measuring Site Link Performance: Volume I – Click-through rate
In a recent post Tom Van den Berckt discussed site links as a new feature on Google Adwords. Google’s new ad formats generated quite a bit of discussion at SMX West 2010. They include site links, product ads, local ads and comparison ads. Site links have been around the longest and the first performance data is starting to surface. At the recent SMX West conference, Google reported an average 30% to 40% increase in the click-through rate (CTR) for ad copy displaying site links. At the same conference, Clicks2Customers reported a similar increase in CTR based on our own experience across multiple clients. The general consensus was that site links have a positive impact on click-through rates. Google hinted at the expansion of site links to a wider set of keywords and also a potential increase in the maximum of 4 site links that are currently being displayed. Currently, site links only appear on the ads with the highest AdRank, which are typically those linked to brand terms.
At present, it is not possible to track site link performance directly using AdWords. We track site link performance through own campaign tracking system, by assigning a unique identifier to each site link within an ad group. This enables us to distinguish clicks that occur via the main ad copy link from those that occur via each of the site links for each ad group. We have been running site links on several client campaigns since November 2009 and outline our initial statistical findings on site link performance below.
We identify the earliest day that site link clicks started appearing on an ad group and then compare the performance of that ad group before and after the start of site links. Below we focus on the highest volume brand terms that have been displaying site links since November 2009. Before we consider changes in CTR, we measured what proportion of ad copy clicks occurred via site links rather than via the main ad copy link. In figure 1 it is shown that a relatively small proportion of overall clicks occur on the actual site links, across all the ad groups that displayed site links. It is important to note that site links do not always trigger and we are not able to tell how often they did trigger at this stage. This means that the true percentage of clicks that site links attract will be understated in figure 1.
Figure 1: Percentage of total ad group clicks attributable to site links for a range of advertisers
Next we consider click-through rate. Our aim is to establish if there has been a significant change in click-through rate after the introduction of site links. For illustrative purposes, we focus on a high traffic brand ad group for one of the above advertisers (Advertiser A).
Visual inspection
In any statistical analysis a visual inspection of the data is good place to start, before embarking on more formal inference. In figure 2 we present our data graphically. The two plots at the top are two different ways to show the distribution of daily click-through rate before and after the introduction of site links. There is a clear upward shift in distribution of click-through rate after the introduction of site links. The density plots show an estimate of the actual densities and are just a different view to that presented by the boxplots, and also clearly show that click through rates have shifted upward since the introduction of site links. The bottom left figure shows the mean CTR before and after site links with a 95% confidence interval in each case. This plot suggests the difference in CTR is significant given the variability in the data. However, it is also apparent that there has been a slight shift in average position after the introduction of site links which should be taken into account in any inference. It is important to separate any effect on CTR caused by position from that caused by the introduction site links.
Figure 2: Visualization of click-through rate distributions before and after the introduction of site links on a branded ad group.
The box-and-whisker plots (or boxplots) in the top left plot are much underrated graphical tools and warrant a brief explanation. In figure 3 we show that the black line that divides the “box” represents the median of the data. The lower and upper edges of the box are represented by the lower and upper quartiles of the data. This means that 50% of the observations fall within the range of the box, while 25% fall below and above the lower (Q1) and upper (Q3) quartiles, respectively. The whiskers mark those values which are 1.5 * IQR from the upper and lower quartiles. The IQR is the inter quartile range: the distance between Q1 and Q3. If there are observations which are outside 1.5 * IQR or even 3 * IQR then they are considered as mild and extreme outliers, respectively.

Figure 3: Explanation of boxplots
Statistical Inference
The visual inspection of the data in figure 3 suggests that there has been an increase in click-through rate since the introduction of site links. We will now consider how we can perform more formal statistical inference on the data, in order to establish if the observed differences are statistically significant given the variability I the data.
A simple and reasonable approach would be to perform a t-test. This test can easily be performed in any statistical software or Excel. This test will compare the observed CTR after the introduction of site links to some value you believe the CTR to have centered on before the introduction of site links. This will produce a p-value that represents the probability that the observed difference is due to chance. A small p-value (typically less that 0.05) implies a significant difference. When we apply this test to the data represented in figure 3 we obtain a p-value of 0.00001, suggesting that there has been a significant increase in CTR after the introduction of site links.
There are two shortcomings to this approach:
- Binomial response: strictly speaking one should use a Chi-square test to conduct inference about the mean of a binomial variable as the standard t-test assume a normal population; this is particularly important when the sample size is relatively small;
- Position: If there has been any shift in average position before and after the introduction of site links, this need to be accounted for in the inference. As we know there is a very clear correlation between position and CTR, which a higher position resulting in a higher CTR. In order to get the most accurate measure for the direct impact of site links on CTR, we need to account for any position effects. This can be achieved through some more sophisticated statistical modeling.
If you are satisfied by basic statistics you should skip to the discussion. However, for those who want to take their campaign measurement to a new level and/or if you enjoy some more sophisticated statistics, this for you. In an earlier blog we introduced logistic regression as a very useful statistical modeling tool for modeling binomial responses as a function of other variables. Once again it will prove to be useful here.
Logistic regression is part of a wider group of statistical models called generalized linear models or GLMs. By using logistic regression, we can model the CTR function of any number of other explanatory variables. In our case average position and an indicator variable that shows if the observation was taken before or after the introduction of site links:

where is defined as the odds of a click-through, so we are actually modeling the log-odds of the CTR (for theoretical reasons we will not go into here). In our model both the position and site link variables are deemed to be significant. The output of a logistic regression model is typically interpreted in terms of odds ratios. The model parameters can be used to compute the odds for a click-through for different values of an explanatory variable, all else being equal. The ratios of the odds ratios with their corresponding confidence intervals can then be used to conduct inference about the modeled data. For example can compute the odds ratio before and after the introduction of site links, at any given position as 

where

is the regression parameter for the site link indicator variable. The odds ratio of 1.33 implies that odds of a click-through have increased by 33% after the introduction of site links in any given position. It is always important to compute a confidence interval for the odds in order to assess the significance of the change in the context of the data. A 95% confidence interval for the above odds ratio is [1.319, 1.335]. The relatively narrow confidence interval suggests that we have high confidence that there is a real increase in the odds of a click of about 33%. We can also visualize our fitted model by re-expression the model above in terms of the CTR and plotting it over a range of positions. This is visualization is presented in figure 4.
Discussion
In this post we showed how we can make use of more sophisticated statistical tool to evaluate the performance of site links. We conclude that site links do increase the click-through rate, with a user generally being more than 30% more likely to click on an advertisement with site links than without site links. In figure 1 we show that the overall percentage of clicks that happen through site links is relatively small and not enough in itself to drive the observed increase in click-through rate. The conclusion is that the inclusion of site links increase the actual size and visibility of the ad copy, while pushing organic results further down the page. This seems to drive more clicks to main ad copy link thereby increasing the overall click-through rate. This agrees well with some of our initial observations about the introduction of site links. This is something we will continue research. Our CTR results agree quite closely to the preliminary data presented by Google at the recent SMX West conference in Santa Clara. It seems like it is a good idea to start experimenting with site links on your top keywords.
Beyond an increase in CTR, it would also be of interest to see whether site links can be used to affect the conversion behavior of advertisements. In volume II of this post we will present some initial data on this aspect of site link behavior, as well as data on testing different types of site links.
All the graphics and inference in this post were produced using R (http://www.r-project.org), a leading free software environment for statistical computing and graphics that we use internally for statistical modeling.
Quality score dynamics can vary between different advertisers and PPC markets
In an earlier post, it shown that the relationship between CTR and position can vary for different advertisers, especially in the higher positions. These differences can potentially be explained by differences in brand traffic or are real differences in the click-through related to the dynamics of geographical PPC markets or the nature of the advertiser’s business. It is interesting to note that the CTR in the higher positions seem to be considerably better for advertisers in less mature online markets namely Australia and South Africa. We investigated this further by only focusing on non-brand specific keywords. To achieve this we included a factor indicating the brand or non-brand status of keywords in our model. Figure 1 compares the four advertisers on all non-brand keywords with a QS of 7. Even after non-brand differentiation we continue to observe substantially higher CTR in the top positions for Australian and South African advertisers compared to their UK & US counterparts used in the same model. This could be reflective of a lower level of competition relative to more developed markets resulting in a smaller number of competitive ads on a search results page, which in turn drives higher click-through in top positions. From this it is also clear that there are no fixed thresholds for determining quality score, rather it is determined on the relative CTR which varies across advertisers and potentially also geographical regions
Figure 1: CTR by position across four advertisers for all non-brand keywords with a quality score of 7.
Effects of Site Links on Adwords Campaigns
One of the recent new features on Google Adwords has been the ability to add Site links to certain campaigns. These campaigns are the ones that Google deems eligible to display additional links in the Adwords advert (similar to what they’ve been doing for quite a while on the organic search results (see fig. 1). On the SERPS, these additional organic links provide the user with a quick way to navigate to subsections of a website without having to go the homepage first, effectively saving the user the effort of an extra click.
Fig. 1: Site links on organic results
So at first sight, it seems like a good idea for Google to implement the same feature on Sponsored Links because it adds value and relevance to the user. But could there be other motives behind Google’s action?
The first thing to point out is that in reality, Google (currently) only seems to display site links on ads with a very high Adrank, which basically means ads linked to brand (or trademark) keywords. Now there has been a lively debate in the search marketing world over whether or not one should bid on one’s own trademark. After all, why would you spend money on clicks that would find you in the (free) organic results anyway? That is a valid argument but unfortunately it is also a bit naïve. As Fig. 2 shows, if you don’t bid on your brand, someone else eventually will and no matter how good your organic rankings, you will lose clients to your competitors. Cause the truth is, Google does not make any money from organic results and if possible they will try to put a sponsored link above the organic results. Your competitor doesn’t even have to bid on your brand, Adwords’ Expanded Broad Matching might automatically match your competitor’s brand to yours. (Example: Google will match the keyword ‘Avis’ ,broad, to a user searching for ‘Hertz’ – i.e. without Avis actively bidding on the Hertz keyword). Have a proper look at your search query reports and you would be surprised at how frequently this occurs.
Fig. 2: Competitors reducing the effect of organic site links
With the site links feature, Google provides a trademark owner the ability to appear more prominently on the SERPs but only if they choose to bid on their trademarked keywords. If we look at fig. 3 it is obvious that site links make a sponsored link appear more prominent for a trademark owner, even as the organic results are being pushed lower down the page.
Fig. 3
This post is based on qualitative observations on popular brand that does not belong (or is related) to our client portfolio but we have quantitative data to back up our theories (more on that in a next post). Unlike in the organic results, site links are not only intended to increase user relevance, but also to increase Google’s revenue by making the competition for prime SERPs advertising space more fierce. Organic (free) listings are being pushed lower down the page, forcing lower ranking websites to start paying for more prominent sponsored listings. For trademark owners site links provide a competitive advantage but at the bottom of the ladder we suspect that competition and click prices will increase.
In a next post, we’ll take a more detailed look at the impact of site links on a campaign’s performance based upon our experience of the last few months.
The relationship between quality score and click through-rate by position
The factors that affect quality score are hotly debated in many blogs and online forums. This motivated us to look for answers in our own data. From what we see in our own data, we have little doubt that click-through rate (CTR) is the most important factor in determining quality score (QS). When studying the relationship between QS and CTR it is important to take into account the effect of position in that analysis which is often over-looked in other analyses. Clicks2Customers, as a company, works with clients in a wide variety of industries and PPC markets around the world, which allowed us to make comparisons from the above dynamics across four clients from PPC markets in different geographical regions.
Logistic regression is a useful and appropriate statistical tool for studying the relationship between a binomial variable or rate, such as CTR and other factors (in our case quality score and position). We selected 4 clients representing a range of industries and geographical locations and modeled the observed CTR as a function QS and position. It is important to note that QS does not affect CTR (in fact causality happens conversely); however the regression model is a useful descriptive tool for highlighting the underlying relationship between the above factors.
In figure 1 we plotted the above relationship model for the four selected clients. Similar to the analysis referred to above, it was observed that quality scores of 8 and 9 are rare, therefore focus was on quality scores of 6, 7 and 10 where there is more data to model. Furthermore, we restricted our analysis to the top positions. The relationship between CTR and both QS and Position proved to be significant, though perhaps predictable. Clearly, a higher quality score implies a higher CTR at a given position. It is clear that the QS for a keyword is determined by the CTR relative to the position the keyword is at. For example, let us consider the data for the large US retailer in figure 1. Keywords in position 1 with a QS of 10 have a CTR of around 10%, while a QS of 10 corresponds to a CTR of around 5% in position 3. In case you are concerned about the use of CTR in determining QS related to position, you have no concerns as it seems as though Google has thought it through.






