Not So Smart
Google’s Smart Keyword Evaluation Tool – what exactly are the advantages, to the advertiser, to Google, to the consumer? It’s supposed to disable terms that are underperforming, so that they are taken out of the rankings to improve the user experience. I’m simply not convinced it’s working at all as intended. I don’t even think anyone at Google really understands how their own tool works.
We recently had a client who was offering an online broadcast of March Madness college basketball games. It was approved by the NCAA, and, in fact, the URL was on their site. Can’t get much more relevant. This was an unusual PPC campaign, as there was a very narrow window of consumer interest. We started off with a test campaign using a core of college basketball and March Madness related terms, and then expanded to specific team and school names as the brackets were announced.
Traffic was slow on the test campaign… and some of the key terms got disabled. Quickly. Terms like “march madness basketball” – which were getting an acceptable clickthrough rate according to Google’s own definition. Look in the FAQ, and it says .5% for first position. Of course, part of the problem there may be that the numbers you see in the client center, and the numbers Google uses to determine a keyword’s status are not the same pool of data. You are looking at the total, including performance on partner sites. They are using Google-only search numbers. Why, if they are able to separate these numbers for their own use, and they are making decisions which effect your campaign, can’t the data be displayed separately for the advertiser as well? And how does it help Google, the advertiser, or the user if terms that are performing well on partner sites are not allowed to continue to be displayed on those sites?
There was a huge spike in March Madness-related search terms once the games got underway. Pre-season there was very little. So, by starting the campaign early, we ended up being penalized for an early lack of interest and were allowed to use some terms which had a high degree of relevancy and for which there was a great deal more traffic than when we started the campaign. The “smart” tool is not smart enough to factor in seasonal differences.
Then, as we added terms to the campaign, we encountered another problem. Terms were put on hold or even disabled – with no history whatsoever. How can any algorithm determine that any term will not perform well for any given client if it is never given a chance to be displayed?
During this campaign, I had lunch with a few people I work with at Google, and asked for an explanation. I was told it may be because of the performance of “similar” terms in the campaign, like plural versus singular. Problem was, there were no similar terms to most of the new ones which had any more history for such an evaluation. The Optimizer I spoke with seemed just as baffled and frustrated that accounts she uploaded new terms for often ended up with several ‘on hold’ or disabled from the start. She said, vaguely (something I have heard, in just as vague terms) that the number of terms on hold had to do with the total number in the account. If you upload too many new terms at once, several of them may be inactive. This also makes no sense. Why can I start a campaign with 2000 terms and be fine, but if I start with 200 and add 1000, I have several that are inactive?
The standard advice is to choose terms which are more specific. This, however, does not explain how a term like “gonzaga bulldogs college basketball game” – which is very specific – would be disabled as soon as it’s uploaded. Apparently, the Smart Keyword Evaluation Tool bases some of its decisions on the history of a term across accounts. So, if several other advertisers have not had success with a term, it might be disabled from day one when you upload it. This does not seem to make much sense. There is no way that an algorithm can evaluate relevance without history for the particular client that is advertising… and if a term is disabled for poor performance, a change in ad copy should allow that term another chance.
I think that this is a tool that was created as an attempt to shortcut the time necessary for human review in relevance. It is, however, shortsighted and flawed. It does not take into account conversions – a term with a low click-through but high conversions would seem like a good one. It does not take into account the differences between different advertisers for the same keywords. It does not take into account seasonal spikes in a term’s relevance. The only way to tell whether a term is relevant for the user is to either review it manually (with editors, as Overture does) or – as has been Google’s strength in the past – to allow the users to decide.
And so, I say to you, oh Google-powers-that-be, let the keywords run! Let the people decide!