Feeds

Google 'Instant Previews' hit Google Analytics with fake traffic

Real-time page fetch

The essential guide to IT transformation

Updated Update: This story has been updated with comment from Google, and it has been clarfied to show that Google did tell webmasters when "Instant Previews" launched that it would be doing real-time fetches in some cases. We've also added stats concerning the fetches from The Register's site logs.

Google's new "Instant Previews" search tool is skewing traffic stats for sites using Google Analytics, creating page views before pages are actually viewed.

Rolled out across Google's search engine earlier this month, Instant Previews lets searchers, yes, preview sites before they visit them. Users click on a small icon that appears beside a search result, and this launches an image of the site in question on the right-hand-side of Google's results page.

As Google pointed out when "Instant Previews" was launched, Google is – in some cases – fetching these previews in real time. Soon after the tool's launch, webmasters posting to Google's help forums noticed that these pre-fetches were skewing Google Anayltics numbers. And as noticed by Search Engine Land, a Google employee later confirmed this with a post of his own.

The employee confirms that these real-time fetches are executing JavaScript used by Google Analytics, the company's own web analytics tool, and this is skewing traffic numbers. But he indicates that a fix is on the way. "We're working on a solution for this, to prevent Google Instant Preview on-demand fetches from executing Analytics JavaScript," the Google employee says. "I'm not sure about the timeframe, but I'll drop a note here when I have more to share. Thanks for your patience."

This same employee goes on reiterate that the preview fetches use their own user agent, so webmasters can filter them out if they're using other analytics methods.

"It is my understanding that these page-views are currently only counted (the Google Analytics JavaScript executed) when we render the preview image on-demand (when a user chooses to view it and when we don't have one cached already)," he says. "Because of that, you may see a temporary change for that particular user-agent. The Analytics and Instant Previews teams are aware of this and looking into a solution.

"If you are using other website metrics tracking solutions, it might make sense to also filter that user-agent out."

The company has now posted a FAQ that details the user agent in question:

Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Web Preview) Version/3.1 Safari/525.13

Asked to comment, Google told us: "Webmasters have the ability to control whether Instant Previews are counted as page views. This works in the same way they control how crawls by regular Googlebot count as page views. Instant Previews sometimes gets enough information from Google’s regular crawl. Occasionally, Google will need to refetch this information when the user needs it, and in these situations we will do so using the 'Google Web Preview' useragent. Webmasters can configure their sites to treat this useragent in the same way that they handle crawls by googlebot."

The FAQ page explains that Google fetches previews in real time when it lacks a cached copy of the page previousy collected by its crawl bots. "We mostly generate preview images based on content we’ve crawled with Googlebot," it says. "When we don’t have a cached preview image (which primarily happens when we can’t fetch the contents of important resources), we may choose to create a preview image on-the-fly based on a user’s request. "

The company also says that because the preview fetches use a separate user agent, the previews may include data that webmasters have blocked the crawl bots from collecting. "As on-the-fly rendering is only done based on a user request (when a user activates previews), it’s possible that it will include embedded content which may be blocked from Googlebot using a robots.txt file."

The Google Analytics situation is reminiscent of the AVG Linkscanner, which started spewing fake traffic across the net in early- to mid-2008. In late February of that year, AVG paired its anti-virus engine with a real-time malware scanner that would vet search results before users clicked on them. If you searched Google, for instance, it would automatically visit each address that turns up on Google's results page.

According to the company, more than 20 million people had downloaded the new AVG 8 by late June 2008, and this caused a huge uptick in traffic on sites across the web. Under pressure from webmasters, the company soon disabled its real-time scanning.

But judging from site logs at The Register, the number of real-time preview fetches from Google is relatively small. Over the past 24 hours, we've had 1244 page requests for the user agent in question from a mere 60 unique IPs. ®

Gartner critical capabilities for enterprise endpoint backup

More from The Register

next story
6 Obvious Reasons Why Facebook Will Ban This Article (Thank God)
Clampdown on clickbait ... and El Reg is OK with this
No, thank you. I will not code for the Caliphate
Some assignments, even the Bongster decline must
Barnes & Noble: Swallow a Samsung Nook tablet, please ... pretty please
Novelslab finally on sale with ($199 - $20) price tag
Banking apps: Handy, can grab all your money... and RIDDLED with coding flaws
Yep, that one place you'd hoped you wouldn't find 'em
Video of US journalist 'beheading' pulled from social media
Yanked footage featured British-accented attacker and US journo James Foley
Primetime precrime? Minority Report TV series 'being developed'
I have to know. I have to find out what happened to my life
Netflix swallows yet another bitter pill, inks peering deal with TWC
Net neutrality crusader once again pays up for priority access
Judge nixes HP deal for director amnesty after $8.8bn Autonomy snafu
Lawyers will have to earn their keep the hard way, says court
Ex-IBM CEO John Akers dies at 79
An era disrupted by the advent of the PC
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.