Google Apps sics crawlers on public docs and sheets
Beware what you publish
Google will soon allow search engines to crawl and index documents, spreadsheets, and presentations published to the web via its online office suite, Google Apps.
On Friday, in a letter to Google Apps users, the web giant informed users the change would arrive "in a few weeks." This was confirmed by a Google spokeswoman in an email to The Reg, who pointed out that on the Google Apps "help center" site, the company says the change is no more than a fortnight away.
"We will be launching a change for published docs. The change will allow published docs that are linked to from a public website to be crawled and indexed, which means they can appear in search results you see on Google.com and other search engines," Google says.
This only applies to files explicitly published using the suite's "publish as web page" or "publish/embed" options and linked to from a public webpage. This does not apply to files shared via the "Allow anyone with the link to view (no sign-in required)" option, which provides for document sharing without links to the public web.
Google warns that if you don't want your publicly-published documents crawled, you can de-publish them. Instructions for de-publishing are here.
At the help center, one Google Apps user has asked if - in light of the change - the company could provide a clear indication of which apps are public and which are not. "I think this makes it very important that you bring back the indication on the docs listing of those files that are published," the user says. "Maybe a separate label/folder of published docs/spreadsheets?"
Indeed, as it stands, Google Apps master view does not tell you which docs are publicly published and which aren't. ®