By Sarah D Scalet
Ask Google anything - what's happening to GE's stock price, how to get to 881 Seventh Ave. in New York, where Mission Impossible 3 is showing, whatever happened to Brian Smith after he moved away in the ninth grade - and you'll get an answer. That's the power of this $US6 billion search engine sensation, which is so good at what it does that the company name became a verb.
That kind of power keeps Google on the front page of the news - and sometimes under unfavourable scrutiny, as demonstrated by Google's recent clashes with the US Department of Justice and also with critics displeased by the search giant's stance on Chinese government censorship.
CSOs and CISOs have a different reason to think carefully about Google and the implications of having so much information online, instantly accessible by almost anyone. Although these issues relate to all search engine companies, Google gets most of the attention - not only because of its huge share of the Web search market but because of its unabashed ambitions to catalogue everything from images and libraries to Earth, the moon and Mars.
"We always get enamoured of a new technology, and it takes us a while to understand the price of that technology," says Robert Garigue, vice president of information integrity and chief security executive of Bell Canada Enterprises in Montreal, Canada. For security pros, the price is that Google can be used to dig up network vulnerabilities and locations of sensitive facilities, to enable fraud and cause other sorts of mayhem against the enterprise. Here, CSO examines the ways Google is shaking the security world, and what companies can do about them.
1. Google Hacking (strictly defined)
What it is: Using search engines to find systems vulnerabilities. Hackers can use carefully crafted searches to find things like open ports, overly revealing error messages or even (egads!) password files on a target organization's computer systems. Any search engine can do this; blame the popularity of the somewhat imprecise phrase "Google hacking" on Johnny Long. The author of the well-read book Google Hacking for Penetration Testers, Long hosts a virtual swap meet (http://johnny.ihackstuff.com) where members exchange and rate intricately written Google searches.
How it works: The way Google works is by "crawling" the Web, indexing everything it finds, caching the index information and using it to create the answers when someone runs a Web search. Unfortunately, sometimes organizations set up their systems in a way that allows Google to index and save a lot more information than they intended. To look for open ports on CSO's Web servers, for instance, a hacker could search Google.com for INURL:WWW.CSOONLINE.COM:1, then INURL:WWW.CSOONLINE.COM:2, and so on, to see if Google has indexed port 1, port 2 and others. The researcher also might search for phrases such as "Apache test page" or "error message", which can reveal configuration details that are like hacker cheat sheets. Carefully crafted Google searches sometimes can even unearth links to sloppily installed surveillance cameras or webcams that are not meant to be public.
Why it matters: Suppose someone is scanning all your ports. Normally, this activity would show up in system logs and possibly set off an intrusion detection system. But search engines like Google have Web crawlers that are supposed to regularly read and index everything on your Web servers. (If they didn't, let's face it - no one would ever visit your Web site.) By searching those indices instead of the systems themselves, "you can do penetration testing without actually touching the victims' sites", points out consultant Nish Bhalla, founder of Security Compass.
What to do: Beat hackers at their own game: Hold your own Google hacking party (pizzas optional). Make Google and other search engines part of your company's routine penetration testing process. Bhalla recommends having techies focus on two things: which ports are open, and which error messages are available.
When you find a problem, your first instinct may be to chase Google off those parts of your property. There is a way to do this - sort of - by using a commonly agreed-upon protocol called a "robots.txt" file. This file, which is placed in the root directory of a Web site, contains instructions about files or folders that should not be indexed by search engines. (For a notoriously long example, view the White House's file at www.whitehouse.gov/robots.txt.) Many companies that run search engines heed the instructions in this file.