Google highlights trouble in detecting web-based malware

No one system is effective in detecting increasingly complex malicious software, the company said

Google issued a new study on Wednesday detailing how it is becoming more difficult to identify malicious websites and attacks, with antivirus software proving to be an ineffective defense against new ones.

The company's engineers analyzed four years worth of data comprising 8 million websites and 160 million web pages from its Safe Browsing service, which is an API (application programming interface) that feeds data into Google's Chrome browser and Firefox and warns users when they hit a website loaded with malware.

Google said it displays 3 million warnings of unsafe websites to 400 million users a day. The company scans the Web, using several methods to figure out if a site is malicious.

"Like other service providers, we are engaged in an arms race with malware distributors," according to a blog post from Google's security team.

That detection process is becoming more difficult due to a variety of evasion techniques employed by attackers that are designed to stop their websites from being flagged as bad, according to the report.

The company uses a variety of methods to detect dangerous sites. It can test a site against a "virtual machine honeypot," which is a virtual machine that visits a website and notes its behavior. It also uses browser emulators for the same purpose, which record an attack sequence. The browser emulator is an HTML parser and a modified open-source JavaScript engine.

Other methods include ranking a website by reputation based on its hosting infrastructure, and another line of defense is antivirus software.

One of the ways hackers get around VM-based detection is to require the victim to perform a mouse click. Many sites are rigged to automatically deliver an exploit and execute an attack if an unpatched software program is found.

Google describes it as a kind of social engineering attack, since the malicious payload appears only after a person interacts with the browser. Google is working around the issue by configuring its virtual machines to do a mouse click.

Browser emulators can be confused by attacks when the malicious code is scrambled, a method known as obfuscation. Since the browser emulator isn't a real browser, it won't necessarily execute the obfuscated JavaScript code in the same way as a real browser. The only explanation for the more complex JavaScript is that it is designed to halt emulated browsers and make manual analysis of the code more difficult, the engineers wrote.

Google is also encountering "IP cloaking," where a malicious website will refused to serve harmful content to certain IP ranges, such as those known to be used by security researchers. In August 2009, Google found that some 200,000 sites were using IP cloaking. It forces researchers to scan the sites from IP ranges that are "unknown by the adversary," the report said.

Antivirus software programs rely on signatures as one method to detect attacks. But the engineers wrote that the software often missed code that has been "packed," or compressed in a way that it is unrecognizable but will still execute.

Since it can take time for AV vendors to refine their signatures and remove ones that cause false positives, the delay allows the malicious content to stay undetected.

"While AV vendors strive to improve detection rates, in real time they cannot adequately detect malicious content," the Google researchers wrote. "This could be due to the fact that adversaries can use AV products as oracles before deploying malicious code into the wild."

The study was authored by Moheeb Abu Rajab, Lucas Ballard, Nav Jagpal, Panayiotis Mavrommatis, Daisuke Nojiri, Niels Provos and Ludwig Schmidt.

Send news tips and comments to

Join the CSO newsletter!

Error: Please check your email address.

Tags intrusionGooglesecurityExploits / vulnerabilitiesdata protectionmalware

More about Google

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by Jeremy Kirk

Latest Videos

  • 150x50

    CSO Webinar: Will your data protection strategy be enough when disaster strikes?

    Speakers: - Paul O’Connor, Engagement leader - Performance Audit Group, Victorian Auditor-General’s Office (VAGO) - Nigel Phair, Managing Director, Centre for Internet Safety - Joshua Stenhouse, Technical Evangelist, Zerto - Anthony Caruana, CSO MC & Moderator

    Play Video

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

More videos

Blog Posts

Market Place