A true story of combating a large-scale DDoS attack

Every website is subject to cyberattack. The difference is in how you plan for the threat and how your infrastructure is able to alert and respond

Real tales of cyberattack response and recovery are hard to come by because organizations are reluctant to share details for a host of legitimate reasons, not the least of which is the potential for negative financial fallout. However, if we never tell our stories we doom others to blindly walk our path. Better to share our real-world battlefield experiences and become contributors to improved threat intelligence.

We are a SaaS-based supplier of Web content management for mid- and large-size enterprises.  Our customers manage hundreds, sometimes thousands of websites around the globe in high profile industries such as pharmaceuticals and financial services. The customer in this story prefers to remain anonymous, but to provide some context, the company is a large, public healthcare services organization focused on helping providers improve financial and operational performance. The company counts thousands of hospitals and healthcare providers as clients, managing billions in spending.

The scale of this particular DDoS attack was enormous at its peak 86 million concurrent users were hitting our customer's website from 100,000+ hosts around the world. The FBI was called. When it was all over 39 hours later, we had mounted a successful defense in what proved to be an epic battle. Here's how it happened.

Initial Attack Vector

On the eve of the company's annual conference where it was set to host 15,000 attendees, we received a troubling alert. The company's web servers were receiving unbelievable amounts of traffic. The company is a SaaS provider of content and analytics for its clients, so this slowdown had the potential to dramatically impact service availability and reputation. There was no time to waste.

On the initial attack vector:

  • All of the requests were 100% legitimate URLs, so we couldn't easily filter out malicious traffic
  • Attacks were originating from all around the world including North Korea, Estonia, Lithuania, China, Russia and South America
  • 60% of the traffic was coming from inside the United States
  • The attack was de-referencing DNS and attacking IP addresses directly

We were able to successfully defend against this initial wave by going in behind the scenes to our courtesy domain in Amazon's Route 53, rearranging things a bit and immediately cutting out the traffic to those IP addresses. Things returned to normal and we breathed a sigh of relief... in our ignorance, thinking everything was going to be OK. As it turns out that was only the first wave. Next came the tsunami.

That evening the attackers came back, and came back with a vengeance, targeting the site via its DNS name, which meant we couldn't employ the same IP blocking tactic we had used earlier. Traffic shot up dramatically.

The existential question you ask yourself in moments like this is, "Are we going to lie down and die, or are we going to step it up?" This led to a seminal moment in conversation with the customer CIO, and we decided with a handshake to step it up. As SaaS companies, our ability to deliver continuous, reliable service is paramount, so both of our reputations were on the line. We agreed to share the cost -- potentially tens of thousands of dollars -- to fight the good fight.

Looking at the second wave's traffic, we realized there were a few immediate mitigations that we could easily implement:

* This particular organization only serves U.S. customers, yet a lot of the traffic was coming from outside the country. We quickly implemented some firewall rules that had been battle tested from our work with Federal agencies, which would admit only U.S.-based traffic. This immediately stopped 40% of the traffic at the front door.

* We inserted a web application firewall behind our AWS Route 53 configuration and scaled up some HA Proxy servers, which would gather a lot of logging information for the FBI who had now become our new best friends to analyze after the fact.

* Third, we intentionally broke our auto scaling configuration. Auto scaling has triggers for scale-up and scale-down. We changed the scale-down trigger to make it much higher than the scale-up trigger. What that meant was the system would scale up properly as more traffic came in, but would never hit the scale-down threshold. As a result, every instance that was launched would stay in service permanently, leaving its logging information intact for harvesting by the FBI.

It was now 1:00 a.m. We put our game faces on. The arms race had begun.

DDoS Day Two

Our attackers scaled up. Amazon Web Services scaled up. Our attackers scaled up some more. AWS scaled up some more. This continued into Day Two. At this point we were providing hourly updates to our customer's board of directors.

At the height of the DDoS attack, we had 18 very large, compute-intensive HA Proxy servers deployed and almost 40 large web servers. The web server farm was so large because even though we had excluded the non-U.S. traffic component, representing 40% of the overall load the remaining 60% consisted of legitimate URLs originating from within the United States, most of which were accessing dynamic services that could not easily be cached. Traffic was hitting an extremely large, globally-distributed infrastructure. Our highly-scaled web server farm was deployed behind a very substantial HA Proxy firewall/load-balancer configuration. This in turn sat behind CloudFront, AWS' globally-distributed content delivery network, which itself was deployed behind Route 53, AWS' globally-redundant DNS platform. This was an infrastructure of very significant dimensions, scalable and secure at every tier.

At around until 7 p.m. that evening, something fantastic happened. We scaled up... but the bad guys didn't scale up anymore. At this point we were sustaining 86 million concurrent connections from more than 100,000 hosts around the world. We measured the traffic, and were shocked to see that we were handling 20 gigabits per second of sustained traffic through the AWS infrastructure. This equates to 40 times the industry median as observed in DDoS attacks in 2014, according to Arbor Networks. We continued to serve the website at a response rate of about 1-3 seconds per page.

Our attackers had run out of gas. They hammered us and hammered us until they simply gave up and went home. At the end, the company's CIO told us that if they had hosted the site in their own data center they would have been out of options and unable to respond a mere eight hours into the attack. Remember that handshake agreement to share the cost of defense? At the end of the day, the total cost in Amazon Web Services fees to successfully respond and defend this assault for 36 hours amounted to less than $1,500.

How to Prepare for a DDoS Attack

We survived because we were prepared, but this experience gave us additional insights for surviving a large-scale DDoS attack. Here are some things you can do to fortify your data center and protect your corporate website(s):

I firmly believe that together as an industry we need to collaborate on a better collective understanding of our adversaries' tactics, techniques and procedures to stay one step ahead of the bad guys.

CrownPeak is a leading Web Experience Management solution provider. Visit www.crownpeak.com

Join the CSO newsletter!

Error: Please check your email address.

Tags network securityddossecurityfbi

More about Amazon Web ServicesArbor NetworksAWSFBI

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by By Adrian Newby, CTO, CrownPeak

Latest Videos

  • 150x50

    CSO Webinar: Will your data protection strategy be enough when disaster strikes?

    Speakers: - Paul O’Connor, Engagement leader - Performance Audit Group, Victorian Auditor-General’s Office (VAGO) - Nigel Phair, Managing Director, Centre for Internet Safety - Joshua Stenhouse, Technical Evangelist, Zerto - Anthony Caruana, CSO MC & Moderator

    Play Video

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

More videos

Blog Posts

Market Place