After the debacle that has been Click Frenzy, I'm going to focus on availability. Click Frenzy was a coordinated advertising promotion with a large number of Australian online shopping websites. This sounded like a great idea, and many retailers paid good money to be part of it. The problem was that the click frenzy website struggled under the load and so did a few of the online retailers, resulting in a vicious backlash on social media.
Here's a few things to consider when trying to make an internet accessible web application available in a wide variety of circumstances:
1. Architect the solution to set of realistic requirements, consider peak load requirements, business continuity and scalability.
2. Test the requirements, you can build a test environment and use software such as HP's Load Runner to generate load. Alternatively there are now cloud based load testing services you can use such as SOASTA and LoadStorm. It's also important to test the performance of the application under unexpected conditions such as the failure of one data centre or key components. This can be achieved by simply turning off the network port that the devices are attached to during the tests.
3. Consider using a hosting provider with a Distributed Denial of Service (DDoS) protection solution that they can enable for you if you are attacked. If you're a website who may be subject to extortion or attacks from a dodgy competitor you may want to put this in place straight away and test it as well.
4. Based on your requirements it is likely you will be using a load balancing solution to provide "cross site load balancing". Vendors such as Radware, F5 and Cisco can provide you with two hot data centres and the ability to horizontally scale and add more web servers, which can then be linked to more application servers and more database servers.
5. The use of virtualisation is also likely, virtualisation enables you to scale horizontally, by cloning web servers and vertically by quickly adding more resources to existing virtual machines. One thing to consider is the maximum capacity of the hardware hosting the hypervisor and capacity available to you. If you run out of capacity on a blade server, and can’t stick any new blades in there, it's a lot harder to add more capacity, as you need to forklift in a new chassis.
6. Consider the database. The "secret sauce" in many large web applications is the database design. A common weak point is the interface between the application and the database, and a hard limit in the number of connections available to the database or large performance hit as the number of database connections rise. Also you will have to consider the implications of synchronising data between separate database instances if you scale the database horizontally.
7. Penetration test the web application to make sure there aren't any vulnerabilities with Denial of Service applications that could be leveraged by an attacker to conduct a denial of service attack. 8. Penetration test the network infrastructure to make sure there aren't any un-necessary network protocols enabled that could be used to amplify a denial of service attack.
9. Have change management and release procedures in place, so no-one makes unapproved and untested changes to the application or infrastructure.
10. Have an incident management process and procedures in place, support personnel available onsite in times of expected high load and perform an Post Incident Review after an outage or slowdown and as a result find and rectify the root causes.