Dealing with outages -- are we ready?

This vendor-written opinion has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.

The powerful storms that recently hit the mid-Atlantic region caused electrical outages that in turn disrupted Amazon's EC2 services. The impact to the consumers and businesses that depend on the Amazon cloud have been well chronicled.

Amazon is a leader in reliability and, if it happens to them, it can and will happen to any service provider. The point is that in order to maintain business continuity, enterprises need to take responsibility for planning their outage contingencies.

HINDSIGHT: Amazon outage one year later: Are we safer?

MORE: Amazon opens up about its cloud security practices, joins CSA registry

The problem is that business processes, applications and computing infrastructure are too intertwined and dependent on each other. If the infrastructure isn't configured just right or is unavailable, the business process stops. The industry has made great strides in abstracting the physical computing infrastructure from the applications it supports. Amazon and VMware have created tremendous value and built businesses by abstracting (or insulating) applications and users from hardware diversity and failures.

However, the industry has only started to abstract the business process from the applications and infrastructure that supports it. To work around an Amazon EC2 outage, organizations really need to utilize more than one provider to avoid a single point of failure. Yet in order for the business to be successful at this there needs to be the ability to reroute and rerun the process in their own data center or an alternative service provider. This is where higher-level process automation comes in.

The recent outages at Royal Bank of Scotland, BATS Global Markets and others demonstrate the inability not only to abstract the process from the infrastructure but to see the interdependencies and the failures that plague complex IT systems as well. In those particular outages, it took minutes to fix the problem but days to find it.

Process automation that keeps track of the complex interdependencies between applications, infrastructure and business workflows can help identify, or even predict problems. Then in the case of an unavoidable outage, the business workflows would be rerouted to an available data center.

Most process automation done today is low level IT administrative tasks for provisioning servers, handling backup or startup routines, and generally doing infrastructure tasks that require little decision making that could affect the line of business. This is necessary and important, but not sufficient to preserve the user experience or business process integrity in the face of increasingly complex IT environments where, statistically, something is always failing.

Enterprises must step up their IT process automation to the point that they can manage business workflows not just servers or IT tasks.

If the businesses dependent on Amazon had these capabilities, they would drastically reduce the outages they experienced. Orchestrating business workflows and associated data across applications and infrastructure is easier said than done. However, it can, and is, being done by many enterprises to assure service levels.

Being able to "roll-back" failed system updates to previous working versions, spotting process failures before they create an unrecoverable backlog, and the ability to run a workflow on newly provisioned environments is the type of higher-level process automation that abstracts inevitable outages from the user or business experience.

As enterprises get more serious about higher-level process automation, they will spend more time abstracting their processes from specific infrastructures and application environments. This abstraction is not only key to quickly managing an outage, it's also key to efficiently dealing with the growing IT complexity created by today's hyper-competitive business environment.

Whether IT is ready or not, the business is doing whatever it takes to respond to changing market and customer demands by pushing IT to develop new applications at a faster pace and deploy them quickly (on highly virtualized infrastructure). Add that up and you get a lack of organization, infrastructure sprawl, and more fluidity as to where applications actually run, resulting in IT complexity and skyrocketing application-to-infrastructure dependencies.

It's at this point where the need for process abstraction and automation becomes acute. Because these interdependencies, which represent potential breakage points, are beyond human ability alone to manage. IT organizations are now forced to deal with these new realities while Cloud, Big Data, DevOps and ITaaS pressures get added to the mix in the name of providing more business agility. With all these moving parts, something needs to be stable and act as the IT backbone. It's increasingly obvious that it's the process and process control.

The days of designing the process to accommodate the shortcomings of infrastructure are over. Enterprises must abstract, insulate and protect their business processes from the applications and infrastructures that support them. The need for improved IT process automation is rising as the services and brand impact of on-line outages grows.

UC4 Software is the world's largest independent IT Process Automation software company. UC4 automates tens of millions of operations a day for over 2,000 customers worldwide. Rethink IT automation at

Read more about software in Network World's Software section.

Join the CSO newsletter!

Error: Please check your email address.
Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by Randy Clark, CMO of UC4 Software

Latest Videos

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

  • 150x50

    IDG Live Webinar:The right collaboration strategy will help your business take flight

    Speakers - Mike Harris, Engineering Services Manager, Jetstar - Christopher Johnson, IT Director APAC, 20th Century Fox - Brent Maxwell, Director of Information Systems, THE ICONIC - IDG MC/Moderator Anthony Caruana

    Play Video

More videos

Blog Posts