Creating a cloud SLA from diagnostic data
- — 05 August, 2011 05:18
As a CSO and CIO you may be wondering why I crafted a diagnostic related to understanding your most critical web products. The original purpose of the diagnostic was to discern which applications and how applications are ported successfully to a service provider's cloud. The diagnostic determines which cloud IaaS products (storage components, network components, and virtualization machines) are needed for an application. It addresses the platform components (server/operating system and web server) in the PaaS layer. Lastly, it focuses on the SaaS software application.
Once, this analysis is done accurately, all of a corporation's top 10 applications are mapped to a public/private cloud. Simple ports (different version of virtualization software or later release of an operating system) can occur as long as they do not impact the software running in the SaaS layer. A matrix can be made for all ten applications and it may be possible to integrate more than one application on different portions of the infrastructure products. So what should be tested to ensure success?
I argue in previous articles that testing disaster recovery is the most important and difficult work a service provider can show to a potential cloud client. Hence, testing a disaster recovery plan is a very high priority. Also, it just happens that the information collected for disaster recovery is a superset of the information necessary for porting your main web applications to the cloud service provider. Thorough analysis of each application shows the service provider how to port and test them in the cloud. It tests the basic applications in the cloud service provider's configuration (what is needed for all ten applications) and also the additional functionality needed for a successful disaster recovery of those same applications.
More importantly, the diagnostic helps determine what functionality was required for input into a SLA (Service Level Agreement) between a company and its cloud service provider. The accurate creation of the SLA is another end goal of the analysis. How does one gain confidence that the SLA covers all the various types of failure by the service provider? What should be analyzed first? Performance metrics are still needed for each supported application. Here are some other questions that need to be answered in order to properly come up with the performance metrics. Again this is a representative list; it is not necessarily complete.
* What is the web browser application's best user response type?
* What is the web browser application's response time under load (20% concurrent users)?
* For each application's components in each cloud layer, how long will component failure need to recover?
* Will component failure impact a given application's browser experience?
* If total data center disaster occurs, how long will recovery take?
* What is the application cost for each different type of failure?
* Will any potential income will be lost for each application?
* What are the reputation (brand) costs associated with the failure of this application?
* What probability is there of a lawsuit if an application fails?
Lastly, a grid of information, a row per application, should be inserted into the SLA. In the SLA, each application should have functionality requirements, performance metrics and financial penalties for the various types of downtime errors. Note the service provider is not required to deploy an identical architecture with the same products and software with models and releases as the client's architecture. The providers must meet similar functionality and response times. Areas of architectural risk should also be noted in the SLA. For example: the database product in the PaaS layer may only be one vendor. A weakness in the database vendor's release of software could impact all the applications that use the given database, a systemic risk. This should be noted and the cost of downtime calculated. The cost of brand loss (if the application has public confidential data) and potential lawsuits should also be included in the SLA for each application.
In summary, a diagnostic can be used to analyze critical web-based corporate applications for a corporation considering putting their applications in a provider's cloud. The provider need not use the same architecture or products but it must meet the SLA for each application. Besides the diagnostic questions some performance data for each web application needs to be collected so that the SLA can be complete.