Introduction
As one of the hottest concepts in IT today, “cloud computing” proposes to transform the way IT is consumed and managed, with promises of improved cost efficiencies, accelerated innovation, faster time-to-market, and the ability to scale applications on demand.
While the market is abundant with hype and confusion, the underlying potential is real — and is beginning to be realized. In particular, SaaS applications and public cloud platforms have already gained traction with small and startup businesses. These offerings enable companies to gain fast, easy, low-cost access to systems that would otherwise cost them millions of dollars to build. At the same time, cloud computing has drawn the cautious but serious interest of larger enterprises in search of its benefits of efficiency and flexibility.
However, as companies begin to implement cloud solutions, the reality of the cloud itself comes to bear. Most cloud computing services are accessed over the Internet, and thus fundamentally rely on an inherently unpredictable and insecure medium. In order for companies to realize the potential of cloud computing, they will need to overcome the performance, reliability, and scalability challenges the Internet presents. This whitepaper provides a framework for understanding the cloud computing marketplace by exploring its enabling technologies and current offerings, as well as the challenges it faces given its reliance on the Internet.
With an understanding of these challenges, we will examine Akamai’s unique role as provider of the critical optimization services that will help cloud computing fulfill its promise to deliver efficient, on-demand, business-critical infrastructure for the enterprise. Understanding the Cloud Simply defined, cloud computing refers to computational resources (“computing”) made accessible as scalable, on-demand services over a network (the “cloud”). And yet, cloud computing is far from simple.
It embraces a confluence of concepts — virtualization, service-orientation, elasticity, multi-tenancy, and pay-as-you-go — manifesting as a broad range of cloud services, technologies, and approaches in today’s marketplace. To facilitate our discussion of this diverse marketplace, we first lay out a framework that gives structure to the different offerings in the cloud computing space. We will also explore the role of public and private clouds in the marketplace. The Cloud Computing Framework Our cloud computing framework has five key components. The first, virtualization technology, can be thought of as an underpinning of cloud computing.
By abstracting software from its underlying hardware, virtualization lays the foundation for enabling pooled, shareable, just-in-time infrastructure. On top of this technology base, cloud computing’s principal offerings can be categorized into three main groups: Infrastructure-asa-Service, Platform-as-a-Service, and Software-as-a-Service. Cloud optimization is the final, critical piece of the framework — encompassing the solutions that enable cloud computing to scale and to deliver the levels of performance and reliability required for it to become part of a business’s core infrastructure.
According to Gartner, roughly 80% to 90% of enterprise computing capacity is unused at any given time. Virtualization enables these onceidle CPU cycles to be used. Taking the concept of server virtualization to the cloud means extending it — going beyond the more efficient use of a single physical machine or cluster — to the aggregation of computing resources across multiple data centers, applications, and tenants, and allowing each to scale up or down on demand. This enables cloud providers to efficiently manage and offer on-demand storage, server, and software resources for many different customers simultaneously.
Significant cloud virtualization technologies include:
- Microsoft (HyperV)
- VMWare (ESX, as well as multiple related VMWare offerings)
- Xen (open source hypervisor, used by Amazon EC2 and in Citrix XenServer)
Infrastructure-as-a-Service Infrastructure-as-a-Service (IaaS) describes the category of cloud computing offerings that make basic computational resources — such as storage, disk space, and servers — available as on-demand services. Rather than using physical machines, IaaS customers get access to virtual servers on which they deploy their own software, generally from the operating system on up.
IaaS offers cost savings and risk reduction by eliminating the substantial capital expenditures required when deploying infrastructure or large-scale applications in-house. Cloud providers generally offer a pay-as-you-go business model that allows companies to scale up and down in response to real-time business needs, rather than having to pay up front for infrastructure that may or may not get used, or having to overprovision resources to address occasional peaks in demand.
To date, IaaS has seen heaviest adoption among small to mid-sized ISVs and businesses that don’t have the resources or economies of scale to build out large IT infrastructures. Examples of cloud IaaS offerings include:
- Akamai (NetStorage and CDN services)
- Amazon (Elastic Compute Cloud/EC2 and Simple Storage Service/S3)
- GoGrid (Cloud Servers and Cloud Storage)
- Joyent (Accelerator)
Platform-as-a-Service A fast-growing category of cloud computing offerings is Platform-as-a-Service (PaaS), which consists of offerings that enable easy development and deployment of scalable Web applications — without the need to invest in or manage any underlying infrastructure. By providing higher-level services than IaaS, such as an application framework and development tools, PaaS generally provides the quickest way to build and deploy applications, with the trade off being less flexibility and potentially greater vendor lock-in than with IaaS.
The PaaS landscape is broad and includes vendors such as:
- Akamai (EdgeComputing)
- Elastra and RightScale (platform environments for Amazon’s EC2 infrastructure)
- Google (App Engine)
- Microsoft (Azure)
- Oracle (SaaS Platform) Software-as-a-Service
The best enterprise-ready examples of cloud computing are in the Software-as-a-Service (SaaS) category, where complete end-user applications are deployed, managed, and delivered over the Web.
SaaS continues the cloud paradigm of low-cost, off-premise systems and on-demand, pay-per-use models, while further eliminating development costs and lag time. This gives organizations the agility to bring services to market quickly and frees them from dependence on internal IT cycles. The speed and ease with which SaaS applications are purchased and consumed has made this category of cloud computing offerings the most widely-adopted today.
The value of cloud optimization services can be understood as a direct function of application adoption, speed, uptime and security. Without optimization services, cloud offerings are at the mercy of the Internet and its many bottlenecks — and the resulting poor performance has a direct impact on the bottom line. For example, a site leveraging IaaS components that fail to scale for a flash crowd will lose customers and revenue.
Likewise, a SaaS application that is slow or unresponsive will suffer from poor adoption. Thus, cloud optimization is essential for cloud computing services to be able to meet enterprise computing requirements. In the Anatomy of a Cloud section below, we will take a closer look at the root causes of the Internet’s bottlenecks. This will lay the foundation for understanding why Akamai, with its highlydistributed network of servers, is uniquely positioned to provide the critical optimization services that can transform the Internet into a high-performance platform for the successful delivery of cloud computing services. trictly on-premise (in a non-cloud environment) while leveraging public cloud offerings for other application components to achieve cost-effective scalability.
So regardless of the path that enterprise adoption of cloud computing takes, the public cloud — that is, the Internet — will play a vital role. And the taming of that cloud — with its inherent performance, security, and reliability challenges — is an element essential to cloud computing’s success. Anatomy of a Cloud When wrapped up in the hype of cloud computing, it is easy to forget the reality — that cloud computing’s reliance on the Internet is a double-edged sword.
On one hand, the Internet’s broad reach helps enable the cost-effective, global, on-demand accessibility that makes cloud computing so attractive. On the other hand, the naked Internet is an inherently unreliable platform — fraught with inefficiencies that adversely impact the performance, reliability, and scalability of applications and services running on top of it. We now take a closer look the causes of these bottlenecks and the impact that different cloud computing architectures can have in addressing them. 3 Public Clouds and Private Clouds
Most of the early spend and traction for cloud computing (including, for example, the IaaS, PaaS, and SaaS vendors mentioned above) have been focused on public cloud services — those that are accessed over the public Internet. Public cloud offerings embody the economies of scale and the flexible, payas-you go benefits that have driven the cloud computing hype. More recently, the concept of private clouds (or internal clouds) has emerged as a way for enterprises to achieve some of the efficiencies of cloud computing with an infrastructure internal to their organization, thus increasing perceived security and control.
By implementing cloud computing technologies behind their firewall, enterprise IT teams can enable pooling and sharing of compute resources across different applications, departments, or business units within their company. Private clouds require significant up-front development costs, on-going maintenance, and internal expertise, and therefore provide a much different benefit profile compared to public clouds. Private clouds are most attractive to enterprises that are large enough to achieve economies of scale in-house and where the ability to maintain internal control over data, applications, and infrastructure is paramount.
Even private clouds, however, often have at least partial dependence on the public Internet, as these large enterprises must support workers in dispersed geographic locations as well as telecommuting or mobile employees. In reality, most enterprise cloud infrastructures will be hybrid in nature — where even a single application can run across a combination of public, private, and non-cloud environments. For example, a company may run highly-sensitive components The Middle Mile Conundrum
The infrastructure that supports any Web-based application, including cloud computing services, can be split into three basic parts: the first mile, or origin infrastructure; the last mile, meaning the end user’s connectivity to the Internet; and the middle mile, or the paths over which data travels back and forth across the Internet, between origin server and end user. Each of these components contributes in different ways to performance and reliability problems for Web-based applications and services.
A decade ago, the last mile of the Internet was likely to be one of the biggest bottlenecks, as end users struggled with slow dial-up modems. Today, however, high levels of global broadband penetration — over 400 million subscribers worldwide, as well as continually increasing broadband speeds, have not only made the last-mile bottleneck history, they have also increased pressure on the rest of the Internet infrastructure to keep pace. 4 First mile bottlenecks are fairly well-understood and, more importantly, fall within the origin provider’s control.
Perhaps the biggest first mile challenge lies in the ability to scale the origin infrastructure to meet variable levels of demand. Not only is it difficult to accurately predict and provision for demand, but it is costly to have to overprovision for occasional peaks in demand — resulting in infrastructure that is underutilized most of the time.
This leaves the middle mile — the mass of infrastructure that comprises the Internet’s core. Indeed, the term middle mile is itself a misnomer in that it refers to a heterogeneous infrastructure that is owned by many competing entities and typically spans hundreds or thousands of miles. While we often refer to the Internet as a single entity, it is actually composed of 13,000 different networks, joined in fragile co-opetition, each providing access to some small subset of end users.
The largest single network accounts for only about 8% of end user access traffic, and per-network share of access traffic drops off dramatically from there to be spread out over a very long tail. This means the performance of any centrally-hosted Web application — including cloud computing applications — is inextricably tied to the performance of the Internet as a whole — including its thousands of disparate networks and the tens of thousands of connection points between them. Given this complex and fragile web, there are many opportunities for things to go wrong.
We now take a closer look at four of the key causes of Internet middle-mile performance problems. Peering Point Problems Internet capacity has evolved over the years, shaped by market economics. Money flows into the networks from the first and last miles, as companies pay for hosting and end users pay for access. First- and last-mile capacity has grown 20- and 50-fold, respectively, over the past five to 10 years. On the other hand, the Internet’s middle mile — made up of the peering and transit points where networks trade traffic — is literally a no man’s land.
Here, economically, there is very little incentive to build out capacity. If anything, networks want to minimize traffic coming into their networks that they don’t get paid for. As a result, peering points are often overburdened, causing packet loss and service degradation, and, in turn, slow and uneven performance for cloud-based applications. The further away a cloud service is from its end customers, the greater the impact of Internet congestion. For enterprises that are accustomed to LAN-based speeds, this performance bottleneck can seriously affect the adoption of cloud computing.
The fragile economic model of peering can have even more serious consequences. For example, major network provider Cogent de-peered for several days with Telia and Sprint in March and October 2008, respectively, over peering-related business disputes. In both cases, the de-peering partitioned the Internet. This means, for example, that users on the Sprint network, as well as any other networks single-homed to Sprint, would not have been able to reach any cloud services or applications hosted on Cogent (or any network single-homed to Cogent). According to the Internet analyst firm Renesys, the CogentSprint de-peering left more than 3500 networks with significantly impaired connectivity.