Thor Henning Hetland Blog

Skip to end of sidebar Go to start of sidebar
Its time for auto scaling - avoid peak load provisioning for web applications

Good blog-post on an old-trick:

  • The missing piece - middleware virtualization

If you take pretty much any existing application, add another machine into the network, and watch what happens, you wouldn't be surprised that nothing much would happen at all.

Non-virtualized middleware

Today's applications aren't able to dynamically take advantage of new computing resources that become available. That's also true for cloud based environments like EC2 - the fact that you can now add machines easily is nice, but it doesn't mean that the application can do anything with it. What is missing is a layer that helps the application take advantage of these new resources dynamically, as they are being added to the system. This is where middleware virtualization comes to the rescue.


VMware - All your clouds are belong to us

VMWorld VMware is introducing a new vSphere architecture and management product to manage a data centre as an internal or external cloud of services.

The idea, introduced by VMware CEO Paul Maritz at VMworld in Cannes, is to have a set of interfaces looking downwards at the IT plumbing and another set looking at the applications.There is a vCompute interface to look at the compute resources and see what is available and provision them. A vStorage layer gets told by storage resources - the arrays for example - what they can do, such as block copy or deduplication. Then vSphere admin staff, and ultimately users themselves, can provision storage resources.


Trouble In The Clouds - Gmail Turns Into Gfail

Thousands of Twitter messages carrying the words "gmail" or "gfail" will teach you that Google's free web-based e-mail platform is currently down. A Google spokesperson told Pocket Lint that their engineers are working on it but have no clue why the errors are turning up.

Meanwhile, Google posted this on a discussion form:

We're aware of a problem with Gmail affecting a small subset of users. The affected users are unable to access Gmail. We will provide an update by February 24, 2009 6:30 AM PST detailing when we expect to resolve the problem. Please note that this resolution time is an estimate and may change.

(POP3 / IMAP seems to be still functioning, and the problem doesn't appear to affect Google Apps at this point)

I'm not buying the small subset part, and considering the fact that Pocket Lint says the problem started occuring around 10:20am GMT, 3 hours before even telling everyone what's going on is an incredibly long timeframe in my opinion.


Clouds only for tactical projects until 2012 - Gartner misses the point... (again)

The cloud computing market is in a period of excitement, growth and high potential, but will still require several years and many changes in the market before cloud computing — or service-enabled application platforms (SEAPs) — is a mainstream IT effort, according to Gartner, Inc.

Gartner said that technologically aggressive application development organizations should look to cloud computing for tactical projects through 2011, during which time the market will begin to mature and be dominated by a select group of vendors.


Cloud definitions in despair.. (Gartner)

Cloud definitions are vague.

That is - how can we expect IT people to be able to strategize and decide on IT direction and tactics if we can't even describe for them what the real issues are in any consistent way. For that, we need a commonly accepted definition, even if it is not great.

Let's at least ask the experts to start their definitions with actual definitions.


Open Source and Community Clouds


StratusLab is an informal collaboration between CNRS/LAL, GRNET, SixSq Sàrl, and UCM. The collaboration is open to anyone who would like to participate. The collaboration focuses on cloud technologies and how those technologies can be used productively in research and commercial environments.

The key issue with productive use of the technologies is effective management of the cloud resources. For broad adoption, cloud resources must be managable with the same (or similar) techniques currently used by administrators of data centers. The initial activities of the collaboration will investigate how different management techniques can be adapted to cloud resources.


EUCALYPTUS - Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems

EUCALYPTUS - Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems - is an open-source software infrastructure for implementing "cloud computing" on clusters. The current interface to EUCALYPTUS is compatible with Amazon's EC2 interface, but the infrastructure is designed to support multiple client-side interfaces. EUCALYPTUS is implemented using commonly available Linux tools and basic Web-service technologies making it easy to install and maintain.

The current release is version 1.3 and it includes the following features:

  • Interface compatibility with EC2 (both Web service and Query interfaces)
  • Simple installation and deployment using Rocks cluster-management tools
  • Stand-alone RPMs for non-Rocks RPM based systems
  • Secure internal communication using SOAP with WS-security
  • Overlay functionality requiring no modification to the target Linux environment
  • Basic "Cloud Administrator" tools for system management and user accounting
  • The ability to configure multiple clusters, each with private internal network addresses, into a single Cloud.



Platform components as a service will hit the software marked hard in 2009 and 2010, but until developers and architects understand how to leverage the platform components in a clear and consistant way, they will add more pain than salvation... Read up on some of the architecture axioms and distributed architectures and analyse your current design and architectures before moving to platform component services is adviced.


Clouded Vision

Cloud computing platforms offer many benefits including:

  1. Cheaper operational costs.
  2. Dynamic scaling in response to load spikes.
  3. Roll-on, roll-off deployments for e.g. newspaper archive processing.

These platforms exist as the result of the investment of companies such as Amazon, Google and Microsoft in developing cost-effective infrastructure with system to administrator ratios of 2500:1 (whilst the average enterprise manages around 150:1 and inefficient properties manage maybe 10:1).

Key to allowing these infrastructures to be efficient and in turn deliver the benefits above is having applications architected such that:

  1. They don't require masses of administrator intervention when they go wrong.
  2. They can be installed with minimal administrator effort because there's no need to worry about tweaking URLs, IP addresses, database connections etc.
  3. They readily support horizontal scaling e.g. because they contain an abstraction that can support sharding of data-storage.

In essence an application must be designed for zero administrator intervention and fully automated deployment. It should also have a variable workload component that magnifies the savings of the architectural properties above.

Strange then that many a developer expects to move their existing application, full of enterprise DNA (static configuration, vertical clusters, no horizontal scaling, high administration costs) to such an offering with minimal change. They even complain when it proves difficult because all those "enterprise features" aren't present. Why does this happen?

I believe it's because these developers have fundamentally misunderstood how cloud computing delivers its benefits. They see the cheap prices but don't stop to consider where the cost saving comes from. Some of it is achieved by cloud platform vendors getting large discounts on huge hardware orders but a significant proportion comes from the fact that they don't need to provide (via human resources or APIs) the sysadmin functions required for conventional hosting solutions.

Quite simply typical applications, their architectures and associated administration practices are not setup for cloud platforms. Some of them may be able to run on these platforms with sufficient hackery, brute force and associated cost. However if the motivation for a move to the cloud is merely to reduce kit costs one might well be better off looking for a cheaper conventional hosting solution.

In summary, making the best of the cloud requires that we take an architectural view, something that we've proven remarkably bad at over and over. Simply deploying an application unchanged to the cloud is unlikely to deliver much benefit.


Sinnataggen and some Cloud development thoughts...

"if the cloud were a child, it would be an angry two year old. The challenge for the industry now is how to make it through the terrible twos."

According to Gartner, the important criteria for a cloud development platform include:

  • Interoperability: how well does the platform integrate with other web assets like open id and google maps?
  • Collaboration: how well does the platform support source code control and social programming (Facebook meets SVN)
  • RIA & mobile clients: cross browser and cross smart-phone support. According to Mark, reach wins over richness - supporting more browsers is more important than supporting more widgets.
  • Legacy: ability to integrate with enterprise data, security and web services
  • Performance: ability to scale significantly with no additional effort/programming
  • Longevity: the market momentum of the platform vendor - will they be around in 3 years? The winner will be less about the raw technology and more about the quality of partners and customers the vendor has attracted.


Cloud definition misses the point....

Here are the three criteria I have for determining whether something is a cloud service or not:

1. The service is accessible via a web browser (non-proprietary) or web services API.
2. There is zero capital expenditure required to get started.
3. You pay only for what you use as you use it.

Free SimpleDB Beta Cloud from Amazon

For "at least" the next six months, Amazon will provide a certain amount of free cloud sitting. Every month, SimpleDB users will receive 25 machine hours, 1GB of data transfers in, 1GB of transfers out, and 1GB of storage at no charge. Once those freebies are exhausted, you'll pay $0.14 per machine hour, $0.10 per GB in, $0.10 to $0.17 per GB out, and $0.25 per GB of storage. Storage pricing is down 83 per cent from the cloud's limited beta.

You can also transfer an unlimited amount of data from EC2 at no charge.

Why clouds should be more like operating systems

Let's say you are running multiple different Amazon Machine Images (AMIs), which contain your applications, libraries, data and configuration settings on Amazon's Elastic Compute Cloud (EC2), and you are using Amazon's S3 for storage. Don't think running everything in the cloud will abstract away potential management problems. You'll still have a system administration headache until you script something or update your AMIs with your new software and application code.

A better - and obvious - answer would be if you could have all of your images, code and applications available in a dashboard where you could simply update everything on the fly.

An even better answer would be to not have to perform any system administration functions at all. Currently, the only way to make that happen with Amazon is through third-party tools like 3Tera and |RightScale|].

Cloud Computing- Are You Looking for IaaS or PaaS Provider?

Microsoft, offers a way to host your .Net applications on the Cloud with a pricing model yet to be officially announced, and offers integration with some Microsoft services/applications.

Amazon on the other end, does not offer a way to host your web applications out-of-the-box on the Cloud, but simply provide virtualized hardware on which you can do whatever you'd like to (well, as usual it's it a bit more complex than that, but that's pretty much it).

So basically, Google and Microsoft offers are PaaS solutions: they offer a Platform on which you can deploy your applications. On the other end, Amazon offers an IaaS solution: an Infrastructure which you can use.

PaaS or IaaS: do you want the Cloud Provider to offer you a way to host your applications (if you can accomadate with their technical restrictions) or an infrastructure allowing you to host your applications (without restrictions) the way you want?

Amazon CloudFront

Amazon CloudFront delivers your content using a global network of edge locations. Requests for your objects are automatically routed to the nearest edge location, so content is delivered with the best possible performance.

geek footnote: the bigass images on dopplr's new city pages are served from Amazon's Cloudfront CDN. And it was really easy.

Very happy about Amazon CloudFront, about 100-120 msec for static files like images, js, etc... in Spain (going to France). Faster than S3

Considerations in Building Web Applications for the Amazon Cloud
  • 1. Licensing - If you stay Open Source you are OK... Commercial Licences may kill your budget..
  • 2. Persistence - Application persistence.. Amazon does not like filesystems, use DB for persistence...
  • 3. Horizontal Scalability. Not sure that this gets the point.. vertical scalabillity should work even better on Amazone. Anyone?
  • 4. Disaster Recovery. Its a distributed system - cope and be happy