A diverse set of real-world Java benchmarks shows Google is fastest, Azure is slowest, and Amazon is priciest.
The sales pitch is seductive because the cloud offers many advantages. There are no utility bills to pay, no server room staff who want the night off, and no crazy tax issues for amortizing the cost of the machines over N years. You give them your credit card, and you get root on a machine, often within minutes.
To test out the options available to anyone looking for a server, I rented some machines on Amazon EC2, Google Compute Engine, and Microsoft Windows Azure and took them out for a spin. The good news is that many of the promises have been fulfilled. If you click the right buttons and fill out the right Web forms, you can have root on a machine in a few minutes, sometimes even faster. All of them make it dead simple to get the basic goods: a Linux distro running what you need.
At first glance, the options seem close to identical. You can choose from many of the same distributions, and from a wide range of machine configuration options. But if you start poking around, you’ll find differences — including differences in performance and cost. The machines may seem like commodities, but they’re not. This became more and more evident once the machines started churning through my benchmarks.
Fast cloud, slow cloud
I tested small, medium, and large machine instances on Amazon EC2, Google Compute Engine, and Microsoft Windows Azure using the open source DaCapo benchmarks, a collection of 14 common Java programs bundled into one easy-to-start JAR. It’s a diverse set of real-world applications that will exercise a machine in a variety different ways. Some of the tests will stress CPU, others will stress RAM, and still others will stress both. Some of the tests will take advantage of multiple threads. No machine configuration will be ideal for all of them.
Some of the benchmarks in the collection will be very familiar to server users. The Tomcat test, for instance, starts up the popular Web server and asks it to assemble some Web pages. The Luindex and Lusearch tests will put Lucene, the common indexing and search tool, through its paces. Another test, Avrora, will simulate some microcontrollers. Although this task may be useful only for chip designers, it still tests the raw CPU capacity of the machine.
I ran the 14 DaCapo tests on three different Linux machine configurations on each cloud, using the default JVM. The instances aren’t perfect “apples to apples” matches, but they are roughly comparable in terms of size and price. The configurations and cost per hour are broken out in the table below.
Cloud machines under test
|Virtual CPUs or cores||RAM||Cost per hour|
|Amazon m1.medium||1||3.75GB||12 cents|
|Amazon c3.large||2||3.75GB||15 cents|
|Amazon m3.2xlarge||8||30.00GB||90 cents|
|Google n1-standard1||1||3.75GB||10.4 cents|
|Google n1-highcpu-2||2||1.80GB||13.1 cents|
|Google n1-standard-8||8||30.00GB||82.9 cents|
|Windows Azure Small VM||1||1.75GB||6 cents|
|Windows Azure Medium VM||2||3.50GB||12 cents|
|Windows Azure Extra Large VM||8||14.00GB||48 cents|
I gathered two sets of numbers for each machine. The first set shows the amount of time the instance took to run the benchmark from a dead stop. It fired up the JVM, loaded the code, and started to work. This isn’t a bad simulation because many servers start up Java code from command lines in scripts.
To add another dimension, the second set reports the times using the “converge” option. This runs the benchmark repeatedly until consistent results appear. This sometimes happens after just a few runs, but in a few cases, the results failed to converge after 20 iterations. This option often resulted in dramatically faster times, but sometimes it only produced marginally faster times.
The results (see charts and tables below) will look like a mind-numbing sea of numbers to anyone, but a few patterns stood out:
- Google was the fastest overall. The three Google instances completed the benchmarks in a total of 575 seconds, compared with 719 seconds for Amazon and 834 seconds for Windows Azure. A Google machine had the fastest time in 13 of the 14 tests. A Windows Azure machine had the fastest time in only one of the benchmarks. Amazon was never the fastest.
- Google was also the cheapest overall, though Windows Azure was close behind. Executing the DaCapo suite on the trio of machines cost 3.78 cents on Google, 3.8 cents on Windows Azure, and 5 cents on Amazon. A Google machine was the cheapest option in eight of the 14 tests. A Windows Azure instance was cheapest in five tests. An Amazon machine was the cheapest in only one of the tests.
- The best option for misers was Windows Azure’s Small VM (one CPU, 6 cents per hour), which completed the benchmarks at a cost of 0.67 cents. However, this was also one of the slowest options, taking 404 seconds to complete the suite. The next cheapest option, Google’s n1-highcpu-2 instance (two CPUs, 13.1 cents per hour), completed the benchmarks in half the time (193 seconds) at a cost of 0.70 cents.
- If you cared more about speed than money, Google’s n1-standard-8 machine (eight CPUs, 82.9 cents per hour) was the best option. It turned in the fastest time in 11 of the 14 benchmarks, completing the entire DaCapo suite in 101 seconds at a cost of 2.32 cents. The closest rival, Amazon’s m3.2xlarge instance (eight CPUs, $0.90 per hour), completed the suite in 118 seconds at a cost of 2.96 cents.
- Amazon was rarely a bargain. Amazon’s m1.medium (one CPU, 10.4 cents per hour) was both the slowest and the most expensive of the one CPU instances. Amazon’s m3.2xlarge (eight CPUs, 90 cents per hour) was the second fastest instance overall, but also the most expensive. However, Amazon’s c3.large (two CPUs, 15 cents per hour) was truly competitive — nearly as fast overall as Google’s two-CPU instance, and faster and cheaper than Windows Azure’s two CPU machine.
These general observations, which I drew from the “standing start” tests, are also borne out by the results of the “converged” runs. But a close look at the individual numbers will leave you wondering about consistency.
Some of this may be due to the randomness hidden in the cloud. While the companies make it seem like you’re renting a real machine that sits in a box in some secret, undisclosed bunker, the reality is that you’re probably getting assigned a thin slice of a box. You’re sharing the machine, and that means the other users may or may not affect you. Or maybe it’s the hypervisor that’s behaving differently. It’s hard to know. Your speed can change from minute to minute and from machine to machine, something that usually doesn’t happen with the server boxes rolling off the assembly line.