When moving to AWS, it can be difficult to pinpoint where your previously high-functioning system is underperforming. AWS has its own internal rhyme and reason, and anyone moving to AWS must be familiar with how AWS operates. One tool that can help you on this front is AWS CloudWatch, which provides metrics and monitoring on core system performance. The focus of this article is how to configure your IO subsystem in AWS. This requires using Cloudwatch to understand your subsystem’s performance. Are you hitting throughput limits? Are you utilizing all your allocated IOPS? What’s the average queue length?
To define all these terms from scratch is out of the scope of this article – we assume you have some basic familiarity with AWS I/O terminology or at minimum have some links to get you started. Our goal here is instead to provide some high-level guidelines on how to begin thinking through your I/O performance in AWS.
It’s Not Just the Volume
Let’s take a step back for a second to let the implications of using EBS with our instances sink in. EBS (Elastic Block Store) is a type of durable, persistent, network-attached storage available for AWS EC2 instances. Emphasis here goes on “network-attached”: whenever you read/write to an EBS volume, there is an unavoidable network component involved that sits directly between the EC2 instance and the EBS network-attached storage. In many cases, this network component will be your bottleneck unless you are running on a particularly capable instance network setup (2), since it isn’t very difficult to surpass the aforementioned instance limits.
The first step in assessing your IO woes is validating this network connection from your instance to your EBS volume. To further complicate things, your performance is not only limited by the volume you provision, but the instance you attach said volume to! Amazon lists the network limits per instance type, ranging from 4k IOPS and 500MiB/s limits to 65K IOPS and 10GiB/s limits, so check your documentation! It is incredibly important that you take into consideration your expected disk throughput and IOPS when selecting an instance. A perfectly performant IO subsystem can’t outshine a low-throughput network connection to an instance, so be sure to test in every application.
When asking “what’s bottlenecking me now?”, the EBS volume type and the instance type have to be considered when setting up your system. You could provision 20000 IOPS to an io1 disk, but because of instance bottlenecks, max out at around 8000 IOPS for a throughput of 260MiB/s. That’s 12000 IOPS you’re paying for wasted, all because you forgot to consider your instance type as a bottleneck.
When consulting AWS documentation about your instance, one should be cautious about taking the listed IO limits at face value. In a fair bit of documentation, AWS achieves the listed instance benchmarks only by assuming a read-only workload in “perfect weather.” The reality is that these numbers are rounded approximations of best-case scenario performance, using block sizes that may not be relevant to your application. Read your documentation carefully and be prepared to test your assumptions. Of course, this is just the tip of the iceberg – you’ll need to weigh in other factors as well, such as the file system being used, the read/write workload, and whether options like encryption are enabled on the EBS volumes connected to the instance.
What About the Disk?
Go ahead and start monitoring your instance to quickly determine where you are particularly struggling in the IO department (if at all!). AWS CloudWatch metrics make it very easy to validate if your high throughput I/O application is having disk-related performance bottlenecks. Combining this with standard disk benchmarking tools at the OS level will allow clear picture of your IO subsystem to emerge. We’ve found it is fairly simple to get an IO subsystem to outperform the EBS-to-Instance network connection (RAID setups come to mind), so after validating the network pipe limitations, you’ll be much better equipped to provision your EBS volume appropriately for desired IOPS and/or throughput.
Appropriate drive selection depends heavily on your primary performance attribute (2), IOPS or bandwidth. For higher IOPS applications, general SSD’s are available in general purpose burst type (gp2) and provisioned (io1) formats. SSD’s are more expensive than their HDD counterparts, and are optimized around low latency smaller block size operations. Most general purpose and mission critical applications are just a matter of appropriately sizing one of these two SSD types. The other kinds of EBS storage options offered by AWS are throughput optimized (st1) and cold (sc1) HDD volumes. What is unique about these storage types is their maximum bandwidth per drive, 500MiB/s and 250MiB/s respectively. These HDD volumes require a larger block size and a preferably sequential workload to labor effectively due to their painfully low IOPS limits (500/250 IOPS respectively).
At the end of the day, most AWS applications outside of big data/log processing will boil down to a choice between either io1 or gp2 volumes. HDD volumes have their place, but they must be carefully considered due to their extreme IOPS limitations even though their admirably high bandwidth limits might be appealing.
EBS snapshots: A Lesson in Prudence
Let’s turn now to some of the functionality provided in tandem with EBS volumes. AWS offers powerful snapshot functionality in conjunction with EBS volumes, allowing the user to make point-in-time backups of a given volume. Better yet, these snapshots are incremental – your first snapshot of a volume will take time, but any subsequent snapshot of the same volume will only snapshot blocks that have been modified since the last snapshot. AWS accomplishes this through a bit of smoke and mirrors that ultimately boils down to two things: pointers and S3. In short, all snapshotted blocks of a given volume are stored in S3. When a new EBS volume is spun up from a snapshot, AWS will initialize the volume using the newest blocks from all snapshots associated with the original drive. This means that an end user can set up a rolling backup schedule for a given EBS volume, and be assured of both data integrity to a point in time, as well as speedy incremental snapshots.
Hearing all this, it is easy to get carried away. Incremental snapshots + speedy initialization of a drive from a snapshot + programmatic control of instances and volumes via AWS API calls = all sorts of possibilities, right? One might think of backup systems, easy setup / maintenance of slave nodes or QA environments, or perhaps a data refresh scheme using snapshots? Sounds good, but stop for a second and remember: THERE IS NO FREE LUNCH. Not here, not anywhere. We’re dealing with computers, and everything has a trade-off cost. So where are we paying here?
Remember when we mentioned how AWS stores snapshot blocks in S3? So how do we get those blocks back from S3, an entirely separate system, into a drive that we initialize from a snapshot? Indeed, AWS does you no favors (or perhaps one too many favors!) on this – when you navigate a volume spun up from a snapshot, all your files look like they’re there!
Or are they? Perhaps you might notice slight performance degradation when you access a file from your snapshot-initialized volume. The first time you open a file, your IO performance is poor; the second time, much better. This pattern repeats for every file on your volume.
There it is – no free lunch. When a drive is initialized from a snapshot (whose blocks are stored on S3) and then attached to an instance, AWS fetches blocks from S3 lazily. Until you access a given block on your disk, it is not physically on your volume. So the first time you access a block, you’re paying two costs – network latency to fetch from S3, and then IO latency for reading the block from disk. And to make things worse – recall the rest of the article on how an EBS volume is essentially storage accessed through a network, which means you have more layers of latency and bottlenecks to work through for an IO read!
Okay, so what can we do? Well, if your volume is small, and / or you have time on your side, there are a few options. The main option suggested by AWS is outlined here: in essence, use the dd tool(3) to pull all blocks from your newly-initialized volume, and redirect the output to /dev/null. In other words, read everything such that all blocks are pulled from S3 (the drive is “pre-warmed”), and then discard those reads. Now you’ve eliminated the S3 latency penalty.
Of course, depending on the size of your drive, this process can be prohibitively slow. It also requires extra manual work or automation from your engineers, which can backfire depending on the complexity of your implementation.
None of this is meant to deter you from making use of snapshots – they’re a wonderful, well-implemented tool. But as with any tool, they come with drawbacks and gotchas, and this is a biggie. So just remember: EBS snapshots are back-ended by S3, and blocks for snapshot-initialized volumes are fetched lazily.
AWS is Not Magic
Your best bet is to test the network connection directly via an I/O load test utility (such as SQLIO.exe or diskspd.exe in the Windows world). Generating an insanely high IO/bandwidth workload is very easy when you’re using the right utility, and it allows you to validate the EBS-to-Instance pipe and disk limitations at not only the AWS layer (CloudWatch), but at the OS layer as well. You can easily find the network pipe limit in terms of bandwidth or I/O using an overly performant drive setup and a correctly tuned load tester utility, such as RAID 0 volumes and diskspd.exe (RAID0 being what it is, of course, exercise caution!).
AWS offers flexibility and power that far exceeds that of your average data center, but (at risk of a groan) with great power does indeed come with great responsibility. You MUST be prepared to play by AWS’s rules and not go by conventional datacenter wisdom. This means a few things:
- Live and breathe AWS documentation.
- At the same time, believe nothing and test everything in documentation.
- Perform your proof-of-concepts at scale and as robustly as possible.
- Remember that AWS services are nothing more than complex abstractions – you will face bottlenecks from things AWS lists, and bottlenecks from things they never mention once. We refer you back to points one through three.
If anything else, the intent of this article is to convey the complexity of moving to AWS through the lens of configuring I/O, and to offer some suggestions and principles on how to navigate the world of AWS. The payoff of moving to AWS can be wonderful, but AWS is not magic. Learn to navigate by its rules, and you’ll be rewarded – otherwise, you’ll wonder what all the hype was about!