Storage Design in 4 Easy Steps
I’ve worked in IT operations for 20 years, and if there’s one lesson I’ve learned is that you should ALWAYS overbuild. Yes, I’ll say it again ALWAYS overbuild; and as a result, I’ve always had happy storage customers. This will be the first of a two-part series on storage for those of you who maybe new to the field or are just looking for solid design guidance. First well look at how to design arrays for performance, then we’ll cover off on some of the softer qualities of arrays that may influence your decision.
This week we’ll cover what I consider my four fundamental design pillars.
Storage Capacity (The Easy Part)
In general, this is the easy part of the storage design. A customer or the business may come to you and say: Mr. Engineer we currently have an array that’s 100Tb and we anticipate 30% growth every year please design us an array. Oh and by the way we’re doing 50,000 IOPS… Many times this is all the information people will think to give you.
But either way our first requirement is the simple part of the task. For our sizing if we wanted to lay out a three and five year sizing it would look like this:
Growth By Percent
Sizing Required in TB
|Year One End||30%||130|
|Year Two End||30%||169|
|Year Three End||30%||220|
|Year Four End||30%||286|
|Year Five End||30%||371|
The math is pretty straight forward, you simply add 30% to each year every year. As you can see there’s a massive difference from year one to year five. Most people don’t understand that this compound rate of growth can be very aggressive.
Finally make sure you’re following your vendors best practices for capacity utilization. Most vendors don’t suggest running their arrays 100% full, and most companies seem to like to run their arrays around 80% full. That will very much depend on your company, but add whatever “head room” you want to your sizing.
Workload (The Holy Trinity)
This is where we start to get into a little bit a tricky area, and where the hard work comes in. We have three factors that will affect the performance of our array. Look out for any storage vender who comes in and simply states “our array will give you x IOPS!”. It’s most likely a lie… All flash arrays (aka AFA’s) do much better with this, but there’s still some tradeoffs on these three things. Let’s look at three characteristics most closely:
2a. IOPS & Block Size
On a hybrid or traditional array IOPS and block size are closely related. If you think about a simple hard drive you can format it with a block size. If you’ve ever worked with MS SQL, you’ve probably formatted your hard drives with a 64k block size. That’s because SQL writes (in general) larger chunks of data down to the disk than let’s say a web server saving tweets, or social media posts. When you think about saving a large chunk of data, that’s simply more times that the hard drive write head must travel along the platter. If you have a larger chunk of data like 64 or 128k it may take you 6ms, if you have a small chunk of data like 4k it may take you 1ms. On a traditional or hybrid storage array the size of your data blocks will mean you will simply get less IOPS. To compensate for this, you will generally need to strip the data across more spindles so your workload isn’t bottle necked.
IOPS Sizing for the future is another key point. You saw how much our storage capacity growth there was from year one to year five. If you did a straight liner upgrade on your IOPS you would calculate your required IOPS required as something like this:
Growth by Year
IOPS Required by Year
|Year One End||30%||65,000|
|Year Two End||30%||84,500|
|Year Three End||30%||109,850|
|Year Four End||30%||142,805|
|Year Five End||30%||185,647|
HOWEVER, that maybe a mistake… Many companies are adding more workload to their arrays, meaning the actual work required per unit of storage is going up! We always need to ask the business what kind of new projects that may have that will drive storage performance. We may want to assume that the performance requirements over the next three to five years will increase at a rate of 10% or some number that the business has given us. What do our calculations look like in this case?
First we have to start by calculating our current IOPS per TB, in our case 50,000 IOPS divided by 100Tb is 500 IOPS per TB. We’ll add 10% to that for year one so on and so forth. Here’s the table to calculate the IOPS per TB required over a five year period:
Growth by Percent
Sizing Req. in TB
IOPS Per TB Growth
IOPS Per TB Required
|Year One End||30%||130||0.1||550|
|Year Two End||30%||169||0.1||605|
|Year Three End||30%||220||0.1||666|
|Year Four End||30%||286||0.1||732|
|Year Five End||30%||371||0.1||805|
So now you can see that by the end of the fifth year you will need 371Tb of capacity and each one of those Tb need to preform 805IOPS. Your total IOPS sizing would be:
371Tb * 805IOPS = 298,655 IOPS
This doesn’t mean you need to buy this performance day one, but what it does mean is that the storage processors you select should be able to handle that performance if you’re going to keep the array for five years.
2b. Bandwidth or Throughput
Bandwidth (in general) is the other side of the IOPS coin. When we spoke about how IOPS go down with large block sized, what we didn’t say was that bandwidth goes up! I often use this analogy with customers. You have 20 students that need to get from point A to point B. There are two options for student transport.
The school bus: you load all the kids on the bus and drive at a safe rate of speed
The sport bike: You load one kid up at a time but you’re blazing down the road at 120MPH!
Doesn’t option two sound like more fun? Well that’s probably why storage vendors measure things in the IOPS unit. No storage arrays gets 1 MILLION GB OF BANDWIDTH! However storage arrays can get 1 MILLION IOPS! AAAAAAH marketing madness! So you really need to understand what’s more important to your applications sitting on top of the storage arrays IOPS or bandwidth.
Latency is what you will generally hear end users complain about, it’s commonly thought of as the response time of the array. For traditional or hybrid arrays latency is effected by bandwidth and throughput. As we called out in point one, the larger the block size the slower the response time. This is really where all flash arrays shine. Within reason they don’t suffer the same penalties as spinning disks because the heads aren’t floating over a platter, it’s a simple electronic bit change. Our options for reducing latency is to spread the writes out over more disks making each individual write smaller, leverage caching, or move to a hybrid or all flash array.
It’s very critical to understand how caching works in any array you’re going to look at. In a traditional array caching is either flash disk or sometimes RAM that the “working set” is moved into to cache achieve better performance than what spinning disk and provide.
There are really two types of caching
Where data is moved up from a slower disk type to a faster disk type and back depending on access
Where data is held in a centralized cache in addition to its main storage area for faster response
In terms of EMC type one is referred to as FastVP type two is referred to as FastCache. Most all arrays come with some form of “FastCache” technology these days. They just work very differently.
Some forms of caching are always on; high end arrays like EMC’s VMAX are constantly looking at the data you’re using and will promote that data in to their cache tier. Mid-tier arrays like the VNX and Unity will tier on a schedule, this can be problematic for certain workloads. If you have an important batch just that only runs once a week it will most likely tier down to a slow disk. When you need to run that job the response time could be quite slow. Again you have to understand your application, the user cases, and even the timing of when things will run.
Another thing to understand about the type one cache is that they are not all created equal. I was in a situation where we were competing with another vendor. The competing array seemed less expensive but when you looked at the technical details, the caching tier could only work on 64k block sized or less. From the data we had gathered the average block size for this customer was 128k and greater. That caches wouldn’t have been much use to the customer. It’s CRITICALLY important to read the in depth white papers around technologies you’re going to invest in. Some people will buy the cheapest product and it ends up causing issues for their environment for the next three to five years…
4. Skew or Working Set
The skew or working set is important because when you’re designing a traditional or hybrid array you want to understand how much flash or performance disk you need. Out of our original 100Tb array how much of that was “hot” at any given point? Most storage vendors performance tools will show you your working set, this absolutely should be used in the design of the next array. For most of the customers I see their working set is about 20%. So that means if you have a storage array with 100Tb of used capacity only 20Tb of that data will be worked on at any given time. If you want to make sure you have outstanding performance you may choose to have 20Tb of SSD disk in your array and allow the inactive data to set on a decent speed of disk like spindles at 10,000 RPM.
Odds and Ends
Of course, I don’t do all these calculations by hand every time I need to size an array. I have a spread sheet I’ve created that helps me with my work that I’m posting up to share. You can find the download link here: Steven’s Sizing Sheet