by Brett Daniel on Apr 22, 2020 3:53:16 PM
Photo: There are a bunch of RAID levels, so many, in fact, that it can be a challenge to distinguish between the differences and benefits of each. In this blog post, we'll discuss a few common RAID levels, highlighting both their differences and benefits. Update 05/05/2020: Check out our blog post on JBOD vs. RAID to learn more about RAID and how it relates to JBOD enclosures. When choosing a RAID level for your storage array, it’s important to consider what you value most: speed, fault tolerance or both. In this blog post, we’ll review five common RAID levels - RAID 0, RAID 1, RAID 5, RAID 6 and RAID 10 - as well as how each levelstores data. And if you're interested in a made-in-USA rugged storage system that can utilize these RAID types, don't hesitate to drop us a line to learn more about what we can do for you. We’ll also briefly review software RAID vs. hardware RAID, as well as the history of the RAID technology itself. By the end, you should be ready to rock ‘n RAID. Table of Contents Photo: A Trenton Systems JBOD/JBOF Rugged Storage unit. The 24 NVME SSDs on this unit can be RAID-configured for better performance and a higher degree of fault tolerance. Redundant Array of Independent Disks (RAID) is a storage technology that creates a data loss fail-safe by merging two or more hard disk drives (HDDs) or solid-state drives (SSDs) into one cohesive storage unit, or array. RAID storage protects against the total loss of a disk drive’s data by repeating or recreating that data and storing it on the additional drive or drives, a process also known as data redundancy. Total data loss, which may occur in the event of a disk drive failure, can devastate businesses and organizations that require frequent access to stored information to carry out their daily responsibilities. Graphic: Consequences of disk drive failure in mission-critical applications Total data loss can be especially devastating for mission-critical applications, whereby a potential failure could result in financial loss, public disapproval, serious injury and even death. Thanks to RAID, mission-critical personnel can continue focusing on executing their essential duties without having to worry about the potentially dire ramifications of a disk drive failure. But as you’ll see, not all levelsof RAID storage protect against data loss. In fact, there’s only one that doesn’t. Configurations that offer data loss protection are referred to as “fault-tolerant.” This just means that the array will continue to function successfully and provide recoverable data in the event of a disk drive failure. Graphic: Software RAID vs. Hardware RAID. Hard drive icon made by Surang at Flaticon Implementation and management of RAID storage can be executed via software RAID, whereby a driver on a computer executes RAID processing, or by hardware RAID, whereby a RAID controller card utilizing a motherboard’s PCI Express slot is used. The effectiveness of software RAID depends on the processing power of the computer. It’s not ideal for complex RAID configurations. For those high-performance configurations, you’ll need a dedicated RAID controller, the sole purpose of which is to execute RAID processing. Average desktop users can get by with software RAID, since most operating systems, including Apple and Windows, support RAID. Plus, software RAID is the cheaper option. Bigger, more complex RAID applications, however, will need to go with hardware RAID to achieve the highest possible performance. The acronym “RAID” was coined by University of California, Berkeley computer scientists David Patterson, Garth Gibson and Randy Katz in their research paper, “A Case for RAID,” presented at the 1988 annual conference of the Association for Computing Machinery’s Special Interest Group on Management of Data (SIGMOD). RAID originally stood for “Redundant Array of Inexpensive Disks” due to the expense factor of Patterson, Gibson and Katz’s paper.What is RAID storage?
RAID Types: Software RAID vs. Hardware RAID
History of RAID
The trio argued that one array of multiple inexpensive disks could technically outperform their larger, more expensive counterpart: the mainframe disk drive.
Although the concept of combining disk drives to improve performance wasn’t a new one, the trio’s paper sparked commercial interest in RAID.
Several RAID levels have since been standardized by the Storage Networking Industry Association (SNIA).
Configurations are typically evaluated based on their level of fault tolerance, their read and write speeds and their storage capacity.
There are many RAID levelsin use today, several of which are rare.
The most common RAID configurations are RAID 0, RAID 1, RAID 5, RAID 6 and RAID 10.
Graphic: RAID 0 configuration
What is RAID 0?
RAID 0, the simplest RAID storage design, utilizes data striping, a process that separates files into segments for storage.
The data segments are stored on one disk drive, as well as the other disk drives in the array.
A RAID 0 setup increases a disk drive’s read and write speeds, given that the additional drives contribute concurrently to the array’s overall read/write workload.
For example, if you’re storing a 1GB file in a two-disk RAID 0 configuration, that 1GB file is separated into two 500MB segments before being written onto each disk, cutting write time in half.
Read time is also cut in half, since retrieving the two 500MB data chunks from two disks takes less time than retrieving the entire 1GB file from one disk.
So, essentially, the more disks in the RAID 0 array, the faster the read and write speeds.
Now, why might you want to use a RAID 0 disk configuration?
RAID 0 is great for storage applications that require swift read and write speeds and have a relatively low risk of total data loss.
It’s perfect for PC gamers, who generally prefer shorter save and load times, as well as photographers, videographers and music producers, who frequently save and load large files using editing software.
Unfortunately, RAID 0 lacks data redundancy, ergo, it is not a fault-tolerant array. If one of the disk drives in the array fails, all the data is lost.
In other words, RAID 0 should be avoided like the plague in mission-critical applications, where a total loss of data could have catastrophic consequences.
An added plus, however, is that RAID 0 users can utilize the entire capacity of the disk drives. So, if you’re using four 1TB disk drives in your RAID 0 array, you have access to 4 TB of space.
This is not the case in the other common RAID configurations, where duplicate data chunks are created to improve fault tolerance, and as a result, take up more space.
Graphic: RAID 1 configuration
What is RAID 1?
RAID 1 utilizes disk mirroring, which creates copies of the same file for storage.
In RAID 1, the original file is stored on one disk drive, and identical copies of the file are stored on the other disk drives in the array.
As a result, RAID 1 produces disk drives that are mirrored copies of each other.
Unlike RAID 0, RAID 1 provides data redundancy, creating a fault-tolerant array.
So, in a two-disk RAID 1 configuration, if one disk drive fails, the second disk drive contains the same data, ergo, data was not lost and can be easily recovered. As a result, fault tolerance has been achieved.
For example, RAID 1 requires 2 drives and 2T of total capacity gives 1T of useful storage.
Similarly, in a five-disk RAID 1 configuration, if three disk drives fail, the fourth and fifth disk provide users with two complete backups of the same data.
The more disk drives in the array, the more copies of the files that are created, and in turn, the greater the degree of fault tolerance.
RAID 1 is useful for mission-critical applications, where total loss of vital, sensitive data is unacceptable.
Doctor’s offices, accounting firms, law firms, banks, police departments, health departments and other government offices could all benefit from a RAID 1 storage configuration.
But even average, everyday users can benefit from RAID 1’s data cloning abilities.
The gamers, the photographers, the videographers and the music producers, each of whom would undoubtedly benefit from the increased read and write speeds of RAID 0, would arguably be better off sacrificing the speed of RAID 0 for the data redundancy and fault tolerance of RAID 1.
RAID 0 vs. RAID 1
When deciding between RAID 0 and RAID 1, the RAID 1 users can take comfort in knowing that, in the event of a disk drive failure, their files have been safely duplicated.
RAID 1 data has not been segmented, so if a disk drive failure occurs, the data do not have to be pieced back together from the failed disk and are therefore recoverable.
RAID 1 should not replace regular backups, however.
Another downside of the classic two-disk RAID 1 configuration is that its storage capacity is only half of its total disk drive capacity.
So, if you’re using two 2TB disk drives, for a total of 4TB of storage capacity, you’re technically being allotted 2TB of space because the second drive contains the same data as the first.
Now, if you add a 2TB disk drive to that same RAID 1 array, you’re still only being allotted 2TB of space, because now, the other two 2TB disk drives contain the same data as the first.
And in a four-disk RAID 1 array with, for example, two 2TB disk drives and two 4TB disk drives, for a total of 12 TB of storage capacity, you’re still only being allotted 2TB of space.
Why? Because if you’re saving 2TB of data to the array, the 2TB drive is filled with the original data, the second 2TB is filled with copies of that data, and the two 4TB drives have been filled with copies at only half of the drives' usable storage capacity.
So, in this setup, you’d have 6TB being used for data protection and 4TB of unused space.
Compare each of these scenarios to a RAID 0 configuration, in which the array’s storage capacity would be equivalent to the total disk drive capacity.
The trade-off for RAID 0 is fault tolerance, and the trade-off for RAID 1 is speed and efficiency.
This RAID stuff can get rather complex, huh? Well, you haven’t seen anything yet!
Be sure to check out Synology’s RAID storage calculator to test out different RAID arrays and storage combinations.
Graphic: RAID 5 configuration
What is RAID 5?
RAID 5 is perhaps the most common RAID configuration, and unlike RAID 0 and RAID 1, requires a minimum of three disk drives to function.
RAID 5 utilizes data striping, whereby data are separated into segments and stored onto the separate disk drives in the array.
But RAID 5 also utilizes a process called parity, whereby a checksum of all the data is created and stored onto each of the drives in the array as well.
So, in a four-disk RAID 5 array, the data and their parity checksums would be striped and distributed onto all four drives for safekeeping.
And in the event of a disk drive failure, the parity checksums allow for the recreation of the stored data.
The downside to RAID 5 is that it can only withstand one disk drive failure.
Thankfully, RAID 5 is hot-swappable, meaning one disk drive can be replaced while the others in the array remain fully functional.
Unfortunately, if a second disk drive fails while the data from the first is being recreated, then all the data in the array are lost.
In terms of storage capacity, since the space of one disk drive is required to store the parity checksums, total array storage capacity in RAID 5 configurations is reduced by one whole drive.
For example, in a five-disk RAID 5 configuration with five 1TB disk drives, for a total of 5TB of storage capacity, only 4TB can be utilized since the parity checksums take up the space of one whole disk drive.
Similarly, in an eight-disk RAID 5 configuration with eight 2TB disk drives, for a total 16TB of storage capacity, only 14TB can be utilized.
RAID 5 outshines RAID 0 and RAID 1 in terms of fault tolerance and has higher total storage capacity than a RAID 1 array.
Like RAID 0, RAID 5 read speeds are fast due the concurrent output contribution of each drive, but unlike RAID 0, the write speeds of RAID 5 suffer due to the redundant creation of the parity checksums.
RAID 5 v. RAID 6
RAID 5 and RAID 6 are not so different. They both utilize the data striping and parity processes.
The main differences between the two configurations are that RAID 6 requires four drives to function, and it utilizes double parity, whereby two checksums are created instead of one.
In RAID 6, two disk drives can fail without total data loss occurring. This means better security than RAID 5, but it also means even slower write speeds since one additional checksum must be created.
Graphic: RAID 10 configuration
What is RAID 10?
RAID 10 utilizes both data striping and disk mirroring to achieve data redundancy and thus a high degree of fault tolerance.
RAID 10 is sometimes referred to as “RAID 1+0,” since it combines the mirroring and striping processes found in the RAID 1 and RAID 0 configurations, respectively.
In a RAID 10 configuration, which requires a minimum of four disks, data is segmented before being duplicated onto the drives in the array.
Like in RAID 0, where a file is segmented and stored onto two separate disk drives in a two-disk configuration, in RAID 10, each of those file segments are mirrored, requiring additional storage space.
A standard four-disk RAID 10 setup can only withstand one drive failure in each mirrored pair of disk drives.
Otherwise, total data loss occurs.
And as with the standard two-disk RAID 1 configuration, total storage capacity of RAID 10 is halved. So, six 1TB disk drives will only net you 3TB of usable space.
Indeed, RAID 10 is the best of both RAID 0 and RAID 1, boasting fast read and write speeds and incredible fault tolerance.
RAID Configurations: Processes and Fault Tolerance | |||||
Raid Level | Raid 0 | Raid 1 | Raid 5 | Raid 6 | Raid 10 |
Process | Data Striping | Disk Mirroring | Striping + Parity | Striping + Double Parity | Mirroring + Striping |
Tolerance | Not Fault-Tolerant | Fault-Tolerant | Fault-Tolerant | Fault-Tolerant | Fault-Tolerant |
Which RAID is best?
The best RAID configuration for your storage system will depend on whether you value speed, data redundancy or both.
If you value speed most of all, choose RAID 0.
If you value data redundancy most of all, remember that the following drive configurations are fault-tolerant: RAID 1, RAID 5, RAID 6 and RAID 10.
Determine your RAID goals by reviewing the following scenarios:
Q: Are you a large business or organization with multiple servers and numerous employees who need consistent access to the data stored on those servers?
A: Choose RAID 5, RAID 6 or RAID 10, and go with a hardware RAID controller.
Q: Are you a small business or organization where speed isn’t as much of a priority as proper record-keeping?
A: Choose RAID 1 or RAID 5, and choose your operating system’s software RAID driver.
Q: Are you a mission-critical business or organization where a loss of sensitive or other vitally important data could result in headache, financial ruin, serious injury or even death?
A: Choose RAID 6 or RAID 10, and choose a hardware RAID controller.
Q: Are you a gamer, photographer, videographer, music producer or other user who values speed and efficiency over fault tolerance?
A: Choose RAID 0 with your operating system’s software RAID driver. But be sure to conduct regular backups.
Conclusion
We hope this blog post helped clear up the differences between common RAID levels and how each of them can offer a unique benefit to your program or application.
Trenton Systems manufactures customizable rugged storage systems that can utilize RAID technology. These systems include our JBOF/JBOD and 5U Rugged Storage systems.
Not only do we offer RAID-capable rugged storage systems, but we design, manufacture, assemble, integrate and support these systems right here in the United States. That's right. We do it all in one USA facility to provide you with high-quality rugged computing solutions, cut down on potential security hazards, support the American economy and American jobs, and offer super fast, in-house customer service and support.
Trenton Systems creates rugged computer systems to help customers around the world meet their rugged computing needs. We stress-test our computer systems to the max, ensuring that customers can carry out industry-specific operations comfortably, effectively and smack dab in the middle of the world's harshest conditions. In other words, we stress so you don't have to.