RAID (Redundant Array of Independent Disks (originally and more informally “…Inexpensive Disks”)) is a technology that employs the simultaneous use of two or more hard disk drives to achieve greater levels of performance, reliability, and/or larger data volume sizes.
“RAID” is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple hard disk drives. RAID’s various designs all involve two key design goals: increased data reliability and increased input/output performance. When several physical disks are set up to use RAID technology, they are said to be in a RAID array. This array distributes data across several disks, but the array is seen by the computer user and operating system as one single disk. RAID can be set up to serve several different purposes.
Purpose and basics
Redundancy is a way that extra data is written across the array, which are organized so that the failure of one (sometimes more) disks in the array will not result in loss of data. A failed disk may be replaced by a new one, and the data on it reconstructed from the remaining data and the extra data. A redundant array allows less data to be stored. For instance, a 2-disk RAID 1 array loses half of the total capacity that would have otherwise been available using both disks independently, and a RAID 5 array with several disks loses the capacity of one disk. Other RAID level arrays are arranged so that they are faster to write to and read from than a single disk.
There are various combinations of these approaches giving different trade-offs of protection against data loss, capacity, and speed. RAID levels 0, 1, and 5 are the most commonly found, and cover most requirements.
( Raid level 3 and Raid level 4 differs in the size of each drive.) This uses byte striping with parity merged with block striping.
RAID can involve significant computation when reading and writing information. With traditional “real” RAID hardware, a separate controller does this computation. In other cases the operating system or simpler and less expensive controllers require the host computer’s processor to do the computing, which reduces the computer’s performance on processor-intensive tasks (see “Software RAID” and “Fake RAID” below). Simpler RAID controllers may provide only levels 0 and 1, which require less processing.
RAID systems with redundancy continue working without interruption when one, or sometimes more, disks of the array fail, although they are vulnerable to further failures. When the bad disk is replaced by a new one the array is rebuilt while the system continues to operate normally. Some systems have to be shut down when removing or adding a drive; others support hot swapping, allowing drives to be replaced without powering down. RAID with hot-swap drives is often used in high availability systems, where it is important that the system keeps running as much of the time as possible.
RAID is not a good alternative to backing up data. Data may become damaged or destroyed without harm to the drive(s) on which they are stored. For example, part of the data may be overwritten by a system malfunction; a file may be damaged or deleted by user error or malice and not noticed for days or weeks; and of course the entire array is at risk of physical damage.
RAID combines two or more physical hard disks into a single logical unit by using either special hardware or software. Hardware solutions often are designed to present themselves to the attached system as a single hard drive, so that the operating system would be unaware of the technical workings. For example, you might configure a 1TB RAID 5 array using three 500GB hard drives in hardware RAID, the operating system would simply be presented with a “single” 1TB disk. Software solutions are typically implemented in the operating system, but would present the RAID drive as a single drive to applications running upon the operating system.
There are three key concepts in RAID: mirroring, the copying of data to more than one disk; striping, the splitting of data across more than one disk; and error correction, where redundant data is stored to allow problems to be detected and possibly fixed (known as fault tolerance). Different RAID levels use one or more of these techniques, depending on the system requirements. RAID’s main aim can either be to improve reliability and availability of data, ensuring that important data is available more often than not, for instance a database of customer orders; equally its aim could merely be to improve the access speed to files, for example a system that delivers video on demand TV programs to many viewers.
The configuration affects reliability and performance in different ways. The problem with using more disks is that it is more likely that one will go wrong, but by using error checking the total system can be made more reliable by being able to survive and repair the failure. Basic mirroring can speed up reading data as a system can read different data from both the disks, but it may be slow for writing if the configuration requires that both disks must confirm that the data is correctly written. Striping is often used for performance, where it allows sequences of data to be read from multiple disks at the same time. Error checking typically will slow the system down as data needs to be read from several places and compared. The design of RAID systems is therefore a compromise and understanding the requirements of a system is important. Modern disk arrays typically provide the facility to select the appropriate RAID configuration. PC Format Magazine claims that “in all our real-world tests, the difference between the single drive performance and the dual-drive RAID 0 striped setup was virtually non-existent. And in fact, the single drive was ever-so-slightly faster than the other setups, including the RAID 5 system that we’d hoped would offer the perfect combination of performance and data redundancy”.
Source : Wikipedia