RAID: Diferență între versiuni

De la
Salt la: navigare, căutare
(Pagină nouă: {{in lucru|rz}} '''RAID''', este un acronim care provine de la ''Redundant Array of Independent Disks'' (matrice redundanta de discuri independente) si este o tehnologie cu ajutorul c...)
Linia 18: Linia 18:
*RAID 0+1: seturi imbricate intr-un set mirrored ( minim 4 discuri; numar par de discuri) furnizeaza toleranta la erori si performante crescute dar creste gradul de complexitate.
*RAID 0+1: seturi imbricate intr-un set mirrored ( minim 4 discuri; numar par de discuri) furnizeaza toleranta la erori si performante crescute dar creste gradul de complexitate.
:The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set to mirror a primary striped set. The array continues to operate with one or more drives failed in the same mirror set, but if drives fail on both sides of the mirror the data on the RAID system is lost.
*RAID 1+0:  
*RAID 1+0: mirrored sets in a striped set (minimum two disks but more commonly four disks to take advantage of speed benefits; even number of disks) provides fault tolerance and improved performance but increases complexity.
:The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives. In a failed disk situation, RAID 1+0 performs better because all the remaining disks continue to be used. The array can sustain multiple drive losses so long as no mirror loses all its drives.<ref name="layton-lm">
Jeffrey B. Layton: [ "Intro to Nested-RAID: RAID-01 and RAID-10"], Linux Magazine, January 6th, 2011</ref>
*RAID 5+1: mirrored striped set with distributed parity (some manufacturers label this as RAID 53).
Whether an array runs as RAID 0+1 or RAID 1+0 in practice is often determined by the evolution of the storage system. A RAID controller might support upgrading a RAID 1 array to a RAID 1+0 array on the fly, but require a lengthy offline rebuild to upgrade from RAID 1 to RAID 0+1. With nested arrays, sometimes the path of least disruption prevails over achieving the preferred configuration.
== Paritate RAID ==
== RAID Parity ==
==Noua clasificare RAID==
Many RAID levels employ an error protection scheme called "parity". Parity calculation, in and of itself, is a widely used method in information technology to provide fault tolerance in a given set of data.
==Backup-ul datelor==
In [[Boolean logic]], there is a principle called [[exclusive or]], or shorthand, "XOR", meaning "one or the other, but not neither nor both." For example:
: <code>0 XOR 0 = 0</code>
===RAID Software===
: <code>0 XOR 1 = 1</code>
: <code>1 XOR 0 = 1</code>
: <code>1 XOR 1 = 0</code>
The XOR operator is central to how parity data is created and used within an array; It is used both for the protection of data, as well as for the recovery of missing data.
As an example, for a simple RAID made up of 6 hard disks (4 for data, 1 for parity, and 1 for use as hot spare), where each drive is capable of holding just a single byte worth of storage, an initial RAID configuration with random values written to each of our four data drives would look like:
: <code>Drive #1: 00101010 (Data)</code>
: <code>Drive #2: 10001110 (Data)</code>
: <code>Drive #3: 11110111 (Data)</code>
: <code>Drive #4: 10110101 (Data)</code>
: <code>Drive #5: -------- (Hot Spare)</code>
: <code>Drive #6: -------- (Parity)</code>
Every time data is written to the data drives, a parity value is calculated in order to be able to recover from a disk failure. To calculate the parity for this RAID, the XOR of alls drive's data is calculated. The resulting value is the parity data.
: <code>00101010 XOR (Drive 1 byte)</code>
: <code>10001110 XOR (Drive 2 byte)</code>
: <code>11110111 XOR (Drive 3 byte)</code>
: <code>10110101 (Drive 4 byte)</code>
: <code> (XOR applied bit-wise down the columns)</code>
: <code>11100110 (This is the value of the Parity byte)</code>
The parity data "11100110" is then written to the dedicated parity drive:
: <code>Drive #1: 00101010 (Data)</code>
: <code>Drive #2: 10001110 (Data)</code>
: <code>Drive #3: 11110111 (Data)</code>
: <code>Drive #4: 10110101 (Data)</code>
: <code>Drive #5: -------- (Hot Spare)</code>
: <code>Drive #6: 11100110 (Parity)</code>
In order to restore the contents of a failed drive, e.g. Drive #3, the same XOR calculation is performed against all the remaining drives, substituting the parity value (11100110) in place of the missing/dead drive:
: <code>00101010 XOR (Drive 1 byte)</code>
: <code>10001110 XOR (Drive 2 byte)</code>
: <code>11100110 XOR (Parity byte in place of failed Drive 3 byte)</code>
: <code>10110101 (Drive 4 byte)</code>
: <code> (XOR applied bit-wise down the columns)</code>
: <code>11110111 (This is the value of the failed Drive 3 byte)</code>
With the complete contents of Drive #3 recovered, the data is written to the hot spare, and the RAID can continue operating.
: <code>Drive #1: 00101010 (Data)</code>
: <code>Drive #2: 10001110 (Data)</code>
: <code>Drive #3: --Dead-- (Data)</code>
: <code>Drive #4: 10110101 (Data)</code>
: <code>Drive #5: 11110111 (Hot Spare)</code>
: <code>Drive #6: 11100110 (Parity)</code>
At this point the dead drive has to be replaced with a working one of the same size. When this happens, the hot spare's contents are then automatically copied to it by the array controller, allowing the hot spare to return to its original purpose as an emergency standby drive. The resulting array is identical to its pre-failure state:
: <code>Drive #1: 00101010 (Data)</code>
: <code>Drive #2: 10001110 (Data)</code>
: <code>Drive #3: 11110111 (Data)</code>
: <code>Drive #4: 10110101 (Data)</code>
: <code>Drive #5: -------- (Hot Spare)</code>
: <code>Drive #6: 11100110 (Parity)</code>
This same basic XOR principle applies to parity within RAID groups regardless of capacity or number of drives. As long as there are enough drives present to allow for an XOR calculation to take place, parity can be used to recover data from any single drive failure. (A minimum of three drives must be present in order for parity to be used for fault tolerance, since the XOR operator requires two operands, and a place to store the result.)
==RAID 10 versus RAID 5 in Relational Databases==
A common myth (and one which serves to illustrate the mechanics of proper RAID implementation) is that in all deployments, RAID 10 is inherently better for relational databases than RAID 5, due to RAID 5's need to recalculate and redistribute parity data on a per-write basis. []
While this may have been a hurdle in past RAID 5 implementations, the task of parity recalculation and redistribution within modern SAN appliances is performed as a back-end process transparent to the host, not as an in-line process which competes with existing I/O. (i.e. the RAID controller handles this as a housekeeping task to be performed during a particular spindle's idle timeslices, so as not to disrupt any pending I/O from the host.) The "write penalty" inherent to RAID 5 has been effectively masked over the past ten years by a combination of improved controller design, larger amounts of cache, and faster hard disks. The effect of a write penalty when using RAID 5 is mostly a concern when the workload has a high amount of random writes (such as in some databases) while in other workloads modern RAID 5 systems can be on par with RAID 10 performance. []
In the vast majority of enterprise-level SAN hardware, any writes which are generated by the host are simply acknowledged immediately, and destaged to disk on the back end when the controller sees fit to do so. From the host's perspective, an individual write to a RAID 10 volume is no faster than an individual write to a RAID 5 volume; A difference between the two only becomes apparent when write cache at the SAN controller level is overwhelmed, and the SAN appliance must reject or gate further write requests in order to allow write buffers on the controller to destage to disk. While rare, this generally indicates poor performance management on behalf of the SAN administrator, not a shortcoming of RAID 5 or RAID 10. SAN appliances generally service multiple hosts which compete both for controller cache and spindle time with one another. This contention is largely masked, in that the controller is generally intelligent and adaptive enough to maximize read cache hit ratios while also maximizing the process of destaging data from write cache.
The choice of RAID 10 versus RAID 5 for the purposes of housing a relational database will depend upon a number of factors (spindle availability, cost, business risk, etc.) but, from a performance standpoint, it depends mostly on the type of I/O that database can expect to see. For databases that are expected to be exclusively or strongly read-biased, RAID 10 is often chosen in that it offers a slight speed improvement over RAID 5 on sustained reads. If a database is expected to be strongly write-biased, RAID 5 becomes the more attractive option, since RAID 5 doesn't suffer from the same write handicap inherent in RAID 10; All spindles in a RAID 5 can be utilized to write simultaneously, whereas only half the members of a RAID 10 can be used <!--Again, kid, please know your facts before editing. Your magical parity unicorn can't stripe what it hasn't mirrored first, no matter how much glitter it farts.-->. [] However, for reasons similar to what has eliminated the "read penalty" in RAID 5, the reduced ability of a RAID 10 to handle sustained writes has been largely masked by improvements in controller cache efficiency and disk throughput.
What causes RAID 5 to be slightly slower than RAID 10 on sustained reads is the fact that RAID 5 has parity data interleaved within normal data. For every read pass in RAID 5, there is a probability that a read head may need to traverse a region of parity data. The cumulative effect of this is a slight performance drop compared to RAID 10, which does not use parity, and therefore will never encounter a circumstance where data underneath a head is of no use. For the vast majority of situations, however, most relational databases housed on RAID 10 perform equally well in RAID 5. The strengths and weaknesses of each type only become an issue in atypical deployments, or deployments on overcommitted or outdated hardware.[]
There are, however, other considerations which must be taken into account other than simply those regarding performance. RAID 5 and other non-mirror-based arrays offer a lower degree of resiliency than RAID 10 by virtue of RAID 10's mirroring strategy. In a RAID 10, I/O can continue even in spite of multiple drive failures. By comparison, in a RAID 5 array, any simultaneous failure involving greater than one drive will render the array itself unusable by virtue of parity recalculation being impossible to perform. For many, particularly in mission-critical environments with enough capital to spend, RAID 10 becomes the favorite as it provides the lowest level of risk.[]
Additionally, the time required to rebuild data on a hot spare in a RAID 10 is significantly less than RAID 5, in that all the remaining spindles in a RAID 5 rebuild must participate in the process, whereas only half of all spindles ''need to participate'' in RAID 10. In modern RAID 10 implementations, all drives generally participate in the rebuilding process as well, but only half are required, allowing greater degraded-state throughput over RAID 5 and overall faster rebuild times.[]
Again, modern SAN design largely masks any performance hit while the RAID array is in a degraded state, by virtue of selectively being able to perform rebuild operations both in-band or out-of-band with respect to existing I/O traffic. Given the rare nature of drive failures in general, and the exceedingly low probability of multiple concurrent drive failures occurring within the same RAID array, the choice of RAID 5 over RAID 10 often comes down to the preference of the storage administrator, particularly when weighed against other factors such as cost, throughput requirements, and physical spindle availability. []
In short, the choice of RAID 5 versus RAID 10 involves a complicated mixture of factors. There is no one-size-fits-all solution, as the choice of one over the other must be dictated by everything from the I/O characteristics of the database, to business risk, to worst case degraded-state throughput, to the number and type of disks present in the array itself. Over the course of the life of a database, you may even see situations where RAID 5 is initially favored, but RAID 10 slowly becomes the better choice, and vice versa.
==New RAID classification==
In 1996, the RAID Advisory Board introduced an improved classification of RAID systems{{Citation needed|date=May 2010}}. It divides RAID into three types: Failure-resistant disk systems (that protect against data loss due to disk failure), failure-tolerant disk systems (that protect against loss of data access due to failure of any single component), and disaster-tolerant disk systems (that consist of two or more independent zones, either of which provides access to stored data).
The original "Berkeley" RAID classifications are still kept as an important historical reference point and also to recognize that RAID Levels 0-6 successfully define all known data mapping and protection schemes for disk. Unfortunately, the original classification caused some confusion due to assumption that higher RAID levels imply higher redundancy and performance. This confusion was exploited by RAID system manufacturers, and gave birth to the products with such names as RAID-7, RAID-10, RAID-30, RAID-S, etc. The new system describes the data availability characteristics of the RAID system rather than the details of its implementation.
The next list provides criteria for all three classes of RAID:
- Failure-resistant disk systems (FRDS) (meets a minimum of criteria 1–6):
1.  Protection against data loss and loss of access to data due to disk drive failure<br />
2.  Reconstruction of failed drive content to a replacement drive <br />
3.  Protection against data loss due to a "write hole"<br />
4.  Protection against data loss due to host and host I/O bus failure <br />
5.  Protection against data loss due to replaceable unit failure <br />
6.  Replaceable unit monitoring and failure indication
- Failure-tolerant disk systems (FTDS)  (meets a minimum of criteria 7–15 ):
7.  Disk automatic swap and hot swap <br />
8.  Protection against data loss due to cache failure <br />
9.  Protection against data loss due to external power failure <br />
10. Protection against data loss due to a temperature out of operating range <br />
11. Replaceable unit and environmental failure warning <br />
12. Protection against loss of access to data due to device channel failure <br />
13. Protection against loss of access to data due to controller module failure <br />
14. Protection against loss of access to data due to cache failure <br />
15. Protection against loss of access to data due to power supply failure
- Disaster-tolerant disk systems (DTDS)  (meets a minimum of criteria 16–21):
16. Protection against loss of access to data due to host and host I/O bus failure <br />
17. Protection against loss of access to data due to external power failure <br />
18. Protection against loss of access to data due to component replacement <br />
19. Protection against loss of data and loss of access to data due to multiple disk failure <br />
20. Protection against loss of access to data due to zone failure <br />
21. Long-distance protection against loss of data due to zone failure
==Non-standard levels==
{{Main|Non-standard RAID levels}}
Many configurations other than the basic numbered RAID levels are possible, and many companies, organizations, and groups have created their own non-standard configurations, in many cases designed to meet the specialised needs of a small niche group. Most of these non-standard RAID levels are [[Property|proprietary]].
* Storage Computer Corporation used to call a cached version of RAID 3 and 4, ''RAID 7''. Storage Computer Corporation is now defunct.
* [[EMC Corporation]] used to offer ''RAID S'' as an alternative to RAID 5 on their [[Symmetrix]] systems. Their latest generations of Symmetrix, the DMX and the V-Max series, do not support RAID S (instead they support RAID 1, RAID 5 and RAID 6.)
* The [[ZFS]] filesystem, available in [[Solaris (operating system)|Solaris]], [[OpenSolaris]] and [[FreeBSD]], offers ''[[RAID-Z]]'', which solves RAID 5's [[Standard RAID levels#RAID 5 disk failure rate|write hole]] problem.
* [[Hewlett-Packard]]'s ''Advanced Data Guarding'' (ADG) is a form of RAID 6.
* [[NetApp]]'s Data ONTAP uses RAID-DP (also referred to as "double", "dual", or "diagonal" parity), is a form of RAID 6, but unlike many RAID 6 implementations, does not use distributed parity as in RAID 5. Instead, two unique parity disks with separate parity calculations are used. This is a modification of RAID 4 with an extra parity disk.
* Accusys ''Triple Parity'' (RAID TP) implements three independent parities by extending RAID 6 algorithms on its FC-SATA and SCSI-SATA RAID controllers to tolerate three-disk failure.
* [[Linux]] [[Non-standard RAID levels#Linux MD RAID 10|MD RAID10]] (RAID 10) implements a general RAID driver that defaults to a standard RAID 1 with 2 drives, and a standard RAID 1+0 with four drives, but can have any number of drives, including odd numbers. MD RAID 10 can run striped and mirrored, even with only two drives with the f2 layout (mirroring with striped reads, giving the read performance of RAID 0; normal Linux software RAID 1 does not stripe reads, but can read in parallel).<ref>[], question 4</ref><ref>{{cite web|url= |title=Main Page - Linux-raid | |date=2010-08-20 |accessdate=2010-08-24}}</ref><ref name="layton-lm"></ref>
* Infrant (now part of [[Netgear]]) ''X-RAID'' offers dynamic expansion of a RAID 5 volume without having to back up or restore the existing content. Just add larger drives one at a time, let it resync, then add the next drive until all drives are installed. The resulting volume capacity is increased without user downtime. (It should be noted that this is also possible in Linux, when utilizing [[Mdadm]] utility. It has also been possible in the EMC Clariion and HP MSA arrays for several years.) The new ''X-RAID2'' found on x86 ReadyNas, that is ReadyNas with Intel CPUs, offers dynamic expansion of a RAID 5 or RAID 6 volume (note ''X-RAID2'' Dual Redundancy not available on all X86 ReadyNas) without having to back up or restore the existing content etc. A major advantage over ''X-RAID'', is that using ''X-RAID2'' you do not need to replace all the disks to get extra space, you only need to replace two disks using single redundancy or four disks using dual redundancy to get more redundant space.
* BeyondRAID, created by [[Data Robotics]] and used in the [[Drobo]] series of products, implements both mirroring and striping simultaneously or individually dependent on disk and data context. It offers expandability without reconfiguration, the ability to mix and match drive sizes and the ability to reorder disks. It supports [[NTFS]], [[HFS+]], [[FAT32]], and [[EXT3]] file systems.<ref>{{cite web|url= |title=Data Robotics, Inc | |date= |accessdate=2010-08-24}}</ref> It also uses [[thin provisioning]] to allow for single volumes up to 16&nbsp;TB depending on the host operating system support.
* [[Hewlett-Packard]]'s EVA series arrays implement vRAID - vRAID-0, vRAID-1, vRAID-5, and vRAID-6. The EVA allows drives to be placed in groups (called Disk Groups) that form a pool of data blocks on top of which the RAID level is implemented. Any Disk Group may have "virtual disks" or LUNs of any vRAID type, including mixing vRAID types in the same Disk Group - a unique feature. vRAID levels are more closely aligned to Nested RAID levels - vRAID-1 is actually a RAID 1+0 (or RAID 10), vRAID-5 is actually a RAID 5+0 (or RAID 50), etc. Also, drives may be added on-the-fly to an existing Disk Group, and the existing virtual disks data is redistributed evenly over all the drives, thereby allowing dynamic performance and capacity growth.
* [[IBM]] (Among others) has implemented a RAID 1E (Level 1 Enhanced). With an even number of disks it is similar to a RAID 10 array, but, unlike a RAID 10 array, it can also be implemented with an odd number of drives. In either case, the total available disk space is n/2. It requires a minimum of three drives.
* [[Hadoop]] has a RAID system that generates a parity file by xor-ing a stripe of blocks in a single HDFS file. More details can be found here <ref>{{cite web|url= |title=Hdfs Raid | |date=2009-08-28 |accessdate=2010-08-24}}</ref>
==Data backup==
A RAID system used as a main system disk is not intended as a replacement for [[backup|backing up]] data. In parity configurations it will provide a backup-like feature to protect from catastrophic data loss caused by physical damage or errors on a single drive. Many other features of backup systems cannot be provided by RAID arrays alone. The most notable is the ability to restore an earlier version of data, which is needed to protect against [[Computer software|software]] errors causing unwanted data to be written to the disk, and to recover from user error or malicious deletion. RAID can also be overwhelmed by catastrophic failure that exceeds its recovery capacity and, of course, the entire array is at risk of physical damage by fire, natural disaster, or human forces. RAID is also vulnerable to controller failure since it is not always possible to migrate a RAID to a new controller without data loss.<ref>{{cite web|url=,1640.html|title= The RAID Migration Adventure|accessdate= 2010-03-10}}</ref>
RAID drives can serve as excellent backup drives when employed as removable backup devices to main storage, and particularly when located offsite from the main systems. However, the use of RAID as the ''only'' storage solution does not replace backups.
{{Merge from| Vinum volume manager |discuss=Talk:Vinum volume manager |date=November 2008}}
(''Specifically, the section comparing hardware / software raid'')
The distribution of data across multiple drives can be managed either by dedicated [[hardware]] or by [[software]]. When done in software the software may be part of the operating system or it may be part of the firmware and drivers supplied with the card.
===Software-based RAID===
Software implementations are now provided by many [[operating systems]]. A software layer sits above the (generally [[block device|block]]-based) disk [[device driver]]s and provides an abstraction layer between the [[logical disk|logical drives]] (RAIDs) and [[disk drive|physical drives]]. Most common levels are RAID 0 (striping across multiple drives for increased space and performance) and RAID 1 (mirroring two drives), followed by RAID 1+0, RAID 0+1, and RAID 5 (data striping with parity) are supported.  New filesystems like [[btrfs]] may replace the traditional software RAID by providing striping and redundancy at the filesystem object level.
* Apple's [[Mac OS X Server]]<ref>{{cite web|url=|title= Apple Mac OS X Server File Systems|accessdate= 2008-04-23}}</ref> and [[Mac OS X]]<ref>{{cite web|url=|title= Mac OS X: How to combine RAID sets in Disk Utility|accessdate= 2010-01-04}}</ref> support RAID 0, RAID 1 and RAID 1+0.
* [[FreeBSD]] supports RAID 0, RAID 1, RAID 3, and RAID 5 and all layerings of the above via [[GEOM]] modules<ref>{{cite web|url=|title=FreeBSD System Manager's Manual page for GEOM(8)|accessdate=2009-03-19}}</ref><ref>{{cite web|url=|title=freebsd-geom mailing list - new class / geom_raid5|accessdate=2009-03-19}}</ref> and ccd.,<ref>{{cite web|url=|title=FreeBSD Kernel Interfaces Manual for CCD(4)|accessdate=2009-03-19}}</ref> as well as supporting RAID 0, RAID 1, RAID-Z, and RAID-Z2 (similar to RAID 5 and RAID 6 respectively), plus nested combinations of those via [[ZFS]].
* [[Linux]] supports RAID 0, RAID 1, RAID 4, RAID 5, RAID 6 and all layerings of the above, as well as "RAID10" (see above).<ref>{{cite web|url=|title=The Software-RAID HOWTO|accessdate=2008-11-10}}</ref><ref>{{cite web|url=|title=RAID setup|accessdate=2008-11-10}} {{Dead link|date=September 2010|bot=H3llBot}}</ref> Certain reshaping/resizing/expanding operations are also supported.<ref>{{cite web|url=|title=RAID setup|accessdate=2010-09-30}}</ref>
* [[Microsoft]]'s server operating systems support RAID 0, RAID 1, and RAID 5. Some of the Microsoft desktop operating systems support RAID such as Windows XP Professional which supports RAID level 0 in addition to spanning multiple disks but only if using dynamic disks and volumes. Windows XP supports RAID 0, 1, and 5 with a simple file patch.<ref>{{cite web|url=,925.html |title=Using WindowsXP to Make RAID 5 Happen | |date= |accessdate=2010-08-24}}</ref> RAID functionality in Windows is slower than hardware RAID, but allows a RAID array to be moved to another machine with no compatibility issues.
* [[NetBSD]] supports RAID 0, RAID 1, RAID 4 and RAID 5 (and any nested combination of those like 1+0) via its software implementation, named RAIDframe.
* [[OpenBSD]] aims to support RAID 0, RAID 1, RAID 4 and RAID 5 via its software implementation softraid.
* [[Solaris]] [[ZFS]] supports ZFS equivalents of RAID 0, RAID 1, RAID 5 (RAID Z), RAID 6 (RAID Z2), and a triple parity version RAID Z3, and any nested combination of those like 1+0. Note that RAID Z/Z2/Z3 solve the RAID 5/6 [[Standard RAID levels#RAID 5 disk failure rate|write hole]] problem and are therefore particularly suited to software implementation without the need for battery backed cache (or similar) support. The boot filesystem is limited to RAID 1.
* Solaris SVM supports RAID 1 for the boot filesystem, and adds RAID 0 and RAID 5 support (and various nested combinations) for data drives.
* Linux and Windows [[FlexRAID]] is a snapshot RAID implementation.
* HP's OpenVMS provides a form of RAID 1 called "Volume shadowing", giving the possibility to mirror data locally and at remote cluster systems.
Software RAID has advantages and disadvantages compared to hardware RAID. The software must run on a host server attached to storage, and server's processor must dedicate processing time to run the RAID software. The additional processing capacity required for RAID 0 and RAID 1 is low, but parity-based arrays require more complex data processing during write or integrity-checking operations. As the rate of data processing increases with the number of disks in the array, so does the processing requirement. Furthermore all the buses between the processor and the disk controller must carry the extra data required by RAID which may cause congestion.
Over the history of hard disk drives, the increase in speed of commodity CPUs has been consistently greater than the increase in speed of hard disk drive throughput.<ref>{{cite web|url=|title=Rules of Thumb in Data Engineering|accessdate=2010-01-14}}</ref> Thus, over-time for a given number of hard disk drives, the percentage of host CPU time required to saturate a given number of hard disk drives has been dropping. e.g. The Linux software md RAID subsystem is capable of calculating parity information at 6&nbsp;GB/s (100% usage of a single core on a 2.1&nbsp;GHz Intel "Core2" CPU as of Linux v2.6.26). A three-drive RAID 5 array using hard disks capable of sustaining a write of 100&nbsp;MB/s will require parity to be calculated at the rate of 200&nbsp;MB/s. This will require the resources of just over 3% of a single CPU core during write operations (parity does not need to be calculated for read operations on a RAID 5 array, unless a drive has failed).
Software RAID implementations may employ more sophisticated algorithms than hardware RAID implementations (for instance with respect to disk scheduling and command queueing), and thus may be capable of increased performance.
Another concern with operating system-based RAID is the boot process. It can be difficult or impossible to set up the boot process such that it can fall back to another drive if the usual boot drive fails. Such systems can require manual intervention to make the machine bootable again after a failure. There are exceptions to this, such as the LILO bootloader for Linux, loader for FreeBSD,<ref>{{cite web|url=|title=FreeBSD Handbook|work=Chapter 19 GEOM: Modular Disk Transformation Framework|accessdate= 2009-03-19}}</ref> and some configurations of the [[GNU GRUB|GRUB]] bootloader natively understand RAID 1 and can load a kernel. If the BIOS recognizes a broken first disk and refers bootstrapping to the next disk, such a system will come up without intervention, but the BIOS might or might not do that as intended. A hardware RAID controller typically has explicit programming to decide that a disk is broken and fall through to the next disk.
Hardware RAID controllers can also carry battery-powered cache memory. For data safety in modern systems the user of software RAID might need to turn the write-back cache on the disk off (but some drives have their own battery/capacitors on the write-back cache, a UPS, and/or implement [[atomicity (database systems)|atomicity]] in various ways, etc.). Turning off the write cache has a performance penalty that can, depending on workload and how well supported command queuing in the disk system is, be significant. The battery backed cache on a RAID controller is one solution to have a safe write-back cache.
Finally operating system-based RAID usually uses formats specific to the operating system in question so it cannot generally be used for partitions that are shared between operating systems as part of a multi-boot setup. However, this allows RAID disks to be moved from one computer to a computer with an operating system or file system of the same type, which can be more difficult when using hardware RAID (e.g. #1: When one computer uses a hardware RAID controller from one manufacturer and another computer uses a controller from a different manufacturer, drives typically cannot be interchanged. e.g. #2: If the hardware controller 'dies' before the disks do, data may become unrecoverable unless a hardware controller of the same type is obtained, unlike with firmware-based or software-based RAID).
Most operating system-based implementations allow RAIDs to be created from [[Disk partitioning|partitions]] rather than entire physical drives. For instance, an administrator could divide an odd number of disks into two partitions per disk, mirror partitions across disks and stripe a volume across the mirrored partitions to emulate [[Non-standard RAID levels#IBM ServeRAID 1E|IBM's RAID 1E configuration]]. Using partitions in this way also allows mixing reliability levels on the same set of disks. For example, one could have a very robust RAID 1 partition for important files, and a less robust RAID 5 or RAID 0 partition for less important data. (Some BIOS-based controllers offer similar features, e.g. [[Intel Matrix RAID]].) Using two partitions on the same drive in the same RAID is, however, dangerous. (e.g. #1: Having all partitions of a RAID 1 on the same drive will, obviously, make all the data inaccessible if the single drive fails. e.g. #2: In a RAID 5 array composed of four drives 250 + 250 + 250 + 500 GB, with the 500&nbsp;GB drive split into two 250&nbsp;GB partitions, a failure of this drive will remove two partitions from the array, causing all of the data held on it to be lost).
===RAID hardware===
===RAID hardware===

Versiunea de la data 9 ianuarie 2011 13:59

Acest articol este în curs de editare de către rz. Dacă doriți să interveniţi în procesul de editare, cereți mai înainte permisiunea autorului pe pagina sa de discuţii.

RAID, este un acronim care provine de la Redundant Array of Independent Disks (matrice redundanta de discuri independente) si este o tehnologie cu ajutorul careia se obtin capacitate de stocare si fiabilitate crescuta prin combinarea mai multor discuri fizice intr-o singura unitate logica in care discurile din matrice sunt interdependente. Diferitele tipuri de sisteme RAID urmaresc doua scopuri cheie: cresterea fiabilitatii datelor si cresterea performantelor I/O. Despre mai multe discuri fizice configurate pentru a folosi tehnologie RAID se spune ca sunt intr-o matrice RAID. Aceasta matrice distribuie datele pe mai multe discuri, dar matricea este adresata de catre sistemul de operare ca un singur disc. RAID poate fi utilizat pentru mai multe scopuri.

Niveluri standard

Initial au fost concepute 5 niveluri RAID, din care au evoluat multe alte variante, cel mai notabil cateva niveluri imbricate si multe niveluri non-standard (in general software proprietar).

  • RAID 0 (striping la nivel de bloc fara paritate, sau mirroring) furnizeaza performante crescute si spatiu de stocare suplimentar dar nu si redundanta sau toleranta la erori (ceea ce face ca RAID0 sa nu fie cu adevarat RAID, conform definitiei acronimului). Totusi, datorita similaritatii cu RAID (in special necesitatea unui controller care sa distribuie datele pe discuri multiple), seturile de stripping simplu sunt denumite in mod uzual RAID 0. Defectarea oricaruia dintre discuri are ca efect distrugerea matricei si probabilitatea de defectare creste direct proportional cu numarul de discuri din matrice (cu un numar minim de discuri, pierderea datelor este de doua ori mai probabila comparat cu un disc fara RAID). O singura defectiune distruge intreaga matrice pentru ca atunci cand datele sunt scrise pe un volum RAID, datele sunt impartite in fragmente denumite blocuri. Numarul de blocuri este dictat de marimea intreteserii (stripe size), un parametru de configurare al matricei. Blocurile sunt scrise pe discuri simultan in acelasi sector. Aceasta permite ca segmente dintr-un fragment de date sa fie scrise sau citite de pe discurile respective in paralel, crescand latimea de banda si astfel viteza operatiunilor de citire/scriere. RAID 0 nu implementeaza verificarea erorilor. Mai multe discuri in matrice conduc la performante crescute dar si la un risc crescut de pierdere a datelor.
  • RAID 1 (mirroring fara paritate sau striping), datele sunt scrise identic pe discuri multiple ("in oglinda"). Cu toate ca majoritatea implementarilor contin 2 discuri, un set poate contine 3 sau mai multe discuri. Matricea furnizeaza toleranta la erori de disc sau defectiuni si continua sa opereze cata vreme exista cel putin un disc in matrice. Cu suport adecvat din partea sistemului de operare, aceasta poate conduce la cresterea performantelor la citirea datelor si o reducere minima a vitezei de scriere. Utilizarea RAID 1 cu controller separat pentru fiecare disc este denumita uneori duplexare.
  • RAID 2 (striping la nivel de bit with cu paritate dedicata Hamming-code)
  • RAID 3 (striping la nivel de octet cu paritate dedicata)
  • RAID 4 (striping la nivel de bloc cu paritate dedicata) este identic cu RAID 5 dar mentine toate datele de paritate pe un singur disc, ceea ce poate provoca o gatuire a performantelor. In aceasta configuratie, fisierele pot fi distribuite pe discuri multiple. Fiecare disc opereaza independent, ceea ce permite efectuarea cererilor I/O in paralel, dar viteza de transfer a datelor poate avea de suferit datorita tipului de paritate. Detectia erorilor se efectueaza prin paritate dedicata, datele de paritate fiind stocate pe o singura unitate de disc dedicata.
  • RAID 5 (striping la nivel de bloc cu paritate distribuita) distribuie atat datele propriu-zise cat si datele de paritate pe discuri multiple si necesita N-1 discuri functionale pentru a opera, unde N este numarul total de discuri din matrice. Defectarea unitatilor de disc necesita inlocuire, dar matricea nu este distrusa la defectarea unui singur disc. La defectarea unui disc, citirile ulterioare pot fi calculate din paritatea distribuita astfel ca defectarea unui disc nu are efecte vizibile pentru utilizator. Matricea va suferi pierdere de date in cazul defectarii unui al doilea disc si este vulnerabila pana cand discul defectat este inlocuit. Defectarea unui disc conduce la scaderea performantelor intregii matrici pana cand discul defectat este inlocuit si matricea reconstruita.
  • RAID 6 (striping la nivel de bloc cu dubla paritate distribuita) dispune de toleranta la erori in cazul defectarii a doua unitati de disc. Aceasta face ca grupuri RAID mai mari sa fie mai practice, in special pentru sisteme cu un grad ridicat de disponibilitate. Acest lucru devine important pentru capacitati mari de stocare pentru ca discurile de capacitate mare necesita un timp mai ridicat de reconstruire dupa defectarea unei singure unitati de disc. Nivelurile RAID cu paritate simpla sunt la fel de vulnerabile ca o matrice RAID 0 in intervalul dintre defectarea unui disc si inlocuirea acestuia; Cu cat discul fizic este mai mare, cu atat durata de reconstructie creste. Paritatea dubla face posibila reconstructia matricei in cazul defectarii unuia din discuri fara a risca integritatea datelor din matrice in cazul defectarii unui singur disc.

Neexistand nivel de baza RAID mai mare de 9, RAID-urile imbricate se descriu de obicei prin alaturarea cifrelor care indica nivelurile RAID folosite, uneori unite prin semnul "+". Ordinea cifrelor este in aceste cazuri ordinea in care matricea imbricata este construita: pentru RAID 1+0, primele perechi de discuri sunt combinate in doua sau mai multe matrici RAID 1 (mirrors), si matricile RAID 1 rezultate sunt combinate intr-o matrice RAID 0 (stripes). Este posibila de asemenea combinarea inversa (RAID 0+1). Rezultatul este matricea superioara. Cand matricea superioara este de tip RAID 0 (ca de exemplu, RAID 10 sau RAID 50) se omite in general semnul "+", though RAID 5+0 is clearer.

  • RAID 0+1: seturi imbricate intr-un set mirrored ( minim 4 discuri; numar par de discuri) furnizeaza toleranta la erori si performante crescute dar creste gradul de complexitate.
  • RAID 1+0:

Paritate RAID

Noua clasificare RAID

Backup-ul datelor


RAID Software

RAID hardware

Hot spares

Alte probleme

Desi RAID poate proteja impotriva defectarii fizice a harddiscurilor, datele sunt totusi expuse distrugerilor datorate defectelor software sau hardware.

RAID software vs. RAID hardware