For raid-10 of 4 disks 2 disks can be critical. Those. a single disk failure will never break the data. If you are interested in the option raid-10 (soft), in which the failure of two disks is guaranteed not to violate the data, then how much to put in the raid? Just in on-line calculators (different) I get and with 8 disks in raid-10 (on any quantities in the group), that only one broken disk will never destroy the data. How to correctly count the number of disks in raid-10 to ensure that the data is saved when two or three disks come out?

  • one
    If replication factor = 2, and {replication_factor} mirrored disks fail, then the entire array will collapse. If I were you, I’d look at the RAID-5/6 side, I didn’t read in detail, but it seems they are focused on a large number of disks and more fault tolerance than 0/1 - etki
  • Thank you for participating, it is just critical in terms of write speed, as well as soft, so raid-10. I do not understand, unless the increase in the number of disks / groups in raid-10 cannot increase the 'fault tolerance' ... - Alex
  • 2
    well, look. RAID-0 provides an increase in speed, and RAID-1 provides mirroring, and thus fault tolerance. Thus, you can increase the speed N times by making a RAID-0 array of N devices, and increase the possible loss of disks to M-1 by making each RAID-1 device an array of M disks. However, for the complete loss of a cluster, the loss of any RAID-0 device is enough, and for it, it is necessary that all physical disks be spoiled in a RAID-1 disk. Therefore, your fault tolerance is actually limited only by the number of disks in a RAID-1 array, .. - etki
  • one
    RAID-0 is solely responsible for performance. Moreover, this array will work at the speed of the slowest disk, so you may not get the expected performance gain. Therefore, I once again advise you to look in the direction of 5/6, software-not software does not matter here, if you implement it on software, you will lose both on 0/1 and 5/6. - etki
  • And what do you think, what is the maximum number of disks (in raid-10 of eight disks) that can fail without data loss? - Alex

1 answer 1

RAID-10 in the standard mode of duplication without data loss will survive the loss of one of any disk, resistance to the loss of the second is not guaranteed. It is assembled as a mirror over a stripe, or vice versa a stripe over a mirror, or mdadm (which is neither) has no difference, sustainability is guaranteed only without any single disk. Particular attention to "any drive."

Each data block in RAID-10 is mirrored by two disks, because of this the total loss of capacity is half. But therefore, if you are unlucky, and even those out of 10 disks fell out, only those two, on which there were mirrors of one data sector - this sector is no longer readable. In the case of great luck, the loss of just the necessary disks - you can lose up to half of the drives of the array.

For example, in linux raid aka mdadm, it is possible to specify how many copies of data should be replicated across disks. For example, 3 copies of data on 6 disks will give you the opportunity to survive the loss of any two disks and not any 4. The price of this is the available capacity of the array. You will only have the capacity of two disks out of 6 available.

RAID5 and RAID6, which are written to you in the comments, will survive the loss of one and two disks, respectively. The failure of any second disk in raid5 or any third disk in raid6 is fatal and results in the loss of the entire array. The purpose and inheritance of these levels of raid - to insure against the death of the disk, but it is somehow cheaper than a mirror. RAID5 will reduce the formatting capacity of the array by the size of only one disk, RAID6 - by the capacity of only two disks. And not half as RAID1 or RAID10.

For example, from 12 disks of 1 TB can be collected:

  • RAID5 with a capacity of 11 TB, you can lose any 1 disk
  • RAID6 with a capacity of 10 TB, you can lose any 2 drives
  • RAID10 capacity of 6 TB, you can lose any 1 drive
  • RAID10 with a capacity of 4 TB, if you configure that you can lose any 2 drives

It would seem, why then so actively use raid10 with such a difference in capacity? Answer: because of performance. With RAID10, a read request can serve any disk from a pair, which means that a normally-made RAID10 - read requests can be parallelized across different disks. In raid5 / 6, one source data block is stored only in one place. To read it from the redundant data, it will be necessary to read this segment from all the disks at once and apply a bit of math. Then, RAID5 / 6 is slower on writing. And a much more dramatic difference in the degraded form, i.e. if one disc fell out. RAID5 / 6 squander on performance more than sensitive.

How many disks can be lost - the problem solves. Let me just remind you that when the dropped out disk is replaced with a new one and the synchronization process of the array begins, this is a very dangerous time, the load on the old disks increases dramatically and someone else may die. Therefore, RAID5 is used rather infrequently, RAID6 is not much more expensive for these tasks, but it insures for the reassembly time of the array.

And another important point that you should always say when talking about raids: RAID is not backup. Backup you should be all the same.

  • Thanks for the information. I correctly understood that for raid-10, in order to be able to lose (maximum) 2 disks, you need at least 12 disks, and you also need to configure it, i.e. with a loss of more than half the capacity, as in your example. Do not tell an example of such a setting? Maybe there is an infa intelligent in the internet to estimate the time of recording such an assembly. - Alex
  • one
    Regarding mdadm only: to survive the loss of any 2 disks, you need any number of disks starting with 3 pieces. (degenerate case, behaves like RAID1). 4 disks, 5 disks - not a problem, therefore, in the answer, he said that mdadm raid10 is something separate. The capacity of the array is considered elementary: divide the raw capacity of the devices by the number of copies of the data. Looked at man, you need an argument --layout= . For example, mdamd -C /dev/md0 -l 10 --layout=n3 -n 6 /dev/sd[af]1 for an array of 6 disks with 3 copies of data. - Shallow
  • "in linux raid aka mdadm, it is possible to indicate how many copies of the data should be replicated on disks" - I will read about this topic, thanks for the hint ... just I don’t find anything sensible on the topic ... I thought there is a ready-layout / table a calculator that will show the dependency for raid-10: the number of harmless failures of the screws to their number in the raid and ~ speed (read / write) for each option. - Alex
  • Here we need a software raid on debian from identical disks. It is necessary to conditionally maximum performance for recording and resistance to the loss of two at once (at first it was thought of three) disks. There are 8 screws, but it can be increased if necessary. But I don’t understand the price of the question, i.e. there would be a dependency table ... with reference to the write speed ... maybe using 10-12 screws you can create 'miracles', or vice versa ... - Alex
  • one
    Is the recording consistent? Rough calculation: write speed per disc * number of discs / number of recorded copies of data. raid10 is simple and stupid, so it is quite distinctly scaled by the number of disks. But the fact that you need to write backup copies of data in several places cannot be avoided. Reading - grows in proportion to the number of disks and linear and random queries. If you play around with the chunk size, it is possible to increase the blocks of exactly which size to smear. - Shallow