📜 ⬆️ ⬇️

AccelStor - your own look at the work of All Flash

Currently, flash drives are increasingly occupying a niche storage media in the Enterprise segment. This contributes to a significant reduction in their cost, and an increase in the capacity of individual drives. Where only mechanical hard drives have been used recently, SSDs are being actively exploited. And we are talking not only about internal drives in client systems, but also about the disk subsystem of servers and data storage systems. And in this segment, storage configurations occupy a special place, where only SSD is used as a carrier. This is the so-called All Flash system.





First of all, you need to understand for yourself what is All Flash storage system. It is clear that the name implies the use of only Flash drives in it. However, not all All Flash systems are the same. Conventionally, they can be divided into three subspecies.


1. Traditional SSD Storage Systems


In fact, it is the most numerous type of All Flash storage system. Because for the manufacturer there is nothing easier than to pack your existing storage SSD drives. Of course, the leading vendors, in addition to the glueing of nameplates (All Flash storage systems), also deal with additional firmware optimization for the convenience of working with SSDs, as well as increasing the speed of the system as a whole. But there are those who do not particularly bother and just offer bundles consisting of a conventional storage system and a set of SSDs. As a result, offers can be found on the market, ranging from All Flash NAS Qnap (let's leave the discussion of the feasibility of this solution out of scope, but, in fact, you can't really pick All Flash!) To the monstrous multilevel Netapp FAS.


The main advantage of this solution is, above all, reasonable cost. Of course, each vendor has its own surcharge for the brand, but in general, the price of All Flash systems (talking about the “head” with the controllers) is not much different compared to the classical storage system (compared to the SSD cost, there is a “penny”).


The downside is the low overall performance of the solution. All similar All Flash systems that have modern “iron” inside give out about 300K IOPS for recording (4K, 100% random, consider the recording mode for the reason that it is much harder for storage than reading. Reading performance is, of course, much higher ). A strong negative deviation from this value is rather a serious flaw in the firmware, and higher performance figures suggest better caching and / or optimization algorithms for specific SSD models. In any case, "saturation" occurs when the number of disks ~ 10-20. Therefore, further adding disks will only increase the available storage capacity, but not the speed of work.


The main reason for this performance limit is the use of classic RAID algorithms. These algorithms have been developed for a long time to work with mechanical hard drives and absolutely do not take into account the peculiarities of the operation of solid-state drives. After all, SSD, unlike HDD, cannot simply overwrite a block of data. He needs to rewrite the entire page containing the variable block to a new place, and empty the old place for the next new record. These circumstances in addition to the standard RAID penalty give a huge overhead for rewriting operations.


2. All Flash arrays with proprietary "iron"


To overcome the bottlenecks of traditional storage systems, it is necessary to use a completely different hardware and software architecture. Examples of such solutions are Pure storage or IBM Flash System products. They do not have either RAID in the usual sense (parity, of course, and there is fault tolerance), nor SSDs as such (instead of them, their “drives”). The result is simply crazy performance and especially low latency. But the cost ... Indeed, as the wing of the aircraft.


3. Software defined storage


Apart from all this "zoo" All Flash arrays are software-defined storage (Software Defined Storage, SDS). SDS is software that runs on regular x86 “hardware” and performs “emulation” of the storage system. We knowingly used this term in quotes, because At present, the boundary between hardware and software controllers is quite arbitrary, unlike in the past. Modern storage systems most often use the standard x86 architecture running Linux-like operating systems. Yes, additional offload support controllers can be used for some operations. But the main difference from SDS is the closeness of both hardware and software for the user. SDS, by contrast, allows you to use almost any recommended hardware and produce moderate modifications in software components.


However, if you use SDS not just as a storage system, but as an All Flash array, then it is wrong to allow the user to freely choose the server platform and make an independent software installation. The main reason is the impossibility to guarantee the specified performance indicators (actually, the main reason for choosing All Flash), as well as the difficulty of supporting a wide range of equipment. Therefore, there are so-called appliance on the market - complete solutions consisting of a server platform with pre-installed and configured software and equipped with the necessary number of SSDs, which generally provides the specified performance.


Representatives of this type of solution (SDS appliance) are the heroes of our review - All Flash arrays from AccelStor .


AccelStor - your own look at the work of All Flash


AccelStor Company was formed as a startup in 2014. The key investor (in fact - the owner of this project) is the well-known IT giant Toshiba. Even before the commercial launch, the company drew attention to itself, receiving the highest awards at various events dedicated to Flash technologies. One of the top awards in their list was received at the very famous and prestigious event Flash Memory Summit (2016).



AccelStor Awards


All these awards were received for an innovative approach to working with flash memory, implemented in the proprietary technology FlexiRemap, which all AccelStor NeoSapphire arrays possess .


FlexiRemap technology is a special algorithm for working with SSDs so as to get rid of performance bottlenecks, as well as maximize the lifespan of drives. The main idea is to convert random write requests to sequential chains. Those. Received data blocks are combined into chains, multiples of the "pages", and only then are recorded on the SSD. As a result, this approach to recording new data is consistent from the point of view of the drives, which ultimately allows to achieve high performance indicators.


In the process, the FlexiRemap algorithm keeps track of the demand for all data blocks. In accordance with the frequency of use, the data is automatically ranked when overwriting so that all the “hot” data are located as close as possible to each other. Then, in the process of modifying the blocks, this data will also be moved to the new “pages” together, which again will make it possible to use a more efficient sequential recording mode on the SSD compared to the traditional approach. This mechanism is similar to a sort of virtual tearing, which among other things also speeds up the work of the Garbage Collection, since The garbage collector will also do its work in sequential mode.


Although RAID is not used here, the data is still protected. To do this, all SSD are divided into two symmetrical groups. All I / O is evenly distributed across both groups (stripe). In addition to the data, each group stores checksums so that it is possible to continue working in the event of a single drive failure. In total, the array can withstand the failure of two SSDs, which, in comparison with RAID, is equivalent to a RAID level of two groups.



Data array organization


When writing, the round robin mechanism is used, thanks to which the data is distributed as evenly as possible across all disks. In addition, each SSD has its own weight factor, which depends on its current recording resource. Therefore, if a disk is worn out more than the others, it will be less likely to receive new data until the resource indicators become equal. Compared with the traditional method in RAID, the FlexiRemap technology allows to significantly increase the service life of drives due to their uniform use.



FlexiRemap vs. RAID


It is worth noting the mechanism of data integrity in the event of a disk failure. In this case, the group in which the SSD has failed is automatically transferred to read-only mode. It becomes for the fastest execution of process of a rebild on the disk hot spare. Once the group is restored, it can again participate in all types of operations. Moreover, the recording resource alignment mechanism described earlier will automatically work.


Speaking of SDS appliance, you need to understand that this is essentially a server with pre-installed software. Therefore, it is a priori one-controller, as expressed in the terminology of storage. And although a number of tasks allows us not to resort to redundant storage controllers, all storage vendors have long taught us that the “correct” storage is storage with two (or even more) controllers. AccelStor also has its answer to this - the Shared Nothing technology for the operation of two nodes in a cluster.


AccelStor NeoSapphire models with two nodes can be either in a single package (based on twin servers) or in the form of two separate servers. The latter can be spread to a distance of 100m from each other to create a disaster recovery. In any case, an external connection via InfiniBand 56G is used to synchronize data between the nodes with an additional “heartbeat” check via Ethernet.



The organization of synchronization between nodes


In contrast to the usual dual-controller storage system, not only the controllers (nodes) are duplicated here with mandatory binding in the form of cooling modules and power supplies, but also the data itself. Each node in AccelStor NeoSapphire is completely independent and contains a complete copy of the data thanks to continuous synchronous replication. Both nodes work in the Symmetric Active-Active mode without the use of sending requests to each other (ALUA), as in classic storage systems. Therefore, the switching time in case of failure by AccelStor really tends to zero. And the presence of two copies of data can significantly improve the reliability of the system compared to the traditional architecture.


Continuing the topic of reliability, it is worth noting that Accelstor arrays do not cache data during write operations, since work in synchronous mode. All intermediate actions on the data by the FlexiRemap algorithm are performed in the RAM of the controller. But the array will issue a confirmation to the host about the successful completion of the operation only after a physical record on the SSD. Therefore, in the All Flash Accelstor arrays there are no batteries / capacitors due to the lack of need for them.


In addition to the unique technologies All Flash, AccelStor NeoSapphire arrays also have Enterprise standard functionality : Thin Provisioning, Redirect-on-Write snapshots with the ability to backup and restore them through external CIFS / NFS folders, asynchronous replication, compression and deduplication. Separately, it is worth noting the function of Free Clone to create copies of volumes that do not physically take up space, because are essentially references to the source volume. This feature can be very useful, for example, in VDI.
Of course, there is support for all modern operating systems and virtualization platforms. There is a plug-in for VMware vSphere Web Client with the ability to manage volumes and fully implements the functionality of Free Clone.


An important advantage of Accelstor NeoSapphire as Software Defined Storage is the ability to work on regular x86 hardware with completely standard SSDs. Yes, the manufacturer does not provide liberties for choosing a hardware platform: it does this for you. This is done primarily for guaranteed predictable performance solutions, as well as to eliminate compatibility problems. All All Flash Accelstor arrays are built for a specific customer in the desired configuration and are thoroughly tested before being sent. Standard warranty for all 3-year NBD arrays with advanced replacement parts. Since vendor is present on the territory of Russia, technical support is also available in Russian.



When ordering All Flash Accelstor NeoSapphire array, you can flexibly select the required volume. Moreover, this volume is what is really available for hosts to work, regardless of the physical organization of disk space. It is necessary to take into account that all models are supplied completely filled with discs. There are no free slots - adding disks will fail. This is all due to the same previously mentioned performance and reliability requirements. If in the future you need to increase the volume, this can be done using expansion shelves (available for older models). It is also necessary to determine in advance how many nodes (controllers) will be in the array, since Upgrade to the current two-mode mode is not provided.


As an interface for all models, a choice of 10G iSCSI or 16G Fiber Channel is available. 56G InfiniBand can also be optional. For iSCSI models, in addition to block access, support for CIFS and NFS file protocols comes as a bonus. The number of ports is determined by the specified system performance so that they are not a bottleneck (usually 2-6 ports per node).


Standard SSD Enterprise class are used as drives. Most often with the SATA interface, since work with two controllers is not required. There are also models All Flash arrays based on NVMe disks.


The use of standard server platforms and SSD allows you to significantly optimize the cost of the solution as a whole. At the same time, AccelStor provides service on its own behalf for the entire solution, regardless of the manufacturer of which components are included in the array.


And, yes, a very important point: no paid licenses! All functionality is available immediately "out of the box." Moreover, in the case of expanding the functionality, new features will be available when updating the firmware.


Checking in


AccelStor has a wide range of models with different declared performance. The minimal model NeoSapphire 3401 with 8 SSDs is capable of providing 300K IOPS @ 4K. And the top- end P710 with 24 SSD already provides 700K IOPS @ 4K. As for the NVMe models, the same figure of 700K IOPS @ 4K is achieved in the NeoSapphire P310 with just 8 SSDs! And note that the indicated performance values ​​are a record in the established mode (reading and any peak values ​​are higher), i.e. in the hardest mode for the array.


We tested the dual-system NeoSapphire H710 with 48 SSDs (24 SSDs per node) with an available capacity of 27TB. Accelstor declares performance for this model not lower than 600K IOPS 4K, random write. Testing was done in IOmeter from three servers connected via Fiber Channel.



In the synthetic tests All Flash, the array turned out to be even better than promised in the specification, which, in our opinion, is only a plus in the market segment, where any indicators are questioned (thanks to marketing margins torn from reality!).


It is important to note that one of the main advantages of the FlexiRemap algorithm is high performance in write mode without degradation over time. Those. The achieved rate in the established mode will be the same in 10min / hour / ... continuous operation. To confirm this fact, we launched the IOmeter test (4K, 100% random write) for several hours (one host was used). Yes, this is true: performance practically does not change over time.



Verdict


When choosing the All Flash array, most users by default prefer to consider traditional storage systems equipped with SSDs as candidates. And if the performance of ~ 280K IOPS (4K, random write) suits you, then you are thinking in the right direction. But business tasks are increasingly demanding that the equipment work for all 146%. And with the usual storage above the head, alas, not jump, and some IBM Flash System costs transcendental money. And here, All Flash AccelStor arrays will be most welcome. Decent performance, high reliability, flexible configuration selection and adequate technical support - this is not a complete list of the advantages of these arrays. Add to this the complete absence of hidden payments for licenses and the longer term use of SSDs - and you get not just an interesting product, but a worthy tool in the operation of your business.


So the place already under AccelStor "under the sun" on the market of ultrafast arrays will inevitably expand. And, how to know which peaks they can reach.




Source: https://habr.com/ru/post/437210/