📜 ⬆️ ⬇️

The new achievement of scientists from Microsoft will create exabyte drives



The prospect of the emergence of huge repositories of DNA-based data has become quite real thanks to a new method for extracting data.

Already, Microsoft sees in synthetic DNA a promising information carrier that can satisfy the needs for storing large data. Previous studies have shown that only a few grams of DNA are able to store exabyte of data, and their storage time is about 2,000 years.

But the main disadvantage of this technology is the high cost and slowness of the record, which includes the conversion of zeros and ones to nucleotides. Extracting data from DNA is the sequencing and reverse translation of files into zeros and ones. Finding and extracting certain files is also a big problem.

But this remained in the past - for the first time, scientists from Microsoft Research and the University of Washington received random access to DNA-carrier on a large scale. As they explained, the lack of random access or the possibility of selectively extracting files from DNA storage leads to the fact that the user must sequence and decode the entire data set to find the necessary files. Getting random access will reduce the number of sequencing operations.

To obtain random access to DNA, scientists have created a library of primers that are attached to each sequence of the molecule and are used as targets for selecting the desired fragments.

The researchers also developed an algorithm for more efficient decoding and data recovery. Microsoft Senior Researcher Sergey Yekhanin noted that new algorithms are more tolerant of errors in writing and reading DNA sequences, which reduces the sequencing and data processing necessary for their recovery.

In synthetic DNA, 200 megabytes of data were encoded, consisting of 35 files from 29 kilobytes to 44 megabytes. These files contain HD-video, audio files, images and texts. Scientists believe that the method they used for random access will scale to physically isolated DNA pools capable of containing several terabytes.

Source: https://habr.com/ru/post/410549/