Immediately I warn you, the question is within the framework of the test assignment given to me for employment by juna, Please do not beat me with sticks and at least send it along the right path. The task is limited to .Net 3.5
Actually the question:
We have a file of more than 30GB in size, the task is to compress it correctly. Planned Algorithm:
- Run two filestream to read / write.
- We read the file in small portions (10 mb each) and write them to the array of initial data
- We run a couple of threads that will take blocks from this array, compress them and put them into an array of compressed data.
- FileStream for recording picks up blocks from the array of compressed data and writes them to a file.
Several questions arise at once:
If the file exceeds the size of RAM, then I can not keep in memory an array with blocks of this huge file, since it does not fit there.
Even if somehow it is possible to realize the removal of already compressed blocks from the original array, the problem will remain in the array of compressed blocks.
Multithreading even in two streams does not guarantee that the blocks of the first and second arrays will retain their places, the streams can operate at different speeds. Accordingly, the output turns porridge.
The result is sad, the algorithm must be anathematized.
What should be the correct algorithm for processing to be carried out in several streams and the amount of data processed was placed in RAM ???
UPD ^^ With fresh thoughts in my head there was an idea to use hard resources. But unfortunately, tons of reading the FileStream information did not give an understanding of what comes out when trying to use FileStream.Read (), and whether it can be used in this context.
UPD ^^ For compression, use GZipStream