There is such code:

void DeadSecCoding(string strfile) { using (FileStream fs = new FileStream(strfile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite,1024*1024,true)) using (BinaryReader sr = new BinaryReader(fs)) { byte byte1; long nBytesRead = fs.Length; while (nBytesRead >0) { byte1 = sr.ReadByte(); int i = Convert.ToInt32(byte1); i *= 1234567; using (StreamWriter writer = new StreamWriter(strfile+".mql",true)) { writer.WriteAsync(i.ToString()+"@"); } nBytesRead--; } } } 

Its meaning is: we read a byte from a stream, convert it to int, multiply int by a number, write to a text file, convert int to string with adding a label. The output file has the contents (plain text):

 5372635@5272736@2437362@827463637@262627@ 

And so on. Everything is written in one line. The code is very slow, since StreamWriter is constantly opening. How to make writing to a file beyond the limits of the while construct so that when converting a large input file does not catch MemoryException?

I think to organize some intermediate buffer, write 8 * 1024 int (or byte) there and make a record in the foreach loop through the StreamWriter.

Can anyone have ideas on how to implement something like this, or ideas on how to do better?

Ps I have some thoughts, but I want to hear someone else's opinion. Thank!

  • writer.WriteAsync without await simply wrong. - VladD
  • Sori, was await, lost when copying :-) - Jack Black

3 answers 3

 const int ReadBufferSize = 64 * 1024; const int WriteBufferSize = 64 * 1024; var readBuffer = new byte[ReadBufferSize]; var writeBuffer = new StringBuilder(WriteBufferSize); using (var fs = new FileStream(strfile, FileMode.Open)) using (var sw = new StreamWriter(strfile + ".mql", true)) { while (true) { writeBuffer.Clear(); int read = fs.Read(readBuffer, 0, ReadBufferSize); if (read == 0) break; for (int i = 0; i < read; ++i) { writeBuffer.Append((int)readBuffer[i] * 1234567); writeBuffer.Append('@'); } sw.Write(writeBuffer.ToString()); } } 

Hard drive will be a bottleneck anyway.

  • The main thing is to have enough memory. Files from 2 GB and higher will have to go through the FileStream. - Jack Black

Try something like this:

 using (var infile = File.OpenRead(strfile)) using (var outfile = File.CreateText(strfile + ".mql")) { var buf = new byte[65536]; while (true) { var actuallyRead = infile.Read(buf, 0, buf.Length); if (actuallyRead == 0) break; var results = buf.Take(actuallyRead) .Select(b => ((int)b * 1234567).ToString() + "@"); outfile.Write(string.Concat(results)); } } 

For an async variant, replace the inside with:

  var actuallyRead = await infile.ReadAsync(buf, 0, buf.Length); if (actuallyRead == 0) break; var results = buf.Take(actuallyRead) .Select(b => ((int)b * 1234567).ToString() + "@"); await outfile.WriteAsync(string.Concat(results)); 
  • And using Select (b => ((int) b * 1234567). ToString () + "@"); severely affect performance? - Jack Black
  • @JackBlack: It shouldn't, you have a bottleneck. - VladD

It is useful to rummage through the old source code, but I found another implementation:

 void DeadSecCoding(string strfile) { using (FileStream fs = new FileStream(strfile, FileMode.Open, FileAccess.Read)) { using (StreamWriter sw = new StreamWriter(strfile + ".dat", true)) { long nBytesRead = fs.Length; int nBytesToRead = 0; for (int i = 0; i < nBytesRead; i++) { int nextByte = fs.ReadByte(); nextByte *= 1234567; sw.Write(nextByte.ToString() + '~'); nBytesToRead++; } } fs.Close(); } 

I tested it on large input data - I didn’t catch MemoryException :-) I would have found it earlier and the question wouldn't even arise