There is an xls file with a lot of lines. I use C #, interop library. Reading takes place in the form of creating a list of objects (one object represents one row) with properties a1, a2 ... a8, which contain the values ​​of the columns of this row. Then the next line, etc. loop for, until I meet an empty string. Then with the help of these objects I conduct analysis. The problem is very slow reading. On the Internet, it is advised to use EPPlus. But, it works with xlsx, I have xls. Hands, through ctrl + a, ctrl + v copying a large number of cells to another sheet takes only a couple of seconds. By analogy, is it possible to somehow select, copy the entire range of filled cells into, say, a two-dimensional array and then work with it? In theory, the speed of an order to increase? If so, how to do it? The number of lines in different files is different. Thank.

My code snippet

//считываем ячейки, пока не встретим пустаю while (true) { excelcells = excelworksheet.get_Range("A" + row.ToString(), Type.Missing); if (excelcells.Value2 == null) { break; } excelcells = excelworksheet.get_Range("C" + row.ToString(), Type.Missing); int AA = Int32.Parse(excelcells.Value2.ToString()); excelcells = excelworksheet.get_Range("H" + row.ToString(), Type.Missing); double BB = (double)excelcells.Value2; excelcells = excelworksheet.get_Range("I" + row.ToString(), Type.Missing); double CC = (double)excelcells.Value2; new ClassRs(AA, BB, CC); row++; } 
  • Give an example of code that reads lines. - MindCleaner
  • why not create a dynamic array and fill it by walking along the cells by matching the positions? - Mike Waters
  • @MindCleaner code added to original question - pucher
  • @MikeWaters I understand that the problem lies in the low speed of cell search. And not in the fact that the data are driven in the form of objects - pucher
  • You can first export xls to csv, and already the csv-file can be quickly read into an array. Or read xls as a database - use OleDb. - Alexander Petrov

3 answers 3

If you need to copy the entire contents of the first sheet into an array, then:

 var oExcel = new excel.Application(); var oBook = oExcel.Workbooks.Open(filePath, false, true); var sh = oBook.Worksheets[1]; var arr = sh.UsedRange.Value2; 
    1. Open Excel through Interop , then resave (convert) to *.xlsx in any temporary directory. Re-saving occurs almost instantly.

    2. Next, open a new file through EpPlus and read the cells at the speed of light.

    Problem solved.

    The only negative is that for the first step you need interop.

    Another option is to select the Range on the page and call the Application method Copy . Then all this will be copied to the clipboard. I looked at the buffer via the Free Clipboard Viewer and there is an html view => you can parse it, but my first option is better.

      As it turned out, one of the problems with the low speed of the interop library is the use of "range" methods and objects when accessing cells. Therefore, it is necessary to minimize their use. To read the cells, instead of the code given in the original question, you need to do this:

       var arrayRsu = (object[,])excelworksheet.get_Range("A" + startRow.ToString(), "P" + lastRow.ToString()).Value; 

      Where the right side - immediately ALL the necessary range of cells. After that, work with each cell is carried out through the declared array. So on the same file from 540 seconds, it was possible to reduce the processing time to either 23 or 8 seconds. The same meaning when writing to the cell. First, add the necessary data to the array, then assign this array to the range. Instantly.