I am currently working on a program that is supposed to read very large (up to 80 cols and ~65,000 rows) Excel outputs from a modelling program, and extract chunks of data from a few columns (specified by user). The output needs to be in CSV format. There are several summarisation options, most of which I already have working. My problem is that I cannot seem to find a very efficient way of outputting raw (unsummarised) data, namely because i haven't figured out how to write data in column by column to my CSV. The chunks are offset from each other (they do not line up across the selected columns), and, because this is a modelling file, each row corresponds to a date, so if someone doesn't need data in a certain date range, it should be left blank.
My current (very slow and inefficient) method is to go down the rows of the file, asking if any column needs this row, and writing a blank if it doesn't. I had thought of perhaps selecting all the data I need in a column, copying it to an array, then going on to the next column, finally dumping the array into the CSV when it is finished. I haven't managed to get this working as of yet, namely because when I want to copy an Excel 'range' of data into an array, all it does is create a new array within the array.
I am using an Excel module to read in the files (Microsoft.Office.Interop.Excel), and a stream writier to write the CSV file (currently line by line).
Any help or suggestions about how to get this data across would be great. I have tried to include all the info, if you need more just ask.
0 Replies - 3006 Views - Last Post: 09 January 2008 - 12:30 PM
Page 1 of 1