I want to find the number of occurrences of different words in column H using C# and ignore the null values. Later, I have to display the output in List box. I am using the Microsoft.Office.Interlop.Excel reference to open workbook and access the first worksheet.
I tried the following code :
private void button3_Click(object sender, EventArgs e)
{
Excel.Application xlApp;
Excel.Workbook xlWorkbook;
Excel.Worksheet xlWorksheet;
object misValue = System.Reflection.Missing.Value;
xlApp = new Excel.Application();
xlWorkbook = xlApp.Workbooks.Open("ABC.xlsx", 0, true, 5, "", "", true, Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
xlWorksheet = (Excel.Worksheet)xlWorkbook.Worksheets.get_Item(1);
Excel.Range bColumn = xlWorksheet.UsedRange.Columns[4, Type.Missing].Columns.Count;
List<string> dataItems = new List<string>();
foreach (object o in bColumn)
{
Excel.Range row = o as Excel.Range;
string s = row.get_Value(null);
dataItems.Add(s);
}
listBox1.DataSource = dataItems;
xlWorkbook.Close(true, misValue, misValue);
xlApp.Quit();
releaseObject(xlWorksheet);
releaseObject(xlWorkbook);
releaseObject(xlApp);
}
Please help me out with the code and suggest the fastest possible approach as the worksheet contains more than a thousand rows.
Thanks in advance for the help!
For example, if you want to find all values in column C then:
object[] columnValue = xlWorksheet.Range["C"].Values2;
the columnValue will be the values in the excel.
So it would be a lot faster if you do it in memory, you can first convert it to a string list using columnValue.Select(a=>a == null ? null : a.ToString()).ToList()
then you will be able to do all sort of logic such as using Distinct to get the distinct values, count to get the occurence and so on.