Home » excel » c# – Strip all formatting from Excel file on load

c# – Strip all formatting from Excel file on load

Posted by: admin May 14, 2020 Leave a comment

Questions:

I want to strip all formatting (borders etc) from an Excel file when it is loaded before it fills the data into a data table.

When i run my code, the updateExcel_Click part updates column C with what is in ConsigneeCombo box for each row, however if the file i am processing has formatting, for example 10 rows with borders but only 8 of them rows with text it updates all 10 because of the formatting

EDIT

Rather than stripping out the borders, what about in the updateExcel_Click part only adding it to rows that have text in?

private void updateExcel_Click(object sender, EventArgs e)
{
    for (int i = 0; i < dataGridView1.RowCount - 1; i++)
    {
        dataGridView1[2, i].Value = ConsigneeCombo.Text;
    }
}

My current GetData code is:

    private DataTable GetData(string userFileName)
    {
        string dirName = Path.GetDirectoryName(userFileName);
        string fileName = Path.GetFileName(userFileName);
        string fileExtension = Path.GetExtension(userFileName);
        string connection = string.Empty;
        string query = string.Empty;
        switch (fileExtension)
        {
            case ".xls":
                connection = [email protected]"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={userFileName};" +
                             "Extended Properties=\"Excel 8.0; HDR=Yes; IMEX=1\"";
                string sheetNamexls;
                using (OleDbConnection con = new OleDbConnection(connection))
                {
                    con.Open();
                    var dtSchema = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
                    sheetNamexls = dtSchema.Rows[0].Field<string>("TABLE_NAME");
                }

                if (sheetNamexls.Length <= 0) throw new InvalidDataException("No sheet found.");

                query = $"SELECT * FROM [{sheetNamexls}]";
                break;

            case ".xlsx":
                connection = [email protected]"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={userFileName};" +
                             "Extended Properties=\"Excel 12.0; HDR=Yes; IMEX=1\"";
                string sheetName;
                using (OleDbConnection con = new OleDbConnection(connection))
                {
                    con.Open();
                    var dtSchema = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
                    sheetName = dtSchema.Rows[0].Field<string>("TABLE_NAME");

                }

                if (sheetName.Length <= 0) throw new InvalidDataException("No sheet found.");

                query = $"SELECT * FROM [{sheetName}]";
                break;
            case ".csv":
                connection = [email protected]"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={dirName};" +
                               "Extended Properties=\"text; HDR=Yes; IMEX=1; FMT=Delimited\"";
                query = $"SELECT * FROM [{fileName}]";
                break;
        }
        return FillData(connection, query);
    }

I have tried adding the ClearFormats(); method but cannot get it to work.

Full code:

How to&Answers:

I am in agreement with @Maciej Los, your question appears focused on something in “Excel”, but the code is not doing anything in “Excel” in reference to adding the text from a ComboBox to the third column of all the rows in a DataGridView. This is confusing and I will start from the perspective of the DataGridView, as this is what the current code is using.

From your comment…

….. If the file i am processing has formatting, for example 10 rows
with borders but only 8 of them rows with text it updates all 10
because of the formatting.

This is not necessarily accurate… the code is NOT updating them because of the “formatting”… it is “updating” them because there are ten (10) rows! … the posted code is simply looping through ALL the rows in the grid. It is not checking for any formatting nor is it checking if the row is “empty”!

When you “read” an “Excel” file “that has cell formatting” in an empty cell (as you described)… it WILL get picked up on the read and will become an “row” in the data source, even though all the cells may be empty. This is an “Excel” issue and I know of a solution that will remove all of these “empty” cells “before” your code reads the “Excel” file, thus “eliminating” these “empty” rows from the start.

I hope I am not missing something….

To do this using the DatGridView, it may be possible to create a small method that given a row index in the grid, returns true if the row is “empty” of text. Calling this method from the existing updateExcel_Click … may look something like below…

In reference to removing the “empty formatted” cells from the Excel file…

Fastest method to remove Empty rows and Columns From Excel Files using Interop

May help. I am aware this uses “interop”, however, I am confident it would not be difficult to implement it using OLEDB. Basically, a “usedRange” from an Excel sheet is read into an object array which will drop this formatting.

Please let me know if I am missing something import. Hope this helps.

Answer:

When i run my code, the updateExcel_Click part updates column C with
what is in ConsigneeCombo box for each row, however if the file i am
processing has formatting, for example 10 rows with borders but only 8
of them rows with text it updates all 10 because of the formatting

Matt, i’m sorry, but the code you’ve posted is not related to Excel. It updates dataGridView1 cells without any condition. So, if you want to update only part of cells, you have to add condition:

But, i really do believe that it isn’t what you’re looking for, because you’re using OleDb provider to get/fetch Excel data.

Note, that OleDb provider exposes methods to provide CRUD operations. You can INSERT (create), SELECT (read), UPDATE (modify) and DELETE (destroy) Excel data through OleDbCommand.

So, if you would like to UPDATE data, use below statement:

You have to pass it to OleDbCommand.Command as a string:

or

But i have to warn you: OleDb provider for JET/ACE does not recognize named parameters! So, you have to add parameters to the OleDbCommand in correct order!

Finally, i’d suggest to re-think your application and split business logic from data access. See:
Creating a Data Access Layer (C#)
Creating a Business Logic Layer (C#)
Writing a Portable Data Access Layer

Above articles provide information for ASP.NET pages, but the logic for WinForms has to be the same!

A part of DAL class for Excel file may look like:

Feel free to improve ExcelDAL class to your needs.

Good luck!