Home » excel » sql server – SSIS (C# or VB): delete rows 1-12 in all excel files in directory

sql server – SSIS (C# or VB): delete rows 1-12 in all excel files in directory

Posted by: admin May 14, 2020 Leave a comment

Questions:

Before importing data from multiple excel files I need to get rid of first 12 rows in each worksheet. I am going to use the code from this solution for bulk processing script task.

Questions:

  • What code should I insert into the script to delete rows? (I suppose right after //Load the DataTable with Sheet Data so we can get the column header); or
  • How to modify this code to make it read excel files starting from Row 13; or, alternatively
  • What SSIS task should I insert before the script for bulk row deletion?
How to&Answers:

This is a method for looping through sheets:

Create a data flow task to read sheet names into ADO object.

Data flow

First item is a script component as a source.
I have a variable for connection string to the Excel Spreadsheet

connstr

Created an Output of SheetName

Output Setup

Here’s the code to read tab names:
C#

You are basically opening the spreadsheet with oleDB.
Putting the table names into a data table

Looping through the data table and writing out the rows to output.

Make sure to close the Connection!!! This may cause errors later if you don’t.

The next step is a conditional split as for some reason the result has duplicates of tab names and they all end in an ‘_’.

Conditional Split

Next step is deriving a column to clean the sheet name of exta “‘”

DerivedCol

Create a Variable of type Object: I named mine ADO_Sheets

Insert a recordset destination object:
1. Set the variable to the variable you just created
2. Map the columns for clean Sheet

Now back to the Control Flow and set up a foreach loop control:
enter image description here

Configure the foreach…
Enumerator: Foreach ADO Enumerator
Source: ADO_Sheets
Variable Mapping: Set to a variable called SheetName

I have a Function Task inside the loop but it is more for ease of understanding, it could have been down in the variables:
SQL

This variable is now your select for extracting the data off that page.

Last is the data flow task you want to run.

Lot’s of work, but I use this so often I thought I would share!!!

Adding info about connection strings to Excel (xlsx)

Excel 2010
Xlsx files
Connect to Excel 2007 (and later) files with the Xlsx file extension. That is the Office Open XML format with macros disabled.

Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\myFolder\myExcel2007file.xlsx;
Extended Properties=”Excel 12.0 Xml;HDR=YES”;

“HDR=Yes;” indicates that the first row contains columnnames, not data. “HDR=No;” indicates the opposite.

Source: https://www.connectionstrings.com/ace-oledb-12-0/