Home » excel » .net – Why is SSIS saving the last row as NULL when transferring data from Excel to CSV?

.net – Why is SSIS saving the last row as NULL when transferring data from Excel to CSV?

Posted by: admin May 14, 2020 Leave a comment

Questions:

Excel:

Input file contains the data as shown below.

Column
------
123456
234567
ADCDEF

CSV/Text:

Output file contains the data as shown below

Column
------
123456
234567
NULL

Why does SSIS package write NULL value instead of ABCDEF in the last row when transferring data from Excel to CSV?

How to&Answers:

The issue is that Excel file contains mixed data that is both numerical values and strings, which is causing Excel to read the first few rows and infer the data type of the column as numerical, which is not true in this case. When you create a Excel Data Source to read this Excel file, you will notice that the column is defined as number and treats it that way. Hence the string never makes it to the output file.

You need to modify the ConnectionString property of the Excel connection manager to include IMEX=1 to indicate that the data source might contain values of different data types.

IMEX stands for intermixed Read more about it here: Connection strings for Excel

Here is an example to illustrate the difference.

I created two identical Excel files as per the data provided in the question.

Excel_1

Excel_2

Created an SSIS package with following connection managers.

Excel_1 had the following connection string

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\temp\ExcelFile_1.xls;Extended Properties="Excel 8.0;HDR=YES";

Excel_2 had the following connection string. The difference being the additional IMEX=1;. You need to manually add this to the ConnectionString property of the Excel Connection Manager. To view the properties, click on the Excel Connection Manager and press F4.

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\temp\ExcelFile_2.xls;Extended Properties="EXCEL 8.0;IMEX=1;HDR=YES";

Connections

Designed the data flow as shown below to transfer Excel_1.xls to FlatFile_1.csv and Excel_2.xls to FlatFile_2.csv

Package

You can see in the output that the first flat file does not have any value for third row but the second file does. The reason is the first Excel connection manager inferred that the column type is numerical, which is not true. However, the second file treated the

FlatFile_1

FlatFile_2

You can right-click on the Excel data source and click Show Advanced Editor...

Excel data source

On the Advanced Editor, click Input and Output Properties, expand Excel Source Output and then expand External Columns. Click Column.

You will notice that the data type of the column on first Excel data source is set to double-precision float [DT_R8] on the first Excel connection manager Excel_1

Excel_1 Advanced

You will notice that the data type of the column on second Excel data source is set to Unicode string [DT_WSTR] on the first Excel connection manager Excel_2

Excel_2 Advanced

Hope that helps.