I need to import data from an Excel spreadsheet into SQL Server, but the data isn’t in a relational/normalized format so the import wizard isn’t going to cut it (as far as I know).
The data is in this format:
Category SubCategory Name Description Category#1 SubCategory#1 Product#1 Description#1 Category#1 SubCategory#1 Product#2 Description#2 Category#1 SubCategory#2 Product#3 Description#3 Category#1 SubCategory#2 Product#4 Description#4 Category#2 SubCategory#3 Product#5 Description#5
(apologies I’m lacking the inventiveness to come up with ‘real’ data at this time in the morning…)
Each row contains a unique product, but the cateogry structure is duplicated. I want to import this data into three tables:
Category SubCategory Product
(I know SubCategory should really be contained within Category, DB was not my design)
I need a way to import unique rows based on the Category and then SubCategory columns, and then when importing the other columns into Product, obtain a reference to the SubCategory based on name.
Short of scripting this, is there any way to do it using the import wizard or some other tool?
I had a similar problem a while ago, and did not find any easy way to do this using an import wizard. The way I resolved the import (as this was a one-off task, and not something that was going to hang around) was to create a simple macro (VBA) from excel that would simply call a stored proc, using each row as the parameters.
The stored proc would intelligently insert each parameter (column), and then grab the ID to use as the foreign key on the next parameter insert.
DECLARE @CategoryID INT DECLARE @SubCategoryID INT -- Check that the Category exists IF NOT EXISTS (SELECT * FROM tblCategories WHERE CategoryName = @pCategoryName) BEGIN -- Your insert statement here, then grab the ID SET @CurrencyID = scope_identity() END ELSE BEGIN -- Set the category ID here END
The VBA macro had code similar to:
Private Sub CommandButton1_Click() Dim cnt As ADODB.Connection Dim wbBook As Workbook Dim wsSheet As Worksheet Dim intActiveRow As Long Dim intInsuranceProduct As Variant ' Get our connection Set cnt = CreateConnection() ' Read the input sheet Set wbBook = ActiveWorkbook Set wsSheet = wbBook.Worksheets(1) ' Ignore the header row intActiveRow = 2 ' process every row into the database Do While (wsSheet.Cells(intActiveRow, 1) "") ' execute the stored procedure, GenerateScript would create your SQL cnt.Execute (GenerateScript(wsSheet, intActiveRow)) ' increment i for row count intActiveRow = intActiveRow + 1 Loop End If 'Cleaning up. cnt.Close Set cnt = Nothing Set wbBook = Nothing Set wsSheet = Nothing End Sub
You might want to investigate SSIS (SQL Server Integration Services) formerly known as DTS (Data Transformation Services).
In SSIS there is the ability to use Excel as a data source where you can specify filters and transformations of the data for loading into the appropriate SQL Server tables. It might take a little research but it is a pretty robust tool, and also support the ability to create a Script task if you need to do something not Out of the Box.
Actually a good software to use that was developed specifically for this type of work is Relational Excel – there’s a trial edition but it can be used past the trial period it just shows nag screens every once in a while. www.relationalexcel.com
The solution very fast is to use the tool “Analyze Table” in MS Access, your Table will be Normalyze, try!