I have a client who needs to import rows from a LARGE Excel file (72K rows) into their SQL Server database. This file is uploaded by users of the system. Performance became an issue when we tried to upload and process these at the same time on user upload. Now we just save it to the disk and an admin picks it up and splits it into 2K rows and runs it through an upload tool one by one. Is there an easier way to accomplish this without affecting performance or timeouts?
If I understand your problem correctly you get a large spreadsheet and need to upload it into a SQL Server database. I’m not sure why your process is slow at the moment, but I don’t think that data volume should be inherently slow.
Depending on what development tools you have available it should be possible to get this to import in a reasonable time.
SSIS can read from excel files. You could schedule a job that wakes up periodically and checks for a new file. If it finds the file then it uses a data flow task to import it into a staging table and then it can use a SQL task to run some processing in it.
If you can use .Net then you could write an application that reads the data out through the OLE automation API and loads it to a staging area through SQLBulkCopy. You can read the entire range into a variant array through the Excel COM API. This is not super-fast but should be fast enough for your purposes.
If you don’t mind using VBA then you can write a macro that does something similar. However, I don’t think traditional ADO has a bulk load feature. In order to do this you would need to export a .CSV or something similar to a drive that can be seen off the server and then
BULK INSERTfrom that file. You would also have to make a bcp control file for the output .CSV file.
Headless imports from user-supplied spreadsheets are always troublesome, so there is quite a bit of merit in doing it through a desktop application. The principal benefit is with error reporting. A headless job can really only send an email with some status information. If you have an interactive application the user can troubleshoot the file and make multiple attempts until they get it right.
I could be wrong, but from your description it sounds like you were doing the processing in code in your application (i.e. file is uploaded and the code that handles the upload then processes the import, possibly on a row-by-row basis)
In any event, I’ve had the most success importing large datasets like that using SSIS. I’ve also set up a spreadsheet as a linked server which works but always felt a bit hackey to me.
Take a look at this article which details how to import data using several different methods, namely:
- SQL Server Data Transformation Services (DTS)
- Microsoft SQL Server 2005 Integration Services (SSIS)
- SQL Server linked servers
- SQL Server distributed queries
- ActiveX Data Objects (ADO) and the Microsoft OLE DB Provider for SQL Server
- ADO and the Microsoft OLE DB Provider for Jet 4.0