I have a csv file with 350,000 rows, each row has about 150 columns.
What would be the best way to insert these rows into SQL Server using ADO.Net?
The way I’ve usually done it is to create the SQL statement manually. I was wondering if there is any way I can code it to simply insert the entire datatable into SQL Server? Or some short-cut like this.
By the way I already tried doing this with SSIS, but there are a few data clean-up issues which I can handle with C# but not so easily with SSIS. The data started as XML, but I changed it to CSV for simplicity.
Make a class “CsvDataReader” that implements IDataReader. Just implement Read(), GetValue(int i), Dispose() and the constructor : you can leave the rest throwing NotImplementedException if you want, because SqlBulkCopy won’t call them. Use read to handle the read of each line and GetValue to read the i’th value in the line.
Then pass it to the SqlBulkCopy with the appropriate column mappings you want.
I get about 30000 records/per sec insert speed with that method.
If you have control of the source file format, make it tab delimited as it’s easier to parse than CSV.
Edit : http://www.codeproject.com/KB/database/CsvReader.aspx – tx Mark Gravell.
SqlBulkCopy if it’s available. Here is a very helpful explanation of using SqlBulkCopy in ADO.NET 2.0 with C#
I think you can load your XML directly into a DataSet and then map your SqlBulkCopy to the database and the DataSet.
Hey you should revert back to XML instead of csv, then load that xml file in a temp table using openxml, clean up your data in temp table and then finally process this data.
I have been following this approach for huge data imports where my XML files happen to be > 500 mb in size and openxml works like a charm.
You would be surprised at how much faster this would work compared to manual ado.net statements.