Page 1 of 1

Using Excel and XML data in VB.NET and SQL Server Rate Topic: ***** 1 Votes

#1 Psyguy  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 76
  • View blog
  • Posts: 326
  • Joined: 12-January 11

Posted 13 January 2012 - 05:16 PM

I had a customer come to me the other day and ask if I could put together some data for him. It was only 18 months worth of shipped products so I naturally said "Sure! That won't be too hard. What database is it in?" He replied, "Well, we don't have the data stored in a database...what we have is all of the pallets of product that we have shipped in an Excel file." Now, at this point I got a little scared because that meant that part of the data was in an Excel file, and the other part was in the database. I needed to join them together and make one cohesive report.

I don't imagine this is too atypical for a small to medium sized business. Many folks are still stuck in the Excel glory days where everything that you needed in table form was done in Excel. Not to worry, though. Today I am going to talk about the information I got while searching for the answer to this problem. Anyone who has read my other Excel related blog will notice that I have reused many of the functions that I had previously written. Remember, this may not be the only way to do it and may not even be the best way to do it, but it does work. Lets get started.

The tools that I am using are Visual Studio 2010 set to framework 3.5 and SQL Server 2005.
Needed assemblies: I am using COM assembly Microsoft Excel 11.0 Object Library

Needed Imports:
Microsoft.Office.Interop
System.IO
System.Data.OleDb
System.Xml
System.Data

Lets assume you have the following Excel file filled with information that someone finds very important:

Attached Image

So now you need to go into the database and join some other information on the order number because the sale and customer information are only on this Excel sheet. Bare in mind that in my particular case, there were roughly 9000 rows of data so doing it by hand was not an option.

So now we have an Excel file and we need to read it into our .NET program so we can generate the XML that will be able to be read into SQL Server.

Import the data from an Excel file into a provided data set. The data for the specified sheet will be loaded into Table(0) of the data set.
Private Sub ImportExcelToDataTable(ByRef ds As DataSet)

        Dim ofd As New OpenFileDialog

        If ofd.ShowDialog = vbOK Then
            Dim full As String = ofd.FileName
            Dim path As String = String.Empty
            Dim fileN As String = String.Empty
            Dim sheet As String = String.Empty
            'get the file name
            For i As Integer = full.Length - 1 To 0 Step -1
                If full.Substring(i, 1) <> "\" Then
                    Dim str As String = full.Substring(i, 1) & fileN
                    fileN = str
                Else
                    Exit For
                End If
            Next

            path = full.Substring(0, full.Length - fileN.Length)
            sheet = InputBox("Please enter the name of the sheet you want to import:", "Sheet Name", "Sheet1")
            If Not sheet = String.Empty Then
                ds = GetExcelSheet(path, fileN, sheet)
                dgvImportedExcel.DataSource = ds.Tables(0)
            Else
                MsgBox("You did not enter a valid sheet name.")
            End If
        End If

    End Sub

Private Function GetExcelSheet(ByVal filePath As String, ByVal fileName As String, ByVal sheetName As String) As DataSet
        '' Excel 2003
        Dim connectionString, sqlString As String
        Dim ds As New DataSet
        Dim xlConn As OleDbConnection
        Dim xlCmd As OleDbCommand
        Dim xlDA As OleDbDataAdapter
        Dim xlFile As FileInfo

        connectionString = String.Format("Provider=Microsoft.Jet.OLEDB.4.0;" &
                                         "Data Source={0}", filePath + fileName) &
                                         ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1;"""

        sqlString = "SELECT * FROM [" & sheetName & "$]"

        xlFile = New FileInfo(fileName)
        If (xlFile.Extension.ToLower() = ".xlsx") Then
            ' Excel 2007
            connectionString = String.Format("Provider=Microsoft.ACE.OLEDB.12.0;" &
                                             "Data Source={0}", filePath + fileName) &
                                             ";Extended Properties=""Excel 12.0 Xml;HDR=YES;"""
        End If

        xlConn = New OleDbConnection(connectionString)

        Try
            xlConn.Open()
            xlCmd = New OleDbCommand()
            xlCmd.Connection = xlConn
            xlCmd.CommandText = sqlString

            xlDA = New OleDbDataAdapter()
            xlDA.SelectCommand = xlCmd

            xlDA.Fill(ds)
        Catch ex As Exception
            GetExcelSheet = Nothing
        Finally
            xlConn.Close()
        End Try

        GetExcelSheet = ds
    End Function  



Attached Image

As you can see, I use an OpenFileDialog to capture the path of the Excel file I want to import. I then assign the table that has been populated with my Excel data to a data grid view, so I can make sure it is correct (I also use that dgv to edit the content prior to creating the XML file).

Next, we need to export this data to a form that SQL Server can use. You could always use .NET to execute the query, thereby bypassing the process I will talk about using SQL Server Management Studio, but I figured you would be able to follow along easier if I went through it this way.

Private Sub CreateXMLFile(ByRef srcTbl As DataTable)

        Try
            Dim sfd As New SaveFileDialog

            If sfd.ShowDialog = vbOK andalso Not sfd.FileName = String.Empty Then
                If srcTbl IsNot Nothing Then
                    If sfd.FileName.Substring(sfd.FileName.Length-4, 1) = "." then
                        srcTbl.TableName = sfd.FileName.Substring(0,sfd.FileName.Length - 4)
                        srcTbl.WriteXml(sfd.FileName)
                    Else
                        srcTbl.TableName = sfd.FileName
                        srcTbl.WriteXml(sfd.FileName & ".xml")
                    End If                    
                End If
            End If
        Catch ex As Exception
            MsgBox("There was a problem saving your file.  Check the name you entered to make sure it conforms to Microsoft standards.")
        End Try

    End Sub



Again, we are starting off this code with a dialog. This time, however, I used a SaveFileDialog, as we are saving the data to an XML file. The user can then enter whatever name they would like to give the file or overwrite an existing file. I parse through the selected file name to see if the user added the file extension and make allowances for either. Now you have an XML file containing all of your data from the Excel file.


For this next part, you will need SQL Server Management Studio (SSMS) opened.
Here is a portion of the XML code just for clarity:

<?xml version="1.0" standalone="yes"?>
<NewDataSet>
  <C_x003A__x005C_Documents_x0020_and_x0020_Settings_x005C_jpriddy_x005C_My_x0020_Documents_x005C_jpTest>
    <OrderNum>1</OrderNum>
    <Date>2012-01-01T00:00:00-08:00</Date>
    <Customer>Tom</Customer>
    <Sale>10</Sale>
  </C_x003A__x005C_Documents_x0020_and_x0020_Settings_x005C_jpriddy_x005C_My_x0020_Documents_x005C_jpTest>
  <C_x003A__x005C_Documents_x0020_and_x0020_Settings_x005C_jpriddy_x005C_My_x0020_Documents_x005C_jpTest>
    <OrderNum>2</OrderNum>
    <Date>2012-01-02T00:00:00-08:00</Date>
    <Customer>Dick</Customer>
    <Sale>10</Sale>
  </C_x003A__x005C_Documents_x0020_and_x0020_Settings_x005C_jpriddy_x005C_My_x0020_Documents_x005C_jpTest>
</NewDataSet>


As you can see, it uses the file path that we set as the table name, for the second level name and NewDataSet as the top level name. Also notice that the dollar sign was removed when it was transferred from Excel into .NET. I'm not sure if there is a way to prevent this, but for our purposes, its not a huge deal.

The following SQL transaction will take the XML text that you previously pasted into it and create a temporary table in memory, then it will save the handle (a unique ID that allows us to call on that temporary table).

Declare @myXML xml

Set @myXML = 'XML file text'
Declare @myHandle int

EXEC sp_xml_preparedocument @myHandle OUTPUT, @myXML 

Select @myHandle


Attached Image

NOTE: You only need to run this part of the query ONE time. If you run it multiple times, you will have multiple copies of your table floating around in memory. While not a big deal if you do it two or three times, you don't want to make a habit of creating unneeded clutter. If you have two or more Excel files you want to join together, you can run this code and get a new handle for each of them thereby allowing you to use the Excel files as SQL tables. Additionally, for SQL Server 2005, the stored procedure used above needs to be in all lower case. This caused me many hours of problems because I couldn't figure out why it wouldn't work, as the book I read had the syntax slightly different. When you run this code, you will get a returned value that you need to make note of. It will be used in the next step.

To use the data that we loaded from the XML file and generate the new XML with our joined data we execute the following query in SSMS:

Declare @iNum int
SET @iNum = 1--Result of Select @myHandle

Select Orders.*
From
   (SELECT * FROM OpenXML(@iNum, '/NewDataSet/C_x003A__x005C_Documents_x0020_and_x0020_Settings_x005C_jpriddy_x005C_My_x0020_Documents_x005C_jpTest')
		--desiredColName variable type 'ActualColNameinXMLFile',
	WITH (OrderNum varchar(max) 'OrderNum',
		  Date varchar(max) 'Date',
		  Customer varchar(max) 'Customer',
		  Sale varchar(max) 'Sale')) as Orders --any additional joins will go here, just like any other table joins


which produces the following:

Attached Image

As you can see, the data is in SQL just like any other table. This allows us to perform any of SQL Servers numerous transactions on it.

Once your data is organized the way you want it, you need to get it back into XML format so we can read it back into our .NET application (or so you can capture the XML text with your application). To do this you simply add one line to the query you created with the above code (including any other transactions you decided to add)

Declare @iNum int
SET @iNum = 1--Result of Select @myHandle

Select Orders.*
From
   (SELECT * FROM OpenXML(@iNum, '/NewDataSet/C_x003A__x005C_Documents_x0020_and_x0020_Settings_x005C_jpriddy_x005C_My_x0020_Documents_x005C_jpTest')
		--desiredColName variable type 'ActualColNameinXMLFile',
	WITH (OrderNum varchar(max) 'OrderNum',
		  Date varchar(max) 'Date',
		  Customer varchar(max) 'Customer',
		  Sale varchar(max) 'Sale')) as Orders --any additional joins will go here, just like any other table joins
FOR XML Raw, Root


Attached Image

If you click on the link in the table, SSMS will open a new tab with your new XML code on it and you can save it using normal save methods:

Attached Image

Now that you have a new XML file with all of your precious data in it, lets say you want to write it back to the Excel file you got it from.
Private Sub ImportXMLToDataTable(ByRef destination As DataSet)

        Dim filePath As String = String.Empty
        Dim ofd As New OpenFileDialog
        Dim fs As System.IO.FileStream = Nothing

        If ofd.ShowDialog = vbOK Then
            filePath = ofd.FileName
            If filePath = String.Empty Then Exit Sub
        Else
            Exit Sub
        End If

        Try
            destination = New DataSet
            fs = New System.IO.FileStream(filePath, System.IO.FileMode.Open)

            destination.ReadXml(fs)
            dgvImportedExcel.DataSource = destination.Tables(0)
        Catch ex As Exception
            MsgBox(ex.ToString)
        Finally
            If Not fs Is Nothing Then fs.Close()
        End Try

End Sub



So now we have our XML file back in our .NET application. I wont show a picture since we have not actually joined any information to the table, it will look the same as above.

The next step you may want to take is to put all that newly joined, grouped, or whatevered data back into an Excel spreadsheet. To do this we are going to use the same technique I describe in my tutorial on exporting data to Excel. I will go over the basic idea though, for completeness.

Private Sub bgwDataToExcel_DoWork(sender As Object, e As System.ComponentModel.DoWorkEventArgs) Handles bgwDataToExcel.DoWork

        Dim excelApp As New Excel.ApplicationClass
        Dim location As String = e.Argument
        Dim WkBk As Excel.Workbook = excelApp.Workbooks.Open(location, 0, False, 5, "", "", False, Excel.XlPlatform.xlWindows, "", True, False, 0, True, False, False)

        Try
            Dim sheet As String = InputBox("New sheet name (if sheet name already exists the save will fail):", "Worksheet", "Sheet1")

            AddExcelSheet(WkBk, sheet, dgvImportedExcel)
            MsgBox("Excel file saved.")
        Catch ex As Exception
            MsgBox("Data Population failed.  Make sure you rename any sheets," &
                   "in your Excel file, which have the name you are trying to use" &
                   "before trying again.")
        Finally
            CloseWorkbook(Wkbk)
            ShutDownExcel(excelApp)
            e.Result = location
        End Try



As you can see, i use a background worker to perform this task as it can take quite a bit of time (depending on the size of your datatable). You must initially set up the Excel Applicationclass object, the path of the file to save to can be passed to the background worker as an argument, and set up the Excel Workbook object.

I have the user decide what to name the new sheet, as this will just add a sheet to an existing Excel file, not create a new file. This has the added benefit of retaining the original data, should the new data be corrupted somehow. The methods used in the AddExcelSheet method are contained in the code block below:

''' <summary>
    ''' Makes the excel headers.
    ''' </summary>
    ''' <param name="sheet">The sheet.</param>
    ''' <param name="dgv">The DGV.</param>
    Private Sub MakeExcelHeaders(ByRef sheet As Excel.Worksheet, ByVal dgv As DataGridView)

        For i = 0 To dgv.Columns.Count - 1
            sheet.Range(GetColumnLetter(i + 1) & "1").Value = dgv.Columns(i).HeaderText
        Next

    End Sub

    ''' <summary>
    ''' Formats the excel sheet.
    ''' </summary>
    ''' <param name="sheet">The sheet.</param>
    ''' <param name="dgvColCount">The DGV col count.</param>
    Private Sub FormatExcelSheet(ByRef sheet As Excel.Worksheet, ByVal dgvColCount As Integer)

        With sheet
            'change the column headers to bold font
            .Range(GetColumnLetter(1) & "1", GetColumnLetter(dgvColCount) & "1").Font.Bold = True
            'autofit the column width to the data
            .Columns.AutoFit()
            'center the values in the column
            .Columns.HorizontalAlignment = Excel.Constants.xlCenter
        End With

    End Sub

    ''' <summary>
    ''' Returns the letter string for excel when given the column number
    ''' </summary>
    ''' <param name="ColumnNumber"></param>
    ''' <returns></returns>
    ''' <remarks>Taken from http://www.freevbcode.com/ShowCode.asp?ID=4303</remarks>
    Function GetColumnLetter(ByVal ColumnNumber As Integer) As String
        If ColumnNumber > 26 Then
            ' 1st character:  Subtract 1 to map the characters to 0-25,
            '                 but you don't have to remap back to 1-26
            '                 after the 'Int' operation since columns
            '                 1-26 have no prefix letter
            ' 2nd character:  Subtract 1 to map the characters to 0-25,
            '                 but then must remap back to 1-26 after
            '                 the 'Mod' operation by adding 1 back in
            '                 (included in the '65')
            GetColumnLetter = Chr(Int((ColumnNumber - 1) / 26) + 64) & _
                           Chr(((ColumnNumber - 1) Mod 26) + 65)
        Else
            ' Columns A-Z
            GetColumnLetter = Chr(ColumnNumber + 64)
        End If
    End Function

    ''' <summary>
    ''' Writes the data, from a datagridview, to excel.
    ''' </summary>
    ''' <param name="sheet">The sheet.</param>
    ''' <param name="dgv">The DGV.</param>
    Private Sub WriteDGVToExcel(ByRef sheet As Excel.Worksheet, ByRef dgv As DataGridView)

        For r = 0 To dgv.Rows.Count - 1
            bgwDataToExcel.ReportProgress(Math.Round((r / (dgv.Rows.Count - 1)) * 100))
            For c = 0 To dgv.Columns.Count - 1
                sheet.Range(GetColumnLetter(c + 1) & (r + 2).ToString).Value = dgv.Rows(r).Cells(c).Value
            Next
        Next

    End Sub  

    ''' <summary>
    ''' Adds an Excel sheet to the Workbook and populates it with data from the DataGridView.
    ''' </summary>
    ''' <param name="wkbk">The WKBK.</param>
    ''' <param name="name">The name.</param>
    ''' <param name="dgv">The DGV.</param>
    Private Sub AddExcelSheet(ByRef wkbk As Excel.Workbook, ByVal name As String, ByVal dgv As DataGridView)

        Dim sheet As Excel.Worksheet
        'add the sheet to the passed in workbook and rename it
        sheet = wkBk.Worksheets.Add
        sheet.Name = name
        'populate the sheet with the data from the passed in datagridview
        exportDGV(dgv, sheet)

    End Sub

    ''' <summary>
    ''' Reads the excel sheet.
    ''' </summary>
    ''' <param name="filePath">The file path.</param>
    ''' <param name="fileName">Name of the file.</param>
    ''' <param name="sheetName">Name of the sheet.</param><returns></returns>
    Private Function GetExcelSheet(ByVal filePath As String, ByVal fileName As String, ByVal sheetName As String) As DataSet
        '' Excel 2003
        Dim connectionString, sqlString As String
        Dim ds As New DataSet
        Dim xlConn As OleDbConnection
        Dim xlCmd As OleDbCommand
        Dim xlDA As OleDbDataAdapter
        Dim xlFile As FileInfo

        connectionString = String.Format("Provider=Microsoft.Jet.OLEDB.4.0;" &
                                         "Data Source={0}", filePath + fileName) &
                                         ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1;"""

        sqlString = "SELECT * FROM [" & sheetName & "$]"

        xlFile = New FileInfo(fileName)
        If (xlFile.Extension.ToLower() = ".xlsx") Then
            ' Excel 2007
            connectionString = String.Format("Provider=Microsoft.ACE.OLEDB.12.0;" &
                                             "Data Source={0}", filePath + fileName) &
                                             ";Extended Properties=""Excel 12.0 Xml;HDR=YES;"""
        End If

        xlConn = New OleDbConnection(connectionString)

        Try
            xlConn.Open()
            xlCmd = New OleDbCommand()
            xlCmd.Connection = xlConn
            xlCmd.CommandText = sqlString

            xlDA = New OleDbDataAdapter()
            xlDA.SelectCommand = xlCmd

            xlDA.Fill(ds)
        Catch ex As Exception
            GetExcelSheet = Nothing
        Finally
            xlConn.Close()
        End Try

        GetExcelSheet = ds
    End Function  

    ''' <summary>
    ''' Closes the Excel workbook.
    ''' </summary>
    ''' <param name="wkbk">The Excel workbook.</param>
    Private Sub CloseWorkbook(ByRef wkbk As Microsoft.Office.Interop.Excel.Workbook)
        
            WkBk.Save()
            WkBk.Close()
            WkBk = Nothing

    End Sub

    ''' <summary>
    ''' Shuts down excel.
    ''' </summary>
    ''' <param name="app">The app.</param>
    Private Sub ShutDownExcel(ByRef app As Microsoft.Office.Interop.Excel.ApplicationClass)
        
        If Not app isnot Nothing then
            app.DisplayAlerts = True
            app.Quit()
            System.Runtime.InteropServices.Marshal.ReleaseComObject(app)
            app = Nothing
            GC.Collect
            GC.WaitForPendingFinalizers
            'has to be called twice: once to make a list of what is to be finalized, second time to finalize those objects on the list
            GC.Collect
            GC.WaitForPendingFinalizers
        End If

    End Sub



Thanks for taking the time to read this tutorial. I hope it was helpful and, as always, if something is incorrect or you have additional comments on the subject, leave a comment!

Is This A Good Question/Topic? 2
  • +

Replies To: Using Excel and XML data in VB.NET and SQL Server

#2 Beach_Coder  Icon User is offline

  • D.I.C Head
  • member icon

Reputation: 17
  • View blog
  • Posts: 123
  • Joined: 10-November 11

Posted 17 January 2012 - 11:41 AM

Well done. Very well done. And well timed. I have just begun a similar project, and your work here just saved me from a good deal of frustration.





et omnia dicta fortiora si dicta latina

This post has been edited by Beach_Coder: 17 January 2012 - 11:43 AM

Was This Post Helpful? 0
  • +
  • -

#3 Psyguy  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 76
  • View blog
  • Posts: 326
  • Joined: 12-January 11

Posted 23 January 2012 - 08:42 PM

Always nice to hear when someone finds it useful!
Was This Post Helpful? 0
  • +
  • -

#4 Jisha B  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 1
  • Joined: 26-August 12

Posted 26 August 2012 - 11:35 PM

View PostPsyguy, on 23 January 2012 - 08:42 PM, said:

Always nice to hear when someone finds it useful!


This is very useful post. But I am facing one problem in uploading Excel. My Excel file has so many NULL values in
different columns with different datatypes like DateTime,INT,DECIMAL . after converting XML file also while I am executing my SP, m getting data type conversion error since from Excel NULL is going as 'NULL'(not as DB NULL). How to handle
this? Pls help me...
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1