Menus

Showing posts with label SQL Server. Show all posts
Showing posts with label SQL Server. Show all posts

Monday, April 15, 2013

User Defined Functions SQLServer

What are UDF: SQL server provides list of many predefined functions that are built in to the T-SQL language. The supplied functions helps extend the capabilities of T-SQL, providing the ability to perform string manipulation, mathematical calculations data type conversion etc. but often we need something which is not provided using these functions. So we can create stored procedure to perform custom processing, but the problem is that we can’t use the result of stored procedure in WHERE or SELECT list, for this type of scenario we need UDF.
Why to use User Defined Functions: The main benefit of UDF is that we are not just limited to sql provided functions. We can write our own functions to meet our specific needs or to simplify complex SQL codes.

In this Article we will learn about User-Defined Functions (UDFs) in Sql Server. All the examples in this article uses the pubs database.
There are three Types of UDFS in Sql Server:
  1. Scalar
  2. Inline Table-Valued
  3. Multi-statement Table-Valued
Let us go through each of these with examples:
1.  Scalar User-Defined Function
A Scalar UDF can accept 0 to many input parameter and will return a single value. A Scalar user-defined function returns one of the scalar (int, char, varchar etc) data types. Text, ntext, image and timestamp data types are not supported. These are the type of user-defined functions that most developers are used to in other programming languages.
Example 1: Here we are creating a Scalar UDF AddTwoNumbers which accepts two input parameters @a and @b and returns output as the sum of the two input parameters.
CREATE FUNCTION AddTwoNumbers
(
@a int,
@b int
)
RETURNS int
AS
BEGIN
RETURN @a + @b
END
Once the above function is created we can use this function as below:
PRINT dbo.AddTwoNumbers(10,20)
OR
SELECT dbo.AddTwoNumbers(30,20)
Note: For Scalar UDFS we need to use Two Part Naming Convention i.e. in the above two statements we are using dbo.AddTwoNumbers.
Whether Below statement is correct? No, because it is not using two-part naming convention. Try executing the below statement it will error out…
PRINT AddTwoNumbers(10,20)
2.  Inline Table-Valued User-Defined Function
An inline table-valued function returns a variable of data type table whose value is derived from a single SELECT statement. Since the return value is derived from the SELECT statement, there is no BEGIN/END block needed in the CREATE FUNCTION statement. There is also no need to specify the table variable name (or column definitions for the table variable) because the structure of the returned value is generated from the columns that compose the SELECT statement. Because the results are a function of the columns referenced in the SELECT, no duplicate column names are allowed and all derived columns must have an associated alias.
Example: In this example we are creating a Inline table-valued function GetAuthorsByState which accepts state as the input parameter and returns firstname and lastname  of all the authors belonging to the input state.
USE PUBS
GO
CREATE FUNCTION GetAuthorsByState
( @state char(2) )
RETURNS table
AS
RETURN (
SELECT au_fname, au_lname
FROM Authors
WHERE state=@state
)
GO
We can use the below statement to get all the authors in the state CA.
SELECT * FROM GetAuthorsByState(‘CA’)
3. Multi-statement Table-Valued User-Defined Function

A Multi-Statement Table-Valued user-defined function returns a table. It can have one or more than one T-Sql statement. Within the create function command you must define the table structure that is being returned. After creating this type of user-defined function, we can use it in the FROM clause of a T-SQL command unlike the behavior found when using a stored procedure which can also return record sets.
Example: In this example we are creating a Multi-Statement Table-Valued function GetAuthorsByState which accepts state as the input parameter and returns author id and firstname of all the authors belonging to the input state. If for the input state there are no authors then this UDF will return a record with no au_id column value and firstname as ‘No Authors Found’.
USE PUBS
GO
CREATE FUNCTION GetAuthorsByState
( @state char(2) )
RETURNS
@AuthorsByState table (
au_id Varchar(11),
au_fname Varchar(20)
)
AS
BEGIN
INSERT INTO @AuthorsByState
SELECT  au_id,
au_fname
FROM Authors
WHERE state = @state
IF @@ROWCOUNT = 0
BEGIN
INSERT INTO @AuthorsByState
VALUES (”,’No Authors Found’)
END
RETURN
END
GO
We can use the below statements to get all the authors in the given input state:
SELECT * FROM GetAuthorsByState(‘CA’)
SELECT * FROM GetAuthorsByState(‘XY’)

Other More Examples of UDF
SQL getdate() returns current system date and time. It always includes both data and time components. We want to get just date and have the time always set to midnight. One solution is to to the conversion like below;
select convert(datetime,CONVERT(date,getdate()))
But the problem is that when we want to have date with time always set to midnight, we need to do this conversion. Solution is to make UDF for this.
create function getonlydate()
returns datetime
as
begin
return(select convert(datetime,convert(date,getdate())))
end
go
Now we can call this UDF in our SQL query.
select dbo.getonlydate()
Let us see how we can use this UDF in other SQL statements.
Let us create a table Order
CREATE TABLE Orders (
OrderID int IDENTITY (1, 1) NOT NULL Primary Key,
CustomerID nchar (5) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
EmployeeID int NULL ,
OrderDate datetime NULL default dbo.getonlydate(),
RequiredDate datetime NULL ,
ShippedDate datetime NULL
)
Let us INSERT values in this table using UDF function we created.
INSERT Orders(CustomerID,EmployeeID,RequiredDate)
values(‘BERGS’,3,dbo.getonlydate() +7)
Let us UPDATE values in this table using UDF function we created.
UPDATE Orders set ShippedDate = dbo.getonlydate()
where OrderID=1
SELECT OrderDate,RequiredDate,ShippedDate
FROM orders
WHERE orderdate = dbo.getonlydate()
Orderdate               Requireddate            Shippeddate
——————————————————————–
2011-05-01 00:00:00.000       2011-05-08 00:00:00.000       2011-05-01 00:00:00.000

Types of User Defined Functions:
1) Scalar functions
2) Inline table valued function
3) Multistatement table valued functions.

A) Scalar Function:
1) They are like standard built in functions provided with SQL Server.
2) It return scalar values that can be used anywhere a constant expression can be used.
3) It typically takes one or more arguments and returns a value of a specified data types.
4) Every T-SQL function must return a result using the RETURN statement.
Example:
The following two functions are variations of a function that returns the average price for a specified type of book from the titles table:
CREATE FUNCTION AverageBookPrice (@booktype varchar(12) = ‘%’)
RETURNS money
AS
BEGIN
DECLARE @Avg money
SELECT @Avg = AVG(price)
FROM titles
WHERE type  like @booktype
RETURN @Avg
END
GO
CREATE FUNCTION   AverageBookPrice2 (@booktype varchar(12) =’%')
RETURNS money
AS
BEGIN
RETURN (SELECT AVG(PRICE)
FROM TITLES
WHERE TYPE LIKE @booktype)
END
## SQL Server doesn’t allow aggregate functions in a WHERE clause unless they are contained in a subquery.
The AvgBookPrice() function lets you compare against the average price without having to use a subquery:
SELECT  title_id, type, price from titles
where price > dbo.AverageBookPrice(‘popular_comp’)
titleid type      price   
———————–
PC1035 popular_comp  22.95
PS1372 psychology    21.59
You can return the value from a user-defined scalar function into a local variable in two ways. You can assign the result to a local variable by using the SET statement or an assignment select, or you can use the EXEC statement. The following commands are functionally equivalent:
declare @avg1 money,
@avg2 money,
@avg3 money
select @avg1 = dbo.AverageBookPrice(‘popular_comp’)
set @avg2 = dbo.AverageBookPrice(‘popular_comp’)
exec @avg3 = dbo.AverageBookPrice ‘popular_comp’
select @avg1 as avg1, @avg2 as avg2, @avg3 as avg3
go
Result is below
avg1     avg2       avg3
———————————–
21.475 21.475 21.475
B) Table Value Function:
1) A table-valued user-defined function returns a rowset instead of a single scalar value.
2) Can be invoked in the FROM clause of a SELECT statement, just as we would a table or view.
3) A table-valued function can almost be thought of as a view that accepts parameters, so the result set is determined dynamically.
4) A table valued function specifies the keyword TABLE in its RETURNS clause.
5) They are of two types.
1) Inline table valued function
A) An inline table-valued function specifies only the TABLE keyword in the RETURNS clause,
Without table definition information.
B) The code inside the function is a single RETURN statement that invokes a SELECT  statement.
Example:
CREATE FUNCTION AveragePriceByType (@price money = 0.0)
RETURNS table
AS
RETURN (SELECT type,avg(isnull(price,0)) as avg_price
FROM titles
GROUP BY type
HAVING avg(isnull(price,0)) > @price )
select * from AveragePriceByType(15.0)
      type        averageprice
—————————————-
trad_cook        15.9633
————————————————————
2) Multi statement table valued function:
a) Multistatement table-valued functions differ from inline functions in two major ways
A) The RETURNS clause specifies a table variable and its definition.
B) The body of the function contains multiple statements, at least one of which populates the table variable with data values.
b) The scope of the table variable is limited to the function in which it is defined.
c) Within the function in which a table variable is defined, that table variable can be treated like a regular table. You can perform any SELECT, INSERT, UPDATE, or DELETE statement on the rows in a table variable, except for SELECT INTO.
The following example defines the inline table-valued function AveragePricebyType() as a multistatement table-valued function called AveragePricebyType3():
CREATE FUNCTION   AveragePricebyType3 (@price money =0.0)
RETURNS @table table(type varchar(12) null,avg_price money null)
AS
BEGIN
INSERT @table
SELECT type,avg(isnull(price,0)) as avg_price
FROM titles
GROUP BY type
HAVING avg(isnull(price,0))> @price
RETURN
END
SELECT * FROM AveragePricebyType3(15.0), this also gives same result.
type        averageprice
—————————————
trad_cook        15.9633
————————————————————
Big Question: Why use multi-statement table-valued functions instead of inline table-valued functions?
1) Generally, we use multi-statement table-valued functions when we need to perform further operations (for example, inserts, updates, or deletes) on the contents of the table variable before returning a result set.
2) We would also use them if we need to perform more complex logic or additional processing on the input parameters of the function before invoking the query to populate the table variable.
Types of SQL statements allowed in a function include the following:

a) DECLARE statements to define variables and cursors that are local to the function.
b) Assignments of values to variables that are local to the function, using the SET command or an assignment select.
c) Cursor operations on local cursors that are declared, opened, closed, and de-allocated within the function. FETCH statements must assign values to local variables by using the INTO clause.
d) Control-of-flow statements such as IF, ELSE, WHILE, GOTO, and so on, excluding the TRY…CATCH statements.
e) UPDATE, INSERT, and DELETE statements that modify table variables defined within the function.
f) EXECUTE statements that call an extended stored procedure. (Any results returned by the extended stored procedure are discarded.)
Nesting of User Defined Function: User-defined functions can also call other user-defined functions, with a limit of 32 levels of nesting. Nesting of functions can help improve the modularity and reusability of function code.
CREATE FUNCTION dbo.getonlydate3()
RETURNS datetime
as
BEGIN
DECLARE @date datetime
SET @date = dbo.striptime( getdate())
RETURN @date
End
How to get information about Functions: To get information by using the provided system procedures and queries against the INFORMATION_SCHEMA.routines view. The following sections describe these methods.

exec sp_helptext getonlydate
Text
create function getonlydate()
returns datetime
as
begin
return(select convert(datetime,convert(date,getdate())))
end
29
In addition to sp_helptext, you can write queries against the INFORMATION_SCHEMA.routines view to display the source code for a function:
SELECT routine_definition
from INFORMATION_SCHEMA.routines
where routine_name = ‘getonlydate’
and specific_schema = ‘dbo’
and specific_catalog = ‘bigpubs2008′
Conclusion: User-defined functions in SQL Server 2008 allow you to create reusable routines that can
Help make your SQL code more straightforward and efficient. Table-valued functions provide a way to create what are essentially parameterized views, and you can include them inline in your queries, just as you would in a table or view.

The Uses of Coalesce in SQL Server

Let's start with the documented use of coalesce. According to MSDN, coalesce returns the first non-null expression among its arguments.
For example,
SELECT COALESCE(NULL, NULL, NULL, GETDATE()) 
will return the current date.  It bypasses the first NULL values and returns the first non-null value.

Using Coalesce to Pivot
If you run the following statement against the AdventureWorks database
SELECT Name FROM HumanResources.Department WHERE (GroupName 'Executive General and Administration')
you will come up with a standard result set such as this.

If you want to pivot the data you could run the following command.
DECLARE @DepartmentName VARCHAR(1000) SELECT @DepartmentName COALESCE(@DepartmentName,'') + Name ';'  FROM HumanResources.Department WHERE  
(GroupName 'Executive General and Administration') SELECT @DepartmentName AS DepartmentNames
and get the following result set.

Using Coalesce to Execute Multiple SQL Statements
Once you can pivot data using the coalesce statement, it is now possible to run multiple SQL statements by pivoting the data and using a semicolon to separate the operations. Let's say you want to find the values for any column in the Person schema that has the column name “Name”. If you execute the following script it will give you just that.
DECLARE @SQL VARCHAR(MAX) CREATE TABLE #TMP
   
(Clmn VARCHAR(500), 
    
Val VARCHAR(50)) SELECT @SQL=COALESCE(@SQL,'')+CAST('INSERT INTO #TMP 

Select ''' TABLE_SCHEMA '.' TABLE_NAME '.' COLUMN_NAME ''' AS Clmn,
 Name FROM ' TABLE_SCHEMA '.[' TABLE_NAME + '];' AS VARCHAR(MAX)) FROM INFORMATION_SCHEMA.COLUMNS  JOIN sysobjects B ON INFORMATION_SCHEMA.COLUMNS.TABLE_NAME B.NAME WHERE COLUMN_NAME 'Name' 
   
AND xtype 'U' 
   
AND TABLE_SCHEMA 'Person' PRINT @SQL EXEC(@SQL) SELECT FROM #TMP DROP TABLE #TMP
here is the result set.

My personal favorite is being able to kill all the transactions in a database using three lines of code. If you have ever tried to restore a database and could not obtain exclusive access, you know how useful this can be.
DECLARE @SQL VARCHAR(8000) SELECT @SQL=COALESCE(@SQL,'')+'Kill '+CAST(spid AS VARCHAR(10))+ '; '  FROM sys.sysprocesses  WHERE DBID=DB_ID('AdventureWorks') PRINT @SQL --EXEC(@SQL) Replace the print statement with exec to execute
will give you a result set such as the following.

Next Steps
  • Whenever I think I may need a cursor, I always try to find a solution using Coalesce first.
  • I am sure I just scratched the surface on the many ways this function can be used. Go try and see what all you can come up with. A little innovative thinking can save several lines of code.

Sunday, January 27, 2013

Step by Step SQL Server Profiler in SQL Server 2005

Introduction

Microsoft SQL Server Profiler is a graphical user interface to SQL Trace for monitoring T-SQL Statements of Database Engine. We can save and reuse the state at a later point of time.
  • We can do the following using SQL Server Profiler
    • Create a trace
    • Watch the trace results as the trace runs
    • Store the trace results in a table
    • Start, stop, pause, and modify the trace results as necessary
    • Replay the trace results
  • Use SQL Server Profiler to monitor only the events in which you are interested.
Menu Path: Start | All Programs | Microsoft SQL Server 2005 | Performance Tools | SQL Server Profiler.
The following screen will come:
Screenshot - pic1.jpg
Figure 1.0
Click on <Connect> Button. New Properties Screen will come:
Screenshot - properties.jpg
Figure 1.1
It has two selection tabs:
  • General: It is used for general setting for Trace Database Engine.
  • Event: It is used to add or remove some selected event for monitor.
In General Section (as given in Figure 1.1), it is divided into four sections.
Section 1: In this section, you have to just specify the name of your trace, Trace provider name and server name are predefined and based upon your SQL Server.
And it is not editable.
Section 2: It is the template section. You can choose different type of Templates based upon your requirements. It is the configuration for trace. By default, it is "Standard (Default)" templates. Others templates are T-SQL, T-SQL Duration, T-SQL Reply, T-SQL SPs, etc. You can create your own custom Templates by selecting different Events and Event Class. It is saved as ".tdf" Extension.
Section 3: This section is related to save your trace. Either as File (.trc) or in a database. as table. While clicking on Save to file check box, File save dialog box should open and you can save that file (with .trc extension).
If you check the "Save to Table", it will connect with your server and ask you to which database you want to save that trace table information.
Screenshot - savetotable.jpg
Figure 1.2
Section 4: You can stop your trace on a particular time. Check the "Enable trace stop time" checkbox and give the time at which you want to stop track, SQL Server will automatically stop trace on that time.
Now Move To "Event Section" Tab.
Now we need to know some definition with respect to SQL Server Profiler.

What is an Event?

An Event is an action or operation that is performed in your SQL Server 2005 Database Engine.
Some examples of Events are:
    • Transact-SQL SELECT, INSERT, UPDATE, and DELETE statements.
    • User login and logout
    • Execution of Stored procedures
    • Operation with cursor
SQL Server profiler is able to trace all of these events and all these events are categories on a particular Event class.

What is an Event Class?

Event class is a type of event that can be traced.
Some examples are:
  • SQL: BatchCompleted
  • SQL: Batch Starting
  • Audit Login
  • Audit Logout
  • Lock: Acquired
  • Lock: Released
Now you can select events from this screen:
Screenshot - events.jpg
Figure 1.3
In section 1, we can select the proper check box based upon our requirement, section 2 will display the details of Selected events and Events class. If you check in the check box of section 3, you will get all the list of Events and Columns in Section 1.
Section 4 is something like customization. Just click on the "Column Filter Button". In this section, you can specify some condition (like or Not like).
Screenshot - Filter1.jpg
Figure 1.4
By clicking on "Organize Column" button, you can change the sequence of order of selected events.
Now Click on the "Run" Button, then Trace window will come:
Screenshot - trace1.jpg
Screenshot - trace2.jpg
Figure 1.5
Using these windows, you will get the detailed time duration of a query and all other events information that you have selected.
You can save this result and use it in future. Or you can extract a particular query from the trace, just right click and click on "Extract Event Data". And save this as a SQL Script.

Reply in SQL Server Profiler

SQL Server profiler has a Reply facility which has the ability to save a trace and replay it later.
Replay is useful to troubleshoot an application. Trace replay supports debugging by using Toggle Breakpoint and the Run to Cursor options on the SQL Server Profiler Replay menu.
Anything changed in SQL Server Management Studio will be traced by the SQL Profiler. So it can basically be used for database performance check. We also have "SQL Server Performance Monitor" to monitor the System and Server performance too.

Saturday, December 22, 2012

Sample Example of RANKING Functions – ROW_NUMBER, RANK, DENSE_RANK, NTILE

SQL Server 2005 has total of 4 ranking function. Ranking functions return a ranking value for each row in a partition. All the ranking functions are non-deterministic.
ROW_NUMBER () OVER ([<partition_by_clause>] <order_by_clause>)
Returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.
RANK () OVER ([<partition_by_clause>] <order_by_clause>)
Returns the rank of each row within the partition of a result set.
DENSE_RANK () OVER ([<partition_by_clause>] <order_by_clause>)
Returns the rank of rows within the partition of a result set, without any gaps in the ranking.
NTILE (integer_expression) OVER ([<partition_by_clause>] <order_by_clause>)
Distributes the rows in an ordered partition into a specified number of groups.
All the above definition and syntax are taken from BOL. It is difficult to explain above function anything else than what they are explained in BOL. Following example is excellent example from BOL again. This function explains usage of all the four function together in one query.
USE AdventureWorks;
GO
SELECT c.FirstName, c.LastName
,ROW_NUMBER() OVER (
ORDER BY a.PostalCode) AS 'Row Number'
,RANK() OVER (
ORDER BY a.PostalCode) AS 'Rank'
,DENSE_RANK() OVER (
ORDER BY a.PostalCode) AS 'Dense Rank'
,NTILE(4) OVER (
ORDER BY a.PostalCode) AS 'Quartile'
,s.SalesYTD, a.PostalCode
FROM Sales.SalesPerson s
INNER JOIN Person.Contact c
ON s.SalesPersonID = c.ContactID
INNER JOIN Person.Address a
ON a.AddressID = c.ContactID
WHERE TerritoryID IS NOT NULL
AND
SalesYTD <> ;

Resultset:

Query to Find Column From All Tables of Database

How many tables in database AdventureWorks have column name like ‘EmployeeID’?
USE AdventureWorks
GO
SELECT t.name AS table_name,
SCHEMA_NAME(schema_id) AS schema_name,
c.name AS column_name
FROM sys.tables AS t
INNER JOIN sys.columns c ON t.OBJECT_ID = c.OBJECT_ID
WHERE c.name LIKE '%EmployeeID%'
ORDER BY schema_name, table_name;
In above query replace EmployeeID with any other column name.
SELECT t.name AS table_name,
SCHEMA_NAME(schema_id) AS schema_name,
c.name AS column_name
FROM sys.tables AS t
INNER JOIN sys.columns c ON t.OBJECT_ID = c.OBJECT_ID
ORDER BY schema_name, table_name;

Friday, December 21, 2012

Index Selection and the Query Optimizer

Index selection is one of the most important techniques used in query optimization. By using the right indexes, SQL Server can speed up your queries and dramatically improve the performance of your applications. In this article I will show you how SQL Server selects indexes, how you can use this knowledge to provide better indexes, and how you can verify your execution plans to make sure these indexes are correctly used.
This article also includes sections about the Database Engine Tuning Advisor and the Missing Indexes feature, which will show how you can use the Query Optimizer itself to provide index tuning recommendations. However, it is important to emphasize that, no matter what index recommendations these tools give, it is ultimately up to the database administrator or developer to do their own index analysis and finally decide which of these recommendations to implement. Also, since we’ll be covering these tools mainly from the point of view of the Query Optimizer, you should use Books Online to obtain more in-depth information regarding using these features.
Finally, the sys.dm_db_index_usage_stats DMV will be introduced as a tool to identify existing indexes which your queries may not be using. Indexes that are not being used will provide no benefit to your databases, but will use valuable disk space and slow your update operations, and so they should be considered for removal.
Note:
The example SQL queries in this article are all based on the AdventureWorks database, and all code has been tested for both the SQL Server 2008 and SQL Server 2008 R2 versions of these databases. Note that these sample databases are not included in your SQL Server installation by default, but can be downloaded from the CodePlex web site. You need to download the family of sample databases for your version, either SQL Server 2008 or SQL Server 2008 R2. During installation you may choose to install all the databases or at least the AdventureWorks and AdventureWorksDW (which is needed at other points in the book beyond this sample).
You should bear in mind that learning how to manage indexes and understand what makes them useful (or not) is a long road, and you’ll be taking just the first few practical steps here. The skills you gain from this article will put you in good stead, but I would urge you to do some further reading around the subject.

Introduction

As mentioned in Chapter 2 of the book, (Which discusses The Execution Engine), SQL Server can use indexes to perform seek and scan operations. Indexes can be used to speed up the execution of a query by quickly finding records without performing table scans; by delivering all the columns requested by the query without accessing the base table (i.e. covering the query, which I’ll return to in a moment), or by providing sorted order, like in queries with GROUP BY, DISTINCT or ORDER BY clauses.
Part of the Query Optimizer’s job is to determine if an index can be used to evaluate a predicate in a query. This is basically a comparison between an index key and a constant or variable. In addition, the Query Optimizer needs to determine if the index covers the query; that is, if the index contains all the columns required by the query (referred to as a “covering index”). It needs to confirm this because, as you’ll hopefully remember, a non-clustered index usually contains only a subset of the columns of the table.
SQL Server can also consider using more than one index, and joining them to cover all the columns required by the query (index intersection). If it’s not possible to cover all of the columns required by the query, then the query optimizer may need to access the base table, which could be a clustered index or a heap, to obtain the remaining columns. This is called a bookmark lookup operation (which could be a Key Lookup or an RID Lookup, as explained in Chapter 2 of the book. However, since a bookmark lookup requires random I/O, which is a very expensive operation, using both an index seek and a bookmark lookup can only be effective for a relatively small number of records.
Also keep in mind that although one or more indexes can be used, it does not mean that they will be finally selected in an execution plan, as this is always a cost-based decision. So, after creating an index, make sure you verify that the index is, in fact, used in a plan (and of course, that your query is performing better, which is probably the primary reason why you are defining an index!) An index that it is not being used by any query will just take up valuable disk space, and may negatively impact the performance of update operations without providing any benefit. It is also possible that an index which was useful when it was originally created is no longer used by any query. This could be as a result of changes in the database schema, the data, or even the query itself. To help you avoid this frustrating situation, the last section in this chapter will show you how you can identify which indexes are no longer being used.

The Mechanics of Index Selection

In a seek operation, SQL Server navigates throughout the B-tree index to quickly find the required records without the need for an index or table scan. This is similar to using an index at the end of a book to find a topic quickly, instead of reading the entire book. Once the first record has been found, SQL Server can then scan the index leaf level forward or backward to find additional records. Both equality and inequality operators can be used in a predicate, including =, <, >, <=, >=, <>, !=, !<, !>, BETWEEN, and IN. For example, the following predicates can be matched to an index seek operation if there is an index on the specified column, or a multi-column index with that column as a leading index key:
  • ProductID = 771
  • UnitPrice < 3.975
  • LastName = ‘Allen’
  • LastName LIKE ‘Brown%’
As an example, look at the next query, which uses an index seek operator and produces the plan in Figure 1-1.
SELECT ProductID, SalesOrderID, SalesOrderDetailID
FROM Sales.SalesOrderDetail
WHERE ProductID = 771
Listing 1-1
Figure 1 - Plan with Index Seek
Figure 1-2: Index Seek Operator Properties
The SalesOrderDetail table has a multi-column index with ProductID as the leading column. The Index Seek operator properties, which you can see in Figure 1-2, include the following seek predicate on the ProductID column, which shows that SQL Server was effectively able to use the index to seek on the ProductID column:
Seek Keys[1]: Prefix: [AdventureWorks].[Sales]. [SalesOrderDetail].ProductID = Scalar Operator (CONVERT_IMPLICIT(int,[@1],0))
Listing 1-2
An index cannot be used to seek on some complex expressions, expressions using functions, or strings with a leading wildcard character, like in the following predicates:
  • ABS(ProductID) = 771
  • UnitPrice + 1 < 3.975
  • LastName LIKE ‘%Allen’
  • UPPER(LastName) = ‘Allen’
Compare the following query to the previous example; by adding an ABS function to the predicate, SQL Server is no longer able to use an Index Seek operator, and instead chooses to do an Index Scan as shown on the plan on Figure 1-3.
SELECT ProductID, SalesOrderID, SalesOrderDetailID
FROM Sales.SalesOrderDetail
WHERE ABS(ProductID) = 771
Listing 1-3
Figure 1-3: Plan with an Index Scan
Figure 1-4: Index Scan Operator Properties
Note that in Figure 1-4, the following predicate is, however, still evaluated on the Index Scan operator:
abs([AdventureWorks].[Sales].[SalesOrderDetail].
[ProductID]) =CONVERT_IMPLICIT(int,[@1],0)
Listing 1-4
In the case of a multi-column index, SQL Server can only use the index to seek on the second column if there is an equality predicate on the first column. So SQL Server can use a multi-column index to seek on both columns in the following cases, supposing that a multi-column index exists on both columns in the order presented:
  • ProductID = 771 AND SalesOrderID > 34000
  • LastName = 'Smith' AND FirstName = 'Ian'
That being said, if there is no equality predicate on the first column, or if the predicate can not be evaluated on the second column, as is the case in a complex expression, then SQL Server may only be able to use a multi-column index to seek on just the first column, like in the following examples:
  • ProductID = 771 AND ABS(SalesOrderID) = 34000
  • ProductID < 771 AND SalesOrderID = 34000
  • LastName > 'Smith' AND FirstName = 'Ian'
However, SQL Server is not able to use a multi-column index for an Index Seek in the following examples, as it is not even able to search on the first column:
  • ABS(ProductID) = 771 AND SalesOrderID = 34000
  • LastName LIKE '%Smith' AND FirstName = 'Ian'
Finally, take a look at the following query, and the Index Seek operator properties in Figure 1-5:
SELECT ProductID, SalesOrderID, SalesOrderDetailID
FROM Sales.SalesOrderDetail
WHERE ProductID = 771 AND ABS(SalesOrderID) = 45233
Listing 1-5.
Figure 1-5: Index Seek Operator Properties
The seek predicate is using only the ProductID column as shown here:
Seek Keys[1]: Prefix: [AdventureWorks].[Sales].
[SalesOrderDetail].ProductID = Scalar Operator (CONVERT_IMPLICIT(int,[@1],0)
Listing 1-6.
An additional predicate on the SalesOrderID column is evaluated like any other scan predicate, as listed in:
abs([AdventureWorks].[Sales].[SalesOrderDetail]. [SalesOrderID])=[@2]
Listing 1-7.
So in summary, this shows that, as we expected, SQL Server was able to perform a seek operation on the ProductID column but, because of the use of the ABS function, was not able to do the same for SalesOrderID. The index was used to navigate directly to find the rows that satisfy the first predicate, but then had to continue scanning to validate the second predicate.

The Database Engine Tuning Advisor

Currently, all major commercial database vendors include a physical database design tool to help with the creation of indexes. However when these tools were first developed, there were just two main architectural approaches considered for how these tools should recommend indexes. The first approach was to build a stand-alone tool with its own cost model and design rules. The second approach was to build a tool that could use the query optimizer cost model.
A problem with building a stand-alone tool is the requirement for duplicating the cost module. On top of that, having a tool with its own cost model, even if it’s better than the optimizer’s cost model, may not be a good idea because the optimizer clearly still chooses its plan based on its own model.
The second approach, using the query optimizer to help in physical database design, has been proposed in the database research community as far as back as 1988. Since it’s the optimizer which chooses the indexes for an execution plan, it makes sense to use the optimizer itself to help find which missing indexes would benefit existing queries. In this scenario, the physical design tool would use the optimizer to evaluate the cost of queries given a set of candidate indexes. An additional benefit of this approach is that, as the optimizer cost model evolves, any tool using its cost model can automatically benefit from it.
SQL Server was the first commercial database product to include a physical design tool, in the shape of the Index Tuning Wizard which shipped with SQL Server 7.0, and which was later replaced by the Database Engine Tuning Advisor (DTA) in SQL Server 2005. Both tools use the query optimizer cost model approach and were created as part of the AutoAdmin project at Microsoft, the goal of which was to reduce the total cost of ownership (TCO) of databases by making them self-tuning and self-managing. In addition to indexes, the DTA can help with the creation of indexed views and table partitioning.
However, creating real indexes in a DTA tuning session is not feasible; its overhead could impact operational queries and degrade the performance of your database. So how does the DTA estimate the cost of using an index that does not yet exist? Actually, even during a regular query optimization, the Query Optimizer does not use actual indexes to estimate the cost of a query. The decision of whether to use an index or not depends only on some metadata and the statistical information regarding the columns of the the index. Index data itself is not needed during query optimization, but of course will be required during query execution if the index is chosen.
So, to avoid creating indexes during a DTA session, SQL Server uses a special kind of indexes called hypothetical indexes, which were also used by the Index Tuning Wizard. As the name implies, hypothetical indexes are not real indexes; they only contain statistics and can be created with the undocumented WITH STATISTICS_ONLY option of the CREATE INDEX statement. You may not be able to see these indexes during a DTA session because they are dropped automatically when they are no longer needed, but you can see the CREATE INDEX WITH STATISTICS_ONLY and DROP INDEX statements if you run a SQL Server Profiler session to see what the DTA is doing.
Let’s take a quick tour to some of these concepts; To get started, create a new table on the AdventureWorks database:
SELECT * 
INTO dbo.SalesOrderDetail FROM Sales.SalesOrderDetail
Listing 1-8.
Copy the following query and save it to a file:
SELECT * FROM dbo.SalesOrderDetail
WHERE ProductID = 897
Listing 1-9.
Open a new DTA session, and you can optionally run a SQL Server Profiler session if you want to inspect what the DTA is doing. On the Workload File option, select the file containing the SQL statement that you just created with Listing 1-9, and specify AdventureWorks as both the database to tune and the database for workload analysis. Click the Start Analysis button and, when the DTA analysis finishes, run this query to inspect the contents of the msdb..DTA_reports_query table:
SELECT * FROM msdb..DTA_reports_query
Listing 1-10.
Running that query shows the following output, edited for space:
StatementString                             CurrentCost RecommendedCost
------------------------------------------- ----------- ---------------
SELECT * FROM dbo.SalesOrderDetail WHERE... 1.2434      0.00328799
Listing 1-11
Notice that this returns information like the query that was tuned, as well as the current and recommended cost. The current cost, 1.2434, is easy to obtain by directly requesting an estimated execution plan for the query as shown in Figure 1-6 (as is discussed in Chapter 2 of the book).
Figure 1-6: Plan showing Total Cost
Since the DTA analysis was completed, the required hypothetical indexes were already dropped. To now obtain the indexes recommended by the DTA, click on the Recommendations tab and look at the Index Recommendations section, where you can find the code to create any recommended index by then clicking on the Definition column. In our example, it will show the following code:
CREATE CLUSTERED INDEX [_dta_index_SalesOrderDetail_c_5_
1915153868__K5] ON [dbo].[SalesOrderDetail]  (
    [ProductID] ASC
)WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
 DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
Listing 1-12.
In the next statement and for demonstration purposes only, I will go ahead and create the index recommended by the DTA but, instead of a regular index, I will create it as a hypothetical index by adding the WITH STATISTICS_ONLY clause:
CREATE CLUSTERED INDEX cix_ProductID ON
dbo.SalesOrderDetail(ProductID)WITH STATISTICS_ONLY
Listing 1-13.
You can validate that a hypothetical index was created by running the next query:
SELECT * FROM sys.indexes
WHERE object_id = object_id('dbo.SalesOrderDetail')
AND name = 'cix_ProductID'
Listing 1-14.
The output is show next below; note that the is_hypothetical field shows that this is, in fact, just a hypothetical index:
object_id  name          index_id type type_desc is_hypothetical
---------- ------------- -------- ---- --------- ---------------
1915153868 cix_ProductID 3        1    CLUSTERED 1
Listing 1-15.
Remove the hypothetical index by running this statement:
DROP INDEX dbo.SalesOrderDetail.cix_ProductID
Listing 1-16.
Finally, implement the DTA recommendation, this time as a regular clustered index:
CREATE CLUSTERED INDEX cix_ProductID ON dbo.SalesOrderDetail(ProductID)
Listing 1-17.
After implementing the recommendation and running the query again, the clustered index is in fact now being used by the Query Optimizer. This time, the plan shows a clustered index seek operator and an estimated cost of 0.0033652, which is very close to the recommended cost listed previously when querying the msdb..DTA_reports_query table.
Finally, drop the table you just created by running the following statement:
DROP TABLE dbo.SalesOrderDetail
Listing 1-18.

The Missing Indexes Feature

SQL Server does provide a second approach that can help you to find useful indexes for your existing queries. Although not as powerful as the DTA, this option, called the Missing Indexes feature, does not require the database administrator to decide when tuning is needed, to explicitly identify what workload represents the load to tune, or to run any tool. This is a lightweight feature which is always on and, same as the DTA, was also introduced with SQL Server 2005. Let’s take a look at what it does.
During optimization, the Query Optimizer defines what the best indexes for a query are, and if these indexes don’t exist, it will make this index information available in the XML plan for a particular plan (as well as the graphical plan, as of SQL Server 2008). Alternatively, it will aggregate this information for queries optimized since the instance was started, and make it all available on the sys.dm_db_missing_index DMV. Note that, just by displaying this information, the Query Optimizer is not only warning you that it might not be selecting an efficient plan, but it is also showing you which indexes may help to improve the performance of your query. In addition, database administrators and developers should be aware of the limitations of this feature, as described on the Books Online entry ‘Limitations of the Missing Indexes Feature’.
So, with all that in mind, let’s take a quick look to see how this feature works. Create the dbo.SalesOrderDetail table on the AdventureWorks database by running the following statement:
SELECT *
INTO dbo.SalesOrderDetail
FROM sales.SalesOrderDetail
Listing 1-19.
Run this query and request a graphical or XML execution plan:
SELECT * FROM dbo.SalesOrderDetail
WHERE SalesOrderID = 43670 AND SalesOrderDetailID > 112
Listing 1-20.
This query could benefit from an index on the SalesOrderID and SalesOrderDetailID columns, but no missing indexes information is shown this time. One limitation of the Missing Indexes feature, which this example has revealed, is that it does not work with a trivial plan optimization. You can verify that this is a trivial plan by looking at the graphical plan properties, shown as Optimization Level TRIVIAL, or by looking at the XML plan, where the StatementOptmLevel is shown as TRIVIAL.
You can avoid the trivial plan optimization in several ways, as I’ll explain in Chapter 5 of the book (The Optimization Process), but (for now, you’ll just have to take it on faith. In our case, we’re just going to create a non-related index by running the following statement:
CREATE INDEX IX_ProductID ON dbo.SalesOrderDetail(ProductID)
Listing 1-21.
What is significant about this is that, although the index created will not be used by our previous query, the query no longer qualifies for a trivial plan. Run the query again, and this time the XML plan will contain the following entry.
<MissingIndexes>
  <MissingIndexGroup Impact="99.7137">
    <MissingIndex Database="[AdventureWorks]" Schema="[dbo]" Table="[SalesOrderDetail]">
      <ColumnGroup Usage="EQUALITY">
        <Column Name="[SalesOrderID]" ColumnId="1" />
      </ColumnGroup>
      <ColumnGroup Usage="INEQUALITY">
      <Column Name="[SalesOrderDetailID]" ColumnId="2"/>
      </ColumnGroup>
    </MissingIndex>
  </MissingIndexGroup>
</MissingIndexes>
Listing 1-22.
The MissingIndexes entry in the XML plan can show up to three groups - equality, inequality, and included - and the first two are shown in this example using the ColumnGroup attribute. The information contained in these groups can be used to create the missing index; the key of the index can be built by using the equality columns, followed by the inequality columns, and the included columns can be added using the INCLUDE clause of the CREATE INDEX statement. SQL Server 2008 Management Studio can build the CREATE INDEX statement for you and, in fact, if you look at the graphical plan, you can see a Missing Index warning at the top including a CREATE INDEX command, as shown in Figure 1-7.
Figure 1-7: Plan with a Missing Index Warning.
Notice the impact value of 99.7137 - Impact is a number between 0 and 100 which gives you an estimate of the average percentage benefit that the query could obtain if the proposed index were available.
You can right-click on the graphical plan and select Missing Index Details to see the CREATE INDEX command that can be used to create this desired index, as shown next:
/*
Missing Index Details from SQLQuery1.sql - The Query Processor estimates that implementing the following index could improve the query cost by 99.7137%.
*/

/*
USE [AdventureWorks]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[SalesOrderDetail] ([SalesOrderID], [SalesOrderDetailID])

GO
*/
Listing 1-23.
Create the recommended index, after you provide a name for it, by running the following statement:
CREATE NONCLUSTERED INDEX IX_SalesOrderID_SalesOrderDetailID
ON [dbo].[SalesOrderDetail]([SalesOrderID], [SalesOrderDetailID])
Listing 1-24.
If you run the query in Listing 1-21 again and look at the execution plan, this time you’ll see an index seek operator using the index you’ve just created, and both the Missing Index warning and the MissingIndex element of the XML plan are gone, as shown in Figure 1-8.
Figure 1-8: Plan without the Missing Index Warning.
Finally, remove the dbo.SalesOrderDetail table you’ve just created by running the following statement:
DROP TABLE dbo.SalesOrderDetail
Listing 1-25.

Unused Indexes

I’ll end this article on indexes by introducing the functionality of the sys.dm_db_index_usage_stats DMV, which you can use to learn about the operations performed by your indexes, and is especially helpful in discovering indexes that are not used by any query or only minimally used. As we’ve already discussed, indexes that are not being used will provide no benefit to your databases, but will use valuable disk space, slow your update operations, and should be considered for removal.
The sys.dm_db_index_usage_stats DMV stores the number of seek, scan, lookup, and update operations performed by both user and system queries, including the last time each type of operation was performed. Keep in mind that this DMV, in addition to non-clustered indexes, will also include heaps, listed as index_id equal to 0, and clustered indexes, listed as index_id equal to 1. For the purposes of this section, you may want to just focus on non-clustered indexes, which include index_id values 2 or greater; since heaps and clustered indexes contain the table’s data, they may not even be candidates for removal in the first place.
By inspecting the user_seeks, user_scans and user_lookup values of your non-clustered indexes you can see how your indexes are being used, and you can inspect the user_updates values to see the amount of updates performed on the index. All of this information will help to give you a sense as to how useful an index actually is. Bear in mind that all I’ll be demonstrating is how to call up information from this DMV, and what sort of situations will trigger different updates to the information it returns. How you deploy the DMV, and how you react to the information it returns, is a task I leave to you.
Now for an example; run the following code to create a new table with a non-clustered index on it:
SELECT * INTO dbo.SalesOrderDetail
FROM Sales.SalesOrderDetail

CREATE NONCLUSTERED INDEX IX_ProductID ON
dbo.SalesOrderDetail(ProductID)
Listing 1-26.
If you want to keep track of the values for this example, follow these steps carefully as every query execution may change the index usage statistics. When you run the following query, it will initially contain only one record, which was created because of table access performed when the index on Listing 1-26 was created:
SELECT DB_NAME(database_id) as database_name,
OBJECT_NAME(s.object_id) as object_name, i.name, s.*
FROM sys.dm_db_index_usage_stats s join sys.indexes i
ON s.object_id = i.object_id AND s.index_id = i.index_id 
and s.object_id = object_id('dbo.SalesOrderDetail')
Listing 1-27.
However, the values that we will be inspecting in this exercise, user_seeks, user_scans, user_lookups, and user_updates are all set to 0.
Now run the following query, let’s say, 3 times:
SELECT * FROM dbo.SalesOrderDetail
Listing 1-28.
This query is using a Table Scan operator, so, if you rerun the code in Listing 1-27, the DMV will show the value 3 on the user_scans column. Note that the column index_id is 0, denoting a heap, and the name of the table is also listed (as a heap is just a table with no clustered index).
Run the next query, which uses an Index Seek, twice. After the query is executed, a new record will be added for the non-clustered index, and the user_seeks counter will show a value of 2:
SELECT ProductID FROM dbo.SalesOrderDetail
WHERE ProductID = 773
Listing 1-29.
Now, run the following query 4 times, and it will use both Index Seek and RID Lookup operators. Since the user_seeks for the non-clustered index had a value of 2, it will be updated to 6, and the user_lookups value for the heap will be updated to 4:
SELECT * FROM dbo.SalesOrderDetail
WHERE ProductID = 773
Listing 1-30.
Finally, run the following query once:
UPDATE dbo.SalesOrderDetail
SET ProductID = 666
WHERE ProductID = 927
Listing 1-31.
Note that the UPDATE statement is doing an Index Seek and a Table Update, so user_seek will be updated for the index, and user_updates will be updated once for both the non-clustered index and the heap. This is the final output of the query in Listing 1-27 (edited for space):
name         index_id user_seeks user_scans user_lookups user_updates
------------ -------- ---------- ---------- ------------ ------------
NULL         0        0          3          4            1
IX_ProductID 2        7          0          0            1
Listing 1-32.
Finally, drop the table you just created:
DROP TABLE dbo.SalesOrderDetail
Listing 1-33.

Conclusion
This article explained how can you define the key of your indexes so that they are likely to be considered for seek operations, which can improve the performance of your queries by finding records more quickly. Predicates were analyzed in the contexts of both single and multi-column indexes, and we also covered how to verify an execution plan to validate that indexes were selected and properly used by SQL Server.
The Database Engine Tuning Advisor and the Missing Indexes feature, both introduced with SQL Server 2005, were presented to show how the Query Optimizer itself can be used to provide index tuning recommendations.
Finally, the sys.dm_db_index_usage_stats DMV was introduced, together with its ability to provide valuable information regarding non-clustered indexes. While we didn’t have time to discuss all the practicalities of using this DMV, we covered enough for you to be able to easily find non-clustered indexes that are not being used by your SQL Server instance. A lot of the advice we covered in this chapter came in the form of demonstrations, so you should now be familiar with index creation, and understand how maximize the usefulness of indexes, and identify indexes that are not as useful.

Friday, December 14, 2012

Read and Import Excel Sheet into SQL Server Database in ASP.Net

Here I am explaining how to import Excel Sheet Rows into SQL Server Database table using ADO.Net in an ASP.Net Web Application.
Concept
1. User uploads the Excel File.
2. Based on the extension it is decided whether it is Excel 97 – 2003 or Excel 2007 format.
3. User can select whether the Excel Sheet has header row or not using the Radio Buttons
4. The Excel file is uploaded and then sheets in the Excel workbook are read into a DropDownList.
5. User has to now select the Sheet from the dropdown whose data he wants to import.
6. User has to enter the name of the table which he wants the data to be imported.
7. User presses OK Button and the data is imported into the SQL Server Database table and the user is updated with the status.
Stored Procedures
For this article I have created two stored procedures one to read the Excel 97 – 2003 format and other Excel 2007 format. Though the Microsoft Ace Driver can read both still I have used Jet for Excel 97 - 2003 formats.
                         
Excel 97 – 2003 Format
 
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE spx_ImportFromExcel03
    @SheetName varchar(20),
    @FilePath varchar(100),
    @HDR varchar(3),
    @TableName varchar(50)
AS
BEGIN
    DECLARE @SQL nvarchar(1000)
           
    IF OBJECT_ID (@TableName,'U') IS NOT NULL
      SET @SQL = 'INSERT INTO ' + @TableName + ' SELECT * FROM OPENDATASOURCE'
    ELSE
      SET @SQL = 'SELECT * INTO ' + @TableName + ' FROM OPENDATASOURCE'
 
    SET @SQL = @SQL + '(''Microsoft.Jet.OLEDB.4.0'',''Data Source='
    SET @SQL = @SQL + @FilePath + ';Extended Properties=''''Excel 8.0;HDR='
    SET @SQL = @SQL + @HDR + ''''''')...['
    SET @SQL = @SQL + @SheetName + ']'
    EXEC sp_executesql @SQL
END
GO
 
Excel 2007 Format
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE spx_ImportFromExcel07
   @SheetName varchar(20),
   @FilePath varchar(100),
   @HDR varchar(3),
   @TableName varchar(50)
AS
BEGIN
    DECLARE @SQL nvarchar(1000)
     
    IF OBJECT_ID (@TableName,'U') IS NOT NULL
      SET @SQL = 'INSERT INTO ' + @TableName + ' SELECT * FROM OPENDATASOURCE'
    ELSE
      SET @SQL = 'SELECT * INTO ' + @TableName + ' FROM OPENDATASOURCE'
 
    SET @SQL = @SQL + '(''Microsoft.ACE.OLEDB.12.0'',''Data Source='
    SET @SQL = @SQL + @FilePath + ';Extended Properties=''''Excel 12.0;HDR='
    SET @SQL = @SQL + @HDR + ''''''')...['
    SET @SQL = @SQL + @SheetName + ']'
    EXEC sp_executesql @SQL
END
GO
 
In the above stored procedures, I have used four input parameters
1.@SheetName - Name of the Excel Sheet to be read.
2.@FilePath - Path of the Excel File
3.@HDR - Indicates whether first row in the excel sheet will be considered as Header row or not.
4.@TableName - The name of the table in which the Excel Sheet data will be transferred if the table is not present it will be created.
 
When you run the above stored procedure first time you might get the following error message.
Msg 15281, Level 16, State 1, Line 1
SQL Server blocked access to STATEMENT 'OpenRowset/OpenDatasource' of component 'Ad Hoc Distributed Queries' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ad Hoc Distributed Queries' by using sp_configure. For more information about enabling 'Ad Hoc Distributed Queries', see "Surface Area Configuration" in SQL Server Books Online.

In order to run the above stored procedure you will need to enable Surface Area Configuration in your SQL Server in the following way
Execute the below four statements one by one in the SQL Server Query Analyzer
 
sp_configure 'show advanced options', 1
 

reconfigure
 
 
sp_configure 'Ad Hoc Distributed Queries', 1
 
 
reconfigure
 
  Also in order to use the Microsoft OLEDB Ace Driver you will need to install the 2007 Office System Driver: Data Connectivity Components which is necessary for Microsoft OLEDB ACE 12.0 driver to work using the link below
2007 Office System Driver: Data Connectivity Components
I faced the following Error while running the OLEDB Ace stored procedure
Msg 7399, Level 16, State 1, Line 2
The OLE DB provider "Microsoft.Ace.OLEDB.12.0" for linked server "(null)" reported an error. Access denied.
Msg 7350, Level 16, State 2, Line 2
Cannot get the column information from OLE DB provider "Microsoft.Ace.OLEDB.12.0" for linked server "(null)".
 
Front End Design
Below is the markup of the HTML mark of the asp.net web page. There are 2 panels. First with an upload button, Asp.Net FileUpload control and a label to display the status used to upload the Excel File. Second one with label to display the uploaded Excel file name, DropDownList which contain the names of Sheets of the Excel workbook, an ASP.Net RadioButtonList to capture whether the Sheet has header row and finally two buttons one to import the Excel Sheet rows into the SQL Server Database table and other one to cancel.
 
<asp:Panel ID="Panel1" runat="server">
    <asp:FileUpload ID="FileUpload1" runat="server" />
    <asp:Button ID="btnUpload" runat="server" Text="Upload"
             OnClick="btnUpload_Click" />
    <br />
    <asp:Label ID="lblMessage" runat="server" Text="" />
asp:Panel>
<asp:Panel ID="Panel2" runat="server" Visible = "false" >
    <asp:Label ID="Label5" runat="server" Text="File Name"/>
    <asp:Label ID="lblFileName" runat="server" Text=""/>
    <br />
    <asp:Label ID="Label2" runat="server" Text="Select Sheet" />
    <asp:DropDownList ID="ddlSheets" runat="server"
                    AppendDataBoundItems = "true">
    asp:DropDownList>
    <br />
    <asp:Label ID="Label3" runat="server" Text="Enter Source Table Name"/>
    <asp:TextBox ID="txtTable" runat="server">asp:TextBox>
    <br />
    <asp:Label ID="Label1" runat="server" Text="Has Header Row?" />
    <br />
    <asp:RadioButtonList ID="rbHDR" runat="server">
        <asp:ListItem Text = "Yes" Value = "Yes" Selected = "True" >
        asp:ListItem>
        <asp:ListItem Text = "No" Value = "No">asp:ListItem>
    asp:RadioButtonList>
    <br />
    <asp:Button ID="btnSave" runat="server" Text="Save"
          OnClick="btnSave_Click" />
    <asp:Button ID="btnCancel" runat="server" Text="Cancel"
          OnClick="btnCancel_Click" />       
 asp:Panel>


Namespaces
You will require to import the following namespaces
C#
using System.Data;
using System.Data.OleDb;
using System.IO;
using System.Configuration;
using System.Data.SqlClient;
 
VB.Net
Imports System.Data
Imports System.Data.OleDb
Imports System.IO
Imports System.Data.SqlClient
Imports System.Configuration
 
 
Web.Config Configurations
I have used a AppSettings key FolderPath to store the path of the folder where the uploaded excel file will be stored.
There are three connections strings as described below
1. Excel03ConString - Conncection String for Excel 97 – 2003 formats
2. Excel07ConString - Connection String for Excel 2007 format
3. conString - Connection String for the SQL Server 2005 Express Database.

<appSettings>
      <add key="FolderPath" value="Files/"/>
appSettings>
 
<connectionStrings>
<add name="Excel03ConString" connectionString="Provider=Microsoft.Jet.OLEDB.4.0;
            Data Source={0};Extended Properties='Excel 8.0;HDR={1}'"/>
<add name="Excel07ConString" connectionString="Provider=Microsoft.ACE.OLEDB.12.0;
            Data Source={0};Extended Properties='Excel 8.0;HDR={1}'"/>
<add name="conString" connectionString="Data Source=.\SQLEXPRESS;
            database=ExcelImport;Integrated Security=true"/>
connectionStrings>
 

Uploading the excel Workbook
User has to first upload the Excel File whose data he wants to transfer to the SQL Server database  onto the server using ASP.Net FileUpload Control and a Upload button. The code snippet for the Upload Button is given below.
 
C#
protected void btnUpload_Click(object sender, EventArgs e)
{
    if (FileUpload1.HasFile)
    {
        string FileName = Path.GetFileName(FileUpload1.PostedFile.FileName);
        string Extension = Path.GetExtension(FileUpload1.PostedFile.FileName);
        string FolderPath = ConfigurationManager.AppSettings["FolderPath"];
        string FilePath = Server.MapPath(FolderPath + FileName);
        FileUpload1.SaveAs(FilePath);
        GetExcelSheets(FilePath, Extension, "Yes");
    }
}
 
 VB.Net
Protected Sub btnUpload_Click(ByVal sender As Object, ByVal e As System.EventArgs)
  If FileUpload1.HasFile Then
    Dim FileName As String = Path.GetFileName(FileUpload1.PostedFile.FileName)
    Dim Extension As String = Path.GetExtension(FileUpload1.PostedFile.FileName)
    Dim FolderPath As String = ConfigurationManager.AppSettings("FolderPath")
    Dim FilePath As String = Server.MapPath(FolderPath + FileName)
    FileUpload1.SaveAs(FilePath)
    GetExcelSheets(FilePath, Extension, "Yes")
  End If
End Sub
 
The above code snippet simply uploads the Excel Workbook into the path defined in the Web.Config key. The figure below displays the User Interface for uploading the Excel File.

User Inteface : Uploading the Excel WorkBook

You will notice GetExcelSheets function being called on the click of Upload Button. As the name suggests the function reads the names of all the sheets present in the Excel Workbook and binds the result to DropDownList. The complete function is given below
          
C#
private void GetExcelSheets(string FilePath, string Extension, string isHDR)
{
    string conStr="";
    switch (Extension)
    {
        case ".xls": //Excel 97-03
            conStr = ConfigurationManager.ConnectionStrings["Excel03ConString"]
                     .ConnectionString;
            break;
        case ".xlsx": //Excel 07
            conStr = ConfigurationManager.ConnectionStrings["Excel07ConString"]
                     .ConnectionString;
            break;
    }
 
    //Get the Sheets in Excel WorkBoo
    conStr = String.Format(conStr, FilePath, isHDR);
    OleDbConnection connExcel = new OleDbConnection(conStr);
    OleDbCommand cmdExcel = new OleDbCommand();
    OleDbDataAdapter oda = new OleDbDataAdapter();
    cmdExcel.Connection = connExcel;
    connExcel.Open();
 
    //Bind the Sheets to DropDownList
    ddlSheets.Items.Clear(); 
    ddlSheets.Items.Add(new ListItem("--Select Sheet--", ""));    
    ddlSheets.DataSource=connExcel
             .GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
    ddlSheets.DataTextField = "TABLE_NAME";
    ddlSheets.DataValueField = "TABLE_NAME";
    ddlSheets.DataBind();
    connExcel.Close();
    txtTable.Text = "";
    lblFileName.Text = Path.GetFileName(FilePath);
    Panel2.Visible = true;
    Panel1.Visible = false;
}
 
   
VB.Net

Private Sub GetExcelSheets(ByVal FilePath As String, ByVal Extension As String, ByVal isHDR As String)
  Dim conStr As String = ""
  Select Case Extension
    Case ".xls"
     'Excel 97-03
     conStr = ConfigurationManager.ConnectionStrings("Excel03ConString") _
              .ConnectionString
     Exit Select
    Case ".xlsx"
     'Excel 07
     conStr = ConfigurationManager.ConnectionStrings("Excel07ConString") _
               .ConnectionString
     Exit Select
  End Select
 
  'Get the Sheets in Excel WorkBoo
  conStr = String.Format(conStr, FilePath, isHDR)
  Dim connExcel As New OleDbConnection(conStr)
  Dim cmdExcel As New OleDbCommand()
  Dim oda As New OleDbDataAdapter()
  cmdExcel.Connection = connExcel
  connExcel.Open()
 
  'Bind the Sheets to DropDownList
  ddlSheets.Items.Clear()
  ddlSheets.Items.Add(New ListItem("--Select Sheet--", ""))
  ddlSheets.DataSource = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid _
                           .Tables, Nothing)
  ddlSheets.DataTextField = "TABLE_NAME"
  ddlSheets.DataValueField = "TABLE_NAME"
  ddlSheets.DataBind()
  connExcel.Close()
  txtTable.Text = ""
  lblFileName.Text = Path.GetFileName(FilePath)
  Panel2.Visible = True
  Panel1.Visible = False
End Sub



Importing the rows from the excel sheet to the database table
Once the Sheets are filed in the DropDownList the user interface looks as like below.

User Inteface : Selection of Excel Sheets and the destination database table

As you can see the excel file name is displayed along with all the sheets in the DropDownList. The RadioButtonList captures the information about the header row in the Excel Sheet. Then once the user presses Save button all the rows are read into the database table which the user has entered in the textbox if the table is not present it will be created.
The code snippet for the Save button is give below
   
C#
protected void btnSave_Click(object sender, EventArgs e)
{
    string FileName = lblFileName.Text;
    string Extension = Path.GetExtension(FileName);
    string FolderPath = Server.MapPath (ConfigurationManager
                       .AppSettings["FolderPath"]);
    string CommandText = "";
    switch (Extension)
    {
        case ".xls": //Excel 97-03
            CommandText = "spx_ImportFromExcel03";
            break;
        case ".xlsx": //Excel 07
            CommandText = "spx_ImportFromExcel07";
            break;
    }
    //Read Excel Sheet using Stored Procedure
    //And import the data into Database Table
    String strConnString = ConfigurationManager
                 .ConnectionStrings["conString"].ConnectionString;
    SqlConnection con = new SqlConnection(strConnString);
    SqlCommand cmd = new SqlCommand();
    cmd.CommandType = CommandType.StoredProcedure;
    cmd.CommandText = CommandText;
    cmd.Parameters.Add("@SheetName", SqlDbType.VarChar).Value =
                   ddlSheets.SelectedItem.Text;
    cmd.Parameters.Add("@FilePath", SqlDbType.VarChar).Value =
                   FolderPath + FileName;
    cmd.Parameters.Add("@HDR", SqlDbType.VarChar).Value =
                   rbHDR.SelectedItem.Text;
    cmd.Parameters.Add("@TableName", SqlDbType.VarChar).Value =
                   txtTable.Text;  
    cmd.Connection = con;
    try
    {
        con.Open();
        object count = cmd.ExecuteNonQuery();
        lblMessage.ForeColor = System.Drawing.Color.Green;
        lblMessage.Text = count.ToString() + " records inserted."; 
    }
    catch (Exception ex)
    {
        lblMessage.ForeColor = System.Drawing.Color.Red;    
        lblMessage.Text = ex.Message; 
    }
    finally
    {
        con.Close();
        con.Dispose();
        Panel1.Visible = true;
        Panel2.Visible = false;
    }
}
 
VB.Net
Protected Sub btnSave_Click(ByVal sender As Object, ByVal e As EventArgs)
  Dim FileName As String = lblFileName.Text
  Dim Extension As String = Path.GetExtension(FileName)
  Dim FolderPath As String = Server.MapPath( _
           ConfigurationManager.AppSettings("FolderPath"))
  Dim CommandText As String = ""
  Select Case Extension
      Case ".xls"
       'Excel 97-03
       CommandText = "spx_ImportFromExcel03"
       Exit Select
      Case ".xlsx"
       'Excel 07
       CommandText = "spx_ImportFromExcel07"
      Exit Select
  End Select
 
  'Read Excel Sheet using Stored Procedure
  'And import the data into Database Table
  Dim strConnString As String = ConfigurationManager _
     .ConnectionStrings("conString").ConnectionString
  Dim con As New SqlConnection(strConnString)
  Dim cmd As New SqlCommand()
  cmd.CommandType = CommandType.StoredProcedure
        cmd.CommandText = CommandText
  cmd.Parameters.Add("@SheetName", SqlDbType.VarChar).Value = _
          ddlSheets.SelectedItem.Text
  cmd.Parameters.Add("@FilePath", SqlDbType.VarChar).Value = _
          FolderPath + FileName
  cmd.Parameters.Add("@HDR", SqlDbType.VarChar).Value = _
          rbHDR.SelectedItem.Text
  cmd.Parameters.Add("@TableName", SqlDbType.VarChar).Value = _
          txtTable.Text
  cmd.Connection = con
  Try
    con.Open()
    Dim count As Object = cmd.ExecuteNonQuery()
    lblMessage.ForeColor = System.Drawing.Color.Green
    lblMessage.Text = count.ToString() & " records inserted."
  Catch ex As Exception
    lblMessage.ForeColor = System.Drawing.Color.Red
    lblMessage.Text = ex.Message
  Finally
    con.Close()
    con.Dispose()
    Panel1.Visible = True
    Panel2.Visible = False
  End Try
End Sub
 
The above code snippet simply calls the respective stored procedure based on the extension and the status is displayed to the user as shown in figure below

User Interface : Status notified to the user after completion of Excel Sheet import