Thursday 16 May 2013

SQL SERVER Interview Questions PART 1




What is a "trigger"?
Microsoft SQL Server includes support for a special type of stored procedure called a trigger. A trigger is a stored procedure that executes whenever an update, delete or insert statement is executed against a table or a view. Triggers are created in order to enforce integrity rules in a database. In other words, you can associate a trigger with a table in such a way that it fires whenever a change is made to the contents of the table. Basically, trigger is a set of SQL statements that execute in response to a data modification/retrieval event on a table. A trigger is a solution to the restrictions of a constraint. For instance :
·          A database column cannot carry PSEUDO columns as criteria where a trigger can.
·          A database constraint cannot refer old and new values for a row where a trigger can.

Also, other than table triggers, there are also schema and database triggers. These can be made to fire when new objects are created, when a user logs in, when the database shutdown etc. Table level triggers can be classified into row and statement level triggers and those can be further broken down into before and after triggers.

·          In SQL Server 6.5 you could define only 3 triggers per table, one for INSERT, one for UPDATE and one for DELETE.
·          From SQL Server 7.0 onwards, this restriction is gone, and you could create multiple triggers per each action. But in 7.0 there’s no way to control the order in which the triggers fire.
·          In SQL Server 2000 you could specify which trigger fires first or fires last using sp_settriggerorder. Triggers can’t be invoked on demand. They get triggered only when an associated action (INSERT, UPDATE, and DELETE) happens on the table on which they are defined.

Triggers are generally used to implement business rules, auditing. Triggers can also be used to extend the referential integrity checks, but wherever possible, use constraints for this purpose, instead of triggers, as constraints are much faster. Till SQL Server 7.0,   triggers fire only after the data modification operation happens. So in a way, they are called post triggers. But in SQL Server 2000 you could create pre triggers also.

 
What is "index covering" of a query?
A non-clustered index that includes (or covers) all columns used in a query is called a covering index. When SQL server can use a non-clustered index to resolve the query, it will prefer to scan the index rather than the table, which typically takes fewer data pages. If your query uses only columns included in the index, then SQL server may scan this index to produce the desired output.

What types of join algorithms can you have?
You can dynamically relate different tables by applying what's known as a join. Technically, a join is the operation used when selecting data to create these relationships at retrieval time. What that means to you is that a join determines which records are selected and which aren't. In this article, we'll introduce you to several types of joins supported by SQL and show you how to correctly apply them to get the data you need.

What that means is that a join is conditional—similar to a WHERE clause or criteria expression—in that the join specifies which records (rows) are selected in both tables. Certainly, the type of join significantly impacts which records are retrieved or acted upon. For the most part, a join can be specified in a FROM or WHERE clause, so be careful where you specify them. Because the data engine executes the clauses in a specific sequence, placement can affect the results.

Generally, the data engine fulfills the FROM clause before applying any other clauses. Adding the join here eliminates records before they're evaluated by any other clauses. As a rule, you'll find this is the recommended method.

Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table. Types of joins: INNER JOIN, OUTER JOIN, and CROSS JOIN. OUTER JOIN’s are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.

What is a SQL View?
A view can be thought of as either a virtual table or a stored query. The data accessible through a view is not stored in the database as a distinct object. What is stored in the database is a SELECT statement. The result set of the SELECT statement forms the virtual table returned by the view. A user can use this virtual table by referencing the view name in Transact-SQL statements the same way a table is referenced. A view is used to do any or all of these functions:

·          Restrict a user to specific rows in a table. For example, allow an employee to see only the rows recording his or her work in a labor-tracking table.
·          Restrict a user to specific columns. For example, allow employees who do not work in payroll to see the name, office, work phone, and department columns in an employee table, but do not allow them to see any columns with salary information or personal information.
·          Join columns from multiple tables so that they look like a single table.
·          Aggregate information instead of supplying details. For example, present the sum of a column, or the maximum or minimum value from a column.

Views are created by defining the SELECT statement that retrieves the data to be presented by the view. The data tables referenced by the SELECT statement are known as the base tables for the view. In this example, titleview in the pubs database is a view that selects data from three base tables to present a virtual table of commonly needed data:

CREATE VIEW titleview
AS
SELECT title, au_ord, au_lname, price, ytd_sales, pub_id
FROM authors AS a
     JOIN titleauthor AS ta ON (a.au_id = ta.au_id)
     JOIN titles AS t ON (t.title_id = ta.title_id)

You can then reference titleview in statements in the same way you would reference a table:

SELECT *
FROM titleview

A view can reference another view. For example, titleview presents information that is useful for managers, but a company typically discloses year-to-date figures only in quarterly or annual financial statements. A view can be built that selects all the titleview columns except au_ord and ytd_sales. This new view can be used by customers to get lists of available books without seeing the financial information:

CREATE VIEW Cust_titleview
AS
SELECT title, au_lname, price, pub_id
FROM titleview

Views in all versions of SQL Server are updatable (can be the target of UPDATE, DELETE, or INSERT statements), as long as the modification affects only one of the base tables referenced by the view, for example:

-- Increase the prices for publisher '0736' by 10%.
UPDATE titleview
SET price = price * 1.10
WHERE pub_id = '0736'
GO

SQL Server 2000 supports more complex types of INSERT, UPDATE, and DELETE statements that reference views. INSTEAD OF triggers can be defined on a view to specify the individual updates that must be performed against the base tables to support the INSERT, UPDATE, or DELETE statement. Also, partitioned views support INSERT, UPDATE, and DELETE statements that modify multiple member tables referenced by the view.
Indexed views are a SQL Server 2000 feature that greatly improves the performance of complex views of the type usually found in data warehouses or other decision support systems.

Views are called virtual tables because the result set of a view is not usually saved in the database. The result set for a view is dynamically incorporated into the logic of the statement and the result set is built dynamically at run time.

What is a Primary Key?
Primary Key is a unique column in the table that identifies each row. There cannot be more than 1 row in the table with the same primary key. The primary key contains a unique identifier to maintain each record's unique identity. Primary keys field can include an employee ID, part number, or customer number. Typically, you specify which column contains the primary key when you create a database table.

Define candidate key, alternate key and composite key.
A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes the primary key of the table. If the table has more than one candidate key, one of them will become the primary key, and the rest are called alternate keys. A key formed by combining at least two or more columns is called composite key.

What’s the difference between a primary key and a unique key?
Both primary key and unique key enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, whereas unique key creates a non-clustered index by default. Another difference is that, primary key doesn’t allow NULL, but unique key allows one NULL only.

What is a "constraint"?
A constraint allows you to apply simple referential integrity checks to a table. There are four primary types of constraints that are currently supported by SQL Server:
·          PRIMARY/UNIQUE - enforces uniqueness of a particular table column.
·          DEFAULT - specifies a default value for a column in case an insert operation does not provide one.
·          FOREIGN KEY - validates that every value in a column exists in a column of another table.
·          CHECK - checks that every value stored in a column is in some specified list
Each type of constraint performs a specific type of action. There are five kinds of constraints in all, Primary key, Foreign key, Unique, Check and Default.

What action do you have to perform before retrieving data from the next result set of a stored procedure?
Move the cursor down one row from its current position. A Result Set cursor is initially positioned before the first row. Before you can get to the first row, you would need to Move the cursor down by one row.

With certain database systems, a stored procedure can return multiple result sets, multiple update counts, or some combination of both. Also, if you are providing a user with the ability to enter any SQL statement, you don't know if you are going to get a ResultSet or an update count back from each statement, without analyzing the contents. The Statement.execute() method helps in these cases. Method Statement.execute() returns a Boolean to tell you the type of response:

·          true indicates next result is a ResultSet. Use Statement.getResultSet to get the ResultSet
·          false indicates next result is an update count. Use Statement.getUpdateCount to get the update count
·          false also indicates no more results. Update count is -1 when no more results (usually 0 or positive)

After processing each response, you use Statement.getMoreResults to check for more results, again returning a Boolean. The following demonstrates the processing of multiple result sets:

boolean result = stmt.execute(" ... ");
int updateCount = stmt.getUpdateCount();
while (result || (updateCount != -1)) {
  if(result) {
    ResultSet r = stmt.getResultSet();
    // process result set
  } else if(updateCount != -1) {
    // process update count
  }
  result = stmt.getMoreResults();
  updateCount = stmt.getUpdateCount(); }

Briefly explain about Database.
A collection of information organized in such a way that a computer program can quickly select desired pieces of data. You can think of a database as an electronic filing system. Traditional databases are organized by fieldsrecords, and files. A field is a single piece of information; a record is one complete set of fields; and a file is a collection of records. For example, a telephone book is analogous to a file. It contains a list of records, each of which consists of three fields: name, address, and telephone number.

An alternative concept in database design is known as Hypertext. In a Hypertext database, any object, whether it be a piece of text, a picture, or a film, can be linked to any other object. Hypertext databases are particularly useful for organizing large amounts of disparate information, but they are not designed for numerical analysis.
To access information from a database, you need a database management system (DBMS). This is a collection of programs that enables you to enter, organize, and select data in a database.

Briefly explain about DBMS.
A collection of programs that enables you to store, modify, and extract information from a database. There are many different types of DBMSs, ranging from small systems that run on personal computers to huge systems that run on mainframes. The following are examples of database applications:
·          computerized library systems
·          automated teller machines
·          flight reservation systems
·          computerized parts inventory systems

From a technical standpoint, DBMSs can differ widely. The terms relationalnetworkflat, and hierarchical all refer to the way a DBMS organizes information internally. The internal organization can affect how quickly and flexibly you can extract information.

Requests for information from a database are made in the form of a query, which is a stylized question. For example, the query
                        SELECT ALL WHERE NAME = "SMITH" AND AGE > 35
requests all records in which the NAME field is SMITH and the AGE field is greater than 35. The set of rules for constructing queries is known as a query language. Different DBMSs support different query languages, although there is a semi-standardized query language called SQL (structured query language). Sophisticated languages for managing database systems are called fourth-generation languages, or 4GLs for short.

The information from a database can be presented in a variety of formats. Most DBMSs include a report writer program that enables you to output data in the form of a report. Many DBMSs also include a graphics component that enables you to output information in the form of graphs and charts.

Briefly explain about RDBMS.
Short for relational database management system and pronounced as separate letters, a type of database management system (DBMS) that stores data in the form of related tables. Relational databases are powerful because they require few assumptions   about how data is related or how it will be extracted from the database. As a result, the same database can be viewed in many different ways.

An important feature of relational systems is that a single database can be spread across several tables. This differs from flat-file databases, in which each database is self-contained in a single table. Almost all full-scale database systems are RDBMS's. Small database systems, however, use other designs that provide less flexibility in posing queries.

What is normalization? Explain different levels of normalization?
In a relational database design, the process of organizing data to minimize redundancy is termed as Normalization. Normalization usually involves dividing a database into two or more tables and defining relationships between the tables. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.

Normalization can be viewed as a series of steps (i.e., levels) designed, one after another, to deal with ways in which tables can be "too complicated for their own good". The purpose of normalization is to reduce the chances for anomalies to occur in a database. The definitions of the various levels of normalization illustrate complications to be eliminated in order to reduce the chances of anomalies. At all levels and in every case of a table with a complication, the resolution of the problem turns out to be the establishment of two or more simpler tables which, as a group, contain the same information as the original table but which, because of their simpler individual structures, lack the complication.

There are three main normal forms, each with increasing levels of normalization:

·          First Normal Form (1NF): We must eliminate repeating groups.  That means any subgroups of data that appear within the record should be spilt into separate tables. Each field in a table contains different information. For example, in an employee list, each table would contain only one birthdate field.

·          Second Normal Form (2NF): By eliminating partial dependencies.  Partial dependencies means that data does not depend on the primary key of the table to uniquely identify it. Each field in a table that is not a determiner of the contents of another field must itself be a function of the other fields in the table.

·          Third Normal Form (3NF): Eliminate all transitive (i.e. hidden) dependencies.  In other words every column that isn’t part of the key must depend on the key for its information al value. No duplicate information is permitted. So, for example, if two tables both require a birthdate field, the birthdate information would be separated into a separate table, and the two other tables would then access the birthdate information via an index field in the birthdate table. Any change to a birthdate would automatically be reflecting in all tables that link to the birthdate table.

There are additional normalization levels, such as Boyce Codd Normal Form (BCNF), fourth normal form (4NF) and fifth normal form (5NF). While normalization makes databases more efficient to maintain, they can also make them more complex because data is separated into so many different tables.

·          Boyce-Codd Normal Form (BCNF): A table is in BCNF if it is in 3NF and if every determinant is a candidate key.
·          4th Normal Form (4NF): A table is in 4NF if it is in BCNF and if it has no multi-valued dependencies.
·          5th Normal Form (5NF): A table is in 5NF, also called "Projection-Join Normal Form" (PJNF), if it is in 4NF and if every join dependency in the table is a consequence of the candidate keys of the table.
·          Domain-Key Normal Form (DKNF): A table is in DKNF if every constraint on the table is a logical consequence of the definition of keys and domains.

What is denormalization and when would you go for it?
As the name indicates, denormalization is the reverse process of normalization. It's the controlled introduction of redundancy in to the database design. It helps improve the query performance as the number of joins could be reduced.

What is an index? What are the types of indexes? How many clustered indexes can be created on a table? I created a separate index on each column of a table. What are the advantages and disadvantages of this approach?
Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve the data quicker. Indexes are of two types. Clustered indexes and non-clustered indexes. When you create a clustered index on a table, all the rows in the table are stored in the order of the clustered index key. So, there can be only one clustered index per table. Non-clustered indexes have their own storage separate from the table data storage. Non-clustered indexes are stored as B-tree structures (so do clustered indexes), with the leaf level nodes having the index key and its row locater. The row located could be the RID or the Clustered index key, depending up on the absence or presence of clustered index on the table.

If you create an index on each column of a table, it improves the query performance, as the query optimizer can choose from all the existing indexes to come up with an efficient execution plan. At the same time, data modification operations (such as INSERT, UPDATE, DELETE) will become slow, as every time data changes in the table, all the indexes need to be updated. Another disadvantage is that, indexes need disk space, the more indexes you have, more disk space is used.

What is RAID and what are different types of RAID configurations?
Note: RAID is available only on Microsoft Windows NT 4.0 and Microsoft Windows 2000.

RAID (redundant array of independent disks) is a disk system that comprises multiple disk drives (an array) to provide higher performance, reliability, storage capacity, and lower cost. Fault-tolerant arrays are categorized in six RAID levels, 0 through 5. Each level uses a different algorithm to implement fault tolerance.
Although RAID is not a part of Microsoft® SQL Server™ 2000, its implementation can directly affect the way SQL Server performs. RAID levels 0, 1, and 5 are typically used with SQL Server.

A hardware disk array improves I/O performance because I/O functions, such as striping and mirroring, are handled efficiently in firmware. Conversely, an operating system–based RAID offers lower cost but consumes processor cycles. When cost is a consideration and redundancy and high performance are required, Microsoft Windows® NT® stripe sets with parity or Windows 2000 RAID-5 volumes are a good solution.

Data striping (RAID 0) is the RAID configuration with the highest performance, but if one disk fails, all the data on the stripe set becomes inaccessible. A common installation technique for relational database management systems is to configure the database on a RAID 0 drive and then place the transaction log on a mirrored drive (RAID 1). You can get the best disk I/O performance for the database and maintain data recoverability (assuming you perform regular database backups) through a mirrored transaction log.

If data must be quickly recoverable, consider mirroring the transaction log and placing the database on a RAID 5 disk. RAID 5 provides redundancy of all data on the array, allowing a single disk to fail and be replaced in most cases without system downtime. RAID 5 offers lower performance than RAID 0 or RAID 1 but higher reliability and faster recovery.

What are the steps you will take to improve performance of a poor performing query?
There could be a lot of reasons behind the poor performance of a query.

But some general issues to talk about would be:
No indexes, table scans, missing or out of date statistics, blocking, excess recompilations of stored procedures, procedures and triggers without SET NOCOUNT ON, poorly written query with unnecessarily complicated joins, too much normalization, excess usage of cursors and temporary tables.

Some of the tools/ways that help you troubleshooting performance problems are:
SET SHOWPLAN_ALL ON, SET SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance monitor, Graphical execution plan in Query Analyzer.

Write down the general syntax for a SELECT statement covering all the options.
SELECT select_list
 [INTO new_table_]
  FROM table_source
   [WHERE search_condition]
    [GROUP BY group_by_expression]
     [HAVING search_condition]
      [ORDER BY order_expression [ASC | DESC] ]

What is an extended stored procedure? Can you instantiate a COM object by using T-SQL?
Extended stored procedures allow you to create your own external routines in a programming language such as C. The extended stored procedures appear to users as normal stored procedures and are executed in the same way. Parameters can be passed to extended stored procedures, and they can return results and return status.

Extended stored procedures are dynamic-link libraries (DLLs) that SQL Server can dynamically load and execute. Extended stored procedures run directly in the address space of SQL Server and are programmed using the SQL Server Open Data Services API. An extended stored procedure is a function within a DLL (written in a programming language like C, C++ using Open Data Services (ODS) API) that can be called from T-SQL, just the way we call normal stored procedures using the EXEC statement.

Yes, you can instantiate a COM (written in languages like VB, VC++) object from T-SQL by using sp_OACreate stored procedure.

What is a self join? Explain it with an example.
Self-joins are useful in SQL select statements and are often used to identify duplicate entries or entries linked in other ways such as customers who share the same telephone number or patients who share the same address. A self-join is an internal table (or view) join.  The self-join is between fields in a table and fields within a virtual copy of the table. 

Self join is just like any other join, except that two instances of the same table will be joined in the query. Here is an example: Employees table which contains rows for normal employees as well as managers. So, to find out the managers of all the employees, you need a self join.

CREATE TABLE emp 
(
empid int,
mgrid int,
empname char(10)
)
INSERT emp SELECT 1,2,'Vyas'
INSERT emp SELECT 2,3,'Mohan'
INSERT emp SELECT 3,NULL,'Shobha'
INSERT emp SELECT 4,2,'Shridhar'
INSERT emp SELECT 5,2,'Sourabh'

SELECT t1.empname [Employee], t2.empname [Manager]
FROM emp t1, emp t2
WHERE t1.mgrid = t2.empid

 Under the ADO Command Object, what collection is responsible for input to stored procedures?
The Parameters collection.

What are data types?
A computer works with values. It manipulates values and produces some kind of result.   Depending on the program, it may be limited as to what types of values it can work with. The types of values a program can work with are known as its data types.

What are the data types in VBScript?
VBScript has only one data type called a Variant. A Variant is a special kind of data type that can contain different kinds of information, depending on how it is used. Because Variant is the only data type in VBScript, it is also the data type returned by all functions in VBScript.

At its simplest, a Variant can contain either numeric or string information. A Variant behaves as a number when you use it in a numeric context and as a string when you use it in a string context. That is, if you are working with data that looks like numbers, VBScript assumes that it is numbers and does what is most appropriate for numbers. Similarly, if you're working with data that can only be string data, VBScript treats it as string data. You can always make numbers behave as strings by enclosing them in quotation marks (" ").

The following are the subtypes of data that a Variant can contain:
Empty, Null, Boolean, Byte, Integer, Currency, Long, Single, Double, Date (Time), String, Object, Error

What are data types in JavaScript?
JavaScript provides three simple data types: Each of these data types allows you to manipulate a different type of data value.
·          Number - used to perform arithmetic operations such as addition, subtraction, multiplication, and division. Any whole number or decimal number that does not appear between quotation marks is considered a Number.
·          String - used to handle text. This data type is simply a series of characters inside quotation marks.
·          Boolean - has only 2 possible values: true or false. This data type allows you to perform logical operations to test whether a value is true or false.

JavaScript also provides two trivial data types for special situations.
·          Null - indicated by the JavaScript keyword null, which represents a condition where no value exists.
·          Undefined - signified by the JavaScript keyword undefined, indicates a situation where the value has not been defined previously in the code.

JavaScript also has complex data types.
·          Object - represents a collection of data and functions. More information on this data type is available in the Object-Oriented JavaScript article.  Technically, all complex JavaScript data types, including String, Function, and Array are objects.
·          Array - stores a collection of data.  See JavaScript Array Basics.
·          Function - a piece of executable program code that is written once, but can be called multiple times from different parts of your JavaScript.

How do you use multiple record sets ( rs.NextRecordSet )?
SQL Server 7.0/2000 with ActiveX Data Objects gives you the ability to use multiple records generated from a single SQL request. For example in an Active Server Page (ASP):

<%
Set cn = CreateObject(“ADODB.Connection”)
Set rs = CreateObject(“ADODB.Recordset”)
This is usually an Application Level Variable that contains the Database
‘ connection string and is initiallized in the global.asa file

cn.Open Application(“cn_string”)
If cn.State > 0 Then ‘ The Connection succeeded

‘ First build a Query that will return multiple recordsets

strSQL = “SELECT * FROM Employees;”
strSQL = strSQL & “SELECT * FROM Orders”
Set rs = cn.Execute(strSQL)
Do While Not rs Is Nothing

‘ Process each recordset

Set rs = rs.NextRecordset
Loop
cn.Close
End If
Set cn = Nothing
%>

This is somewhat useful; however its full power can be seen when incorporated with SQL Server stored procedures. In a SQL Server stored procedure it is simple to return multiple Recordsets with one call that could build an entire web page. This ability in effect solves several problems and enhances the system by:
1.        Shifting some of the processing back to the SQL Server
2.        Uses SQL Servers benefits of creating an Execution Plan for all of the SQL contained within the stored procedure, thus will execute much quicker than the equivalent SQL/ASP methods.

First thing to do is create the stored procedure. This stored procedure is over simplistic in that it only builds a simple combo box for a web page:

Create Procedure dbo.sp_BuildMyPage
As
SELECT ‘<SELECT id=emp name=emp>’
SELECT ‘<OPTION>’ + EmpName + ‘</OPTION>’ FROM Employees
SELECT ‘</SELECT>’
Go

Then we need to build the Web Page:

<%
Set cn = CreateObject(“ADODB.Connection”)
Set rs = CreateObject(“ADODB.Recordset”)
cn.Open Application(“cn_string”)
If cn.State > 0 Then ‘ The Connection succeeded
strSQL = “sp_BuildMypage”
Set rs = cn.Execute(strSQL)
Do While Not rs Is Nothing
Response.Write rs.GetString(2,-1)
Set rs = rs.NextRecordset
Loop
cn.Close
End If
Set cn = Nothing
%>

That’s it. All the code for the page is stored in a compiled format in SQL Server a with a pre-determined execution plan and the ASP only needs to call the stored procedure, pass it any parameters that are needed and then handle the resulting Recordsets that are returned. Since you develop each recordset in the stored procedure as only returning one column with multiple rows, the ADO GetString method simply concatenates all the rows together for a recordset and spits them out with one call per
Recordset, thus improving performance significantly over processing one row at a time as would usually be done when building a combo box.

Define a transaction. What are acid properties of a transaction?
A transaction is a process where either something completes or it fails depending on its requirements

·          Atomicity ensures that all updates completed under a specific transaction are committed (and made durable) or that all are aborted and rolled back to their previous state.

·          Consistency means that a transaction is a correct transformation of the system state, preserving the state invariants.

·          Isolation protects concurrent transactions from reading partial and uncommitted results, which might create inconsistencies in the application state. Resource managers use transaction-based synchronization protocols to isolate the uncommitted work of active transactions.

·          Durability means that committed updates to managed resources (such as database records) survive failures, including communication failures, process failures, and server-system failures. Transactional logging even allows you to recover the durable state after disk media failures.

What is the name of the SQL language used in SQL Server stored procedures? (T-SQL)
Transact-SQL (T-SQL)

What is the purpose of having stored procedures in a database? (T-SQL)
By packaging your SQL statements in stored procedures (SP), you get faster database performance. A SP needs to be parsed, compiled, and optimized by SQL Server only once, whenever it is executed from an ASP.NET page. Also, multiple SQL statements can be packaged in a SP and executed as a group.

Why might you create a stored procedure with the 'with recompile' option? (T-SQL)
As a database is changed by such actions as adding indexes or changing data in indexed columns, the original query plans used to access its tables should be optimized again by recompiling them. This optimization happens automatically the first time a stored procedure is run after Microsoft® SQL Server™ 2000 is restarted. It also occurs if an underlying table used by the stored procedure changes. But if a new index is added from which the stored procedure might benefit, optimization does not automatically happen (until the next time the stored procedure is run after SQL Server is restarted).

SQL Server provides three ways to recompile a stored procedure:

·          The sp_recompile system stored procedure forces a recompile of a stored procedure the next time it is run.

·          Creating a stored procedure that specifies the WITH RECOMPILE option in its definition indicates that SQL Server does not cache a plan for this stored procedure; the stored procedure is recompiled each time it is executed. Use the WITH RECOMPILE option when stored procedures take parameters whose values differ widely between executions of the stored procedure, resulting in different execution plans to be created each time. Use of this option is uncommon, and causes the stored procedure to execute more slowly because the stored procedure must be recompiled each time it is executed.

·          You can force the stored procedure to be recompiled by specifying the WITH RECOMPILE option when you execute the stored procedure. Use this option only if the parameter you are supplying is atypical or if the data has significantly changed since the stored procedure was created.

What is a cursor? (T-SQL)
Cursors allow row-by-row processing of the ResultSet.

Types of cursors: Static, Dynamic, Forward-only, Keyset-driven. See books online for more information.

Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network roundtrip; where as a normal SELECT query makes only one roundtrip, however large the ResultSet is. Cursors are also costly because they require more resources and temporary storage (results in more IO operations). Further, there are restrictions on the SELECT statements that can be used with some types of cursors.

Most of the times, set based operations can be used instead of cursors. Here is an example:

If you have to give a flat hike to your employees using the following criteria:

Salary between 30000 and 40000 -- 5000 hike
Salary between 40000 and 55000 -- 7000 hike
Salary between 55000 and 65000 -- 9000 hike

In this situation many developers tend to use a cursor, determine each employee's salary and update his salary according to the above formula. But the same can be achieved by multiple update statements or can be combined in a single UPDATE statement as shown below:

UPDATE tbl_emp SET salary =
CASE WHEN salary BETWEEN 30000 AND 40000 THEN salary + 5000
WHEN salary BETWEEN 40000 AND 55000 THEN salary + 7000
WHEN salary BETWEEN 55000 AND 65000 THEN salary + 10000
END

Another situation in which developers tend to use cursors: You need to call a stored procedure when a column in a particular row meets certain condition. You don't have to use cursors for this. This can be achieved using WHILE loop, as long as there is a unique key to identify each row.

What can you do to prevent a query from locking data during a simple select statement?
There are two ways to turn off all locking for a specific query or transaction. If you are writing a simple query that provides the data for a report, the easiest way to turn off locking is to use the NOLOCK optimizer hint as part of the query. If you are running larger transactions that may include multiple queries, then you may consider setting the isolation level for the transaction to READ UNCOMMITTED. If you do this, you will have to turn it on, and then off, from within your transaction.

Let's say that you have a SELECT statement that needs to return some rows. But as the SELECT statement is running, it runs into a row that it needs, but that row has been previously locked, and is still locked. In most cases, the SELECT statement will have to wait until the blocking lock is released. This could be a second, minutes, or who knows how long. Let's also assume that you don't really care that much if the locked row is returned or not, and that you are more concerned with performance than accuracy. Or, perhaps you goal is even to prevent the same user from viewing the same data at the same time, and you don't want any currently locked rows to be viewed by your SELECT statement.

If the above is true, consider using the READPAST locking hint. If this hint is added to a SELECT query (and it can only be used in a SELECT statement), if a row it needs is locked with a row-level lock (only, other locks are still honored), it will skip that locked record and continue on, until it is done, preventing a lock from stalling the query, providing quick results. Also, for this hint to work, the READ COMMITTED isolation level has to be in place.

While using this lock will be rare, it can come in handy in unique situations, helping to boost the performance of your application.

Write a T-SQL query that returns all the fields and the first 100 records in the Customers table.
Select  top 100 * from customers
The Select statement can optionally be followed by one of these predicates: ALL, DISTINCT, DISTINCTROW & TOP. These limit the number of records returned.

What is a clustered index?  How many are permitted on a table?  What is the penalty of using one?
In Microsoft SQL Server databases you can create a clustered index. In a clustered index, the physical order of the rows in the table is the same as the logical (indexed) order of the index key values. A table can contain only one clustered index. Update and Delete operations are often accelerated by clustered indexes because these operations require large amounts of data to be read. Creating or modifying a clustered index can be time-consuming, because it is during these operations that the table's rows are reorganized on disk.

What is the difference between GROUP BY and DISTINCT?
GROUP BY divides a table into groups. Groups can consist of column names or results or computed columns. GROUP BY was added to SQL because aggregate functions (like SUM) return the aggregate of all column values every time they are called, and without the GROUP BY function it was impossible to find the sum for each individual group of column values.
This "Sales" Table:
Company
Amount
W3Schools
5500
IBM
4500
W3Schools
7100
And This SQL: SELECT Company, SUM (Amount) FROM Sales

Returns this result:
Company
SUM(Amount)
W3Schools
17100
IBM
17100
W3Schools
17100
The above code is invalid because the column returned is not part of an aggregate. A GROUP BY clause will solve this problem:
SELECT Company, SUM (Amount) FROM Sales
GROUP BY Company

Returns this result:
Company
SUM(Amount)
W3Schools
12600
IBM
4500

DISTINCTThe SELECT keyword allows us to grab all information from a column (or columns) on a table. This, of course, necessarily means that there will be redundancies. What if we only want to select each DISTINCT element? This is easy to accomplish in SQL. All we need to do is to add DISTINCT after SELECT.

Example: Select Companies from Order Table: Simple Table of Purchase Orders:
Company
OrderNumber
Sega
3412
DeveloperIQ
2312
Trio
4678
DeveloperIQ
6798

This SQL statement: SELECT Company FROM Orders

Will return this result:
Company
Sega
DeveloperIQ
Trio
DeveloperIQ

Note that the company DeveloperIQ is listed twice in the result. Sometimes we don't want that.

Example: Select Distinct Companies from Orders
This SQL statement: SELECT DISTINCT Company FROM Orders

Will return this result:
Company
Sega
DeveloperIQ
Trio

In SQL query, what is @@ERROR used for?
It is used to track the error in Stored Procedure and Triggers. For example if @@ERROR = NULL, it means there is no error in Stored Procedure and Triggers.

When an error is encountered within a stored procedure, the best you can do (assuming it’s a non-fatal error) is halt the sequential processing of the code and either branch to another code segment in the procedure or return processing to the calling application. Notice that the previous sentence is specific to non-fatal errors. There are two types of errors in SQL Server: fatal and non-fatalFatal errors cause a procedure to abort processing and terminate the connection with the client application. Non-fatal errors do not abort processing a procedure or affect the connection with the client application. When a non-fatal error occurs within a procedure, processing continues on the line of code that follows the one that caused the error.

You might be wondering what actions cause fatal errors. Unfortunately, the actions that cause a fatal error are not well documented. Each error has an associated severity level that is a value between 0–25. The errors with a severity level of 20 or above are all fatal, but once you get below this value there is no well-defined rule as to which errors are fatal. In truth, though, worrying about which errors are fatal is a bit useless because there is no code you can implement that will allow you to handle them gracefully. Of course, you can use pro-active coding to make sure fatal-errors do not occur. For example, if your application allows users to type in the name of the table on which a query is based you can verify its existence before referencing it with dynamic SQL.

@@ERROR
The @@ERROR system function is used to implement error handling code. It contains the error ID produced by the last SQL statement executed during a client’s connection. When a statement executes successfully, @@ERROR contains 0. To determine if a statement executes successfully, an IF statement is used to check the value of the function immediately after the target statement executes. It is imperative that @@ERROR be checked immediately after the target statement, because its value is reset when the next statement executes successfully.

RAISERROR
The RAISERROR statement is used to produce an ad hoc error message or to retrieve a custom message that is stored in the sysmessages table. You can use this statement with the error handling code presented in the previous section to implement custom error messages in your applications.

Adding a Permanent Custom Message
If you have a message that is going to be used frequently, it is more efficient to add it to the sysmessages table and reference it by its unique ID. The system stored procedure sp_addmessages adds an error message to sysmessages.

What is SQL Server isolation level? Name the 4 SQL Server isolation levels.
Isolation Level
The property of a transaction that controls the degree to which data is isolated for use by one process and guarded against interference from other processes. Setting the isolation level defines the default locking behavior for all SELECT statements in your SQL Server session.

SET TRANSACTION ISOLATION LEVEL
Controls the default transaction locking behavior for all Microsoft SQL Server SELECT statements issued by a connection.

This example sets the TRANSACTION ISOLATION LEVEL for the session. For each Transact-SQL statement that follows, SQL Server holds all of the shared locks until the end of the transaction.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
GO
BEGIN TRANSACTION
SELECT * FROM publishers
SELECT * FROM authors
...
COMMIT TRANSACTION

READ COMMITTED
Specifies that shared locks are held while the data is being read to avoid dirty reads, but the data can be changed before the end of the transaction, resulting in nonrepeatable reads or phantom data. This option is the SQL Server default.

READ UNCOMMITTED
Implements dirty read, or isolation level 0 locking, which means that no shared locks are issued and no exclusive locks are honored. When this option is set, it is possible to read uncommitted or dirty data; values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the four isolation levels.

REPEATABLE READ
Locks are placed on all data that is used in a query, preventing other users from updating the data, but new phantom rows can be inserted into the data set by another user and are included in later reads in the current transaction. Because concurrency is lower than the default isolation level, use this option only when necessary.

SERIALIZABLE
Places a range lock on the data set, preventing other users from updating or inserting rows into the data set until the transaction is complete. This is the most restrictive of the four isolation levels. Because concurrency is lower, use this option only when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT statements in a transaction.


Dirty Reads: Reads that contain uncommitted data. For example, transaction1 changes a row. Transaction2 reads the changed row before transaction1 commits the change. If transaction1 rolls back the change, transaction2 has read a row that never logically existed.
Phantom: By one task, the insertion of a new row or the deletion of an existing row in a range of rows previously read by another task that has not yet committed its transaction. The task with the uncommitted transaction cannot repeat its original read because of the change to the number of rows in the range. If a connection sets its transaction isolation level to serializable, SQL Server uses key-range locking to prevent phantoms.
Nonrepeatable Reads: When a transaction reads the same row more than one time, and between the two (or more) reads, a separate transaction modifies that row. Because the row was modified between reads within the same transaction, each read produces different values, which introduces inconsistency.

What is the difference between UNION and UNION ALL?  Which is faster?
UNION ALL command is used to combine the results of two queries together. The difference between UNION ALL and UNION is that, while UNION drops duplicate values and only selects distinct values, UNION ALL selects all values.  UNION is faster than UNION ALL.

Is there a better way to get the record count on a table than using SELECT COUNT (*)?  Provide some alternatives that may provide a faster response on a large table:
We can use SELECT MAX. It gives the highest non-null value and  is faster than COUNT because it just give the highest number instead counting the total records for specific column.

What does DTS mean?  What is it used for?
DTS is a feature of Microsoft SQL Server. It provides the functionality to import, export and transform data between Microsoft SQL Server and any OLE DB, Open Database Connectivity (ODBC), or text file format. For example, importing catalog data from the Catalog database to the Data Warehouse is a DTS task. After you import data into the Data Warehouse, business managers can analyze the data.

You create a DTS solution as one or more packages. Each package may contain an organized set of tasks that define work to be performed, transformations on data and objects, workflow constraints that define task execution, and connections to data sources and destinations. DTS packages also provide services, such as logging package execution details, controlling transactions, and handling global variables.

These tools are available for creating and executing DTS packages:

·          The Import/Export Wizard is for building relatively simple DTS packages, and supports data migration and simple transformations.
·          The DTS Designer graphically implements the DTS object model, allowing you to create DTS packages with a wide range of functionality.
·          DTSRun is a command-prompt utility used to execute existing DTS packages.
·          DTSRunUI is a graphical interface to DTSRun, which also allows the passing of global   variables and the generation of command lines.
·          SQLAgent is not a DTS application; however, it is used by DTS to schedule package execution.

Using the DTS object model, you also can create and run packages programmatically, build custom tasks, and build custom transformations.

What are advantages and disadvantages of Microsoft-provided data provider classes in ADO.NET?
SQLServer.NET data provider is high-speed and robust, but requires SQL Server license purchased from Microsoft. OLE-DB.NET is universal for accessing other sources, like Oracle, DB2, Microsoft Access and Informix, but it’s a .NET layer on top of OLE layer, so not the fastest thing in the world. ODBC.NET is a deprecated layer provided for backward compatibility to ODBC engines.

What’s the role of the DataReader class in ADO.NET connections?
It returns a read-only dataset from the data source when the command is executed.

What is the wildcard character in SQL? Let’s say you want to query database with LIKE for all employees whose name starts with La.
The wildcard character is %, the proper query with LIKE would involve ‘La%’.

To broaden the selections of a structured query language (SQL) statement, two wildcard characters, the percent sign (%) and the underscore (_), can be used. The percent sign is analogous to the asterisk (*) wildcard character used with MS-DOS. The percent sign allows for the substitution of one or more characters in a field. The underscore is similar to the MS-DOS wildcard question mark (?) character. The underscore allows for the substitution of a single character in an expression. The following SELECT statement lists all the records that have an "X" or a "Y" as the last character in the STATE field. The output will include records from New York and Texas.

SELECT * FROM clients WHERE state LIKE "_Y" OR state LIKE "_X"

What connections does Microsoft SQL Server support? Which one is trusted and which one is untrusted?
Windows Authentication (via Active Directory) and SQL Server authentication (via Microsoft SQL Server username and passwords). Windows Authentication is trusted because the username and password are checked with the Active Directory; the SQL Server authentication is untrusted, since SQL Server is the only verifier participating in the transaction.

Why would you use untrusted verification?
An untrusted user is one who attempts a remote login to the local server. Web Services might use it, as well as non-Windows applications.

What’s the data provider name to connect to Access database?
Microsoft.Jet.OLEDB.4.0

What is a pre-requisite for connection pooling?
Multiple processes must agree that they will share the same connection, where every parameter is the same, including the security settings.

Connection Pooling: A performance optimization based on using collections of pre-allocated resources, such as objects or database connections. Pooling results in more efficient resource allocation.

Pooling connections can significantly enhance the performance and scalability of your application. The .NET Framework Data Provider for SQL Server provides connection pooling automatically for your ADO.NET client application. You can also supply several connection string modifiers to control connection pooling behavior.

Pool Creation and Assignment
When a connection is opened, a connection pool is created based on an exact matching algorithm that associates the pool with the connection string in the connection. Each connection pool is associated with a distinct connection string. When a new connection is opened, if the connection string is not an exact match to an existing pool, a new pool is created. Once created, connection pools are not destroyed until the active process ends. Maintenance of inactive or empty pools involves minimal system overhead.

Connection Addition
A connection pool is created for each unique connection string. When a pool is created, multiple connection objects are created and added to the pool so that the minimum pool size requirement is satisfied. Connections are added to the pool as needed, up to the maximum pool size.

When a SqlConnection object is requested, it is obtained from the pool if a usable connection is available. To be usable, the connection must currently be unused, have a matching transaction context or not be associated with any transaction context, and have a valid link to the server.

If the maximum pool size has been reached and no usable connection is available, the request is queued. The Connection Pooler satisfies these requests by reallocating connections as they are released back into the pool. Connections are released back into the pool when you call Close or Dispose on the Connection.

NOTE: It is recommended that you always close the Connection when you are finished using it in order for the connection to be returned to the pool. This can be done using either the Close or Dispose methods of the Connection object. Connections that are not explicitly closed might not be added or returned to the pool. For example, a connection that has gone out of scope but that has not been explicitly closed will only be returned to the connection pool if the maximum pool size has been reached and the connection is still valid.

Connection Removal
The connection pooler will remove a connection from the pool if the connection lifetime has expired, or if the pooler detects that the connection with the server has been severed. Note that this can be detected only after attempting to communicate with the server. If a connection is found that is no longer connected to the server, it is marked as invalid. The connection pooler periodically scans connection pools looking for objects that have been released to the pool and are marked as invalid. These connections are then permanently removed. If a connection exists to a server that has disappeared, it is possible for this connection to be drawn from the pool even if the connection pooler has not detected the severed connection and marked it as invalid. When this occurs, an exception is generated. However, you must still close the connection in order to release it back into the pool.




Happy Programming!!
Don’t forget to leave your feedback and comments below!
If you have any query mail me to Sujeet.bhujbal@gmail.com     
Regards
Sujeet Bhujbal
--------------------------------------------------------------------------------
----------------------------------------------------------------------------------



No comments:

Post a Comment