SQL Server Interview Questions Part 3

Hi friends,

Its time of interview :) :) so that i am posting new article on SQL SERVER Interview

Questions

1.What is a "trigger"?

Microsoft SQL Server includes support for a special type of stored procedure called a trigger. A trigger is a stored procedure that executes whenever an update, delete or insert statement is executed against a table or a view. Triggers are created in order to enforce integrity rules in a database. In other words, you can associate a trigger with a table in such a way that it fires whenever a change is made to the contents of the table. Basically, trigger is a set of SQL statements that execute in response to a data modification/retrieval event on a table. A trigger is a solution to the restrictions of a constraint. For instance :

· A database column cannot carry PSEUDO columns as criteria where a trigger can.

· A database constraint cannot refer old and new values for a row where a trigger can.

Also, other than table triggers, there are also schema and database triggers. These can be made to fire when new objects are created, when a user logs in, when the database shutdown etc. Table level triggers can be classified into row and statement level triggers and those can be further broken down into before and after triggers.

· In SQL Server 6.5 you could define only 3 triggers per table, one for INSERT, one for UPDATE and one for DELETE.

· From SQL Server 7.0 onwards, this restriction is gone, and you could create multiple triggers per each action. But in 7.0 there’s no way to control the order in which the triggers fire.

· In SQL Server 2000 you could specify which trigger fires first or fires last using sp_settriggerorder. Triggers can’t be invoked on demand. They get triggered only when an associated action (INSERT, UPDATE, and DELETE) happens on the table on which they are defined.

Triggers are generally used to implement business rules, auditing. Triggers can also be used to extend the referential integrity checks, but wherever possible, use constraints for this purpose, instead of triggers, as constraints are much faster. Till SQL Server 7.0, triggers fire only after the data modification operation happens. So in a way, they are called post triggers. But in SQL Server 2000 you could create pre triggers also.

What is "index covering" of a query?

A non-clustered index that includes (or covers) all columns used in a query is called a covering index. When SQL server can use a non-clustered index to resolve the query, it will prefer to scan the index rather than the table, which typically takes fewer data pages. If your query uses only columns included in the index, then SQL server may scan this index to produce the desired output.

What types of join algorithms can you have?

You can dynamically relate different tables by applying what's known as a join. Technically, a join is the operation used when selecting data to create these relationships at retrieval time. What that means to you is that a join determines which records are selected and which aren't. In this article, we'll introduce you to several types of joins supported by SQL and show you how to correctly apply them to get the data you need.

What that means is that a join is conditional—similar to a WHERE clause or criteria expression—in that the join specifies which records (rows) are selected in both tables. Certainly, the type of join significantly impacts which records are retrieved or acted upon. For the most part, a join can be specified in a FROM or WHERE clause, so be careful where you specify them. Because the data engine executes the clauses in a specific sequence, placement can affect the results.

Generally, the data engine fulfills the FROM clause before applying any other clauses. Adding the join here eliminates records before they're evaluated by any other clauses. As a rule, you'll find this is the recommended method.

Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table. Types of joins: INNER JOIN, OUTER JOIN, and CROSS JOIN. OUTER JOIN’s are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.

What is a SQL View?

A view can be thought of as either a virtual table or a stored query. The data accessible through a view is not stored in the database as a distinct object. What is stored in the database is a SELECT statement. The result set of the SELECT statement forms the virtual table returned by the view. A user can use this virtual table by referencing the view name in Transact-SQL statements the same way a table is referenced. A view is used to do any or all of these functions:

· Restrict a user to specific rows in a table. For example, allow an employee to see only the rows recording his or her work in a labor-tracking table.

· Restrict a user to specific columns. For example, allow employees who do not work in payroll to see the name, office, work phone, and department columns in an employee table, but do not allow them to see any columns with salary information or personal information.

· Join columns from multiple tables so that they look like a single table.

· Aggregate information instead of supplying details. For example, present the sum of a column, or the maximum or minimum value from a column.

Views are created by defining the SELECT statement that retrieves the data to be presented by the view. The data tables referenced by the SELECT statement are known as the base tables for the view. In this example, titleview in the pubs database is a view that selects data from three base tables to present a virtual table of commonly needed data:

CREATE VIEW titleview

AS

SELECT title, au_ord, au_lname, price, ytd_sales,
  pub_id

FROM authors AS a

     JOIN
  titleauthor AS ta ON (a.au_id = ta.au_id)

     JOIN
  titles AS t ON (t.title_id = ta.title_id)

You can then reference titleview in statements in the same way you would reference a table:

SELECT *

FROM titleview

A view can reference another view. For example, titleview presents information that is useful for managers, but a company typically discloses year-to-date figures only in quarterly or annual financial statements. A view can be built that selects all the titleview columns except au_ord and ytd_sales. This new view can be used by customers to get lists of available books without seeing the financial information:

CREATE VIEW Cust_titleview

AS

SELECT title, au_lname, price, pub_id

FROM titleview

Views in all versions of SQL Server are updatable (can be the target of UPDATE, DELETE, or INSERT statements), as long as the modification affects only one of the base tables referenced by the view, for example:

-- Increase the prices for publisher '0736' by 10%.

UPDATE titleview

SET price = price * 1.10

WHERE pub_id = '0736'

GO

SQL Server 2000 supports more complex types of INSERT, UPDATE, and DELETE statements that reference views. INSTEAD OF triggers can be defined on a view to specify the individual updates that must be performed against the base tables to support the INSERT, UPDATE, or DELETE statement. Also, partitioned views support INSERT, UPDATE, and DELETE statements that modify multiple member tables referenced by the view.

Indexed views are a SQL Server 2000 feature that greatly improves the performance of complex views of the type usually found in data warehouses or other decision support systems.

Views are called virtual tables because the result set of a view is not usually saved in the database. The result set for a view is dynamically incorporated into the logic of the statement and the result set is built dynamically at run time.

What is a Primary Key?

Primary Key is a unique column in the table that identifies each row. There cannot be more than 1 row in the table with the same primary key. The primary key contains a unique identifier to maintain each record's unique identity. Primary keys field can include an employee ID, part number, or customer number. Typically, you specify which column contains the primary key when you create a database table.

Define candidate key, alternate key and composite key.

A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes the primary key of the table. If the table has more than one candidate key, one of them will become the primary key, and the rest are called alternate keys. A key formed by combining at least two or more columns is called composite key.

What’s the difference between a primary key and a unique key?

Both primary key and unique key enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, whereas unique key creates a non-clustered index by default. Another difference is that, primary key doesn’t allow NULL, but unique key allows one NULL only.

What is a "constraint"?

A constraint allows you to apply simple referential integrity checks to a table. There are four primary types of constraints that are currently supported by SQL Server:

· PRIMARY/UNIQUE - enforces uniqueness of a particular table column.

· DEFAULT - specifies a default value for a column in case an insert operation does not provide one.

· FOREIGN KEY - validates that every value in a column exists in a column of another table.

· CHECK - checks that every value stored in a column is in some specified list

Each type of constraint performs a specific type of action. There are five kinds of constraints in all, Primary key, Foreign key, Unique, Check and Default.

What action do you have to perform before retrieving data from the next result set of a

 stored procedure?

Move the cursor down one row from its current position. A Result Set cursor is initially positioned before the first row. Before you can get to the first row, you would need to Move the cursor down by one row.

With certain database systems, a stored procedure can return multiple result sets, multiple update counts, or some combination of both. Also, if you are providing a user with the ability to enter any SQL statement, you don't know if you are going to get a ResultSet or an update count back from each statement, without analyzing the contents. The Statement.execute() method helps in these cases. Method Statement.execute() returns a Boolean to tell you the type of response:

· true indicates next result is a ResultSet. Use Statement.getResultSet to get the ResultSet

· false indicates next result is an update count. Use Statement.getUpdateCount to get the update count

· false also indicates no more results. Update count is -1 when no more results (usually 0 or positive)

After processing each response, you use Statement.getMoreResults to check for more results, again returning a Boolean. The following demonstrates the processing of multiple result sets:

boolean result = stmt.execute(" ... ");

int updateCount = stmt.getUpdateCount();

while (result || (updateCount != -1)) {

if(result) {

ResultSet r = stmt.getResultSet();

// process result set

} else if(updateCount != -1) {

// process update count

}

result = stmt.getMoreResults();

updateCount = stmt.getUpdateCount(); }

Briefly explain about Database.

A collection of information organized in such a way that a computer program can quickly select desired pieces of data. You can think of a database as an electronic filing system. Traditional databases are organized by fields, records, and files. A field is a single piece of information; a record is one complete set of fields; and a file is a collection of records. For example, a telephone book is analogous to a file. It contains a list of records, each of which consists of three fields: name, address, and telephone number.

An alternative concept in database design is known as Hypertext. In a Hypertext database, any object, whether it be a piece of text, a picture, or a film, can be linked to any other object. Hypertext databases are particularly useful for organizing large amounts of disparate information, but they are not designed for numerical analysis.

To access information from a database, you need a database management system (DBMS). This is a collection of programs that enables you to enter, organize, and select data in a database.

Briefly explain about DBMS.

A collection of programs that enables you to store, modify, and extract information from a database. There are many different types of DBMSs, ranging from small systems that run on personal computers to huge systems that run on mainframes. The following are examples of database applications:

· computerized library systems

· automated teller machines

· flight reservation systems

· computerized parts inventory systems

From a technical standpoint, DBMSs can differ widely. The terms relational, network, flat, and hierarchical all refer to the way a DBMS organizes information internally. The internal organization can affect how quickly and flexibly you can extract information.

Requests for information from a database are made in the form of a query, which is a stylized question. For example, the query

SELECT ALL WHERE NAME = "SMITH" AND AGE > 35

requests all records in which the NAME field is SMITH and the AGE field is greater than 35. The set of rules for constructing queries is known as a query language. Different DBMSs support different query languages, although there is a semi-standardized query language called SQL (structured query language). Sophisticated languages for managing database systems are called fourth-generation languages, or 4GLs for short.

The information from a database can be presented in a variety of formats. Most DBMSs include a report writer program that enables you to output data in the form of a report. Many DBMSs also include a graphics component that enables you to output information in the form of graphs and charts.

Briefly explain about RDBMS.

Short for relational database management system and pronounced as separate letters, a type of database management system (DBMS) that stores data in the form of related tables. Relational databases are powerful because they require few assumptions about how data is related or how it will be extracted from the database. As a result, the same database can be viewed in many different ways.

An important feature of relational systems is that a single database can be spread across several tables. This differs from flat-file databases, in which each database is self-contained in a single table. Almost all full-scale database systems are RDBMS's. Small database systems, however, use other designs that provide less flexibility in posing queries.

What is normalization? Explain different levels of normalization?

In a relational database design, the process of organizing data to minimize redundancy is termed as Normalization. Normalization usually involves dividing a database into two or more tables and defining relationships between the tables. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.

Normalization can be viewed as a series of steps (i.e., levels) designed, one after another, to deal with ways in which tables can be "too complicated for their own good". The purpose of normalization is to reduce the chances for anomalies to occur in a database. The definitions of the various levels of normalization illustrate complications to be eliminated in order to reduce the chances of anomalies. At all levels and in every case of a table with a complication, the resolution of the problem turns out to be the establishment of two or more simpler tables which, as a group, contain the same information as the original table but which, because of their simpler individual structures, lack the complication.

There are three main normal forms, each with increasing levels of normalization:

· First Normal Form (1NF): We must eliminate repeating groups. That means any subgroups of data that appear within the record should be spilt into separate tables. Each field in a table contains different information. For example, in an employee list, each table would contain only one birthdate field.

· Second Normal Form (2NF): By eliminating partial dependencies. Partial dependencies means that data does not depend on the primary key of the table to uniquely identify it. Each field in a table that is not a determiner of the contents of another field must itself be a function of the other fields in the table.

· Third Normal Form (3NF): Eliminate all transitive (i.e. hidden) dependencies. In other words every column that isn’t part of the key must depend on the key for its information al value. No duplicate information is permitted. So, for example, if two tables both require a birthdate field, the birthdate information would be separated into a separate table, and the two other tables would then access the birthdate information via an index field in the birthdate table. Any change to a birthdate would automatically be reflecting in all tables that link to the birthdate table.

There are additional normalization levels, such as Boyce Codd Normal Form (BCNF), fourth normal form (4NF) and fifth normal form (5NF). While normalization makes databases more efficient to maintain, they can also make them more complex because data is separated into so many different tables.

· Boyce-Codd Normal Form (BCNF): A table is in BCNF if it is in 3NF and if every determinant is a candidate key.

· 4th Normal Form (4NF): A table is in 4NF if it is in BCNF and if it has no multi-valued dependencies.

· 5th Normal Form (5NF): A table is in 5NF, also called "Projection-Join Normal Form" (PJNF), if it is in 4NF and if every join dependency in the table is a consequence of the candidate keys of the table.

· Domain-Key Normal Form (DKNF): A table is in DKNF if every constraint on the table is a logical consequence of the definition of keys and domains.

What is denormalization and when would you go for it?

As the name indicates, denormalization is the reverse process of normalization. It's the controlled introduction of redundancy in to the database design. It helps improve the query performance as the number of joins could be reduced.

What is an index? What are the types of indexes? How many clustered indexes can be

created on a table? I created a separate index on each column of a table.

What are the advantages and disadvantages of this approach?

Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve the data quicker. Indexes are of two types. Clustered indexes and non-clustered indexes. When you create a 11index on a table, all the rows in the table are stored in the order of the clustered index key. So, there can be only one clustered index per table. Non-clustered indexes have their own storage separate from the table data storage. Non-clustered indexes are stored as B-tree structures (so do clustered indexes), with the leaf level nodes having the index key and its row locater. The row located could be the RID or the Clustered index key, depending up on the absence or presence of clustered index on the table.

If you create an index on each column of a table, it improves the query performance, as the query optimizer can choose from all the existing indexes to come up with an efficient execution plan. At the same time, data modification operations (such as INSERT, UPDATE, DELETE) will become slow, as every time data changes in the table, all the indexes need to be updated. Another disadvantage is that, indexes need disk space, the more indexes you have, more disk space is used.

What is RAID and what are different types of RAID configurations?

Note: RAID is available only on Microsoft Windows NT 4.0 and Microsoft Windows 2000.

RAID (redundant array of independent disks) is a disk system that comprises multiple disk drives (an array) to provide higher performance, reliability, storage capacity, and lower cost. Fault-tolerant arrays are categorized in six RAID levels, 0 through 5. Each level uses a different algorithm to implement fault tolerance.

Although RAID is not a part of Microsoft® SQL Server™ 2000, its implementation can directly affect the way SQL Server performs. RAID levels 0, 1, and 5 are typically used with SQL Server.

A hardware disk array improves I/O performance because I/O functions, such as striping and mirroring, are handled efficiently in firmware. Conversely, an operating system–based RAID offers lower cost but consumes processor cycles. When cost is a consideration and redundancy and high performance are required, Microsoft Windows® NT® stripe sets with parity or Windows 2000 RAID-5 volumes are a good solution.

Data striping (RAID 0) is the RAID configuration with the highest performance, but if one disk fails, all the data on the stripe set becomes inaccessible. A common installation technique for relational database management systems is to configure the database on a RAID 0 drive and then place the transaction log on a mirrored drive (RAID 1). You can get the best disk I/O performance for the database and maintain data recoverability (assuming you perform regular database backups) through a mirrored transaction log.

If data must be quickly recoverable, consider mirroring the transaction log and placing the database on a RAID 5 disk. RAID 5 provides redundancy of all data on the array, allowing a single disk to fail and be replaced in most cases without system downtime. RAID 5 offers lower performance than RAID 0 or RAID 1 but higher reliability and faster recovery.

What are the steps you will take to improve performance of a poor performing query?

There could be a lot of reasons behind the poor performance of a query.

But some general issues to talk about would be:

No indexes, table scans, missing or out of date statistics, blocking, excess recompilations of stored procedures, procedures and triggers without SET NOCOUNT ON, poorly written query with unnecessarily complicated joins, too much normalization, excess usage of cursors and temporary tables.

Some of the tools/ways that help you troubleshooting performance problems are:

SET SHOWPLAN_ALL ON, SET SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance monitor, Graphical execution plan in Query Analyzer.

Write down the general syntax for a SELECT statement covering all the options.

SELECT select_list

[INTO new_table_]

FROM table_source

[WHERE search_condition]

[GROUP BY group_by_expression]

[HAVING search_condition]

[ORDER BY order_expression [ASC | DESC] ]

What is a self join? Explain it with an example.

Self-joins are useful in SQL select statements and are often used to identify duplicate entries or entries linked in other ways such as customers who share the same telephone number or patients who share the same address. A self-join is an internal table (or view) join. The self-join is between fields in a table and fields within a virtual copy of the table.

Self join is just like any other join, except that two instances of the same table will be joined in the query. Here is an example: Employees table which contains rows for normal employees as well as managers. So, to find out the managers of all the employees, you need a self join.

CREATE TABLE emp

(

empid int,

mgrid int,

empname char(10)

)

INSERT emp SELECT 1,2,'Vyas'

INSERT emp SELECT 2,3,'Mohan'

INSERT emp SELECT 3,NULL,'Shobha'

INSERT emp SELECT 4,2,'Shridhar'

INSERT emp SELECT 5,2,'Sourabh'

SELECT t1.empname [Employee], t2.empname [Manager]

FROM emp t1, emp t2

WHERE t1.mgrid = t2.empid

What is the difference between GROUP BY and DISTINCT?

GROUP BY divides a table into groups. Groups can consist of column names or results or computed columns. GROUP BY was added to SQL because aggregate functions (like SUM) return the aggregate of all column values every time they are called, and without the GROUP BY function it was impossible to find the sum for each individual group of column values.

This "Sales" Table:

Company	Amount
W3Schools	5500
IBM	4500
W3Schools	7100

And This SQL: SELECT Company, SUM (Amount) FROM Sales

Returns this result:

Company	SUM(Amount)
W3Schools	17100
IBM	17100
W3Schools	17100

The above code is invalid because the column returned is not part of an aggregate. A GROUP BY clause will solve this problem:

SELECT Company, SUM (Amount) FROM Sales

GROUP BY Company

Returns this result:

Company	SUM(Amount)
W3Schools	12600
IBM	4500

DISTINCT: The SELECT keyword allows us to grab all information from a column (or columns) on a table. This, of course, necessarily means that there will be redundancies. What if we only want to select each DISTINCT element? This is easy to accomplish in SQL. All we need to do is to add DISTINCT after SELECT.

Example: Select Companies from Order Table: Simple Table of Purchase Orders:

Company	OrderNumber
Sega	3412
DeveloperIQ	2312
Trio	4678
DeveloperIQ	6798

This SQL statement: SELECT Company FROM Orders

Will return this result:

Company

Sega

DeveloperIQ

Trio

DeveloperIQ

Note that the company DeveloperIQ is listed twice in the result. Sometimes we don't want that.

Example: Select Distinct Companies from Orders

This SQL statement: SELECT DISTINCT Company FROM Orders

Will return this result:

Company

Sega

DeveloperIQ

Trio

In SQL query, what is @@ERROR used for?

It is used to track the error in Stored Procedure and Triggers. For example if @@ERROR = NULL, it means there is no error in Stored Procedure and Triggers.

When an error is encountered within a stored procedure, the best you can do (assuming it’s a non-fatal error) is halt the sequential processing of the code and either branch to another code segment in the procedure or return processing to the calling application. Notice that the previous sentence is specific to non-fatal errors. There are two types of errors in SQL Server: fatal and non-fatal. Fatal errors cause a procedure to abort processing and terminate the connection with the client application. Non-fatal errors do not abort processing a procedure or affect the connection with the client application. When a non-fatal error occurs within a procedure, processing continues on the line of code that follows the one that caused the error.

You might be wondering what actions cause fatal errors. Unfortunately, the actions that cause a fatal error are not well documented. Each error has an associated severity level that is a value between 0–25. The errors with a severity level of 20 or above are all fatal, but once you get below this value there is no well-defined rule as to which errors are fatal. In truth, though, worrying about which errors are fatal is a bit useless because there is no code you can implement that will allow you to handle them gracefully. Of course, you can use pro-active coding to make sure fatal-errors do not occur. For example, if your application allows users to type in the name of the table on which a query is based you can verify its existence before referencing it with dynamic SQL.

@@ERROR

The @@ERROR system function is used to implement error handling code. It contains the error ID produced by the last SQL statement executed during a client’s connection. When a statement executes successfully, @@ERROR contains 0. To determine if a statement executes successfully, an IF statement is used to check the value of the function immediately after the target statement executes. It is imperative that @@ERROR be checked immediately after the target statement, because its value is reset when the next statement executes successfully.

RAISERROR

The RAISERROR statement is used to produce an ad hoc error message or to retrieve a custom message that is stored in the sysmessages table. You can use this statement with the error handling code presented in the previous section to implement custom error messages in your applications.

Adding a Permanent Custom Message

If you have a message that is going to be used frequently, it is more efficient to add it to the sysmessages table and reference it by its unique ID. The system stored procedure sp_addmessages adds an error message to sysmessages.

If you have any doubt please reply to this post else mail me on sujeet.bhujbal@gmail.com

Regards

Sujeet

Pages - Menu

Tuesday, 8 May 2012

SQL Server Interview Questions Part 3

No comments:

Post a Comment