Documentos de Académico
Documentos de Profesional
Documentos de Cultura
M I C R O S O F T
L E A R N I N G
P R O D U C T
6232B
Volume 1
ii
Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links may be provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. 2011 Microsoft Corporation. All rights reserved. Microsoft, and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. All other trademarks are property of their respective owners.
MICROSOFT LICENSE TERMS OFFICIAL MICROSOFT LEARNING PRODUCTS - TRAINER EDITION Pre-Release and Final Release Versions
These license terms are an agreement between Microsoft Corporation and you. Please read them. They apply to the Licensed Content named above, which includes the media on which you received it, if any. The terms also apply to any Microsoft updates, supplements, Internet-based services, and support services
for this Licensed Content, unless other terms accompany those items. If so, those terms apply. By using the Licensed Content, you accept these terms. If you do not accept them, do not use the Licensed Content. If you comply with these license terms, you have the rights below.
1. DEFINITIONS. a. Academic Materials means the printed or electronic documentation such as manuals, workbooks, white papers, press releases, datasheets, and FAQs which may be included in the Licensed Content. b. Authorized Learning Center(s) means a Microsoft Certified Partner for Learning Solutions location, an IT Academy location, or such other entity as Microsoft may designate from time to time. c. Authorized Training Session(s) means those training sessions authorized by Microsoft and conducted at or through Authorized Learning Centers by a Trainer providing training to Students solely on Official Microsoft Learning Products (formerly known as Microsoft Official Curriculum or MOC) and Microsoft Dynamics Learning Products (formerly know as Microsoft Business Solutions Courseware). Each Authorized Training Session will provide training on the subject matter of one (1) Course. d. Course means one of the courses using Licensed Content offered by an Authorized Learning Center during an Authorized Training Session, each of which provides training on a particular Microsoft technology subject matter. e. Device(s) means a single computer, device, workstation, terminal, or other digital electronic or analog device. f.
Licensed Content means the materials accompanying these license terms. The Licensed Content may include, but is not limited to, the following elements: (i) Trainer Content, (ii) Student Content, (iii) classroom setup guide, and (iv) Software. There are different and separate components of the Licensed Content for each Course. Software means the Virtual Machines and Virtual Hard Disks, or other software applications that may be included with the Licensed Content. Student Content means the learning materials accompanying these license terms that are for use by Students and Trainers during an Authorized Training Session. Student Content may include labs, simulations, and courseware files for a Course. Trainer(s) means a) a person who is duly certified by Microsoft as a Microsoft Certified Trainer and b) such other individual as authorized in writing by Microsoft and has been engaged by an Authorized Learning Center to teach or instruct an Authorized Training Session to Students on its behalf.
g.
h. Student(s) means a student duly enrolled for an Authorized Training Session at your location. i.
j.
k. Trainer Content means the materials accompanying these license terms that are for use by Trainers and Students, as applicable, solely during an Authorized Training Session. Trainer Content may include Virtual Machines, Virtual Hard Disks, Microsoft PowerPoint files, instructor notes, and demonstration guides and script files for a Course. l.
Virtual Hard Disks means Microsoft Software that is comprised of virtualized hard disks (such as a base virtual hard disk or differencing disks) for a Virtual Machine that can be loaded onto a single computer or other device in order to allow end-users to run multiple operating systems concurrently. For the purposes of these license terms, Virtual Hard Disks will be considered Trainer Content.
m. Virtual Machine means a virtualized computing experience, created and accessed using Microsoft Virtual PC or Microsoft Virtual Server software that consists of a virtualized hardware environment, one or more Virtual Hard Disks,
and a configuration file setting the parameters of the virtualized hardware environment (e.g., RAM). For the purposes of these license terms, Virtual Hard Disks will be considered Trainer Content.
n.
you means the Authorized Learning Center or Trainer, as applicable, that has agreed to these license terms.
2. OVERVIEW.
Licensed Content. The Licensed Content includes Software, Academic Materials (online and electronic), Trainer Content, Student Content, classroom setup guide, and associated media. License Model. The Licensed Content is licensed on a per copy per Authorized Learning Center location or per Trainer basis.
3. INSTALLATION AND USE RIGHTS. a. Authorized Learning Centers and Trainers: For each Authorized Training Session, you may:
i. either install individual copies of the relevant Licensed Content on classroom Devices only for use by Students enrolled in and the Trainer delivering the Authorized Training Session, provided that the number of copies in use does not exceed the number of Students enrolled in and the Trainer delivering the Authorized Training Session, OR
ii. install one copy of the relevant Licensed Content on a network server only for access by classroom Devices and only for use by Students enrolled in and the Trainer delivering the Authorized Training Session, provided that the number of Devices accessing the Licensed Content on such server does not exceed the number of Students enrolled in and the Trainer delivering the Authorized Training Session. iii. and allow the Students enrolled in and the Trainer delivering the Authorized Training Session to use the Licensed Content that you install in accordance with (ii) or (ii) above during such Authorized Training Session in accordance with these license terms. i. Separation of Components. The components of the Licensed Content are licensed as a single unit. You may not separate the components and install them on different Devices.
ii. Third Party Programs. The Licensed Content may contain third party programs. These license terms will apply to the use of those third party programs, unless other terms accompany those programs.
b. Trainers:
i. Trainers may Use the Licensed Content that you install or that is installed by an Authorized Learning Center on a classroom Device to deliver an Authorized Training Session.
ii. Trainers may also Use a copy of the Licensed Content as follows:
A. Licensed Device. The licensed Device is the Device on which you Use the Licensed Content. You may install and Use one copy of the Licensed Content on the licensed Device solely for your own personal training Use and for preparation of an Authorized Training Session. B. Portable Device. You may install another copy on a portable device solely for your own personal training Use and for preparation of an Authorized Training Session. 4. PRE-RELEASE VERSIONS. If this is a pre-release (beta) version, in addition to the other provisions in this agreement, these terms also apply: a. Pre-Release Licensed Content. This Licensed Content is a pre-release version. It may not contain the same information and/or work the way a final version of the Licensed Content will. We may change it for the final, commercial version. We also may not release a commercial version. You will clearly and conspicuously inform any Students who participate in each Authorized Training Session of the foregoing; and, that you or Microsoft are under no obligation to provide them with any further content, including but not limited to the final released version of the Licensed Content for the Course. b. Feedback. If you agree to give feedback about the Licensed Content to Microsoft, you give to Microsoft, without charge, the right to use, share and commercialize your feedback in any way and for any purpose. You also give to third parties, without charge, any patent rights needed for their products, technologies and services to use or interface with any specific parts of a Microsoft software, Licensed Content, or service that includes the feedback. You will not give feedback that is subject to a license that requires Microsoft to license its software or documentation to third parties because we include your feedback in them. These rights survive this agreement. c. Confidential Information. The Licensed Content, including any viewer, user interface, features and documentation that may be included with the Licensed Content, is confidential and proprietary to Microsoft and its suppliers.
i.
Use. For five years after installation of the Licensed Content or its commercial release, whichever is first, you may not disclose confidential information to third parties. You may disclose confidential information only to your employees and consultants who need to know the information. You must have written agreements with them that protect the confidential information at least as much as this agreement. Survival. Your duty to protect confidential information survives this agreement.
ii.
iii. Exclusions. You may disclose confidential information in response to a judicial or governmental order. You must first give written notice to Microsoft to allow it to seek a protective order or otherwise protect the information. Confidential information does not include information that d. becomes publicly known through no wrongful act; you received from a third party who did not breach confidentiality obligations to Microsoft or its suppliers; or you developed independently.
Term. The term of this agreement for pre-release versions is (i) the date which Microsoft informs you is the end date for using the beta version, or (ii) the commercial release of the final release version of the Licensed Content, whichever is first (beta term). Use. You will cease using all copies of the beta version upon expiration or termination of the beta term, and will destroy all copies of same in the possession or under your control and/or in the possession or under the control of any Trainers who have received copies of the pre-released version. Copies. Microsoft will inform Authorized Learning Centers if they may make copies of the beta version (in either print and/or CD version) and distribute such copies to Students and/or Trainers. If Microsoft allows such distribution, you will follow any additional terms that Microsoft provides to you for such copies and distribution.
e.
f.
ii. Virtual Hard Disks. The Licensed Content may contain versions of Microsoft XP, Microsoft Windows Vista, Windows Server 2003, Windows Server 2008, and Windows 2000 Advanced Server and/or other Microsoft products which are provided in Virtual Hard Disks. A. If the Virtual Hard Disks and the labs are launched through the Microsoft Learning Lab Launcher, then these terms apply: Time-Sensitive Software. If the Software is not reset, it will stop running based upon the time indicated on the install of the Virtual Machines (between 30 and 500 days after you install it). You will not receive notice before it stops running. You may not be able to access data used or information saved with the Virtual Machines when it stops running and may be forced to reset these Virtual Machines to their original state. You must remove the Software from the Devices at the end of each Authorized Training Session and reinstall and launch it prior to the beginning of the next Authorized Training Session. B. If the Virtual Hard Disks require a product key to launch, then these terms apply: Microsoft will deactivate the operating system associated with each Virtual Hard Disk. Before installing any Virtual Hard Disks on classroom Devices for use during an Authorized Training Session, you will obtain from Microsoft a product key for the operating system software for the Virtual Hard Disks and will activate such Software with Microsoft using such product key. C. These terms apply to all Virtual Machines and Virtual Hard Disks: You may only use the Virtual Machines and Virtual Hard Disks if you comply with the terms and conditions of this agreement and the following security requirements: o o You may not install Virtual Machines and Virtual Hard Disks on portable Devices or Devices that are accessible to other networks. You must remove Virtual Machines and Virtual Hard Disks from all classroom Devices at the end of each Authorized Training Session, except those held at Microsoft Certified Partners for Learning Solutions locations.
o o o o o
You must remove the differencing drive portions of the Virtual Hard Disks from all classroom Devices at the end of each Authorized Training Session at Microsoft Certified Partners for Learning Solutions locations. You will ensure that the Virtual Machines and Virtual Hard Disks are not copied or downloaded from Devices on which you installed them. You will strictly comply with all Microsoft instructions relating to installation, use, activation and deactivation, and security of Virtual Machines and Virtual Hard Disks. You may not modify the Virtual Machines and Virtual Hard Disks or any contents thereof. You may not reproduce or redistribute the Virtual Machines or Virtual Hard Disks.
ii. Classroom Setup Guide. You will assure any Licensed Content installed for use during an Authorized Training Session will be done in accordance with the classroom set-up guide for the Course.
iii. Media Elements and Templates. You may allow Trainers and Students to use images, clip art, animations, sounds, music, shapes, video clips and templates provided with the Licensed Content solely in an Authorized Training Session. If Trainers have their own copy of the Licensed Content, they may use Media Elements for their personal training use. iv. iv Evaluation Software. Any Software that is included in the Student Content designated as Evaluation Software may be used by Students solely for their personal training outside of the Authorized Training Session.
b. Trainers Only:
i. Use of PowerPoint Slide Deck Templates . The Trainer Content may include Microsoft PowerPoint slide decks. Trainers may use, copy and modify the PowerPoint slide decks only for providing an Authorized Training Session. If you elect to exercise the foregoing, you will agree or ensure Trainer agrees: (a) that modification of the slide decks will not constitute creation of obscene or scandalous works, as defined by federal law at the time the work is created; and (b) to comply with all other terms and conditions of this agreement.
ii. Use of Instructional Components in Trainer Content. For each Authorized Training Session, Trainers may customize and reproduce, in accordance with the MCT Agreement, those portions of the Licensed Content that are logically associated with instruction of the Authorized Training Session. If you elect to exercise the foregoing rights, you agree or ensure the Trainer agrees: (a) that any of these customizations or reproductions will only be used for providing an Authorized Training Session and (b) to comply with all other terms and conditions of this agreement. iii. Academic Materials. If the Licensed Content contains Academic Materials, you may copy and use the Academic Materials. You may not make any modifications to the Academic Materials and you may not print any book (either electronic or print version) in its entirety. If you reproduce any Academic Materials, you agree that:
The use of the Academic Materials will be only for your personal reference or training use You will not republish or post the Academic Materials on any network computer or broadcast in any media; You will include the Academic Materials original copyright notice, or a copyright notice to Microsofts benefit in the format provided below: Form of Notice: 2010 Reprinted for personal reference use only with permission by Microsoft Corporation. All rights reserved. Microsoft, Windows, and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the US and/or other countries. Other product and company names mentioned herein may be the trademarks of their respective owners.
6. INTERNET-BASED SERVICES. Microsoft may provide Internet-based services with the Licensed Content. It may change or cancel them at any time. You may not use these services in any way that could harm them or impair anyone elses use of them. You may not use the services to try to gain unauthorized access to any service, data, account or network by any means. 7. SCOPE OF LICENSE. The Licensed Content is licensed, not sold. This agreement only gives you some rights to use the Licensed Content. Microsoft reserves all other rights. Unless applicable law gives you more rights despite this limitation, you may use the Licensed Content only as expressly permitted in this agreement. In doing so, you must comply with any technical limitations in the Licensed Content that only allow you to use it in certain ways. You may not
install more copies of the Licensed Content on classroom Devices than the number of Students and the Trainer in the Authorized Training Session; allow more classroom Devices to access the server than the number of Students enrolled in and the Trainer delivering the Authorized Training Session if the Licensed Content is installed on a network server; copy or reproduce the Licensed Content to any server or location for further reproduction or distribution; disclose the results of any benchmark tests of the Licensed Content to any third party without Microsofts prior written approval; work around any technical limitations in the Licensed Content; reverse engineer, decompile or disassemble the Licensed Content, except and only to the extent that applicable law expressly permits, despite this limitation; make more copies of the Licensed Content than specified in this agreement or allowed by applicable law, despite this limitation; publish the Licensed Content for others to copy; transfer the Licensed Content, in whole or in part, to a third party; access or use any Licensed Content for which you (i) are not providing a Course and/or (ii) have not been authorized by Microsoft to access and use; rent, lease or lend the Licensed Content; or use the Licensed Content for commercial hosting services or general business purposes. Rights to access the server software that may be included with the Licensed Content, including the Virtual Hard Disks does not give you any right to implement Microsoft patents or other Microsoft intellectual property in software or devices that may access the server.
8. EXPORT RESTRICTIONS. The Licensed Content is subject to United States export laws and regulations. You must comply with all domestic and international export laws and regulations that apply to the Licensed Content. These laws include restrictions on destinations, end users and end use. For additional information, see www.microsoft.com/exporting. 9. NOT FOR RESALE SOFTWARE/LICENSED CONTENT. You may not sell software or Licensed Content marked as NFR or Not for Resale. 10. ACADEMIC EDITION. You must be a Qualified Educational User to use Licensed Content marked as Academic Edition or AE. If you do not know whether you are a Qualified Educational User, visit www.microsoft.com/education or contact the Microsoft affiliate serving your country. 11. TERMINATION. Without prejudice to any other rights, Microsoft may terminate this agreement if you fail to comply with the terms and conditions of these license terms. In the event your status as an Authorized Learning Center or Trainer a) expires, b) is voluntarily terminated by you, and/or c) is terminated by Microsoft, this agreement shall automatically terminate. Upon any termination of this agreement, you must destroy all copies of the Licensed Content and all of its component parts. 12. ENTIRE AGREEMENT. This agreement, and the terms for supplements, updates, Internet-based services and support services that you use, are the entire agreement for the Licensed Content and support services. 13. APPLICABLE LAW. a. United States. If you acquired the Licensed Content in the United States, Washington state law governs the interpretation of this agreement and applies to claims for breach of it, regardless of conflict of laws principles. The laws of the state where you live govern all other claims, including claims under state consumer protection laws, unfair competition laws, and in tort. b. Outside the United States. If you acquired the Licensed Content in any other country, the laws of that country apply. 14. LEGAL EFFECT. This agreement describes certain legal rights. You may have other rights under the laws of your country. You may also have rights with respect to the party from whom you acquired the Licensed Content. This agreement does not change your rights under the laws of your country if the laws of your country do not permit it to do so.
15. DISCLAIMER OF WARRANTY. The Licensed Content is licensed as-is. You bear the risk of using it. Microsoft gives no express warranties, guarantees or conditions. You may have additional consumer rights under your local laws which this agreement cannot change. To the extent permitted under your local laws, Microsoft excludes the implied warranties of merchantability, fitness for a particular purpose and noninfringement. 16. LIMITATION ON AND EXCLUSION OF REMEDIES AND DAMAGES. YOU CAN RECOVER FROM MICROSOFT AND ITS SUPPLIERS ONLY DIRECT DAMAGES UP TO U.S. $5.00. YOU CANNOT RECOVER ANY OTHER DAMAGES, INCLUDING CONSEQUENTIAL, LOST PROFITS, SPECIAL, INDIRECT OR INCIDENTAL DAMAGES.
This limitation applies to anything related to the Licensed Content, software, services, content (including code) on third party Internet sites, or third party programs; and claims for breach of contract, breach of warranty, guarantee or condition, strict liability, negligence, or other tort to the extent permitted by applicable law.
It also applies even if Microsoft knew or should have known about the possibility of the damages. The above limitation or exclusion may not apply to you because your country may not allow the exclusion or limitation of incidental, consequential or other damages. Please note: As this Licensed Content is distributed in Quebec, Canada, some of the clauses in this agreement are provided below in French. Remarque : Ce le contenu sous licence tant distribu au Qubec, Canada, certaines des clauses dans ce contrat sont fournies ci-dessous en franais. EXONRATION DE GARANTIE. Le contenu sous licence vis par une licence est offert tel quel . Toute utilisation de ce contenu sous licence est votre seule risque et pril. Microsoft naccorde aucune autre garantie expresse. Vous pouv ez bnficier de droits additionnels en vertu du droit local sur la protection dues consommateurs, que ce contrat ne peut modifier. La ou elles sont permises par le droit locale, les garanties implicites de qualit marchande, dadquation un usage partic ulier et dabsence de contrefaon sont exclues. LIMITATION DES DOMMAGES-INTRTS ET EXCLUSION DE RESPONSABILIT POUR LES DOMMAGES. Vous pouvez obtenir de Microsoft et de ses fournisseurs une indemnisation en cas de dommages directs uniquement hauteur de 5,00 $ US. Vous ne pouvez prtendre aucune indemnisation pour les autres dommages, y compris les dommages spciaux, indirects ou accessoires et pertes de bnfices. Cette limitation concerne: tout ce qui est reli au le contenu sous licence , aux services ou au contenu (y compris le code) figurant sur des sites Internet tiers ou dans des programmes tiers ; et les rclamations au titre de violation de contrat ou de garantie, ou au titre de responsabilit stricte, de ngligence ou dune autre faute dans la limite autorise par la loi en vigueur.
Elle sapplique galement, mme si Microsoft connaissait ou devrait connatre lventualit dun tel dommage. Si votre pays nautorise pas lexclusion ou la limitation de responsabilit pour les dommages indirects , accessoires ou de quelque nature que ce soit, il se peut que la limitation ou lexclusion ci-dessus ne sappliquera pas votre gard. EFFET JURIDIQUE. Le prsent contrat dcrit certains droits juridiques. Vous pourriez avoir dautres droits prvus par les lois de votre pays. Le prsent contrat ne modifie pas les droits que vous confrent les lois de votre pays si celles-ci ne le permettent pas.
ix
Acknowledgements
Microsoft Learning would like to acknowledge and thank the following for their contribution towards developing this title. Their effort at various stages in the development has ensured that you have a good classroom experience.
xi
Contents
Module 1: Introduction to SQL Server 2008 R2 and its Toolset
Lesson 1: Introduction to the SQL Server Platform Lesson 2: Working with SQL Server Tools Lesson 3: Configuring SQL Server Services Lab 1: Introduction to SQL Server and its Toolset 1-3 1-14 1-28 1-36
xii
xiii
Course Description
This five-day instructor-led course is intended for Microsoft SQL Server database developers who are responsible for implementing a database on SQL Server 2008 R2. In this course, students learn the skills and best practices on how to use SQL Server 2008 R2 product features and tools related to implementing a database server.
Audience
This course is intended for IT Professionals who want to become skilled on SQL Server 2008 R2 product features and technologies for implementing a database. To be successful in this course, the student should have knowledge of basic relational database concepts and writing T-SQL queries.
Student Prerequisites
This course requires that you meet the following prerequisites: Working knowledge of Transact-SQL (ability to write Transact-SQL queries) Working knowledge of relational databases (database design skills) Core Windows Server skills Completed Course 2778: Writing Queries Using Microsoft SQL Server 2008 Transact-SQL
Course Objectives
After completing this course, students will be able to: Understand the product, its components, and basic configuration Work with the data types supported by SQL Server Design and implement tables and work with schemas Design and implement views and partitioned views Describe the concept of an index and determine the appropriate data type for indexes and composite index structures Identify the appropriate table structures and implement clustered indexes and heaps Describe and capture execution plans Design and implement non-clustered indexes, covering indexes, and included columns Design and implement stored procedures Implement table types, table valued parameters, and the MERGE statement Describe transactions, transaction isolation levels, and application design patterns for highlyconcurrent applications Design and implement T-SQL error handling and structured exception handling Design and implement scalar and table-valued functions Design and implement constraints
xiv
Design and implement triggers Describe and implement target use cases of SQL CLR integration Describe and implement XML data and schema in SQL Server Use FOR XML and XPath queries Describe and use spatial data types in SQL Server Implement and query full-text indexes
Course Outline
This section provides an outline of the course: Module 1, Introduction to SQL Server 2008 R2 and its Toolset introduces you to the entire SQL Server platform and its major tools. This module also covers editions, versions, basics of network listeners, and concepts of services and service accounts. Module 2, Working with Data Types describes the data types supported by SQL Server and how to work with them. Module 3, Designing and Implementing Tables describes the design and implementation of tables. Module 4, Designing and Implementing Views describes the design and implementation of views. Module 5, Planning for SQL Server 2008 R2 Indexing describes the concept of an index and discusses selectivity, density, and statistics. This module also covers appropriate data type choices and choices around composite index structures. Module 6, Implementing Table Structures in SQL Server 2008 R2 covers clustered indexes and heaps. Module 7, Reading SQL Server 2008 R2 Execution Plans introduces the concept of reading execution plans. Module 8, Improving Performance through Nonclustered Indexes covers non-clustered indexes, covering indexes and included columns. Module 9, Designing and Implementing Stored Procedures describes the design and implementation of stored procedures. Module 10, Merging Data and Passing Tables covers table types, table valued parameters and the MERGE statement as used in stored procedures. Module 11, Creating Highly Concurrent SQL Server 2008 R2 Applications covers transactions, isolation levels, and designing for concurrency. Module 12, Handling Errors in T-SQL Code describes structured exception handling and gives solid examples of its use within the design of stored procedures. Module 13, Designing and Implementing User-Defined Functions describes the design and implementation of functions, both scalar and table-valued. Module 14, Ensuring Data Integrity through Constraints describes the design and implementation of constraints. Module 15, Responding to Data Manipulation via Triggers describes the design and implementation of triggers. Module 16, Implementing Managed Code in SQL Server 2008 R2 describes the implementation of and target use-cases for SQL CLR integration.
xv
Module 17, Storing XML Data in SQL Server 2008 R2 covers the XML data type, schema collections, typed and untyped columns and appropriate use cases for XML in SQL Server. Module 18, Querying XML Data in SQL Server 2008 R2 covers the basics of FOR XML and XPath Queries. Module 19, Working with SQL Server 2008 R2 Spatial Data describes spatial data and how this data can be implemented within SQL Server. Module 20, Working with Full-Text Indexes and Queries covers full text indexes and queries.
xvi
Course Materials
The following materials are included with your kit: Course Handbook A succinct classroom learning guide that provides all the critical technical information in a crisp, tightly-focused format, which is just right for an effective in-class learning experience. Lessons: Guide you through the learning objectives and provide the key points that are critical to the success of the in-class learning experience. Labs: Provide a real-world, hands-on platform for you to apply the knowledge and skills learned in the module. Module Reviews and Takeaways: Provide improved on-the-job reference material to boost knowledge and skills retention. Lab Answer Keys: Provide step-by-step lab solution guidance at your finger tips when its needed.
Course Companion Content on the http://www.microsoft.com/learning/companionmoc/ Site: Searchable, easy-to-navigate digital content with integrated premium on-line resources designed to supplement the Course Handbook. Modules: Include companion content, such as questions and answers, detailed demo steps and additional reading links, for each lesson. Additionally, they include Lab Review questions and answers and Module Reviews and Takeaways sections, which contain the review questions and answers, best practices, common issues and troubleshooting tips with answers, and real-world issues and scenarios with answers. Resources: Include well-categorized additional resources that give you immediate access to the most up-to-date premium content on TechNet, MSDN, Microsoft Press
Student Course files on the http://www.microsoft.com/learning/companionmoc/ Site: Includes the Allfiles.exe, a self-extracting executable file that contains all the files required for the labs and demonstrations. Course evaluation At the end of the course, you will have the opportunity to complete an online evaluation to provide feedback on the course, training facility, and instructor. To provide additional comments or feedback on the course, send e-mail to support@mscourseware.com. To inquire about the Microsoft Certification Program, send e-mail to mcphelp@microsoft.com.
xvii
Software Configuration
The following software is installed on each VM: SQL Server 2008 R2 (on the SQL Server VMs)
Course Files
There are files associated with the labs in this course. The lab files are located in the folder D:\6232B_Labs on the student computers.
Classroom Setup
Each classroom computer will have the same virtual machine configured in the same way.
xviii
1-1
Module 1
Introduction to SQL Server 2008 R2 and its Toolset
Contents:
Lesson 1: Introduction to the SQL Server Platform Lesson 2: Working with SQL Server Tools Lesson 3: Configuring SQL Server Services Lab 1: Introduction to SQL Server and its Toolset 1-3 1-14 1-28 1-36
1-2
Module Overview
Before beginning to work with SQL Server in either a development or an administration role, it is important to understand the overall SQL Server platform. In particular, it is useful to understand that SQL Server is not just a database engine but it is a complete platform for managing enterprise data. Along with a strong platform, SQL Server provides a series of tools that make the product easy to manage and good target for the application development. Individual components of SQL Server can operate within separate security contexts. Correctly configuring SQL Server services is important where enterprises are operating with a policy of least possible permissions.
Objectives
After completing this lesson, you will be able to: Describe the SQL Server Platform Work with SQL Server Tools Configure SQL Server Services
1-3
Lesson 1
SQL Server is a platform for developing business applications that are data focused. Rather than being a single monolithic application, SQL Server is structured as a series of components. It is important to understand the use of each of the components. More than a single copy of SQL Server can be installed on a server. Each of these copies is called an instance and can be separately configured and managed. SQL Server is shipped in a variety of editions, each with a different set of capabilities. It is important to understand the target business cases for each of the SQL Server editions and how SQL Server has evolved through a series of improving versions over many years. It is a stable and robust platform.
Objectives
After completing this lesson, you will be able to: Describe the overall SQL Server platform Explain the role of each of the components that make up the SQL Server platform Describe the functionality provided by SQL Server Instances Explain the available SQL Server Editions Explain how SQL Server has evolved through a series of versions
1-4
Key Points
SQL Server is an integrated and enterprise-ready platform for data management that offers a low total cost of ownership. Question: Which other database platforms have you worked with?
Enterprise Ready
While SQL Server is much more than a relational database management system, it provides a very secure, robust, and stable relational database management system. SQL Server is used to manage organizational data and to provide analysis and insights into that data. The database engine is one of the highest performing database engines available and regularly features in the top of industry performance benchmarks. You can review industry benchmarks and scores at www.tpc.org.
High Availability
Impressive performance is necessary but not at the cost of availability. Organizations need constant access to their data. Many enterprises are now finding a need to have 24 hour x 7 day access available. The SQL Server platform was designed with the highest levels of availability in mind. As each version of the product has been released, more and more capabilities have been added to minimize any potential downtime.
Security
Utmost in the minds of enterprise managers is the need to secure organizational data. Security is not able to be retrofitted after an application or a product is created. SQL Server has been built from the ground up with the highest levels of security as a goal.
1-5
Scalability
Organizations have a need for data management capabilities for systems of all sizes. SQL Server scales from the smallest needs to the largest via a series of editions with increasing capabilities.
Cost of Ownership
Many competing database management systems are expensive to both purchase and to maintain. SQL Server offers very low total cost of ownership. SQL Server tooling (both management and development) builds on existing Windows knowledge. Most users tend to become familiar with the tools quite quickly. The productivity achieved when working with the tools is enhanced by the high degree of integration between the tools. For example, many of the SQL Server tools have links to launch and preconfigure other SQL Server tools.
1-6
Key Points
SQL Server is a very good relational database engine but as a data platform, it offers much more than a relational database engine.
1-7
1-8
Key Points
It is sometimes useful to install more than a single copy of a SQL Server component on a single server. Many SQL Server components can be installed more than once as separate instances.
Question: Why might you need to separate databases by service level agreement? Different versions of SQL Server can often be installed side-by-side using multiple instances. This can assist when testing upgrade scenarios or performing upgrades.
1-9
Additional instances of SQL Server require an instance name in addition to the server name and are known as "named" instances. Not all components of SQL Server are able to be installed in more than one instance. In particular, SQL Server Integration Services is installed once per server. There is no need to install SQL Server tools and utilities more than once. A single installation of the tools is able to manage and configure all instances.
1-10
Key Points
SQL Server is available in a wide variety of editions, with different price points and different levels of capability.
Business Use Case Uses massively parallel processing (MPP) to execute queries against vast amount of data quickly. Parallel Data Warehouse systems are sold as a complete "appliance" rather than via standard software licenses Provides the highest levels of scalability for mission-critical applications Provides the highest levels of reliability for demanding workloads Delivers a reliable, complete data management and Business Intelligence (BI) platform Is a free edition for lightweight web and small server-based applications Is a free edition for standalone and occasionally connected mobile applications, optimized for a very small memory footprint Allows building, testing, and demonstrating all SQL Server functionality
Developer
1-11
Workgroup
Runs branch applications with secure remote synchronization and management capabilities Provides a secure, cost effective, and scalable platform for public web sites and applications Allows building and extending SQL Server applications to a cloud-based platform
Web
SQL Azure
Question: What would be a good business case example for using a cloud-based service?
1-12
Key Points
SQL Server is a platform with a rich history of innovation achieved while maintaining strong levels of stability. SQL Server has been available for many years, yet it is rapidly evolving new capabilities and features.
Early Versions
The earliest versions (1.0 and 1.1) were based on the OS/2 operating system. Versions 4.2 and later moved to the Windows operating system, initially on the Windows NT operating system.
Later Versions
Version 7.0 saw a significant rewrite of the product. Substantial advances were made in reducing the administration workload for the product. OLAP Services (which later became Analysis Services) was introduced. SQL Server 2000 featured support for multiple instances and collations. It also introduced support for data mining. SQL Server Reporting Services was introduced after the product release as an add-on enhancement to the product, along with support for 64-bit processors. SQL Server 2005 provided another significant rewrite of many aspects of the product. It introduced support for: Non-relational data stored and queried as XML. SQL Server Management Studio was released to replace several previous administrative tools. SQL Server Integration Services replaced a former tool known as Data Transformation Services (DTS). Another key addition to the product was the introduction of support for objects created using the Common Language Runtime (CLR). The T-SQL language was substantially enhanced, including structured exception handling.
1-13
Dynamic Management Views and Functions were introduced to enable detailed health monitoring, performance tuning, and troubleshooting. Substantial high availability improvements were included in the product. Database mirroring was introduced. Support for column encryption was introduced.
SQL Server 2008 also provided many enhancements: The "SQL Server Always On" technologies were introduced to reduce potential downtime. Filestream support improved the handling of structured and semi-structured data. Spatial data types were introduced. Database compression and encryption technologies were added. Specialized date- and time-related data types were introduced, including support for timezones within datetime data. Full-text indexing was integrated directly within the database engine. (Previously full-text indexing was based on interfaces to operating system level services). A policy-based management framework was introduced to assist with a move to more declarativebased management practices, rather than reactive practices. A PowerShell provider for SQL Server was introduced.
Upcoming Versions
The next version of SQL Server has been announced. This version will enable the efficient delivery of mission-critical solutions through a highly scalable and available platform. Additional productivity tools and features will assist developers and the reach of business intelligence tooling to end-users will be enhanced. Question: Which versions of SQL Server have you worked with?
1-14
Lesson 2
Working effectively with SQL Server requires familiarity with the tools that are used in conjunction with SQL Server. Before any tool can connect to SQL Server, it needs to make a network connection to the server. In this lesson, you will see how these connections are made, then look at the tools that are most commonly used when working with SQL Server.
Objectives
After completing this lesson, you will be able to: Connect from Clients and Applications Describe the roles of Software Layers for Connections Use SQL Server Management Studio Use Business Intelligence Development Studio Use Books Online
1-15
Key Points
Client applications connect to endpoints. A variety of communication protocols are available for making connections. Also, users need to be identified before they are permitted to use the server.
Connectivity
The protocol that client applications use when connecting to the SQL Server relational database engine is known as Tabular Data Stream (TDS). It defines how requests are issued and how results are returned. Other components of SQL Server use alternate protocols. For example, clients to SQL Server Analysis Services communicate via the XML for Analysis (XML/A) protocol. However, in this course, you are primarily concerned with the relational database engine. TDS is a high-level protocol that is transported by lower-level protocols. It is most commonly transported by the TCP/IP protocol, the Named Pipes protocol, or implemented over a shared memory connection. SQL Server 2008 R2 does support connection over the Virtual Interface Adapter (VIA) protocol but the use of this protocol with SQL Server is now deprecated and should not be used for new implementations.
Authentication
For the majority of applications and organizations, data must be held securely and access to the data is based on the identity of the user attempting to access the data. The process of verifying the identity of a user (or more formally, of any principal), is known as authentication. SQL Server supports two forms of authentication. It can store the login details for users directly within its own system databases. These logins are known as SQL Logins. Alternately, it can be configured to trust a Windows authenticator (such as Active Directory). In that case, a Windows user can be granted access to the server, either directly or via his/her Windows group memberships. When a connection is made, the user is connected to a specific database, known as their "default" database.
1-16
Key Points
Connections to SQL Server are made through a series of software layers. It is important to understand how each of these layers interacts. This knowledge will assist you when you need to perform configuration or troubleshooting.
Client Libraries
Client applications use programming libraries to simplify their access to databases such as SQL Server. Open Database Connectivity (ODBC) is a commonly used library. It operates as a translation layer that shields the application from some details of the underlying database engine. By changing the ODBC configuration, an application could be altered to work with a different database engine, without the need for application changes. OLEDB originally stood for Object Linking and Embedding for Databases, however, that meaning is now not very relevant. OLEDB is a library that does not translate commands. When an application sends a SQL command, OLEDB passes it to the database server without modification. The SQL Server Native Access Component (SNAC) is a software layer that encapsulates commands issued by libraries such as OLEDB and ODBC into commands that can be understood by SQL Server and encapsulates results returned by SQL Server ready for consumption by these libraries. This primarily involves wrapping the commands and results in the TDS protocol.
Network Libraries
SQL Server exposes endpoints that client applications can connect to. The endpoint is used to pass commands and data to/from the database engine. SNAC connects to these endpoints via network libraries such as TCP/IP, Named Pipes, or VIA. Please note that the use of the VIA protocol with SQL Server is now deprecated. For client applications that are
1-17
executing on the same computer as the SQL Server service, a special "shared memory" network connection is also available.
1-18
Key Points
SQL Server Management Studio (SSMS) is the primary tool supplied by Microsoft for interacting with SQL Server services.
1-19
Key Points
In this demonstration you will see how to work with SQL Server Management Studio.
Demonstration Setup
1. 2. 3. 4. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. In Hyper-V Manager, in the Virtual Machines pane, right-click 623XB-MIA-SQL1 and click Revert. If you are prompted to confirm that you want to revert, click Revert. If you do not already have a Virtual Machine Connection window, right-click 623XB-MIA-SQL1 and click Connect.
Demonstration Steps
1. 2. 3. 4. 5. 6. 7. 8. In the Virtual Machine, Click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. In the Connect to Server window, ensure that Server Type is set to Database Engine. In the Server name text box, type (local). In the Authentication drop-down list, select Windows Authentication, and click Connect. From the View menu, click Object Explorer. In Object Explorer, expand Databases, expand AdventureWorks2008R2, and Tables. Review the database objects. Right-click the AdventureWorks2008R2 database and choose New Query. Type the query shown in the snippet below.
1-20
9.
Note the use of Intellisense while entering it, and then click Execute on the toolbar. Note how the results can be returned.
10. From the File menu click Save SQLQuery1.sql. Note this saves the query to a file. 11. In the Results tab, right-click on the cell for ProductID 1 (first row and first cell) and click Save Results As. In the FileName textbox, type Demonstration2AResults and click Save. Note this saves the query results to a file. 12. From the Query menu, click Display Estimated Execution Plan. Note that SSMS is capable of more than simply executing queries. 13. From the Tools menu, and click Options. 14. In the Options pane, expand Query Results, expand SQL Server, and expand General. Review the available configuration options and click Cancel. 15. From the File menu, click Close. In the Microsoft SQL Server Management Studio window, click No. 16. In the File menu, click Open, and click Project/Solution. 17. In the Open Project window, open the project D:\6232B_Labs\6232B_02_PRJ\6232B_02_PRJ.ssmssln. 18. From the View menu, click Solution Explorer. Note the contents of Solution Explorer. SQL Server projects have been supplied for each module of the course and contain demonstration steps and suggested lab solutions, along with any required setup/shutdown code for the module. 19. In the Solution Explorer, click the X to close it. 20. In Object Explorer, from the Connect toolbar icon, note the other SQL Server components that connections can be made to: Database Engine Analysis Services Integration Services Reporting Services SQL Server Compact
21. From the File menu, click New, and click Database Engine Query to open a new connection. 22. In the Connect to Database Engine window, type (local) in the Server name text box. 23. In the Authentication drop-down list, select Windows Authentication, and click Connect. 24. In the Available Databases drop-down list, click tempdb database. Note this will change the database that the query is executed against. 25. Right-click in the query window and click Connection, and click Change Connection Note: this will reconnect the query to another instance of SQL Server. 26. From the View menu, click Registered Servers. 27. In the Registered Servers window, expand Database Engine, right-click Local Server Groups, and click New Server Group 28. In the New Server Group Properties window type Dev Servers in the Group name textbox and click OK. 29. Right-click Dev Servers and click New Server Registration
1-21
30. In the New Server Registration window, click Server name drop-down list, type (local) and click Save. 31. Right-click Dev Servers and click New Server Registration 32. In the New Server Registration window, click Server name drop-down list, type .\MKTG and click Save. 33. In the Registered Servers window, right-click the Dev Servers group and choose New Query. 34. Type the query as shown in the snippet below and click Execute toolbar icon.
SELECT @@version;
1-22
Key Points
The SQL Server platform comprises a number of components. Projects for several of the Business Intelligence related components are created and modified using Business Intelligence Development Studio (BIDS).
1-23
Key Points
In this demonstration you will see how to work with SQL Server Business Intelligence Development Studio.
Demonstration Steps
1. If Demonstration 2A was not performed: Revert the 623XB-MIA-SQL1 virtual machine using Hyper-V Manager on the host system, and connect to the Virtual Machine. 2. In the Virtual Machine, Click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Business Intelligence Development Studio (BIDS). From the File menu, expand New, and click Project. Note the available project templates(If other languages are installed, note how they are still present as well). In the Templates pane, click Report Server Project, and click OK. In Solution Explorer, right-click Reports and click Add New Report. In the Report Wizard window, click Next. In the Select the Data Source window, click Edit. In the Connection Properties window, type (local) for the Server name and in the Connect to a database drop-down list, select AdventureWorks2008R2, and click OK. In the Select the Data Source window, click Next. In the Design the Query window, for the Query string textbox, type the following query as shown in snippet below and click Next.
3. 4. 5. 6. 7. 8. 9.
1-24
11. In the Design the Table window, click Details four times, and click Finish>>|. 12. In the Completing the Wizard window, click Finish. 13. In the Report1.rdl [Design] tab, click Preview and note the report that is rendered. 14. Click on the Design tab, from the File menu click Exit. Note do not save the changes. Question: Can you suggest a situation where the ability to schedule the execution of a report would be useful?
1-25
Books Online
Key Points
Books Online (BOL) is the primary reference for SQL Server. It can be installed offline (for use when disconnected from the Internet) and can also be used online directly from the Microsoft MSDN web site (via an Internet connection).
Books Online
BOL should be regarded as the primary technical reference for SQL Server. A common mistake when installing BOL locally on a SQL Server installation is to neglect to update BOL regularly. To avoid excess download file sizes, BOL is not included in SQL Server service pack and cumulative update packages. BOL is regularly updated and a regular check should be made for updates. For most T-SQL commands, many users will find the examples supplied easier to follow than the formal syntax definition. Note that when viewing the reference page for a statement, the formal syntax is shown at the top of the page and the examples are usually at the bottom of the page. BOL is available for all supported versions of SQL Server. It is important to make sure you are working with the pages designed for the version of SQL Server that you are working with. Many pages in BOL provide links to related pages from other versions of the product.
1-26
Key Points
In this demonstration you will see how to work with SQL Server Business Books Online.
Demonstration Steps
1. If Demonstration 2A was not performed: Revert the 623XB-MIA-SQL1 virtual machine using Hyper-V Manager on the host system, and connect to the Virtual Machine. 2. 3. 4. 5. 6. 7. 8. 9. In the Virtual Machine, Click Start, click All Programs, click Microsoft SQL Server 2008 R2, click Documentation and Tutorials, and click SQL Server Books Online. In the Contents window, click SQL Server 2008 R2 Books Online. Note the basic navigation options available within BOL. In the Virtual Machine, Click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. In the Connect to Server window, ensure that Server Type is set to Database Engine. In the Server name text box, type (local). In the Authentication drop-down list, select Windows Authentication, and click Connect. From the File menu, click New, and click Query with Current Connection. In the SQLQuery1.sql tab, type the query as shown in the snippet below and click Execute toolbar icon.
10. Click the name of the function SUBSTRING, then hit the F1 key to open the BOL topic for SUBSTRING.
1-27
11. In the Online Help Settings window, ensure Use local Help as primary source option button, and click OK. Note the content of the page and scroll to the bottom to see the examples. 12. From the File menu, click Exit. 13. In the host system, open Internet Explorer and browse to the SQL Server Books Online page: http://msdn.microsoft.com/en-us/library/ms130214.aspx and note the available online options.
1-28
Lesson 3
Each SQL Server service can be configured individually. The ability to provide individual configuration for services assists organizations that aim to minimize the permissions assigned to service accounts, as part of a policy of least privilege execution. SQL Server Configuration Manager is used to configure services, including the accounts that the services operate under, and the network libraries used by the SQL Server services. SQL Server also ships with a variety of tools and utilities. It is important to know what each of these tools and utilities is used for.
Objectives
After completing this lesson, you will be able to: Use SQL Server Configuration Manager Use SQL Server Services Use Network Ports and Listeners Create Server Aliases Use other SQL Server tools
1-29
Key Points
SQL Server Configuration Manager (SSCM) is used to configure SQL Server services, to configure the network libraries exposed by SQL Server services, and to configure how client connections are made to SQL Server.
Question: Why would a server system need to have a client configuration node?
1-30
Key Points
SQL Server Configuration Manager can be used to configure the individual services that are provided by SQL Server. Many components provided by SQL Server are implemented as operating system services. The components of SQL Server that you choose during installation determine which of the SQL Server services are installed.
Instances
Many SQL Server components are instance-aware and can be installed more than once on a single server. When SSCM lists each service, it shows the associated instance of SQL Server in parentheses after the name of the service. In the example shown in the slide, there are two instances of the database engine installed. PARTNER is the name of a named instance of the database engine. MSSQLSERVER is the default name allocated to the default instance of the SQL Server database engine.
1-31
Key Points
SQL Server Configuration Manager can be used to configure both server and client protocols and ports.
Client Configurations
Every computer that has SNAC installed needs the ability to configure how that library will access SQL Server services. SNAC is installed on the server as well as on client systems. When SSMS is installed on the server, it uses the SNAC library to make connections to the SQL Server services that are on the same system. The client configuration nodes within SSCM can be used to configure how those connections are made. Note that two sets of client configurations are provided. One set is used for 32-bit applications; the other set is used for 64-bit applications. SSMS is a 32-bit application, even when SQL Server is installed as a 64-bit application.
1-32
Key Points
Connecting to a SQL Server service can involve multiple settings such as server address, protocol, and port. To make this easier for client applications and to provide a level of available redirection, aliases can be created for servers.
Aliases
Hard-coding connection details for a specific server, protocol, and port within an application is not desirable as these might need to change over time. A server alias can be created and associated with a server, protocol, and port (if required). Client applications can then connect to the alias without being concerned about how those connections are made. Each client system that utilizes SNAC (including the server itself) can have one or more aliases configured. Aliases for 32-bit applications are configured independently of the aliases for 64-bit applications. In the example shown in the slide, the alias "Marketing" has been created for the local server "." and utilizing the named pipes protocol "np" and named pipe address that is based on the name of the computer or the value "." for the local computer. The client then only needs to connect to the name "Marketing".
1-33
Key Points
SQL Server provides a rich set of tools and utilities to make working with the product easier. The most commonly used tools are listed in the following table: Tool SQL Server Profiler Purpose Trace activity from client applications to SQL Server. Supports both the database engine and Analysis Services. Design indexes and statistics to improve database performance, based on analysis of trace workloads. Configure and manage SQL Server Master Data Services
Database Engine Tuning Advisor Master Data Services Configuration Manager Reporting Services Configuration Manager
SQL Server Error and Usage Configure the level of automated reporting back to the SQL Server product Reporting team about errors that occur and on usage of different aspects of the product. PowerShell Provider SQL Server Management Objects (SMO) Allow configuring and querying SQL Server using PowerShell. Provide a detailed .NET based library for working with management aspects of SQL Server directly from application code.
1-34
Key Points
In this demonstration you will see how SQL Server Profiler can capture traces of statements executed.
Demonstration Steps
1. If Demonstration 2A was not performed: Revert the 623XB-MIA-SQL1 virtual machine using Hyper-V Manager on the host system, and connect to the Virtual Machine. 2. 3. 4. 5. 6. 7. 8. 9. In the Virtual Machine, Click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. In the Connect to Server window, ensure that Server Type is set to Database Engine. In the Server name text box, type (local). In the Authentication drop-down list, select Windows Authentication, and click Connect. From the Tools menu, click SQL Server Profiler. In the Connect to Server window, ensure that Server Type is set to Database Engine. In the Server name text box, type (local). In the Authentication drop-down list, select Windows Authentication, and click Connect.
10. In the Trace Properties window, click Run. Note this will start a new trace with the default options. 11. Switch to SSMS, click New Query toolbar icon. 12. In the Query window, type the query as shown in the snippet below, and click Execute toolbar icon.
USE AdventureWorks2008R2 GO SELECT * FROM Person.Person ORDER BY FirstName;
1-35
GO
13. Switch to SQL Server Profiler. Note the statement trace occurring in SQL Server Profiler. 14. From the File menu and click Stop Trace. 15. In the Results grid, click individual statements to see the detail shown in the lower pane. Question: What could you use captured trace files for?
1-36
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V window. In the Virtual Machines list, right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, then close the Virtual Machine Connection window. 6. In Hyper-V Manager, in the Virtual Machines list, right-click 623XB-MIA-SQL1 and click Start. 7. Right-click 623XB-MIA-SQL1 and click Connect. 8. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears. 9. On the Action menu, click the Ctrl+Alt+Delete menu item. 10. Click Switch User, then click Other User. 11. Log on using the following credentials: User name: AdventureWorks\Administrator Password: Pa$$w0rd 1. 2. 3. 4. 5.
12. From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. 13. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window.
Lab Scenario
AdventureWorks is a global manufacturer, wholesaler and retailer of cycle products. The owners of the company have decided to start a new direct marketing arm of the company. It has been created as a new company named Proseware, Inc. Even though it has been set up as a separate company, it will receive
1-37
some IT-related services from the existing AdventureWorks company and will be provided with a subset of the corporate AdventureWorks data. The existing AdventureWorks company SQL Server platform has been moved to a new server that is capable of supporting both the existing workload and the workload from the new company. In this lab, you are ensuring that the additional instance of SQL Server has been configured appropriately and making a number of additional required configuration changes.
1-38
Task 1: Check that Database Engine and Reporting Services have been installed for the
MKTG instance
Open SQL Server Configuration Manager. Check the installed list of services for the MKTG instance and ensure that the database engine and Reporting Services have been installed for the MKTG instance.
Task 2: Note the services that are installed for the default instance and that Integration
Services is not installed on a per instance basis
Note the list of services that are installed for the default instance. Note that Integration Services has no instance name shown as it is not installed on a per-instance basis.
Task 3: Ensure that all required services including SQL Server Agent are started and set
to autostart for both instances
Ensure that all the MKTG services are started and set to autostart. (Ignore the Full Text Filter Daemon at this time). Ensure that all the services for the default instance are set to autostart. (Ignore the Full Text Filter Daemon at this time). Results: After this exercise, you have checked that the required SQL Server services are installed, started, and configured to autostart.
1-39
Task 1: Change the service account for the MKTG database engine
Change the service account for the MKTG database engine service to AdventureWorks\PWService using the properties page for the service.
Task 2: Change the service account for the MKTG SQL Server Agent
Change the service account for the MKTG SQL Server Agent service to AdventureWorks\PWService using the properties page for the service and then restart the service. Results: After this exercise, you have configured the service accounts for the MKTG instance.
1-40
Task 1: Enable the named pipes protocol for the default instance
Enable the named pipes protocol for the default database engine instance using the Protocols window.
Task 2: Enable the named pipes protocol for the MKTG instance
Enable the named pipes protocol for the MKTG database engine instance using the Protocols window.
1-41
Task 5: Use SQL Server Management Studio to connect to both aliases to ensure they
work as expected
Open SQL Server Management Studio. Connect to the Proseware alias. In Object Explorer, connect also to the AdventureWorks alias. Results: After this exercise, you should have created and tested aliases for both database engine instances.
1-42
Challenge Exercise 5: Ensure SQL Browser is Disabled and Configure a Fixed TCP/IP Port (Only if time permits)
Scenario
Client applications will need to connect to the MKTG database engine instance via the TCP/IP protocol. As their connections will need to traverse a firewall, the port used for connections cannot be configured as a dynamic port. The port number must not change. Corporate policy at AdventureWorks is that named instances should be accessed via fixed TCP ports and the SQLBrowser service should be disabled. In this exercise, you will make configuration changes to comply with these requirements. A firewall exception has already been created for port 51550, for use with the MKTG database engine instance. The main tasks for this exercise are as follows: 1. Configure the TCP port for the MKTG database engine instance to 51550. Disable the SQLBrowser service.
Task 1: Configure the TCP port for the MKTG database engine instance to 51550
Using the property page for the TCP/IP server protocol, configure the use of the fixed port 51550. (Make sure that you clear the dynamic port) Restart the MKTG database engine instance. Ensure that the MKTG database engine instance has been restarted successfully.
1-43
Review Questions
1. 2. 3. What is the difference between a SQL Server version and an edition? What is the purpose of the Business Intelligence Development Studio? Does Visual Studio need to be installed before BIDS?
Best Practices
1. 2. 3. Ensure that developer edition licenses are not used in production environments. Develop using the least privileges possible, to avoid accidentally building applications that will not run for standard users. If using an offline version of Books Online, ensure it is kept up to date.
1-44
2-1
Module 2
Working with Data Types
Contents:
Lesson 1: Using Data Types Lesson 2: Working with Character Data Lesson 3: Converting Data Types Lesson 4: Specialized Data Types Lab 2: Working with Data Types 2-3 2-19 2-27 2-34 2-40
2-2
Module Overview
One of the most important decisions that will be taken when designing a database is the data types to be associated with the columns of every table in the database. The data type of a column determines the type and range of values that can be stored in the column. Other objects in SQL Server such as variables and parameters also use these same data types. A very common design error is to use inappropriate data types. As an example, while you can store a date in a string column, doing so is rarely a good idea. In this module, you will see the range of data types that are available within SQL Server and receive advice on where each should be used.
Objectives
After completing this lesson, you will be able to: Work with data types Work with character data Convert between data types Use specialized data types
2-3
Lesson 1
The most basic types of data that get stored in database systems are numbers, dates, and strings. There are a range of data types that can be used for each of these. In this lesson, you will see the available range of data types that can be used for numeric and date-related data. You will also see how to determine if a data type should be nullable or not. In the next lesson, you will see how to work with string data types.
Objectives
After completing this lesson, you will be able to: Understand the role of data types Use exact numeric data types Use approximate numeric data types Work with IDENTITY columns Use date and time data types Work with unique identifiers Decide on appropriate nullability of data
2-4
Key Points
Data types determine what can be stored in locations within SQL Server such as columns, variables, and parameters. For example, a tinyint column can only store values from 0 to 255. They also determine the types of values that can be returned from expressions.
Constraining Values
Data types are a form of constraint that is placed on the values that can be stored in a location. For example, if you choose a numeric data type, you will not be able to store text in the location. As well as constraining the types of values that can be stored, data types also constrain the range of values that can be stored. For example, if you choose a smallint data type, you can only store values between -32768 and 32767.
Query Optimization
When SQL Server knows that the value in a column is an integer, it may be able to come up with an entirely different query plan to one where it knows the location is holding text values. The data type also determines which sorts of operations are permitted on that data and how those operations work.
Self-Documenting Nature
Choosing an appropriate data type provides a level of self-documentation. If all values were stored in the sql_variant (which is a data type that can store any type of value) or xml data types, it's likely that you would need to store documentation about what sort of values can be stored in the sql_variant locations.
2-5
Data Types
There are three basic sets of data types: System Data Type - SQL Server provides a large number of built-in (or intrinsic) data types. Examples of these would be integer, varchar, and date. Alias Data Type - Users can also define data types that provide alternate names for the system data types and potentially further constrain them. These are known as alias data types. An example of an alias data type would be to define the name PhoneNumber as being equivalent to nvarchar(16). Alias data types can help provide consistency of data type usage across applications and databases. User-defined Data Type - With managed code via SQL Server CLR integration, entirely new data types can be created. There are two categories of these CLR types: system CLR data types (such as the geometry and geography spatial data types) and user-defined CLR data types that allow users to create their own data types. Managed code is discussed later in Module 16. Question: Why would it be faster to compare two integer variables that are holding the values 3240 and 19704 than two varchar(10) variables that are holding the values "3240" and "19704"?
2-6
Key Points
Numeric data types can be exact or approximate. Exact data types are the most common data type used in business applications.
2-7
This is often the wrong number of decimal places for many monetary applications and the data type is not a standard data type. In general, use decimal for monetary values.
rather than testing for a value being equal to 1, as this will provide more reliable code. Another aspect that surprises new users is that bit, along with other data types, is also nullable. That means that a bit location can be in three states: NULL, 0, or 1. Question: What would be a suitable data type for storing the value of a check box that can be 0 for unchecked, 1 for checked, or -1 for disabled?
2-8
Key Points
It is common to require a series of numbers to be automatically provided for an integer column. The IDENTITY property on a database column indicates that the value for the column will not be provided by an INSERT statement but should be automatically provided by SQL Server.
IDENTITY
IDENTITY is a property typically associated with integer or bigint columns that provide automated generation of values during insert operations. You may be familiar with auto-numbering systems or sequences in other database engines. While not identical to these, IDENTITY columns can be used to replace the functionality from those other database engines. When specifying the IDENTITY property, you specify a seed and an increment. The seed is the starting value. The increment is how much the value goes up by each time. Both seed and increment default to a value of 1 if they are not specified. Although explicit inserts are not normally allowed to columns with an IDENTITY property, it is possible to explicitly insert values. The ability to insert into an IDENTITY column can be enabled temporarily using a connection option. SET IDENTITY_INSERT ON can be used to allow the user to insert values into the column with the IDENTITY property instead of having it auto-generated. Having the IDENTITY property on a column does not in itself ensure that the column is unique. Unless there is also a unique constraint on the column, there is no guarantee that values in a column with the IDENTITY property will be unique.
2-9
For example, if you insert a row into a customer table, the customer might be assigned a new identity value. However, if a trigger on the customer table caused an entry to be written into an audit logging table when inserts are performed, the @@IDENTITY variable would return the identity value from the logging table, rather than the one from the customer table. To deal effectively with this, the SCOPE_IDENTITY() function was introduced. It provides the last identity value within the current scope only. In the previous example, it would return the identity value from the customer table. Another complexity relates to multi-row inserts. These were introduced in SQL Server 2008. In this situation, you may want to retrieve the IDENTITY column value for more than one row at a time. Typically, this would be implemented by the use of the OUTPUT clause on the INSERT statement.
2-10
Key Points
SQL Server provides two approximate numeric data types. They are used more commonly in scientific applications than in business applications. A very common design error made by new developers is to use the float or real data types for storing business values such as monetary values.
Common Errors
A very common error for new developers is to use approximate numeric data types to store values that need to be stored exactly. This causes rounding and processing errors. A "code smell" for picking new developers is to have columns of numbers that do not exactly add up to the displayed totals. It is common to have small rounding errors creep into calculations, that is, a total that is out by 1 cent in dollar or euro based currencies. The inappropriate use of numeric data types can cause processing errors. Look at the following code and decide how many times the PRINT statement would be executed:
DECLARE @Counter float; SET @Counter = 0; WHILE (@Counter <> 1.0) BEGIN SET @Counter += 0.1; PRINT @Counter;
2-11
END;
It might surprise you that this query would never stop running and would need to be cancelled. After cancelling the query, if you look at the output you would see the following:
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7
What has happened? The problem is that the value 0.1 cannot be stored exactly in a float or real data type. Consider how you would write the answer to 1 / 3 in decimal. The answer isn't 0.3, it's 0.3333333 recurring. There is no way in decimal to write 1 / 3 as an exact decimal fraction. You have to eventually settle for an approximate value. The same problem occurs in binary fractions; it just occurs at different values. 0.1 ends up being stored as the equivalent of 0.099999 recurring. 0.1 in decimal is a non-terminating fraction in binary. So, when you put the system in a loop adding 0.1 each time, the value never exactly equals 1.0, which does happen to be able to be stored precisely.
2-12
Key Points
SQL Server supports a rich set of data types for working with date- and time-related values. It is important to be very careful when working with string literal representations of these values and with their precision (or accuracy). SQL Server also provides a large number of functions for working with dates and times.
2-13
Another problem with the datetime data type is that the way it converts strings to dates is based on language format settings. A value in the form 'YYYYMMDD' will always be converted to the correct date but a value in the form 'YYYY-MM-DD' might end up being interpreted as 'YYYY-DD-MM' depending on the settings for the session. It is important to understand that this behaviour does not happen with the new date data type, so a string that was in form 'YYYY-MM-DD' could be interpreted as two different dates by the date (and datetime2) data type and the datetime data type.
Timezones
The datetimeoffset data type is a combination of a datetime2 data type and a timezone offset. Note that the data type is not timezone aware, it is simply capable of storing and retrieving timezone values. Note that the timezone offset values extend for more than a full day (range of -14:00 to +14:00). A range of system functions has been provided for working with timezone values, as well as with all the date and time related data types. Question: Why is the specification of a date range from the year 0000 to the year 9999 based on the Gregorian Calendar not entirely meaningful?
2-14
Unique Identifiers
Key Points
Globally unique identifiers (GUIDs) have become common in application development. They are used to provide a mechanism where any process can generate a number at will and know that it will not clash with a number generated by any other process.
GUIDs
Numbering systems have traditionally depended on a central source for the next value in a sequence to make sure that no two processes use the same value. GUIDs were introduced to avoid the need for anyone to function as the "number allocator". Any process (and on any system) can generate a value and know that it will not clash with a value generated by any process across time and space and on any system to a very, very high degree of probability. This is achieved by using very, very large values. When discussing the bigint data type earlier, you learned that the 64-bit bigint values were really large. GUIDs are 128-bit values. The magnitude of a 128-bit value is well beyond our capabilities of comprehension.
2-15
The usefulness of NEWSEQUENTIALID() is also quite limited as the main reason for using GUIDs is to allow other layers of code to generate the values and know they can just insert them into a database without clashes. If you need to request a value from the database via NEWSEQUENTIALID(), you usually would have been better to use an IDENTITY column instead. A very common development error is to store GUIDs in string values rather than in uniqueidentifier columns. Note: uniqueidentifier columns are also commonly used by replication systems. Replication is an advanced topic beyond the scope of this course. Question: The slide mentions that a common error is to store GUIDs as strings. What would be wrong with this?
2-16
Key Points
Nullability determines if a value must be present or not. Assigning inappropriate nullability of columns is another very common design error.
NULL
NULL is a state that a column is in, rather than a type of value that is stored in a column. You do not say that a value equals NULL; you say that a value is NULL. This is why in T-SQL, you do not check whether a value is NULL with the equality operator. You do not write code that says:
WHERE Color = NULL;
Common Errors
New developers will often confuse NULL values with zero, blank (or space), zero-length strings, etc. This is exasperated by other database engines that treat NULL and zero-length strings or zeroes as identical. NULL indicates the absence of a value. Careful consideration must be given to the nullability of a column. As well as specifying a data type for a column, you specify whether or not a value needs to be present. (Often this is referred to as whether or not a column is mandatory). Look at the NULL and NOT NULL declarations on the slide and decide why each decision might have been made.
2-17
2-18
Key Points
In this demonstration you will see: Work with IDENTITY values Work with NULL Insert GUIDs into a table
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_02_PRJ\6232B_02_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
2-19
Lesson 2
In the last lesson, you saw that the most basic types of data that get stored in database systems today are numbers, dates, and strings. There are a range of data types that can be used for each of these. You also looked at the available range of data types that can be used for numeric and date-related data. In this lesson, you will now look at the other very common category of data: the string-related data types. Another common class of design and implementation errors relates to collations. Collations define how string data is sorted. In this lesson, you will also see how collations are defined and used.
Objectives
After completing this lesson, you will be able to: Explain the role of Unicode encoding Use character data types Work with collations
2-20
Understanding Unicode
Key Points
Traditionally, most computer systems stored one character per byte. This only allowed for 256 different character values, which is not enough to store characters from many languages.
They can then enter the number beside the character to select the intended word. It might not seem important to an English-speaking person but given that the first option means "horse", the second option is like a question mark, and the third option means "mother", there is definitely a need to select the correct option!
Character Groups
An alternate way to enter the characters is via radical groupings. Please note the third character in the screenshot above. The left-hand part of that character, , means "woman". Rather than entering Englishlike characters (that could be quite unfamiliar to the writers), select a group of characters based on what is known as a radical. If you select the woman radical, you see this list:
2-21
Please note that the character representing "mother" is the first character on the second line. For this sort of keyboard entry to work, the characters must be in appropriate groups, not just stored as one large sea of characters. An additional complexity is that the radicals themselves are also in groups. You can see in the screenshot that the woman radical was part of the third group of radicals.
Unicode
In the 1980s, work was done to determine how many bytes are required to be able to hold all characters from all languages but also store them in their correct groupings. The answer was three bytes. You can imagine that three was not an ideal number for computing and at the time users were mostly working with 2 byte (that is, 16 bit) computer systems. Unicode introduced a two-byte character set that attempts to fit the values from the three bytes into two bytes. Inevitably then, trade-offs had to occur. Unicode allows any combination of characters that are drawn from any combination of languages to exist in a single document. There are multiple encodings for Unicode with UTF-7, UTF-8, UTF-16, and UTF-32. (UTF is universal text format). SQL Server currently implements double-byte characters for its Unicode implementation. For string literal values, an N prefix on a string allows the entry of double-byte characters into the string rather than just single-byte characters. (N stands for "National" in "National Character Set"). When working with character strings, the LEN function returns the number of characters (Unicode or not) whereas DATALENGTH returns the number of bytes. Question: Do you recognize either of the phrases on the slide?
2-22
Key Points
SQL Server provides a range of string data types for storing characters. They differ by length and by character encoding.
Note the trailing spaces. The char and nchar data types are not very useful for data that varies in length but are ideal for short strings that are always the same length, for example, state codes in the U.S.A.
2-23
of great benefit when the length of the strings being stored varies and it also avoids the need to trim the right-hand-side of the string in most applications. The varchar and nvarchar data types are limited to 8000 and 4000 characters, respectively. This is roughly what fits in a data page in a SQL Server database.
2-24
Understanding Collations
Key Points
Collations in SQL Server are used to control the code page that is used to store non-Unicode data and the rules that govern how SQL Server sorts and compares character values.
Code Pages
It was mentioned earlier that computer systems traditionally stored one byte per character. This allowed for 256 possible values, with a range from 0 to 255. The values from 0 to 31 were reserved for "control characters" such as backspace (character 8) and tab (character 9). Character 32 was allocated for a space and so on, up to the Delete character which was assigned the value 127. For values above 127 though, standards were initially not very clear. It was common to store characters such as line drawing characters or European characters with accents or graves in these codes. In fact, a number of computer systems only used 7 bits to store characters instead of 8 bits. (As an example, the DEC10 system from Digital Equipment Corporation stored 5 characters of 7 bits each per 36bit computer "word". It used the final bit as a parity check bit). Problems did arise when different vendors used the upper characters for different purposes. In the 1970's, it was not uncommon to type a character on your screen and see a different character when that document was printed, as the screen and the printer were using different characters in the values above 127. A number of standard character sets that described what should be in the upper code values did appear. The MSDOS operating system categorized these as "code pages". What a code page really defines is which characters are used for the values from 128 to 255. Both the operating systems and SQL Server support a range of code pages.
2-25
The elements of this are: SQL SortRules The actual string "SQL" A string identifying the alphabet or language that are applied when dictionary sorting is specified An optional string that indicates an uppercase preference One to four digits that define the code page used by the collation. For curious historic reasons, CP1 specifies code page 1252 but for all others the number indicates the code page, for example, CP850 specifies code page 850. Either BIN for binary or a combination of case and accent sensitivity. CI is caseinsensitive, CS is case-sensitive. AI is accent-insensitive, AS is accent-sensitive.
Pref CodePage
ComparisonStyle
As an example, the collation SQL_Latin1_General_Pref_CP850_CI_AS indicates that it is a SQL collation, Latin1_General is the alphabet being used, there is a preference for upper-case, the code page is 850, and sorting is performed case-insensitive and accent-sensitive. Windows collations have similar naming but with less fields. For example, Windows collation Latin1_General_CI_AS refers to Latin1_General as the alphabet being used, case-insensitive and accentsensitive.
Collation Issues
The main issues with collations occur when you try to compare values that are stored with different collations. It is possible to set default collations for servers, databases, and even columns. When comparing values from different collations, you need to then specify which collation (which could yet another collation) will be used for the comparison. Another use of this is as shown in the example in the slide. In this case, you are forcing the query to perform a case-sensitive comparison between the string '%ball%' and the value in the column. If the column contained 'Ball', it would not then match. Question: What are the code page and sensitivity values for the collation QL_Scandinavian_Cp850_CI_AS?
2-26
Key Points
In this demonstration you will see how to: How to work with Unicode and non-Unicode data How to work with collations
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_02_PRJ\6232B_02_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 21 Demonstration 2A.sql script file. Follow the instructions contained within the comments of the script file.
2-27
Lesson 3
Now that you have learned about the most common data types, you need to consider that data is not always already in an appropriate data type. For example, you may have received data from another system and you may need to convert the data from one data type to another. You can control how this is done or you can try to let SQL Server do the conversions implicitly. There are a number of issues that can arise when making conversions between data types. You will learn about these issues in this lesson.
Objectives
After completing this lesson, you will be able to: Use the CAST function Use the CONVERT function Allow implicit data conversion to occur Describe some common issues that arise during conversion
2-28
Using CAST
Key Points
Available data is not always in the data type that it is needed in. For example, you may need to return a number as a string value. This requires converting the data from one data type to another. The CAST function is used to convert data. CAST is based on the SQL standards.
CAST
You can use the CAST function to explicitly convert data from one type to another. Look at the expression:
CAST(ListPrice AS varchar(12))
This expression takes the ListPrice column (likely to be a decimal value) and casts it as a string value. Note that you are not exhibiting control over how the decimal value will be formatted as a string, only that it will be converted to a string. An error is returned if the cast is not possible or is not supported. Question: Give an example of a situation where you would need to cast a number as a string.
2-29
Using CONVERT
Key Points
CAST performs basic type casting but does not allow control over how the type cast will be performed. CONVERT is a SQL Server extension to the SQL language that is more powerful than CAST.
CONVERT
While CAST is a good option wherever it can be used as it is a SQL standard option, at times more control is needed on how a conversion is carried out than what CAST allows for. CONVERT allows you to specify the target data type and the source data element but also allows you to specify a style for the conversion. For example, the expression:
CONVERT(varchar(8),SYSDATETIME(),112)
would return the current date formatted as YYYYMMDD. Style 112 specifies the format YYYYMMDD. Note that for date-related styles, removing 100 from the value will give you the equivalent style without the century. So the expression:
CONVERT(varchar(6),SYSDATETIME(),12)
would return the current date formatted as YYMMDD. Note: The style value is often assumed to just relate to character-based output but it can also be used for determining how an incoming string is parsed.
2-30
Key Points
When data isn't explicitly converted between types, implicit data conversion is attempted by SQL Server automatically. The conversion is based on data type preference.
SQL Server has no need to convert the values here as they are both int values and division is defined for int values. Question: Look at the slide examples. Suggest where implicit conversions are happening and from which data types to which other data types.
2-31
Key Points
Data type conversion errors are commonplace. It is important to be aware of common situations that give rise to such errors.
Example Issues
Issue Comment
Inappropriate values for the If the data type target is integer, you are not going to be able to convert target data type most text strings to it. Value is out of range for the Each data type has a range of values that can be stored. For example, you cannot store the value 2340280923 in an int location. target data type Value is truncated while being converted (sometimes silently) As an example, consider the expression CONVERT(varchar(6),SYSDATETIME(),112). Style 112 suggest returning an 8 character string. When you convert it to a 6 character string, it is silently truncated. As an example, the datetime value '20051231 23:59:59.999' is silently rounded to the value '20060101 00:00:00.000'.
Value is rounded while being converted (sometimes silently) Value is changed while being converted (sometimes silently) Assumptions are made
As an example, execute the code "SELECT 5ee" and note the output.
Even though you might know the internal binary format of a data type
2-32
(such as datetime), it is very dangerous to write code that depends on that knowledge. The internal representation could change over time.
Some datetime conversions The string "2010-05-04" could be interpreted as 4th May 2010 or 5th April are deterministic and 2010 depending upon the language settings, when working with the depend on language datetime data type. settings. Some parsing issues are hard to understand. The SELECT example given for 5ee is also a good example of this.
Note that the worst of these issues tend to occur during implicit type conversions. For this and other reasons, implicit type conversions should be avoided. Attempt to control how conversions occur by making explicit conversions.
2-33
Key Points
In this demonstration you will see: How to convert date data types explicitly How language settings can affect date conversions How data can be truncated during data type conversion Issues that can arise with implicit conversion
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_02_PRJ\6232B_02_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
2-34
Lesson 4
You have now covered the common SQL Server data types but SQL Server also includes a number of more specialized data types that it is useful to be aware of. These data types are not used as commonly as the data types that have been described earlier in the module but they fill important roles in development.
Objectives
After completing this lesson, you will be able to: Work with the timestamp and rowversion data types Work with alias data types Describe other SQL Server data types
2-35
Key Points
The rowversion data type assists in creating systems that are based on optimistic concurrency. The timestamp data type has been deprecated and replaced by rowversion.
2-36
Internal Storage
rowversion holds a counter that increments across all changes in the entire database. The current rowversion value for a database can be returned from the system variable @@DBTS.
2-37
Key Points
Alias data types are names given to subtypes of existing system built-in (or intrinsic) types. The use of alias types can help promote consistency in database designs.
2-38
Key Points
SQL Server also offers a number of special data types. A number of other data types are shown in the table on the slide. These are important but less commonly used data types.
2-39
Key Points
In this demonstration you will see how: How to use the rowversion data type
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_02_PRJ\6232B_02_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 41 Demonstration 4A.sql script file. Follow the instructions contained within the comments of the script file.
2-40
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item.
2-41
Click Switch User, and then click Other User. Log on using the following credentials: i. ii. User name: AdventureWorks\Administrator Password: Pa$$w0rd
8. 9.
From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window.
10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_02_PRJ\6232B_02_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
A new developer has sought your assistance in deciding which data types to use for three new tables she is designing. She presents you with a list of organizational data requirements for each table. You need to decide on appropriate data types for each item. You also need to export some data from your existing system but while being exported, some of the columns need to be converted to alternate data types. If you have time, there is another issue that your manager would like you to address. She is concerned about a lack of consistency in the use of data types across the organization. At present, she is concerned about email addresses and phone numbers. You need to review the existing data types being used in the MarketDev database for this and create new data types that can be used in applications, to avoid this inconsistency.
Supporting Documentation
Table 1: PhoneCampaign Description Which campaign this relates to. The prospect that was contacted. When contact was first attempted with the prospect. Comments related to the contact that was made, if it was made.
2-42
When contact was actually made with the prospect. Outcome of the contact: sale, later follow-up, or no interest Value of any sale made (up to 2 decimal places)
Table 2: Opportunity Description Name of the opportunity Which prospect this opportunity relates to Stage the sale is at: Lead, Qualification, Proposal Development, Contract Negotiations, Complete, Lost Date that the opportunity was raised Probability of success Rating: Cold, Warm, Hot Estimated closing date Estimated revenue Delivery address
Table 3: SpecialOrder Description Which prospect this order is for External supplier of the item Description of the item Quantity Required (some quantities are whole numbers, some are fractional with up to three decimal places) Date of order Promised delivery date
2-43
Actual delivery date Special requirements (any comments related to the special order) Quoted price per unit (up to two decimal places)
Query Requirement 1: A list of products from the Marketing.Product table that are no longer sold, that is they have a SellEndDate. The output should show ProductID, ProductName, and SellEndDate formatted as a string based on the following format: YYYYMMDD. The output should appear similar to:
Query Requirement 2: A list of products from the Marketing.Product table that have demographic information. The output should show ProductID, ProductName, and Demographics formatted as nvarchar(1000) instead of XML. The output should appear similar to:
2-44
1.
2-45
3.
Task 2: Review the first query requirement and write a SELECT statement to meet the
requirement
Review the supporting documentation for details for the first query requirement. Write a SELECT statement that returns the required data. The output should look similar to the supplied sample.
Task 3: Review the second query requirement and write a SELECT statement to meet the
requirement
Review the supporting documentation for details for the second query requirement. Write a SELECT statement that returns the required data. The output should look similar to the supplied sample. Results: After this exercise, you should have created two new SELECT statements as per the design requirements.
2-46
Challenge Exercise 3: Designing and Creating Alias Data Types (Only if time permits)
Scenario
In this exercise, your manager is concerned about a lack of consistency in the use of data types across the organization. At present, she is concerned about email addresses and phone numbers. You need to review the existing data types being used in the MarketDev database for this and create new data types that can be used in applications, to avoid this inconsistency. The main tasks for this exercise are as follows: 1. 2. 3. Investigate the storage of phone numbers and email addresses Create a data type to be used to store phone numbers Create a data type to be used to stored email addresses
2-47
Review Questions
1. 2.
3.
What is the uniqueidentifier data type commonly used for? What are common errors that can occur during data type conversion? What date is present in a datetime data type if a value is assigned to it that only contains a time?
Best Practices
1. 2. 3. 4. 5. Always choose an appropriate data type for columns and variables rather than using generic data types such as string or xml except where they are necessary. When defining columns, always specify the nullability rather than leaving it to the system default settings. Avoid the use of any of the deprecated data types. In the majority of situations, do not store currency values in approximate numeric data types such as real or float. Use the unicode-based data types where there is any chance of needing to store non-English characters. Use sysname data type in administrative scripts involving database objects rather than nvarchar(128).
6.
2-48
3-1
Module 3
Designing and Implementing Tables
Contents:
Lesson 1: Designing Tables Lesson 2: Working with Schemas Lesson 3: Creating and Altering Tables Lab 3: Designing and Implementing Tables 3-3 3-15 3-21 3-32
3-2
Module Overview
In relational database management systems (RDBMS), user and system data is stored in tables. Each table comprises a set of rows that describe entities and a set of columns that hold the attributes of an entity. For example, a Customer table would have columns such as CustomerName and CreditLimit and a row for each customer. In SQL Server, tables are contained within schemas that are very similar in concept to folders that contain files in the operating system. Designing tables is often one of the most important roles undertaken by a database developer because incorrect table design leads to the inability to query the data efficiently. Once an appropriate design has been created, it is then important to know how to correctly implement the design.
Objectives
After completing this module, you will be able to: Design Tables Work with Schemas Create and Alter Tables
3-3
Lesson 1
Designing Tables
The most important aspect of designing tables involves determining what data each column will hold. As all organizational data is held within database tables, it is critical to store the data with an appropriate structure. The best practices for table and column design are often represented by a set of rules known as "normalization" rules. In this lesson, you will learn the most important aspects of normalized table design along with the appropriate use of primary and foreign keys. In addition, you will learn to work with the system tables that are supplied when SQL Server is installed.
Objectives
After completing this lesson, you will be able to: Describe what a table is Normalize data Describe common Normalization Forms Explain the role of Primary Keys Explain the role of Foreign Keys Work with System tables
3-4
What is a Table?
Key Points
Relational databases store data about entities in tables that are defined by columns and rows. Rows represent entities and columns define the attributes of the entities. Tables have no predefined order and can be used as a security boundary.
Tables
Relational database management systems are not the only type of database system available but they are the most commonly deployed type of database management system at present. In formal relational database management system terminology, tables are referred to as "relations". Tables store data about entities such as customers, suppliers, orders, products, and sales. Each row of a table represents the details of a single entity, for example a single customer, supplier, order, product, or sale. Columns define the information that is being held about each entity. For example, a Product table might have columns such as ProductID, Size, Name, and UnitWeight. Each of these columns is defined with a specific data type. For example, the UnitWeight of a product might be allocated a decimal(18,3) data type.
Naming Conventions
Strong disagreement exists in the industry over naming conventions for tables. The use of prefixes (such as tblCustomer or tblProduct) is widely discouraged. Prefixes were widely used in higher-level programming languages before the advent of strong typing (that is the use of strict data types) but are now rare. The main reason for this is that names should represent the entities, not how they are stored. For example, during a maintenance operation, it might become necessary to replace a table with a view or vice-versa. This could lead to views named tblProduct or tblCustomer, when trying to avoid breaking existing code.
3-5
Another area of strong disagreement relates to whether table names should be singular or plural. For example, should a table that holds the details of a customer be called Customer or Customers? Proponents of plural naming argue that the table holds the details of many customers whereas proponents of singular naming argue that it is common to expose these tables via object models in higher-level languages and that the use of plural names complicates this process. SQL Server system tables (and views) have plural names. The argument is not likely to ever be resolved either way and is not a SQL language-specific problem. For example, an array of customers in a higher-level language could sensibly be called "Customers" yet referring to a single customer via "Customers[49]" seems awkward. The most important aspect of naming conventions is that you should adopt a naming convention that you can work with and apply it consistently.
Security
Tables can be used as security boundaries in that users can be assigned permissions at the table level. Note also though that SQL Server supports the assignment of permissions at the column level as well as at the table level. Row-level security is not available for tables but can be implemented via a combination of views, stored procedures, and/or triggers.
Row Order
Tables are containers for rows but they do not define any order for the rows that they contain. When selecting rows from a table, a user should specify the order that the rows should be returned in, but only if the output order matters. SQL Server may have to expend additional sorting effort to return rows in a given order and it is important that this effort is only expended when necessary.
3-6
Normalizing Data
Key Points
Normalization is a systematic process that is used to improve the design of databases.
Normalization
Edgar F. Codd (August 23, 1923 April 18, 2003) was a British scientist who is widely regarded as having invented the Relational Model. This model underpins the development of relational database management systems. Codd introduced the concept of normalization and helped the concept evolve over many years, through a series of "normal forms". Codd introduced 1st normal form in 1970, followed by 2nd normal form, and then 3rd normal form in 1971. Since that time, higher forms of normalization have been introduced by theorists but most database designs today are considered to be "normalized" if they are in 3rd normal form.
Intentional Denormalization
Not all databases should be normalized. It is common to intentionally denormalize databases for performance reasons or for ease of end-user analysis. As an example, dimensional models that are widely used in data warehouses (such as the data warehouses commonly used with SQL Server Analysis Services) are intentionally designed to not be normalized. Tables might also be denormalized to avoid the need for time-consuming calculations or to minimize physical database design constraints such as locking.
3-7
Key Points
In general, normalizing a database design leads to an improved design. Most common table design errors in database systems can be avoided by applying normalization rules.
Normalization
Normalization is used to: free the database of modification anomalies minimize redesign when the structure of the database needs to be changed ensure the data model is intuitive to users avoid any bias towards particular forms of querying
While there is disagreement on the interpretation of these rules, general agreement exists on most common symptoms of violating the rules.
3-8
3-9
Key Points
In this demonstration you will see common normalization errors
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_03_PRJ\6232B_03_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
3-10
Primary Keys
Key Points
A primary key is a form of constraint applied to a table. A candidate key is used to identify a column or set of columns that can be used to uniquely identify a row. A primary key is chosen from any potential candidate keys.
Primary Key
A primary key must be unique and cannot be NULL. Primary keys are a form of constraint. (Constraints are discussed later in this course). Consider a table that holds an EmployeeID and a NationalIDNumber, along with the employee's name and personal details. The EmployeeID and the NationalIDNumber are likely to both be possible candidate keys. In this case, the EmployeeID column would be the primary key. It may be necessary to combine multiple columns into a key before it can be used to uniquely identify a row. In formal database terminology, no candidate key is more important than any other candidate key. When tables are correctly normalized though, they will usually have only a single candidate key that could be used as a primary key. However, this is not always the case. Ideally, keys used as primary keys should not change over time.
3-11
Question: What is an advantage of using a natural key? Question: What is a disadvantage of using a natural key? Question: What might be an appropriate primary key for the Owner table mentioned in the previous demonstration?
3-12
Foreign Keys
Key Points
A Foreign Key is used to establish references or relationships between tables.
Foreign Keys
It is a requirement to hold the details of the primary key (or another unique key) from one table as a column in another table. For example, a CustomerOrders table might include a CustomerID column. A foreign key reference is used to ensure that any CustomerID entered into the CustomerOrders table does in fact exist in the Customers table. In SQL Server, the reference is only checked if the column holding the foreign key value is not NULL.
Self-Referencing Tables
A table can hold a foreign key reference to itself. For example, an Employees table might contain a ManagerID column. An employee's manager is also an employee. A foreign key reference can be made from the ManagerID column of the Employees table to the EmployeeID column in the same table.
Reference Checking
Referenced keys cannot be updated or deleted unless options that cascade the changes to related tables are used. For example, you cannot change the ID for a customer when there are orders in a CustomerOrders table that reference that customer's ID. However, at the time you define the foreign key constraint, you can specify that changes to the referenced value are permitted and will cascade. This means that the ID for the customer would be changed in both the Customers table and in the CustomerOrders table. Tables might also include multiple foreign key references. For example, an Orders table might refer to a Customers table and a Products table.
3-13
Terminology
Foreign keys are referred to as being used to "enforce referential integrity". Foreign keys are a form of constraint and will be covered in more detail in a later module. The ANSI SQL 2003 definition refers to self-referencing tables as having "recursive foreign keys". Question: What would be an example of multiple foreign keys in a table referencing the same table?
3-14
Key Points
System Tables are the tables that are provided directly by the SQL Server database engine. They should not be directly modified.
3-15
Lesson 2
SQL Server 2005 introduced a change to how schemas are used. Since that version, schemas are used as containers for objects such as tables, views, and stored procedures. Schemas can be particularly helpful in providing a level of organization and structure when large numbers of objects are present in a database. Security permissions can also be assigned at the schema level rather than individually on the objects contained within the schemas. Doing this can greatly simplify the design of system security requirements.
Objectives
After completing this lesson, you will be able to: Describe the role of a Schema Describe the role of Object Name Resolution Create Schemas
3-16
What is a Schema?
Key Points
Schemas are used to contain objects and to provide a security boundary for the assignment of permissions.
Schemas
In SQL Server, schemas are essentially used as containers for objects, somewhat like a folder is used to hold files at the operating system level. Since their introduction in SQL Server 2005, schemas can be used to contain objects such as tables, stored procedures, functions, types, views, etc. Schemas form a part of the multi-part naming convention for objects. In SQL Server, an object is formerly referred to by a name of the form: Server.Database.Schema.Object
Security Boundary
Schemas can be used to simplify the assignment of permissions. An example of applying permissions at the schema level would be to assign the EXECUTE permission on a schema to a user. The user could then execute all stored procedures within the schema. This simplifies the granting of permissions as there is no need to set up individual permissions on each stored procedure.
3-17
automatically create a schema with the same name as existing object owners, so that applications that use multi-part names will continue to work.
3-18
Key Points
It is important to use at least two-part names when referring to objects in SQL Server code such as stored procedures, functions, and views.
More than one Product table could exist in separate schemas. When single part names are used, SQL Server must then determine which Product table is being referred to. Most users have default schemas assigned but not all users have these. Default schemas are not assigned to Windows groups or to users based on certificates but they do apply to users created from standard Windows and SQL Server logins. Users without default schemas are considered to have the dbo schema as their default schema. When locating an object, SQL Server will first check the user's default schema. If the object is not found, SQL Server will then check the dbo schema to try to locate the object. It is important to include schema names when referring to objects instead of depending upon schema name resolution, such as in this modified version of the previous statement:
SELECT ProductID, Name, Size FROM Production.Product;
Apart from rare situations, using multi-part names leads to more reliable code that does not depend upon default schema settings.
3-19
Creating Schemas
Key Points
Schemas are created with the CREATE SCHEMA command. This command can also include the definition of objects to be created within the schema at the time the schema is created.
CREATE SCHEMA
Schemas have both names and owners. In the first example shown in the slide, a schema named Reporting is being created. It is owned by the user Terry. While both schemas and the objects contained in the schemas have owners and the owners do not have to be the same, having different owners for schemas and the objects contained within them can lead to complex security issues.
3-20
Key Points
In this demonstration you will see how to: Create a schema Create a schema with an included object Drop a schema
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_03_PRJ\6232B_03_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 21 Demonstration 2A.sql script file. Follow the instructions contained within the comments of the script file.
3-21
Lesson 3
Now that you understand the core concepts surrounding the design of tables, this lesson introduces you to the T-SQL syntax used when defining, modifying, or dropping tables. Temporary tables are a special form of tables that can be used to hold temporary result sets. Computed columns are used to create columns where the value held in the column is automatically calculated, either from expressions involving other columns from the table or from the execution of functions.
Objectives
After completing this lesson, you will be able to: Create Tables Drop Tables Alter Tables Use Temporary Tables Work with Computed Columns
3-22
Creating Tables
Key Points
Tables are created with the CREATE TABLE statement. This statement is also used to define the columns associated with the table and identify constraints such as primary and secondary keys.
CREATE TABLE
When creating tables with the CREATE TABLE statement, make sure that you supply both a schema name and a table name. If the schema name is not specified, the table will be created in the default schema of the user executing the statement. This could lead to the creation of scripts that are not robust in that they could generate different schema designs when executed by different users.
Nullability
You should specify NULL or NOT NULL for each column in the table. SQL Server has defaults for this and they can be changed via the ANSI_NULL_DEFAULT setting. Scripts should always be designed to be as reliable as possible and specifying nullability in DDL scripts helps improve script reliability.
Primary Key
A primary key constraint can be specified beside the name of a column if only a single column is included in the key, or after the list of columns. It must be included after the list of columns when more than one column is included in the key as shown in the following example where the SalesID is only unique for each SalesRegisterID:
CREATE TABLE PetStore.SalesReceipt ( SalesRegisterID int NOT NULL, SalesID int NOT NULL, CustomerID int NOT NULL, SalesAmount decimal(18,2) NOT NULL, PRIMARY KEY (SalesRegisterID, SalesID)
3-23
);
Primary keys are constraints and are more fully described along with other constraints in a later module. Question: In the example shown, could the OwnerName column have been used as the primary key instead of a surrogate key?
3-24
Dropping Tables
Key Points
The DROP TABLE statement is used to drop tables from a database. If a table is referenced by a foreign key constraint, it cannot be dropped.
DROP TABLE
When dropping a table, all permissions, constraints, indexes, and triggers that are related to the table are also dropped. Code that references the table (such as code contained within stored procedures, functions, and views) is not dropped. This can lead to "orphaned" code that refers to non-existent objects. SQL Server 2008 introduced a set of dependency views that can be used to locate code that references non-existent objects. The details of both referenced and referencing entities are available from the sys.sql_expression_dependencies view. Referenced and referencing entities are also available separately from the sys.dm_sql_referenced_entities and sys.dm_sql_referencing_entities dynamic management views. (Views are discussed later in the course). Question: Why would a reference to a table stop it from being dropped?
3-25
Altering Tables
Key Points
Altering a table is useful as permissions on the table are retained along with the data in the table. If you drop and recreate the table, both the permissions on the table and the data in the table are lost. If the table is referenced by a foreign key, it cannot be dropped. However, it can be altered.
ALTER TABLE
Tables are modified using the ALTER TABLE statement. This statement can be used to add or drop columns and constraints or to enable or disable constraints and triggers. (Constraints and triggers are discussed in later modules). Note that the syntax for adding and dropping columns is inconsistent. The word COLUMN is required for DROP but not for ADD. In fact, it is not an optional keyword for ADD either. If the word COLUMN is omitted in a DROP, SQL Server assumes that it is a constraint being dropped. In the slide example, the PreferredName column is being added to the PetStore.Owner table. Later, the PreferredName column is being dropped from the PetStore.Owner table. Note the difference in syntax regarding the word COLUMN.
3-26
Key Points
In this demonstration you will see how to: Create tables Alter tables Drop tables
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_03_PRJ\6232B_03_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
Question: Why should you ensure that you specify the nullability of a column when designing a table?
3-27
Temporary Tables
Key Points
Temporary tables are used to hold temporary result sets within a user's session. They are deleted automatically when they go out of scope. This typically occurs when the code that they were created within, completes or aborts.
Temporary Tables
Temporary tables are very similar to other tables, except that they are only visible to the creator and in the same scope (and sub-scopes) within the session. They are automatically deleted when a session ends or when they go out of scope. While temporary tables will be deleted when they go out of scope, they should be explicitly deleted when no longer required. Temporary tables are often created in code using the SELECT INTO statement. A table is created as a temporary table if its name has a pound (#) prefix. A global temporary table is created if the name has a double-pound (##) prefix. Global temporary tables are visible to all users and are not commonly used.
3-28
The overuse of temporary tables is a common T-SQL coding error that often leads to performance and resource issues. Extensive use of temporary tables is often an indicator of poor coding techniques, often due to a lack of set-based logic design.
3-29
Key Points
In this demonstration you will see how to: Work with temporary tables
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_03_PRJ\6232B_03_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 32 Demonstration 3B.sql script file. Follow the instructions contained within the comments of the script file.
3-30
Computed Columns
Key Points
Computed columns are columns that are derived from other columns or from the result of executing functions.
Computed Columns
Computed columns were introduced in SQL Server 2000. In the example shown in the slide, the YearOfBirth column is calculated by executing the DATEPART function to extract the year from the DateOfBirth column in the same table. In the example shown, you can also see the word PERSISTED added to the definition of the computed column. Persisted computed columns were introduced in SQL Server 2005. A non-persisted computed column is calculated every time a SELECT operation occurs on the column. A persisted computed column is calculated when the data in the row is inserted or updated. The data in the column is then selected like the data in any other column. The core difference between persisted and non-persisted computed columns relates to when the computational performance impact is exerted. Non-persisted computed columns work best for data that is modified regularly but selected rarely. Persisted computed columns work best for data that is modified rarely but selected regularly. In most business systems, data is read much more regularly than it is updated. For this reason, most computed columns would perform best as persisted computed columns.
3-31
Key Points
In this demonstration you will see how to work with computed columns.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_03_PRJ\6232B_03_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 33 Demonstration 3C.sql script file. Follow the instructions contained within the comments of the script file.
3-32
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item.
3-33
Click Switch User, and then click Other User. Log on using the following credentials: i. ii. User name: AdventureWorks\Administrator Password: Pa$$w0rd
8. 9.
From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window.
10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_03_PRJ\6232B_03_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
A business analyst from your organization has provided you with a first-pass at a schema design for some new tables being added to the MarketDev database. You need to provide an improved schema design based on good design practices and an appropriate level of normalization. The business analyst was also confused about when data should be nullable. You need to decide about nullability for each column in your improved design. The new tables need to be isolated in their own schema. You need to create the required schema DirectMarketing. The owner of the schema should be dbo. When the schema has been created, if you have available time, you need to create the tables that have been designed.
3-34
Name Comments Table2: TVAdvertisement Name TV_Station City CostPerAdvertisement TotalCostOfAllAdvertisements NumberOfAdvertisements Date_Of_Advertisement_1 Time_Of_Advertisement_1 Date_Of_Advertisement_2 Time_Of_Advertisement_2 Date_Of_Advertisement_3 Time_Of_Advertisement_3 Date_Of_Advertisement_4 Time_Of_Advertisement_4 Date_Of_Advertisement_5 Time_Of_Advertisement_5 Table3: Campaign_Response Name ResponseOccurredWhen RelevantProspect RespondedHow ChargeFromReferrer RevenueFromResponse
Data Type nvarchar(15) nvarchar(25) float float varchar(4) varchar(12) int varchar(12) int varchar(12) int varchar(12) int varchar(12) int
Data Type datetime int varchar(8) (phone, email, fax, letter) float float
3-35
Name ResponseProfit
3-36
3-37
3-38
1.
3-39
Review Questions
1. 2. 3. What is a primary key? What is a foreign key? What is meant by the term "referential integrity"?
Best Practices
1. 2. All tables should have primary keys. Foreign keys should be declared within the database in almost all circumstances. Often developers will suggest that the application will ensure referential integrity. Experience shows that this is a poor option. Databases are often accessed by multiple applications. Bugs are also easy to miss when they first start to occur.
3-40
4-1
Module 4
Designing and Implementing Views
Contents:
Lesson 1: Introduction to Views Lesson 2: Creating and Managing Views Lesson 3: Performance Considerations for Views Lab 4: Designing and Implementing Views 4-3 4-13 4-22 4-27
4-2
Module Overview
Views are a type of virtual table because the result set of a view is not usually saved in the database. Views can simplify the design of database applications by abstracting the complexity of the underlying objects. Views can also provide a layer of security. Users can be given permission to access a view without permission to access the objects that the view is constructed on.
Objectives
After completing this lesson, you will be able to: Explain the role of views in database development Implement views Describe the performance related impacts of views
4-3
Lesson 1
Introduction to Views
In this lesson, you will gain an understanding of views and how they are used. You will also investigate the system views that are supplied by the SQL Server engine. A view is effectively a named SELECT query. Unlike ordinary tables (base tables) in a relational database, a view is not part of the physical schema it is a dynamic, virtual table computed or collated from data in the database. Effective use of views in database system design helps improve performance and manageability. In this module you will learn about views, the different types of views, and how to use them.
Objectives
After completing this lesson, you will be able to: Describe views Describe the different types of view provided by SQL Server Explain the advantages offered by views Work with system views Work with dynamic management views
4-4
What is a View?
Key Points
A view can be thought of as a named virtual table that is defined through a SELECT statement. To an application, a view behaves very similarly to a table. Question: Have you ever used views in designing Microsoft SQL Server database systems? If so, why did you use them?
Views
The data accessible through a view is not stored in the database as a distinct object, except in the case of indexed views. (Indexed views are described later in this module). What is stored in the database is the SELECT statement. The data tables referenced by the SELECT statement are known as the base tables for the view. As well as being based on tables, views can reference other views. Queries against views are written the same way that queries are written against tables.
4-5
own region or state. A view could be created that limits the rows returned to those for a particular state or region. Question: Why would you limit which columns are returned by a view?
4-6
Types of Views
Key Points
There are four basic types of view: standard views, system views (including dynamic management views), indexed views and partitioned views (including distributed partitioned views).
Standard Views
Standard views combine data from one or more base tables (or views) into a new virtual table. From the base tables (or views), particular columns and rows can be returned. Any computations, such as joins or aggregations, are done during query execution for each query referencing the view, if the view is not indexed.
System Views
System views are provided with SQL Server and show details of the system catalog or aspects of the state of SQL Server. Dynamic Management Views (DMVs) were introduced in SQL Server 2005 and enhanced in every edition since. DMVs provide dynamic information about the state of SQL Server such as information about the current sessions or the queries those sessions are executing.
Indexed Views
Indexed views materialize the view through the creation of a clustered index on the view. This is usually done to improve query performance. Complex joins or lengthy aggregations can be avoided at execution time by pre-calculating the results. Indexed views are discussed in Module 6.
Partitioned Views
4-7
Partitioned views form a union of data from multiple tables into a single view. Distributed partitioned views are formed when the tables that are being combined by a union operation, are located on separate instances of SQL Server. Note: Indexed views and Partitioned views are described later in this module. Question: What advantages would you assume that views would provide?
4-8
Advantages of Views
Key Points
Views are generally used to focus, simplify, and customize the perception each user has of the database.
Advantages of Views
Views can provide a layer of abstraction in database development. They can allow a user to focus on a subset of data that is relevant to them, or that they are permitted to work with. Users do not need to deal with the complex queries that might be involved within the view. They are able to query the view as they would query a table. Views can be used as security mechanisms by allowing users to access data through the view, without granting the users permissions to directly access the underlying base tables of the view. Many external applications are unable to execute stored procedures or T-SQL code but can select data from tables or views. Creating a view allows isolating the data that is needed for these export functions. Views can be used to provide a backward compatible interface to emulate a table that previously existed but whose schema has changed. For example, if an Customer table has been split into two tables: CustomerGeneral and CustomerCredit, a Customer view could be created over the two new tables to allow existing applications to query the data without requiring the applications to be altered. Reporting applications often need to execute complex queries to retrieve the report data. Rather than embedding this logic in the reporting application, a view could be created to supply the data required by the reporting application, in a much simpler format. Question: If tables can be replaced by views (and vice-versa) during maintenance, what does that suggest to you about the naming of views and tables?
4-9
Key Points
SQL Server provides information about its configuration via a series of system views. These views also provide metadata describing both the objects you create in the database and those provided by SQL Server.
System Views
Catalog views are primarily used to retrieve metadata about tables and other objects in databases. While it would be possible to retrieve much of this information directly from system tables, the use of catalog views is the supported mechanism for doing this. Earlier versions of SQL Server provided a set of virtual tables that were exposed as system views. For backwards compatibility, a set of "compatibility" views have been provided. These views, however, are deprecated and should not be used for new development work. The International Standards Organization (ISO) has standards for the SQL language. As each database engine vendor uses different methods of storing and accessing metadata, a standard mechanism was designed. This interface is provided by the views in the INFORMATION_SCHEMA schema. The most commonly used INFORMATION_SCHEMA views are shown in the following table: Common INFORMATION_SCHEMA Views INFORMATION_SCHEMA.CHECK_CONSTRAINTS INFORMATION_SCHEMA.COLUMNS INFORMATION_SCHEMA.PARAMETERS INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS INFORMATION_SCHEMA.ROUTINE_COLUMNS INFORMATION_SCHEMA.ROUTINES INFORMATION_SCHEMA.TABLE_CONSTRAINTS INFORMATION_SCHEMA.TABLE_PRIVILEGES INFORMATION_SCHEMA.TABLES INFORMATION_SCHEMA.VIEW_COLUMN_USAGE
4-10
Common INFORMATION_SCHEMA Views INFORMATION_SCHEMA.VIEW_TABLE_USAGE INFORMATION_SCHEMA.VIEWS Question: Give an example of why you would want to interrogate a catalog view.
4-11
Key Points
Dynamic Management Views are commonly just called DMVs. These views provide a relational method for querying the internal state of a SQL Server instance.
4-12
Key Points
In this demonstration you will see how to: Query system views Query dynamic management views
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_04_PRJ\6232B_04_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
Question: When are the values returned by most dynamic management views reset?
4-13
Lesson 2
In the previous lesson, you learned about the role of views. In this lesson you will learn how to create, drop and alter views. You will also learn how views and the objects that they are based on have owners and how this can impact the use of views. You will see how to find information about existing views and how to obfuscate the definitions of views.
Objectives
After completing this lesson, you will be able to: Create views Drop views Alter views Explain the concept of ownership chaining and how it applies to views List the available sources of information about views Work with updatable views Obfuscate view definitions
4-14
Creating Views
Key Points
To create a view you must be granted permission to do so by the database owner. Creating a view involves associating a name with a SELECT statement.
CREATE VIEW
If you specify the option WITH SCHEMABINDING, the underlying tables cannot be changed in a way that would affect the view definition. If you later decide to index the view, the WITH SCHEMABINDING option must be used. Views can be based on other views (instead of base tables) up to 32 levels of nesting. Care should be exercised in nesting views deeply as it can become difficult to understand the complexity of the underlying code and it can be difficult to troubleshoot performance problems related to the views. Views have no natural output order. Queries that access the views should specify the order for the returned rows. The ORDER BY clause can be used in a view but only to satisfy the needs of a clause such as the TOP clause. Expressions that are returned as columns need to be aliased. It is also common to define column aliases in the SELECT statement within the view definition but a column list can also be provided after the name of the view. You can see this in the following code example:
CREATE VIEW HumanResources.EmployeeList (EmployeeID, FamilyName, GivenName) AS SELECT EmployeeID, LastName, FirstName FROM HumanResources.Employee;
Question: Why is the ORDER BY clause ever permitted in a view definition if it doesnt impact the output order of the rows?
4-15
Dropping Views
Key Points
Dropping a view removes the definition of the view and all permissions associated with the view.
DROP VIEW
Even if a view is recreated with exactly the same name as a view that has been dropped, permissions formerly associated with the view are removed. It is important to record why views are created and to then drop them if they are no longer required for the purpose they were created. Retaining view definitions that are not in use adds to the work required when reorganizing the structure of databases. If a view was created with the SCHEMABINDING option, it will need to be removed before changes can be made to the structure of the underlying tables. The DROP VIEW statement supports the dropping of multiple views via a comma-delimited list as shown in the following code sample:
DROP VIEW Sales.WASales, Sales.CTSales, Sales.CASales;
4-16
Altering Views
Key Points
After a view is defined, you can modify its definition without dropping and re-creating the view.
ALTER VIEW
The ALTER VIEW statement modifies a previously created view. (This includes indexed views that are discussed in the next lesson). The main advantage of using ALTER VIEW is that any permissions associated with the view are retained. Altering a view also involves less code than dropping and recreating a view.
4-17
Key Points
When querying a view, there needs to be an unbroken chain of ownership from the view to the underlying tables unless the user executing the query also has permissions on the underlying table or tables.
Ownership Chaining
One of the key reasons for using views is to provide a layer of security abstraction so that access is given to views and not to the underlying table or tables. For this mechanism to function correctly, an unbroken ownership chain must exist. In the example shown in the slide, a user John has no access to a table that is owned by Nupur. If Nupor creates a view or stored procedure that accesses the table and gives John permission to the view, John can then access the view and though it, the data in the underlying table. However, if Nupur creates a view or stored procedure that accesses a table owned by Tim and grants John access to the view or stored procedure, John would not be able to use the view or stored procedure, even if Nupur has access to Tim's table because of the broken ownership chain. Two options could be used to correct this situation: Tim could own the view or stored procedure instead of Nupur. John could be granted permission to the underlying table. (This is often undesirable).
4-18
Key Points
Views are queried the same way that ordinary tables are queried. You may also though want to find out information about how a view is defined or about its properties.
4-19
Updatable Views
Key Points
It is possible to update data in the base tables by updating a view.
Updatable Views
Updates that are performed on views cannot affect columns from more than one base table. (To work around this restriction, you can create INSTEAD OF triggers. These triggers are discussed in Module 15). While views can contain aggregated values from the base tables, these columns cannot be updated nor can columns involved in grouping operations such as GROUP BY, HAVING or DISTINCT. It is possible to modify a row in a view in such a way that the row would no longer belong to the view. For example, you could have a view that selected rows where the State column contained the value WA. You could then update the row and set the State column to the value CA. To avoid the chance of this, you can specify the WITH CHECK OPTION clause when defining the view. It will check during data modifications that any row modified would still be returned by the same view. Data that is modified in a base table via a view still needs to meet the restrictions on those columns such as nullability, constraints, defaults, etc. as if the base table was modified directly. This can be particularly challenging if all the columns in the base table are not present in the view. For example, an INSERT on the view would fail if the base table it was based upon required mandatory columns that were not exposed in the view and that did not have DEFAULT values.
4-20
Key Points
Database developers often want to protect the definitions of their database objects. The WITH ENCRYPTION clause can be included when defining or altering a view.
WITH ENCRYPTION
The WITH ENCRYPTION clause provides limited obfuscation of the definition of a view. It is important to keep copies of the source code for views. This is even more important when the view is created with the WITH ENCRYPTION option. Encrypted code (including the code definitions of views) makes it harder to perform problem diagnosis and query tracing and tuning. The encryption provided is not very strong. Many 3rd party utilities exist that can decrypt the source so you should not depend on this to protect your intellectual property if doing so is critical to you. Question: Do you think you might be deploying encrypted views in your organization?
4-21
Key Points
In this demonstration you will see: Create a view Query a view Query the definition of a view Use the WITH ENCRYPTION option Drop a view Generate a script for an existing view
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_04_PRJ\6232B_04_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 21 Demonstration 2A.sql script file. Follow the instructions contained within the comments of the script file.
4-22
Lesson 3
Now that you understand why views are important and know how to create them, it is important to understand the potential performance impacts of using views. In this lesson, you will see how views are incorporated directly into the execution plans of queries that they are used in. You will see the effect and potential disadvantages of nesting views and see how performance can be improved in some situations. Finally, you will see how the data from multiple tables can be combined into a single view, even if those tables are on different servers.
Objectives
After completing this lesson, you will be able to: Explain the dynamic resolution process for views List the most important considerations when working with nested views Create indexed views Describe the purpose of partitioned views
4-23
Key Points
Standard views are expanded and incorporated into the queries that they are referenced in. The objects that they reference are resolved at execution time.
4-24
Key Points
While views can reference other views, careful consideration needs to be made when doing this.
Nested Views
Views can be nested up to 32 levels. Layers of abstraction are often regarded as desirable when designing code in any programming language. Views can help provide this. The biggest concern with nested views is that it is very easy to create code that is very difficult for the query optimizer to work with, without realizing that this is occurring. Nested views can make it much harder to troubleshoot performance problems and more difficult to understand where complexity is arising in code.
4-25
Partitioned Views
Key Points
Partitioned views allow the data in a large table to be split into smaller member tables. The data is partitioned between the member tables based on ranges of data values in one of the columns.
Partitioned Views
Data ranges for each member table in a partitioned view are defined in a CHECK constraint specified on the partitioning column. A UNION ALL statement is used to combine selects of all the member tables into a single result set. In a local partitioned view, all participating tables and the view reside on the same instance of SQL Server. In most cases, table partitioning should be used instead of local partitioned views. In a distributed partitioned view, at least one of the participating tables resides on a different (remote) server. Distributed partitioned views can be used to implement a federation of database servers. Good planning and testing is crucial as major performance problems can arise if the design of the partitioned views is not appropriate.
4-26
Key Points
In this demonstration you will see how: Views are eliminated in query plans Views are expanded and integrated into the outer query before being optimized
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_04_PRJ\6232B_04_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
4-27
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item.
4-28
Click Switch User, and then click Other User. Log on using the following credentials: i. User name: AdventureWorks\Administrator ii. Password: Pa$$w0rd
8. 9.
From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window.
10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_04_PRJ\6232B_04_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
A new web-based stock promotion system is being rolled out. Your manager is very concerned about providing access from the web-based system directly to the tables in your database. She has requested you to design some views that the web-based system could connect to instead. Details of organizational contacts are held in a number of tables. The relationship management system being used by the account management team needs to be able to gain access to these contacts. However, they need a single view that comprises all contacts. You need to design, implement and test the required view. Finally, if you have time, a request has been received from the new Marketing team that the catalog description of the product models should be added to the AvailableModels view. They would appreciate you modifying the view to provide this additional column.
Supporting Documentation
View1: OnlineProducts ViewColumn ProductID ProductName ProductNumber Color Availability SourceColumn ProductID ProductName ProductNumber Color (note N/A should be returned when NULL) Based on DaysToManufacture (0 = Instock, 1 = Overnight, 2 = Fast, Other Values = Call)
4-29
Note: Based on table Marketing.Product. Rows should only appear if the product has begun to be sold and is still being sold. (Derive this from SellStartDate and SellEndDate). View2: AvailableModels ViewColumn ProductID ProductName ProductModelID ProductModel SourceColumn ProductID ProductName ProductModelID ProductModel
Based on tables Marketing.Product and Marketing.ProductModel. Rows should only appear if the product has at least one model, has begun to be sold and is still being sold. (Derive this from SellStartDate and SellEndDate). View3: Contacts ViewColumn ContactID FirstName MiddleName LastName ContactRole SourceColumn in Prospect ProspectID FirstName MiddleName LastName PROSPECT Source Column in Salesperson SalespersonID FirstName MiddleName LastName SALESPERSON
4-30
4-31
4.
4-32
4-33
Review Questions
1. 2. 3. How does SQL Server store the view in the database? What is a Standard non-indexed view? What is an unbroken ownership chain?
Best Practices
1. 2. 3. 4. Use views to focus data for users. Avoid nesting many layers within views. Avoid ownership chain problems within views. Ensure consistent connection SET options when intending to index views.
4-34
5-1
Module 5
Planning for SQL Server 2008 R2 Indexing
Contents:
Lesson 1: Core Indexing Concepts Lesson 2: Data Types and Indexes Lesson 3: Single Column and Composite Indexes Lab 5: Planning for SQL Server Indexing 5-3 5-11 5-19 5-24
5-2
Module Overview
An index is a collection of pages associated with a table. Indexes are used to improve the performance of queries or enforce uniqueness. Before learning to implement indexes, it is important to understand how they work, how effective different data types are when used within indexes, and how indexes can be constructed from multiple columns.
Objectives
After completing this lesson, you will be able to: Explain core indexing concepts Describe the effectiveness of each data type common used in indexes Plan for single column and composite indexes
5-3
Lesson 1
While it is possible for SQL Server to read all the pages in a table when calculating the results of a query, doing so is often highly inefficient. Indexes can be used to point to the location of required data and to minimize the need for scanning entire tables. In this lesson, you will learn how indexes are structured and learn the key measures associated with the design of indexes. Finally, you will see how indexes can become fragmented over time.
Objectives
After completing this lesson, you will be able to: Describe how SQL Server accesses data Describe the need for indexes Explain the concept of B-Tree index structures Explain the concepts of index selectivity, density and depth Explain why index fragmentation occurs
5-4
Key Points
SQL Server can access data in a table by reading all the pages of the table (known as a table scan) or by using index pages to locate the required rows.
Indexes
Whenever SQL Server needs to access data in a table, it makes a decision about whether to read all the pages of the table or whether there are one or more indexes on the table that would reduce the amount of effort required in locating the required rows. Queries can always be resolved by reading the underlying table data. Indexes are not required but accessing data by reading large numbers of pages is usually considerably slower than methods that use appropriate indexes. On occasions, SQL Server will create its own temporary indexes to improve query performance. However, doing so is up to the optimizer and beyond the control of the database administrator or programmer, so these temporary indexes will not be discussed in this module. The temporary indexes are only used to improve a query plan, if no proper indexing already exists. In this module, you will consider standard indexes created on tables. SQL Server includes other types of index: Integrated full-text search is a special type of index that provides flexible searching of text. Spatial indexes are used with the GEOMETRY and GEOGRAPHY data types. Primary and secondary XML indexes assist when querying XML data.
Each of these other index types is discussed in later modules in this course. Question: When might a table scan be more efficient than using an index?
5-5
Key Points
Indexes are not described in ANSI SQL definitions. Indexes are considered to be an implementation detail. SQL Server uses indexes for improving the performance of queries and for implementing certain constraints.
A Useful Analogy
At this point, it is useful to consider an analogy that might be easier to relate to. Consider a physical library. Most libraries store books in a given order, which is basically an alphabetical order within a set of defined categories. Note that even when you store the books in alphabetical order, there are various ways that this could be done. The order of the books could be based on the name of the book or the name of the author. Whichever option is chosen makes one form of access easy and does not help other methods of access. For example, if books were stored in book name order, how would you locate books written by a particular author? Indexes assist with this type of problem. Question: Which different ways might you want to locate books in a physical library?
5-6
Index Structures
Key Points
Tree structures are well known for providing rapid search capabilities for large numbers of entries in a list.
Index Structures
Indexes in database systems are often based on binary tree (B-Tree) structures. Binary trees are simple structures where at each level, a decision is made to navigate left or right. This style of tree can quickly become unbalanced and less useful. SQL Server indexes are based on a form of self-balancing tree. Whereas binary trees have at most two children per node, SQL Server indexes can have a large number of children per node. This helps improve the efficiency of the indexes and avoids the need for excessive depth within an index. Depth is defined as the number of levels from the top node (called the root node) and the bottom nodes (called leaf nodes).
5-7
Key Points
When designing indexes, three core concepts are important: selectivity, density and index depth.
Now imagine doing the same for a range of authors such as one third of all the authors. You quickly reach a point where it would be quicker to just scan the whole library and ignore the author index rather than running backwards and forwards between the index and the bookcases. Density is a measure of the lack of uniqueness of the data in a table. A dense table is one that has a high number of duplicates. Index Depth is a measure of the number of levels from the root node to the leaf nodes. Users often imagine that SQL Server indexes are quite deep but the reality is quite different to this. The large number of children that each node in the index can have produces a very flat index structure. Indexes with only 3 or 4 layers are very common.
5-8
Index Fragmentation
Key Points
Index fragmentation is the inefficient use of pages within an index. Fragmentation occurs over time as data is modified.
Index Fragmentation
For operations that read data, indexes perform best when each page of the index is as full as possible. While indexes may initially start full (or relatively full), modifications to the data in the indexes can cause the need to split index pages. From our physical library analogy, imagine a fully populated library with full bookcases. What occurs when a new book needs to be added? If the book is added to the end of the library, the process is easy but if the book needs to be added in the middle of a full bookcase, there is a need to readjust the bookcase.
Detecting Fragmentation
SQL Server provides a measure of fragmentation in the sys.dm_db_index_physical_stats dynamic management view. The avg_fragmentation_in_percent column shows the percentage of fragmentation.
5-9
SQL Server Management Studio also provides details of index fragmentation in the properties page for each index as shown in the following screenshot from the AdventureWorks2008R2 database:
5-10
Key Points
In this demonstration you will see how to identify fragmented indexes
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_05_PRJ\6232B_05_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
Question: How might solid state disk drives change concerns around fragmentation?
5-11
Lesson 2
Not all data types work equally well as components of indexes. In this lesson, you will learn how effective a number of common data types are, when used within indexes. This will assist you in choosing data types when designing indexes.
Objectives
After completing this lesson, you will be able to: Describe the effectiveness of numeric data when used in indexes Describe the effectiveness of character data when used in indexes Describe the effectiveness of date-related data when used in indexes Describe the effectiveness of GUID data when used in indexes Describe the effectiveness of BIT data when used in indexes Explain how computed columns can be indexed.
5-12
Key Points
Numeric data types tend to produce highly-efficient indexes. Exact numeric types are the most efficient.
5-13
Key Points
While it might seem natural to base indexes on character data, indexes constructed on character data tend to be less efficient than those constructed on numeric data. However, character based indexes are both common and useful.
5-14
Key Points
Date related data types make good keys within indexes.
5-15
Key Points
While GUID values can be relatively efficient in indexes, operations on those indexes can lead to fragmentation problems and inefficiency.
5-16
Key Points
BIT columns are highly efficient in indexes. There is a common misconception that they are not useful but many valid scenarios exist for the use of BIT data type within indexes.
5-17
Key Points
Indexing a computed column can be highly efficient. It can also assist with improving the performance of poorly designed databases.
Note that SQL Server's query optimizer may ignore the index on a computed column, even if the requirements shown are met. The requirement for determinism and precision means that for a given set of input values, the same output values would always be returned. For example, the function SYSDATETIME() returns the current date and time whenever it is called. Its output would not be considered deterministic. You may want to create an index on a computed column when the results are queried or reported often. For example, a retail store may want to report on sales by day of the week (Sunday, Monday, Tuesday, etc.). You can create a computed column that determines the day of the week based on the date of the sale and then index that computed column. SQL Server 2005 introduced the ability to persist computed columns. Rather than calculating the value every time a SELECT operation is performed, the value can be calculated and stored whenever an INSERT
5-18
or UPDATE occurs. This is useful for data that is not updated frequently but is selected frequently. Indexes can be created on the persisted computed column.
Physical Analogy
From our physical library analogy, a persisted computed column for a book could be imagined as a label that is placed on the book that records the number of pages in the book. Nothing about the book itself changes when the label is placed on it but you now don't have to pick the book up and count the number of pages in it, if you need to make a decision based on the number of pages in the book. An index could then be created based on the value on the label similarly to how an index could be created on the name of the author. Question: If a column in a database mostly held character values but occasionally (30 rows out of 50,000 rows in the table) holds a number, how could you quickly locate a row with a specific numeric value?
5-19
Lesson 3
The indexes discussed so far have been based on data from single columns. Indexes can also be based on the data from multiple columns. Indexes can also be constructed in ascending or descending order. This lesson investigates these concepts and the effects they have on index design along with details of how SQL Server maintains statistics on the data contained within indexes.
Objectives
After completing this lesson, you will be able to: Describe the differences between single column versus composite indexes Describe the differences between ascending versus descending indexes Explain how SQL Server keeps statistics on indexes
5-20
Key Points
Indexes can be constructed on multiple columns rather than on single columns. Multi-column indexes are known as composite indexes.
In our physical library analogy, consider a query that required the location of books by a publisher within a specific release year. While a publisher index would be useful for finding all the books released by the publisher, it would not help to narrow down the search to those books within the release year. Separate indexes on publisher and release year would not be useful but an index that contained both publisher and release year could be very selective. Similarly, an index by topic would be of limited value also. Once the correct topic was located, all the books on that topic would have to be searched to determine if they were by the specified author. The best option would be an author index that also included details of each book's topic. In that case, a scan of the index pages for the author would be all that is required to work out which books need to be accessed. In the absence of any other design criteria, you should typically index the most selective column first, when constructing composite indexes. Question: Why might an index on customer then order date be more or less effective than an index on order date then customer?
5-21
Key Points
Each component of an index can be created in an ascending or descending order. For single column indexes, ascending and descending indexes are equally useful. For composite indexes, specifying the order of individual columns within the index might be useful.
5-22
Index Statistics
Key Points
SQL Server keeps statistics on indexes to assist when making decisions about how to access the data in a table.
Index Statistics
Earlier in the module, you saw that SQL Server needs to make decisions about how to access the data in a table. For each table that is referenced in a query, SQL Server might decide to read the data pages or it might decide to use an index. It is important to realize though, that SQL Server must make this decision before it begins to execute a query. This means that it needs to have information that will assist it in making this determination. For each index, SQL Server keeps statistics that tell it how the data is distributed.
Physical Analogy
When discussing the physical library analogy earlier, it was mentioned that if you were looking up the books for an author, using an index that is ordered by author could be useful. However, if you were locating books for a range of authors, that there would be a point at which scanning the entire library would be quicker than running backwards and forwards from the index to the shelves of books. The key issue here is that you need to know, before executing the query, how selective (and therefore useful) the indexes would be. The statistics held on indexes provide this knowledge. Question: Before starting to perform your lookup in a physical library, how would you know which way was quicker?
5-23
Key Points
In this demonstration you will see how to work with index statistics.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_05_PRJ\6232B_05_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
Question: Why would you not always choose to use FULLSCAN for statistics?
5-24
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item.
5-25
Click Switch User, and then click Other User. Log on using the following credentials: i. ii. User name: AdventureWorks\Administrator Password: Pa$$w0rd
8. 9.
From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window.
10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_05_PRJ\6232B_05_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
You have been asked to explain the concept of index statistics and selectivity to a new developer. You will explore the statistics available on an existing index and determine how selective some sample queries would be. One of the company developers has provided you with a list of the most important queries that will be executed by the new marketing management system. Depending upon how much time you have available, you need to determine the best column orders for indexes to support each query. Complete as many as possible within the allocated time. In later modules, you will consider how these indexes would be implemented. Each query is to be considered in isolation in this exercise.
Supporting Documentation
Query 1:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE ProspectID = 12553;
Query 2:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Arif%';
Query 3:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Alejandro%'
5-26
Query 4:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName >= 'S' ORDER BY LastName, FirstName;
Query 5:
SELECT LanguageID, COUNT(1) FROM Marketing.ProductDescription GROUP BY LanguageID;
5-27
2. 3. 4. 5. 6.
Review the results. Have any autostats been generated? Create manual statistics on the Color column. Call the statistics Product_Color_Stats. Use a full scan of the data when creating the statistics. Re-execute the command from task 1 to see the change. Using the DBCC SHOW_STATISTICS command, review the created Product_Color_Stats statistics. Answer the following questions related to the Product_Color_Stats statistics: a. b. c. d. How many rows were sampled? How many steps were created? What was the average key length? How many Black products are there?
7.
Execute the following command to check how accurate the statistics that have been generated are:
8.
Calculate the selectivity of each of the three queries shown: a) b) c) SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'A%'; SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Alejandro%'; SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Arif%';
5-28
Query 2:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Alejandro%';
Query 3:
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE 'Arif%';
Results: After this exercise, you have assessed Selectivity on each various queries.
5-29
Challenge Exercise 2: Design column orders for indexes (Only if time permits)
Scenario
One of the company developers has provided you with a list of the most important queries that will be executed by the new marketing management system. You need to determine the best column orders for indexes to support each query. In later modules, you will consider how these indexes would be implemented. Each query is to be considered in isolation in this exercise. The main tasks for this exercise are as follows: 1. 2. 3. 4. 5. Determine which columns should be part of an index for Query1 and the best order for the columns to support the query. Determine which columns should be part of an index for Query2 and the best order for the columns to support the query. Determine which columns should be part of an index for Query3 and the best order for the columns to support the query. Determine which columns should be part of an index for Query4 and the best order for the columns to support the query. Determine which columns should be part of an index for Query5 and the best order for the columns to support the query.
5-30
Review Questions
1. 2. Do tables need indexes? Why do constraints use indexes?
Best Practices
1. 2. Design indexes to maximize sensitivity which leads to lower I/O. In absence of other requirements, aim to have the most selective columns first in composite indexes.
6-1
Module 6
Implementing Table Structures in SQL Server 2008 R2
Contents:
Lesson 1: SQL Server Table Structures Lesson 2: Working with Clustered Indexes Lesson 3: Designing Effective Clustered Indexes Lab 6: Implementing Table Structures in SQL Server 6-3 6-13 6-20 6-26
6-2
Module Overview
One of the most important decisions that needs to be taken when designing tables in SQL Server databases relates to the structure of the table. Regardless of whether or not other indexes are used to locate rows, the table itself can be structured like an index or left without such a structure. In this module, you will learn how to choose an appropriate table structure. For situations where you decide to have a specific structure in place, you will learn how to create an effective structure.
Objectives
After completing this lesson, you will be able to: Explain how tables can be structured in SQL Server databases Work with clustered indexes Design effective clustered indexes
6-3
Lesson 1
There are two ways that SQL Server tables can be structured. Rows can be added in any order or rows can be ordered. In this lesson, you will investigate both options; gain an understanding of how common data modification operations are impacted by each option. Finally, you will see how unique clustered indexes are structured differently to non-unique clustered indexes.
Objectives
After completing this lesson, you will be able to: Describe how tables can be organized as heaps Explain how common operations are performed on heaps Detail the issues that can arise with forwarding pointers Describe how tables can be organized with clustered indexes Explain how common operations are performed on tables with clustered indexes Describe how unique clustered indexes are structured differently to non-unique clustered indexes
6-4
What is a Heap?
Key Points
A heap is a table that has no enforced order for either the pages within the table or for the data rows within each page.
Heaps
The simplest table structure available in SQL Server is a heap. Data rows are added to the first available location within the table's pages that have sufficient space. If no space is available, additional pages are added to the table and the rows placed in those pages. Even though no index structure exists for a heap, SQL Server tracks the available pages using an entry in an internal structure called an Index Allocation Map (IAM). Heaps are allocated index id zero in this map.
Physical Analogy
In the physical library analogy, a heap would be represented by structuring your library so that every book is just placed in any available space found that is large enough. Without any other assistance, finding a book would involve scanning one bookcase after another. Question: Why might modifying a row cause it to need to move between pages?
6-5
Operations on Heaps
Key Points
The most common operations performed on tables are INSERT, UPDATE, DELETE and SELECT operations. It is important to understand how each of these operations is affected by structuring a table as a heap.
Physical Analogy
In the library analogy, an INSERT would be executed by locating any gap large enough to hold the book and placing it there. If no space that is large enough is available, a new bookcase would be allocated and the book placed into it. This would continue unless a limit existed on the number of bookcases that the library could contain. A DELETE operation could be imagined as scanning the bookcases until the book is found, removing the book and throwing it away. More precisely, it would be like placing a tag on the book to say that it is to be thrown out the next time the library is cleaned up or space on the bookcase is needed. An UPDATE operation would be represented by replacing a book with a (potentially) different copy of the same book. If the replacement book was the same (or smaller) size as the original book, it could be placed directly back in the same location as the original book. However, if the replacement book was larger, the original book would be removed and placed into another location. The new location for the book could be in the same bookcase or in another bookcase. Question: What would be involved in finding a book in a library structured as a heap? (This would simulate a SELECT operation).
6-6
Forwarding Pointers
Key Points
When other indexes point to rows in a heap, data modification operations cause forwarding pointers to be inserted into the heap. This can cause performance issues over time.
Physical Analogy
Now imagine that the physical library was organized as a heap where books were stored in no particular order. Further imagine that three additional indexes were created in the library, to make it easier to find books by author, ISBN, and release date. As there was no order to the books on the bookcases, when an entry was found in the ISBN index, the entry would refer to the physical location of the book. The entry would include an address like "Bookcase 12 - Shelf 5 - Book 3". That is, there would need to be a specific address for a book. An update to the book that caused it to need to be moved to a different location would be problematic. One option for resolving this would be to locate all index entries for the book and update the new physical location. An alternate option would be to leave a note in the location where the book used to be that points to where the book has been moved to. This is what a forwarding pointer is in SQL Server. This allows rows to be updated and moved without the need to update other indexes that point to them. A further challenge arises if the book needed to be moved again. There are two options ways that this could be handled. Either yet another note could be left pointing to the new location or the original note could be modified to point to the new location. Either way, the original indexes would not need to be updated. SQL Server deals with this by updating the original forwarding pointer. This way, performance does not continue to degrade by having to follow a chain of forwarding pointers.
6-7
Note that while options to rebuild indexes have been available in prior versions, the option to rebuild a table was not available. This command can also be used to change the compression settings for a table. (Page and Row Compression are an advanced topic beyond the scope of this course).
6-8
Key Points
Rather than storing data rows of a data as a heap, tables can be designed with an internal logical ordering. This is known as a clustered index.
Clustered Index
A table with a clustered index has a predefined order for rows within a page and for pages within the table. The order is based on a key made up of one or more columns. The key is commonly called a clustering key. Because the rows of a table can only be in a single order, there can be only a single clustered index on a table. An Index Allocation Map entry is used to point to a clustered index. Clustered indexes are always index id = 1. There is a common misconception that pages in a clustered index are "physically stored in order". While this is possible in rare situations, it is not commonly the case. If it was true, fragmentation of clustered indexes would not exist. SQL Server tries to align physical and logical order while creating an index but disorder can arise as data is modified. Index and data pages are linked within a logical hierarchy and also double-linked across all pages at the same level of the hierarchy to assist when scanning across an index.
Physical Analogy
In the library analogy, a clustered index is similar to storing all books in a specific order. An example of this would be to store books in ISBN (International Standard Book Number) order. Clearly, the library can only be in a single order.
6-9
Key Points
Earlier you saw how common operations were carried out on tables structured as heaps. It is important to understand how each of those operations is affected by structuring a table with a clustered index.
Physical Analogy
In a library that is ordered in ISBN order, an INSERT operation requires a new book to be placed in exactly the correct logical ISBN order. If there is space somewhere on the bookcase that is in the required position, the book can be placed into the correct location and all other books in the bookcase moved to accommodate the new book. If there is not sufficient space, the bookcase needs to be split. Note that a new bookcase would be physically placed at the end of the library but would be logically inserted into the list of bookcases. INSERT operations would be straightforward if the books were being added in ISBN order. New books could always be added to the end of the library and new bookcases added as required. In this case, no splitting is required. When an UPDATE operation is performed, if the replacement book is the same size or smaller and the ISBN has not changed, the book can just be replaced in the same place. If the replacement book is larger and the ISBN has not changed, and there is spare space within the bookcase, all other books in the bookcase can be slid along to allow the larger book to be replaced in the same spot. If there was insufficient space in the bookcase to accommodate the larger book, the bookcase would need to be split. If the ISBN of the replacement book wass different to the original book, the original book would need to be removed and the replacement book treated like the insertion of a new book. A DELETE operation would involve the book being removed from the bookcase. (Again, more formerly, it would be flagged as free space but simply left in place for later removal). When a SELECT is performed, if the ISBN is known, the required book can be quickly located by efficiently searching the library. If a range of ISBN's was requested, the books would be located by finding the first
6-10
book and continuing to collect books in order until a book is encountered that is out of range or until the end of the library is reached. Question: What sort of queries would now perform better in this library?
6-11
Key Points
SQL Server must be able to uniquely identify any row in a table. Clustered indexes can be created as unique or non-unique.
Physical Analogy
In the library analogy, a unique index is like a rule that says that no more than a single copy of any book can ever be stored. If an insert of a new book is attempted and another book is found to have the same ISBN (assuming that the ISBN was the clustering key), the insertion of the new book would be refused. It is important to understand that the comparison is made only on the clustering key. The book would be rejected for having the same ISBN, even if other properties of the book are different. A non-unique clustered index is similar to having a rule that allows more than a single book with the same ISBN. The issue is that it is likely to be desirable to track each copy of the book separately. The uniqueifier that is added by SQL Server would be like a "Copy Number" being added to books that can be duplicated. The uniqueifier is not visible to users.
6-12
Key Points
In this demonstration you will see how to: Create a table as a heap Check the fragmentation and forwarding pointers for a heap Rebuild a heap
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_06_PRJ\6232B_06_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
6-13
Lesson 2
If a decision has been made to structure a table with a clustered index, it is important to be familiar with how the indexes are created, altered or dropped. In this lesson, you will see how to perform these actions, understand how SQL Server performs them automatically in some situations and see how to incorporate free space within indexes to improve insert performance.
Objectives
After completing this lesson, you will be able to: Create clustered indexes Drop a clustered index Alter a clustered index Incorporate Free Space in indexes
6-14
Key Points
Clustered indexes can be created either directly using the CREATE INDEX command or automatically in some situations where a PRIMARY KEY constraint is specified on the table.
6-15
Question: What else would be added to your table if you added a non-unique clustered index to it?
6-16
Key Points
The method used to drop a clustered index depends upon the way the clustered index was created.
6-17
Key Points
Minor modifications to indexes are permitted through the ALTER INDEX statement but it cannot be used to modify the structure of the index, including the columns that make up the key.
Disabling Indexes
While the ALTER INDEX statement includes a DISABLE option that can be applied to any index, this option is of limited use with clustered indexes. Once a clustered index is disabled, no access to the data in the table is then permitted until it is rebuilt.
6-18
Key Points
The FILLFACTOR and PADINDEX options are used to provide free space within index pages. This can improve INSERT and UPDATE performance in some situations but often to the detriment of SELECT operations.
6-19
Key Points
In this demonstration you will see how to: Create a table with a clustered index Detect fragmentation in a clustered index Correct fragmentation in a clustered index
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_06_PRJ\6232B_06_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 21 Demonstration 2A.sql script file. Follow the instructions contained within the comments of the script file.
Question: Where was the performance of the UPDATE statement against this table much faster than the one against the heap?
6-20
Lesson 3
When creating clustered indexes on tables, it is important to understand the characteristics of good clustering keys. Some data types work better for clustering keys than others. In this lesson, you will see how to design good clustering keys and also see how clustered indexes can be created on views.
Objectives
After completing this lesson, you will be able to: Describe characteristics of good clustering keys Explain which data types are most appropriate for use in clustering keys Create indexed views Explain considerations that must be made when working with indexed views
6-21
Key Points
Many different types of data can be used for clustering a table. While not every situation is identical, there is a set of characteristics that generally create the best clustering keys. Keys should be short, static, increasing and unique.
6-22
Key Points
Similar to the way that some data types are generally better as components of indexes than other data types, some data types are more appropriate for use as clustering keys than others.
6-23
Key Points
Clustered indexes can be created over views. A view with a clustered index is called an "indexed view". Indexed views are the closest SQL Server equivalent to "materialized views" in other databases. Indexed views can have a profound (positive) impact on the performance of queries in particular circumstances.
6-24
Key Points
The use of indexed views is governed by a set of considerations that must be met for the views to be utilized. Premium editions of SQL Server take more complete advantage of indexed views.
6-25
Key Points
In this demonstration you will see how to: Obtain details of indexes created on views See if an indexed view has been used in an estimated execution plan
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_06_PRJ\6232B_06_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
Question: How could you ensure that an indexed view is selected when working with Standard Edition of SQL Server?
6-26
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item.
6-27
Click Switch User, and then click Other User. Log on using the following credentials: i. ii. User name: AdventureWorks\Administrator Password: Pa$$w0rd
8. 9.
From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window.
10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_06_PRJ\6232B_06_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
One of the most important decisions when designing a table is to choose an appropriate table structure. In this lab, you will choose an appropriate structure for some new tables required for the relationship management system.
Supporting Documentation
Table 1: Relationship.ActivityLog Name ActivityTime SessionID Duration ActivityType Table 2: Relationship.PhoneLog Name PhoneLogID SalespersonID CalledPhoneNumber CallDurationSeconds Data Type int int nvarchar(16) int Constraint Primary Key Data Type datetimeoffset int int int Constraint
6-28
Table 3: Relationship.MediaOutlet Name MediaOutletID MediaOutletName PrimaryContact City Data Type int nvarchar(40) nvarchar(50) nvarchar(50) Constraint
Table 4: Relationship.PrintMediaPlacement Name PrintMediaPlacementID MediaOutletID PlacementDate PublicationDate RelatedProductID PlacementCost Table 5: Name ApplicationID ApplicantName EmailAddress ReferenceID Comments Data Type int nvarchar(150) nvarchar(100) uniqueidentifier nvarchar(500) Constraint IDENTITY(1,1) Data Type int int datetime datetime int decimal(18,2) Constraint Primary Key
6-29
6-30
6-31
Challenge Exercise 3: Comparing the Performance of Clustered Indexes vs. Heaps (Only if time permits)
Scenario
A company developer has approached you to decide whether a new table should have a clustered index or not. Insert performance of the table is critical. You will consider the design, create a number of alternatives and compare the performance of each against a set of test workloads. The main tasks for this exercise are as follows: 1. 2. 3. 4. 5. 6. Review the Design for Table 5. Create a table based on the design with no clustered index. Call the table Relationship.Table_Heap. Create a table based on the design with a clustered index on the ApplicantID column. Call the table Relationship.Table_ApplicationID. Create a table based on the design with a clustered index on the EmailAddress column. Call the table Relationship.Table_EmailAddress. Create a table based on the design with a clustered index on the ReferenceID column. Call the table Relationship.Table_ReferenceID. Load and execute the workload script. (Note: this may take some minutes to complete. You can check where it is up to by viewing the Messages tab. A message is printed as each of the four sections is completed. While the script is running, review the contents of the script and estimate the proportion of time difference you expect to see in the results). Compare the performance of each table structure.
7.
6-32
6-33
Review Questions
1. 2. What is the main problem with uniqueidentifiers used as primary keys? Where are newly inserted rows placed when a table is structured as a heap?
Best Practices
1. 2. 3. Unless specific circumstances arise, most tables should have a clustered index. The clustered index may or may not be placed on the table's primary key. When using GUID primary keys in the logical data model, consider avoiding their use throughout the physical implementation of the data model.
6-34
7-1
Module 7
Reading SQL Server 2008 R2 Execution Plans
Contents:
Lesson 1: Execution Plan Core Concepts Lesson 2: Common Execution Plan Elements Lesson 3: Working with Execution Plans Lab 7: Reading SQL Server Execution Plans 7-3 7-14 7-24 7-31
7-2
Module Overview
In earlier modules, you have seen that one of the most important decisions that SQL Server takes when executing a query, is how to access the data in any of the tables involved in the query. SQL Server can read the underlying table (which might be structured as a heap or with a clustered index) but it might also choose to use another index. In the next module, you will see how to design additional indexes but before learning this, it is important to know how to determine the outcomes of the decisions that SQL Server makes. Execution plans show how each step of a query is to be executed. In this module, you will learn how to read and interpret execution plans.
Objectives
After completing this lesson, you will be able to: 1. 2. 3. Explain the core concepts related to the use of execution plans Describe the role of the most common execution plan elements Work with execution plans
7-3
Lesson 1
The first steps in working with SQL Server execution plans are to understand why they are so important and to understand the phases that SQL Server passes through when executing a query. Armed with that information, you can learn what an execution plan is, what the different types of execution plans are, and how execution plans relate to execution contexts. Execution plans can be retrieved in a variety of formats. It is also important to understand the differences between each of these formats and to know when to use each format.
Objectives
After completing this lesson, you will be able to: 1. 2. 3. 4. 5. 6. Explain why execution plans matter Describe the phases that SQL Server passes through while executing a query Explain what execution plans are Describe the difference between actual and estimated execution plans Describe execution contexts Make effective use of the different execution plan formats
7-4
Key Points
Rather than trying to guess how a query is to be performed or how it was performed, execution plans allow precise answers to be obtained. Execution plans are also commonly referred to as query plans.
These are such common questions yet SQL Server provides tools to help answer the questions. Execution plans show how SQL Server intends to execute a query or how it did execute a query. The ability to interpret these execution plans provides you with the ability to answer the questions above. Many users capture execution plans and then try to resolve the worst performing aspects of a query. The best use of execution plans however, is in verifying that the plan you expected to be used, was the plan that was used. This means that you need to already have an idea of how you expect SQL Server to execute your queries. You will see more information on doing this in the next module.
7-5
Key Points
SQL Server executes queries in a series of phases. A key outcome of one of the phases is an execution plan. Once compiled, a plan may be cached for later use.
T-SQL Parsing
The first phase when executing queries is to check that the statements supplied in the batch follow the rules of the language. Each statement is checked to find any syntax errors. Object names within the statements are located. Question: What is the difference between a statement and a batch?
While at first glance, it might seem that mapping the Product table to its underlying object ID would be easy, consider that SQL Server supports more than a single object with the same name in a database, through the use of schemas. For example, note that each of the objects in the following code could be completely different in structure and that the names relate to entirely different objects:
SELECT * FROM Production.Product; SELECT * FROM Sales.Product; SELECT * FROM Marketing.Product;
SQL Server needs to apply a set of rules to relate the table name "Product" to the intended object.
7-6
Query Optimization
Once the object IDs have been resolved, SQL Server needs to decide how to execute the overall batch. Based on the available statistics, SQL Server will make decisions about how to access the data contained in each of the tables that are part of each query. SQL Server does not always find the best possible plan. It weighs up the cost of a plan, based on its estimate of the cost of resources required to execute the plan. The aim is to find a satisfactory plan in a reasonable period of time. The more complex a SQL batch is, the longer it could take SQL Server to evaluate all the possible plans that could be used to execute the batch. Finding the best plan might take longer than executing a less optimal plan. There is no need to consider alternate plans for DDL (Data Definition Language) statements, such as CREATE, ALTER or DROP. Many simple queries also have trivial plans that are quickly identified. Question: Can you think of a type of query that might lead to a trivial plan?
Plan Caching
If the plan is considered sufficiently useful, it may be stored in the Plan Cache. On later executions of the batch, SQL Server will attempt to reuse execution plans from the Plan Cache. This is not always possible and, for certain types of query, not always desirable.
7-7
Key Points
An execution plan is a map that details either how SQL Server would execute a query or how SQL Server did execute a query. SQL Server uses a cost-based optimizer.
Execution Plans
Execution plans show the overall method that SQL Server is using to satisfy the requirements of the query. As part of the plan, SQL Server decides the types of operations to be performed and the order that the operations will be performed in. Many operations are related to the choice SQL Server makes about how to access data in a table and whether or not available indexes will be used. These decisions are based on the statistics that are available to SQL Server at the time. SQL Server uses a cost-based optimizer and each element of the query plan is assigned a cost in relation to the total cost of the batch. SSMS also calculates a relationship between the costs of each statements, which is useful where a batch contains more than one statement. The costs that are either estimated or calculated as part of the plan can be interpreted within the plan. The cost of individual elements can be compared across statements in a single batch but comparisons should not be made between the costs of elements in different batches. Costs can only be used to determine if an operation is cheaper or more expensive than another operation. Costs cannot be used to estimate execution time. Question: What resources do you imagine the cost would be based upon?
7-8
Key Points
SQL Server can record the plan it used for executing a query. Before it executes a query though, it needs to create an initial plan.
7-9
Execution plans include row counts in each data path. For estimated execution plans, these are based on estimates from the available statistics. For actual execution plans, both the estimated and actual row counts are shown.
7-10
Key Points
Execution plans are reentrant. This means that more than one user can be executing exactly the same execution plan at one time. Each user needs separate data related to their individual execution of the plan. This data is held in an object known as an Execution Context.
Execution Context
Execution plans detail the steps that SQL Server would take (or did take) when executing a batch of statements. When multiple users are executing the plan concurrently, there needs to be a structure that holds data related to their individual executions of the plan. Execution contexts are cached for reuse in a very similar way to the caching that occurs with execution plans. When a user executes a plan, SQL Server retrieves an execution context from cache if there is one available. To maximize performance and minimize memory requirements, execution contexts are not fully completed when they are created. Branches of the code are "fleshed out" when the code needs to move to the branch. This means that if a procedure includes a set of procedural logic statements (like the IF statement), the execution context retrieved from cache may have gone in a different logical direction and not yet have all the details required. From a caching reuse point of view, it is useful to avoid too much procedural logic in stored procedures. You should favor set-based logic instead.
7-11
Key Points
There are three formats for execution plans. Text based plans are now deprecated. XML based plans should be used instead. Graphical plans render XML based plans for each of use.
Text based execution plans were superseded by XML based plans in SQL Server 2005 and are now deprecated. They should not be used in new development work.
Plan Portability
SQL Server provided a graphical rendering of execution plans to make reading text based plans easier. One challenge with this however, was that it was very difficult to send a copy of a plan to another user for review. XML plans can be saved as a .sqlplan filetype and are entirely portable. Graphical plans can be rendered from XML plans, including plans that have been received from other users. Note that graphical plans include only a subset of the information that is available from an XML plan. While it is not easy to read XML plans directly, further information can be obtained by reading the contents of the XML plan. XML plans are also ideal for programmatic access for users creating tools and utilities, as XML is relatively easy to consume programmatically in an application.
7-12
Question: What impact does having SSMS associated with the .sqlplan filetype have?
7-13
Key Points
In this demonstration you will see how to: Show an estimated execution plan Compare execution costs between two queries in a batch Show an actual execution plan Save an execution plan
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_07_PRJ\6232B_07_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
Question: How do you explain that such different queries return the same plan?
7-14
Lesson 2
Now that the role of execution plans is understood, along with the format of the plans, it is important to learn to interpret the plans. Execution plans can contain a large number of different types of elements. Certain elements however, appear regularly in execution plans. In this lesson, you will learn to interpret execution plans and learn about the most common execution plan elements.
Objectives
After completing this lesson, you will be able to: Describe the execution plan elements for table and clustered index scans and seeks Describe the execution plan elements for nested loops and lookups Describe the execution plan elements for merge and hash joins Describe the execution plan elements for aggregations Describe the execution plan elements for filter and sort Describe the execution plan elements for data modification
7-15
Key Points
Three execution plan elements relate to reading data from a table. The particular element used depends upon the structure of the table: heap or clustered index and whether the clustered index (if present) is useful in resolving the query.
SQL Server does not need to read the entire table and can use the index to quickly locate the correct customer. This is referred to as a clustered index seek.
7-16
Key Points
Nested Loops are one of the most commonly encountered operations. They are used to implement join operations and are commonly associated with RID or Key Lookup elements.
For each row in the upper input, a lookup is performed against the bottom input. The difference between a RID Lookup and a Key Lookup is whether the table has a clustered index. RID Lookups apply to heaps. Key Lookups apply to tables with clustered indexes. In some earlier documentation, a Key Lookup was also referred to as a Bookmark Lookup. The Key Lookup operator was introduced in SQL Server 2005 Service Pack 2. Note also that in earlier versions of SQL Server 2005, the Bookmark Lookup was shown as a Clustered Index Seek operator with a LOOKUP keyword associated with it.
7-17
In the physical library analogy, a lookup is similar to reading through an author index and for each book found in the index, going to collect it from the bookcases. Lookups are often expensive operations as they need to be executed once for every row of the top input source. Note that in the execution plan shown, more than half the cost of the query is accounted for by the Key Lookup operator. In the next module, you will see options for minimizing this cost in some situations. Nested Loop is the preferred choice whenever the number of rows in the top input source is small when compared with the number of rows in the bottom input source.
7-18
Key Points
Merge Joins and Hash Matches are other forms of join operations. Merge Joins are more efficient than Hash Matches but require sorted inputs.
Merge Joins
Apart from Nested Loop operations in which each row of one table is used to lookup rows from another table, it is common to need to join tables where simple lookups are not possible. Imagine two piles of paper sitting on the floor of your office. One pile of paper holds details of all your customers, one customer to a sheet. The other pile of paper holds details of customer orders, one order per sheet. If you needed to merge the two piles of paper together so that each customer's sheet was adjacent to his/her orders, how would you perform the merge? The answer depends upon the order of the sheets. If the customer sheets were in customer ID order and the customer order sheets were also in customer ID order, merging the two piles would be easy. The process involved is similar to what occurs with a Merge Join operator. It can only be used when the inputs are already in the same order. Merge Joins can be used to implement a variety of join types such as left outer joins, left semi joins, left anti semi joins, right outer joins, right semi joins, right anti semi joins and unions.
Hash Matches
Now imagine how you would merge the piles of customers and customer orders if the customers were in customer ID order but the customer orders were ordered by customer order number. The same problem would occur if the customer sheets were in postal code order. These situations are similar to the problem encountered by Hash Match operations. There is no easy way to merge the piles.
7-19
Hash Matches are using a relatively "brute force" method of joining. One input is broken into a set of "hash buckets" based on an algorithm. The other input is processed based on the same algorithm. In the analogy with the piles of paper, the algorithm could be to obtain the first digit of the customer ID. With this algorithm, ten buckets would be created. Although it may not be possible to always avoid Hash Matches in query plans, their presence is often an indication of a lack of appropriate indexing on the underlying tables.
7-20
Aggregations
Key Points
There are two types of Aggregate operator: Stream Aggregate and Hash Match Aggregate. Stream Aggregate operations are very efficient.
Aggregations
Imagine being asked to count how many orders are present for each customer based on a list of customer orders. How would you perform this operation? Similar to the discussion on Merge Joins and Hash Matches, the answer depends on the order that the customer orders are being held in. If the customer orders are already in customer ID order, then performing the count (or other aggregation) is very easy. This is the equivalent of a Stream Aggregate operation. However, if the aggregate being calculated is based on a different attribute of the customer orders than the attribute they are sorted by, performing the calculations is much more complex. One option would be to first sort all the customer orders by customer ID, then to count all the customer orders for each customer ID. Another alternative is to process the input via a hashing algorithm like the one used for Hash Match operations. This is what SQL Server does when using a Hash Match Aggregate operation. The presence of these operations in a query plan is often (but not always) an indication of a lack of appropriate indexing on the underlying table.
7-21
Key Points
Filter operations implement WHERE clause predicates or HAVING clause predicates. Sort operations sort input data.
7-22
Data Modification
Key Points
INSERT, UPDATE and DELETE operations are used to present the outcome of underlying T-SQL data modification statements. T-SQL MERGE statements can be implemented by combinations of INSERT, UPDATE and DELETE operations.
Data Modification
The purpose of these operations will usually be self-evident but what might be obvious is the potential cost of these operations or the complexity that can be involved with them. A T-SQL INSERT, UPDATE or DELETE statement might involve much more than the related execution plan operation. Question: Can you think of an example where an INSERT statement in T-SQL need to perform more than an INSERT operation in an execution plan?
7-23
Key Points
In this demonstration you will see queries that demonstrate the most common execution plan elements.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_07_PRJ\6232B_07_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 21 Demonstration 2A.sql script file. Follow the instructions contained within the comments of the script file.
7-24
Lesson 3
Now that you understand the importance of execution plans and are familiar with common elements contained within the plans, consideration needs to be given to the different ways that the plans can be captured. In this lesson, you will see a variety of ways to capture plans and explore the criteria by which SQL Server decides whether or not to reuse plans. When working with execution plans, SQL Server exposes a number of dynamic management views (DMVs) that can be used to explore query plan reuse. You will also see how they are used.
Objectives
After completing this lesson, you will be able to: Implement methods for capturing plans Explain how SQL Server decides whether or not to reuse existing plans when re-executing queries Use execution plan related DMVs
7-25
Key Points
Earlier in this module you saw how to capture execution plans using SQL Server Management Studio. Other options exist for capturing plans.
7-26
Key Points
In this demonstration you will see how to use Activity Monitor to view recent expensive queries.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_07_PRJ\6232B_07_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
Question: What could cause an expensive query to be removed from the Activity Monitor window?
7-27
Re-Executing Queries
Key Points
SQL Server attempts to reuse execution plans where possible. While this is often desirable, the reuse of existing plans can be counterproductive to performance.
Re-Executing Queries
Reusing query plans avoids the overhead of compiling and optimizing the queries. Some queries, however, perform poorly when executed with a plan that was generated for a different set of parameters. For example, consider a query with FromCustomerID and ToCustomerID parameters. If the value of the FromCustomerID parameter was the same as the value of the ToCustomerID parameter, an index seek based on the CustomerID might be highly selective. However, a later execution of that query where a large number of customers were requested would not be selective. This means that SQL Server would perform better if it reconsidered how to execute the query, and thus generate a new plan. You will see a further discussion on this "parameter sniffing" issue in later modules.
SQL Server assigns a cost to each plan that is cached, to estimate its "value". The value is a measure of how expensive the execution plan was to generate. When memory resources become tight, SQL Server will need to decide which plans are the most useful to keep. The decision to evict a plan from memory is based on this cost value and on whether or not the plan has been reused recently.
7-28
Options are available to force compilation behavior of code but they should be used sparingly and where necessary. You will see a further discussion on this issue in a later module.
7-29
Key Points Dynamic Management Views provide insight into the internal operations of SQL Server. Several of these views are useful when investigating execution plans. Most DMV values are reset whenever the server is restarted. Some are reset more often. View sys.dm_exec_connections sys.dm_exec_sessions sys.dm_exec_query_stats sys.dm_exec_requests Description One row per user connection to the server One row per session, including system and user sessions Query statistics Associated with a session and providing one row per currently executing request Provides the ability to find the T-SQL code being executed for a request Provides the ability to find the execution plan associated with a request Details of cached query plans Details of dependent objects for those plans
sys.dm_exec_sql_text()
sys.dm_exec_query_plan()
7-30
Key Points
In this demonstration you will see how to view cached execution plans.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_07_PRJ\6232B_07_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 32 Demonstration 3B.sql script file. Follow the instructions contained within the comments of the script file.
Question: No matter how quickly you execute the command to check the cache after you clear it, you would not see it empty. Why?
7-31
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item.
7-32
Click Switch User, and then click Other User. Log on using the following credentials: i. ii. User name: AdventureWorks\Administrator Password: Pa$$w0rd
8. 9.
From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window.
10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_07_PRJ\6232B_07_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
You have been learning about the design of indexes. To take this learning further, you need to have a way to view how these indexes are used. In the first exercise, you will learn to view both estimated and actual execution plans. Execution plans can contain many types of elements. In the second exercise, you will learn to identify the most common plan elements and see how statements lead to these elements being used. You regularly find yourself trying to decide between different ways of structuring SQL queries. You are concerned that you arent always choosing the highest-performing options. If time permits, you will learn to use execution plans to compare the cost of statements in multi-statement batches.
7-33
Task 3: View the estimated execution plan for script 7.2 using SHOWPLAN_XML
Execute script 7.2 in SQL Server Query Analyzer. Click on the returned XML and view the execution plan. Right-click in the whitespace in the plan. Choose Show Execution Plan XML. Briefly review the XML. Close the XML window and the execution plan window.
Task 7: Review the execution plans currently cached in memory using script 7.5
Execute script 7.5 to view the plans currently cached in memory
7-34
Results: After this exercise, you have reviewed various actual and estimated query plans.
7-35
7-36
Task 10: Explain the actual execution plan from script 7.14
Execute script 7.14. Note the difference in this plan from the plan for script 7.12. Results: After this exercise, you will have analyzed the most common plan elements returned from queries.
7-37
7-38
Review Questions
1. 2. What is the difference between a graphical execution plan and an XML execution plan? Give an example of why a T-SQL DELETE statement could have a complex execution plan?
Best Practices
1. 2. Avoid capturing execution plans for large numbers of statements when using SQL Profiler. If you need to capture plans using Profiler, make sure the trace is filtered to reduce the number of events being captured.
8-1
Module 8
Improving Performance through Nonclustered Indexes
Contents:
Lesson 1: Designing Effective Nonclustered Indexes Lesson 2: Implementing Nonclustered Indexes Lesson 3: Using the Database Engine Tuning Advisor Lab 8: Improving Performance through Nonclustered Indexes 8-3 8-10 8-18 8-25
8-2
Module Overview
The biggest improvements in database query performance on most systems come from appropriate use of indexing. In previous modules, you saw how to structure tables for efficiency, including the option of creating a clustered index on the table. In this module, you will see how nonclustered indexes have the potential to significantly enhance the performance of your applications and learn to use a tool that can help you design these indexes appropriately.
Objectives
After completing this lesson, you will be able to: Design effective nonclustered indexes Implement nonclustered indexes Use the database engine tuning advisor to design indexes
8-3
Lesson 1
Before you start to implement nonclustered indexes, you need to design them appropriately. In this module, you will learn how SQL Server structures nonclustered indexes and how they can provide performance improvements for your applications. You will also see how to find information about the indexes that have been created.
Objectives
After completing this lesson, you will be able to: Describe the concept of nonclustered indexes Explain how SQL Server structures nonclustered indexes when the underlying table is organized as a heap Explain how SQL Server structures nonclustered indexes when the underlying table is organized with a clustered index Obtain information about indexes that have been created
8-4
Key Points
You have seen how tables can be structured as heaps or have clustered indexes. Additional indexes can be created on the tables to provide alternate ways to rapidly locate required data. These additional indexes are called nonclustered indexes.
Nonclustered Indexes
A table can have up to 999 non-clustered indexes. These indexes are assigned index IDs greater than 1. Non-clustered indexes can be defined on a table regardless of whether the table uses a clustered index or a heap, and are used to improve the performance of important queries. Whenever updates to key columns from the nonclustered index or updates to clustering keys on the base table are made, the nonclustered indexes need to be updated as well. This impacts the data modification performance of the system. Each additional index that is added to a table increases the work that SQL Server might need to perform when modifying the data rows in the table. Care must be taken to balance the number of indexes created against the overhead that they introduce.
Ongoing Review
An application's data access patterns may change over time, particularly in enterprises where ongoing development work is being performed on the applications. This means that nonclustered indexes that are created at one point in time may need to be altered or even dropped at a later point in time, to continue to achieve high performance levels.
Physical Analogy
Continuing our library analogy, nonclustered indexes are indexes that point back to the bookcases. They provide alternate ways to look up the information in the library. For example, they might allow access by author, by release date, by publisher, etc. They can also be composite indexes where you could find an index by release date within the entries for each author.
8-5
Key Points
Nonclustered indexes have the same B-tree structure as clustered indexes, but in the nonclustered index, the data and the index are stored separately. When the underlying table is structured as a heap, the leaf level of a nonclustered index holds Row ID pointers instead of data. By default, no data apart from the keys is stored at the leaf level.
Physical Analogy
Based on the library analogy, a nonclustered index over a heap is like an author index pointing to books that have been stored in no particular order within the bookcases. Once an author is found in the index, the entry in the index for each book would have an address like "Bookcase 4, Shelf 3, Book 12". Note that it would be a pointer to the exact location of the book. Question: What is an upside of having the indexes point directly to RowIDs? Question: What is the downside of having multiple indexes pointing to data pages via RowID?
8-6
Key Points
You have seen that the base table could be structured with a clustered index instead of a heap. While SQL Server could have been designed so that nonclustered indexes still pointed to Row IDs, it was not designed that way. Instead, the leaf levels of a nonclustered index contain the clustering keys for the base table.
Physical Analogy
In the library analogy, a nonclustered index over a clustered index is like having an author index built over a library where the books are all stored in ISBN order. When the required author is found in the author index, the entry in the index provides details of the ISBNs for the required books. These ISBNs are then used to locate the books within the bookcases. If the bookcases need to be rearranged (for example due to other rows being modified), no changes need to be made to the author index as it is only providing keys that are used for locating the books, rather than the physical location of the books. Question: What is the downside of holding clustering keys in the leaf nodes of a nonclustered index instead of RowIDs? Question: What is the upside of holding clustering keys in the leaf nodes of a nonclustered index instead of RowIDs?
8-7
Key Points
You might require information about existing indexes before you create, modify, or remove an index. SQL Server 2008 provides many ways to obtain information about indexes.
sys.index_columns
8-8
Notes Statistics associated with a table, including statistic name and whether it was created automatically or by a user. Column ID associated with the statistic.
sys.stats_columns
System Functions
SQL Server provides a set of functions that provide information about the structure of indexes. Some of the more useful functions are shown in the following table: Function INDEXKEY_PROPERTY Notes Index column position within the index and column sort order (ASC or DESC). Index type, number of levels, and current setting of index options that are stored in metadata. Name of the key column of the specified index.
INDEXPROPERTY
INDEX_COL
In the next demonstration, you will see examples of many of these methods for obtaining information on indexes.
8-9
Key Points
In this demonstration you will see several ways to view information about indexes.
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_08_PRJ\6232B_08_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
Question: What would be another way to find information about the physical structure of indexes?
8-10
Lesson 2
Now that you have learned how nonclustered indexes are structured, it is important to learn how nonclustered indexes are implemented. In earlier modules, you have seen how Lookup functions used with Nested Loop execution plan elements can be very expensive operations. In this module, you will see options for alleviating these costs. You will also see how to alter or drop nonclustered indexes and how filtered indexes can reduce the overhead associated with some nonclustered indexes.
Objectives
After completing this lesson, you will be able to: Create nonclustered indexes Describe the performance impact of Lookup operations as part of Nested Loops in execution plans Use INCLUDE Clause to create covering indexes Drop or alter nonclustered indexes Use filtered indexes
8-11
Key Points
Nonclustered indexes are created with the CREATE INDEX statement. By default, the CREATE INDEX statement creates nonclustered indexes rather than clustered indexes when you do not specify which type of index you require. Wherever possible, the clustered index (if the table needs one) should be created prior to the nonclustered indexes. Otherwise SQL Server needs to rebuild all nonclustered indexes while creating the clustered index.
8-12
In composite indexes, the ordering of key columns is important and that the most selective column should be specified first, in the absence of any other requirements. Each column that makes up the key can be specified as ASC (ascending) or DESC (descending). Ascending is the default order.
8-13
Key Points
Nonclustered indexes can be very useful when needing to find specific data based on the key columns of the index. However, for each entry found, SQL Server needs to then use the values from the leaf level of the index (either clustering keys or rowid) to look up the data rows from the base table. This lookup process can be very expensive.
8-14
INCLUDE Clause
Key Points
In earlier versions of SQL Server (prior to 2005), it was common for DBAs or developers to create indexes with a large number of columns, to attempt to "cover" important queries. Covering a query avoids the need for lookup operations and can greatly increase the performance of queries. The INCLUDE clause was introduced to make the creation of covering indexes easier.
INCLUDE Clause
Adding columns to the key of an index adds a great deal of overhead to the index structure. For example, in the library analogy, if an index was constructed on PublisherID, ReleaseDate and Title, the index would internally be sorted by Title for no benefit. A further issue is the limitation of 16 columns and 900 bytes for an index, as this limits the ability to add columns to index keys when trying to cover queries. SQL Server 2005 introduced the ability to include one or more columns (up to 1024 columns) only at the leaf level of the index. The index structure in other levels is unaffected by these included columns. They are included, only to help with covering queries. If more than one column is listed in an INCLUDE clause, the order of the columns within the clause is not relevant. Question: For an index to cover a single table query, which columns would need to be present in the index?
Performance Impacts
Covering indexes can have a very positive performance impact on the queries that they are designed to support. However, while it would be possible to create an index to cover most queries, doing so could be counterproductive. Each index that is added to a table can negatively impact the performance of data modifications on the table. For this reason, it is important to decide which queries are most important and to aim to cover only those queries.
8-15
Key Points
Only indexes created via CREATE INDEX can be dropped via DROP INDEX. If an index has been created by SQL Server to support a PRIMARY KEY or UNIQUE constraint, then those indexes need to be dropped by dropping the constraint instead.
8-16
Filtered Indexes
Key Points
By default, SQL Server includes an entry for every row in a table at the leaf level of each index. This is not always desirable. Filtered indexes only include rows that match a WHERE predicate that is specified when the index is created.
Filtered Indexes
For the example in the slide, consider a large table of transactions with one column that indicates if the transaction is finalized or not. Often only a very small number of rows will be unfinalized. An index on the finalized transactions would be pointless as it would never be sufficiently selective to be helpful. However, an index on the unfinalized transactions could be highly selective and very useful. Standard indexes created in this situation would contain an entry at the leaf level for every transaction row, even though most entries in the index would never be used. Filtered indexes only include entries for rows that match the WHERE predicate. Note that only very simple logic is permitted in the WHERE clause predicate for filtered indexes. For example, you cannot use the clause to compare two columns and you cannot reference a computed column, even if it is persisted. Question: What is the downside of having an entry at the leaf level for every transaction row, whether finalized or not?
8-17
Key Points
In this demonstration you will see how to: Create covering indexes View included columns in indexes
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_08_PRJ\6232B_08_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 21 Demonstration 2A.sql script file. Follow the instructions contained within the comments of the script file.
Question: If included columns only apply to nonclustered indexes, why do you imagine that the columns in the clustered primary key also showed as included?
8-18
Lesson 3
Designing useful indexes is considered by many people as more of an art than a science. While there is some truth to this statement, a number of tools are available to assist with learning to create useful indexes. In this module, you will learn how to capture activity against SQL Server using SQL Profiler and then how to analyze that activity using the Database Engine Tuning Advisor.
Objectives
After completing this lesson, you will be able to: Capture traces of activity using SQL Server Profiler Use Database Engine Tuning Advisor to analyze trace results
8-19
Key Points
SQL Server Profiler is an important tool when tuning the performance of SQL Server queries. It captures the activity from client applications to SQL Server and stores it in a trace. These traces can then be analyzed.
SQL Trace
SQL Server Profiler is a graphical tool and it is important to realize that it can have significant performance impacts on the server being traced, depending upon the options chosen. SQL Trace is a library of system stored procedures that can be used for tracing when minimizing the performance impacts of the tracing is necessary.
8-20
The Extended Events system that was introduced in SQL Server 2008 also provides capabilities for tracing SQL Server activity and resources. Both SQL Trace and Extended Events are outside the scope of this course. Question: Where would the ability to replay a trace be useful?
8-21
Key Points
In this demonstration you will see how to use SQL Server Profiler.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_08_PRJ\6232B_08_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
2. 3.
Question: When so many statements were executed, why was there only one entry in the trace?
8-22
Key Points
The Database Engine Tuning Advisor utility analyzes the performance effects of workloads run against one or more databases. Typically these workloads are obtained from traces captured by SQL Server Profiler. After analyzing the effects of a workload on your databases, Database Engine Tuning Advisor provides recommendations for improving the performance of your system.
Workloads
A workload is a set of Transact-SQL statements that executes against databases that you want to tune. The workload source can be a file containing Transact-SQL statements, a trace file generated by SQL Profiler, or a table of trace information, again generated by SQL Profiler. SQL Server Management Studio also has the ability to launch Database Engine Tuning Advisor to analyze an individual statement.
Recommendations
The recommendations that can be produced include suggested changes to the database such as new indexes, indexes that should be dropped, and depending on the tuning options you set, partitioning recommendations. The recommendations that are produced are provided as a set of Transact-SQL statements that would implement the suggested changes. You can view the Transact-SQL and save it for later review and application, or you can choose to implement the recommended changes immediately.
8-23
Be careful of applying changes to a database without detailed consideration, especially in production environments. Also, ensure that any analysis that you perform is based on appropriately sized workloads so that recommendations are not made based on partial information. Question: Why is it important to tune an entire workload rather than individual queries?
8-24
Key Points
In this demonstration you will see how to use Database Engine Tuning Advisor.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_08_PRJ\6232B_08_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 32 Demonstration 3B.sql script file. Follow the instructions contained within the comments of the script file.
8-25
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item. Click Switch User, and then click Other User.
8-26
i. User name: AdventureWorks\Administrator ii. Password: Pa$$w0rd 8. From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. 9. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window. 10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_08_PRJ\6232B_08_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
The marketing system includes a query that is constantly executed and is performing too slowly. It retrieves 5000 web log entries beyond a given starting time. Previously, a non-clustered index was created on the SessionStart column. When 100 web log entries were being retrieved at a time, the index was being used. The developer is puzzled that changing the request to 5000 entries at a time has caused SQL Server to ignore the index he built. You need to investigate the query and suggest the best non-clustered index to support the query. You will then test your suggestion. After you have created the new index, the developer noted the cost of the sort operation and tried to create another index that would eliminate the sort. You need to explain to him why SQL Server has decided not to use this index. Later you will learn to set up a basic query tuning trace in SQL Server Profiler and use the trace captured in Database Engine Tuning Advisor. If time permits, you will design a required nonclustered index.
Supporting Documentation
Query 1: Query to test
DECLARE @StartTime datetime2 = '2010-08-30 16:27'; SELECT TOP(5000) wl.SessionID, wl.ServerID, wl.UserName FROM Marketing.WebLog AS wl WHERE wl.SessionStart >= @StartTime ORDER BY wl.SessionStart, wl.ServerID;
8-27
8-28
4.
8-29
Task 3: Test the design and explain why the index was not used
Enable Include Actual Execution Plan. Execute the query. Review the Execution Plan and explain why the index was not used. Results: After this exercise, you have understood why some indexes are not appropriate in some scenerios.
8-30
Task 3: Design a more appropriate index by following the Missing Index suggestion
Review and implement the Missing Index that SQL Server has suggested. Test to ensure that the new index is being used.
Task 4: Create a better index that removes the sort operation. If you create another
index, confirm that SQL Server selects it
Create a new index that will remove the Sort operation. Test to ensure that the new index is being used. Results: After this exercise, you should have created a better index that will remove the sort operation.
8-31
Task 1: Open SQL Server Profiler and configure and start a trace
Open SQL Server Profiler. Configure it use the following: a. b. c. d. e. Template: Tuning Save To File: should be selected and any file name provided for a file on the desktop Enable file rollover: Not selected Maximum File Size: 500MB Filter: DatabaseName LIKE MarketDev
Start the SQL Server Profiler Trace. Disable AutoScroll from the Window Menu.
8-32
Review Questions
1. 2. What is a covering index? Can a clustered index be a covering index?
Best Practices
1. 2. 3. Never apply Database Engine Tuning Advisor recommendations without further reviewing what is being suggested. Record details of why and when you create any indexes. DBAs are hesitant to ever remove indexes without this knowledge. When DETA suggests new statistics, this should be taken as a hint to investigate the indexing structure of the table.If using an offline version of Books Online, ensure it is kept up to date.
9-1
Module 9
Designing and Implementing Stored Procedures
Contents:
Lesson 1: Introduction to Stored Procedures Lesson 2: Working With Stored Procedures Lesson 3: Implementing Parameterized Stored Procedures Lesson 4: Controlling Execution Context Lab 9: Designing and Implementing Stored Procedures 9-3 9-11 9-23 9-33 9-39
9-2
Module Overview
Stored procedures allow the creation of T-SQL logic that will be stored and executed at the server. This logic might enforce business rules or data consistency. You will see the potential advantages of the use of stored procedures in this module along with guidelines on creating them.
Objectives
After completing this module, you will be able to: Describe the role of stored procedures and the potential benefits of using them Work with stored procedures Implement parameterized stored procedures Control the execution context of a stored procedure
9-3
Lesson 1
SQL Server provides a number of stored procedures and users can also create stored procedures. In this lesson, you will see the role of stored procedures and the potential benefits of using them. System stored procedures provide a large amount of pre-built functionality that you can take advantage of when building applications. It is also important to realize when designing stored procedures that not all T-SQL statements are permitted within stored procedures.
Objectives
After completing this lesson, you will be able to: Describe the role of stored procedures Identify the potential benefits of using stored procedures Work with system stored procedures Identify statements that are not permitted within the body of a stored procedure declaration
9-4
Key Points
A stored procedure is a named collection of Transact-SQL statements that is stored on the server within the database itself. Stored procedures are a method of encapsulating repetitive tasks; they support userdeclared variables, conditional execution, and other powerful programming features.
Stored Procedures
Stored procedures are similar to procedures, methods and functions in high-level languages. They can have input and output parameters and a return value. As a side effect of executing the stored procedure, rows of data can also be returned from the stored procedure. In fact, multiple rowsets can be returned from a single stored procedure. Stored procedures can be created in either T-SQL or in managed .NET code and are executed by the EXECUTE T-SQL statement. The creation of stored procedures in managed code will be discussed in Module 16. Question: Why might it be useful to return multiple rowsets from a stored procedure?
9-5
Key Points
The use of stored procedures offers a number of benefits over issuing T-SQL code directly from an application.
Security Boundary
Stored procedures can be part of a scheme that helps increase application security. They can be treated as a security boundary. Users can be given permission to execute a stored procedure without being given permission to access the objects that the stored procedure accesses. For example, you can give a user (or set of users via a role) permission to execute a stored procedure that updates a table without granting the user any permissions at all directly on the table.
Modular Programming
Code reuse is important. Stored procedures help by allowing logic to be created once and then to be called many times and from many applications. Maintenance is easier as if a change is needed, you only need to change the procedure, without needing to change the application code at all in many cases. Changing a stored procedure could avoid the need to change the data access logic in a group of applications.
Delayed Binding
It is possible to create a stored procedure that accesses (or references) a database object that does not yet exist. This can be helpful in simplifying the order that database objects need to be created in. This is referred to as deferred name resolution.
Performance
Send the name of a stored procedure to be executed rather than hundreds or thousands of lines of executable T-SQL code can offer a significant reduction in the level of network traffic.
9-6
Before T-SQL code is executed, it needs to be compiled. When a stored procedure is compiled, in many cases SQL Server will attempt to retain (and reuse) the query plan that it previously generated, to avoid the cost of the compilation of the code. While reuse of execution plans for ad-hoc T-SQL code issued by applications is possible, SQL Server favors the reuse of stored procedure execution plans. Query plans for ad-hoc T-SQL statements are amongst the first items removed from memory when memory pressure is occurring. The rules governing the reuse of query plans for ad-hoc T-SQL are largely based on matching the text of the queries exactly. Any difference at all (eg: whitespace, casing, etc.) will cause a different query plan to be used, unless the difference is only a value that SQL Server decides must be the equivalent of a parameter. Stored procedures have a much higher chance of achieving query plan reuse. Question: Stored procedures can be created in any order. Why could cause the tables that are referenced by the stored procedures, to need to be created in a specific order?
9-7
Key Points
SQL Server is shipped with a large amount of pre-built functionality shipped within system stored procedures and system extended stored procedures.
9-8
9-9
Key Points
Not all T-SQL statements are permitted within stored procedure declarations. The table on the slide shows that statements that cannot be used.
Note that stored procedures can access objects in other databases but that they need to be referred to by name, not by attempting to change the database context to another database ie: the USE statement cannot be used within the body of a stored procedure in the way that it can be used in a T-SQL script.
9-10
Demonstration 1A: Working with System Stored Procedures and Extended Stored Procedures
Key Points
In this demonstration you will see how: 1. 2. How to execute system stored procedures How to execute system extended stored procedures
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_09_PRJ\6232B_09_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
Question: What does the mismatch of prefixes in system stored procedure and system extended store procedure names suggest?
9-11
Lesson 2
Now that you have an understanding of why stored procedures are important, you need to gain an understanding of the practicalities involved in working with stored procedures.
Objectives
After completing this lesson, you will be able to: Create a stored procedure Execute stored procedures Alter a stored procedure Drop a stored procedure Identify stored procedure dependencies Explain guidelines for creating stored procedures Obfuscate stored procedure definitions
9-12
Key Points
The T-SQL CREATE PROCEDURE statement is used to create new procedures.
CREATE PROC
CREATE PROCEDURE is commonly abbreviated to CREATE PROC. A procedure cannot be replaced by using the CREATE PROC statement. It needs to be explicitly altered using an ALTER PROC statement or dropped and then recreated. The CREATE PROC statement must be the only statement in the T-SQL batch. All statements from the keyword AS until the end of the script or until the end of the batch (using a batch separator such as GO) will become part of the body of the stored procedure. Creating a stored procedure requires both the CREATE PROCEDURE permission in the current database and the ALTER permission on the schema in which the procedure is being created. It is important to keep connection settings such as QUOTED_IDENTIFIER and ANSI_NULLS consistent when working with stored procedures. The settings associated with the stored procedure are taken from the settings in the session where it is created. Stored procedures are always created in the current database with the single exception of stored procedures created with a # prefix in their name. The # prefix on a name indicates that it is a temporary object. As such, it would be created in the tempdb database and removed the next time the server is restarted.
9-13
Note: Even though wrapping the body of a stored procedure with a BEGINEND block is not required, doing so is considered a good practice.
9-14
Key Points
The T-SQL EXECUTE statement is used to execute stored procedures. The EXECUTE statement is commonly abbreviated as EXEC.
EXECUTE Statement
The EXECUTE is mostly used to execute stored procedures but can also be used to execute other objects such as dynamic SQL statements. AS mentioned in the first lesson, system stored procedures can be executed within the master database without having to explicitly refer to that database. That does not apply to other stored procedures.
9-15
Having SQL Server perform unnecessary steps to locate a stored procedure reduces performance for no reason.
9-16
Key Points
The T-SQL ALTER PROCEDURE statement is used to replace an existing procedure. ALTER PROCEDURE is often abbreviated to ALTER PROC.
ALTER PROC
The main reason for using the ALTER PROC statement is to retain any existing permissions on the procedure while it is being changed. Users may have been granted permission to execute the procedure. If you drop the procedure and recreate it, those permissions that had been granted to the users would be removed when the procedure was dropped.
Procedure Type
Note that the type of procedure cannot be changed. For example, a T-SQL procedure cannot be changed to a managed code procedure via an ALTER PROCEDURE statement or vice-versa.
Connection Settings
The connection settings such as QUOTED_IDENTIFIER and ANSI_NULLS that will be associated with the modified stored procedure will be those taken from the session that makes the change, not from the original stored procedure so it is important to keep these consistent when making changes.
Complete Replacement
Note that when you alter a stored procedure, you need to supply again any options (such as WITH ENCRYPTION) that were supplied while creating the procedure. None of these options are retained and they are replaced by whatever options are supplied in the ALTER PROC statement.
9-17
Key Points
Dropping a stored procedure is straightforward. The DROP PROCEDURE statement is used to drop a stored procedure and is commonly abbreviated as DROP PROC.
Permissions
Dropping a procedure requires either ALTER permission on the schema that the procedure is part of or CONTROL permission on the procedure itself.
9-18
Key Points
It is a good idea before dropping a stored procedure to check for any other objects that are dependent upon the stored procedure.
sp_depends
Earlier versions of SQL Server used the sp_depends system stored procedure to return details of dependencies between objects. It was known to have issues and to report incomplete information due to issues with deferred name resolution.
sys.sql_expression_dependencies
Use of the sys.sql_expression_dependencies view replaces the previous use of the sp_depends system stored procedure. Sys.sql_expression_dependencies provide a one row per name dependency on userdefined entities in the current database. sys.dm_sql_referenced_entities and sys.dm_sql_referencing_entities provide more targeted views over the data provided by sys.sql_expression_dependencies. You will see an example of these dependency views being used in the next demonstration.
9-19
Key Points
There are a number of important guidelines that should be considered when creating stored procedures.
9-20
9-21
Key Points
SQL Server provides an option to obfuscate the definition of stored procedures via the WITH ENCRYPTION option. Caution needs to be exhibited in using it as it makes working with the application more difficult and likely does not achieve the aims it is being targeted at.
WITH ENCRYPTION
As was mentioned in an earlier module dealing with views, it is very important to understand that while SQL Server provides an option (WITH ENCRYPTION) to obfuscate the definition of your stored procedures, that the encryption is not particularly strong. It is known to be relatively easy to defeat as the encryption keys are stored in known locations within the encrypted text. There are a number of 3rd party tools that are capable of reversing the encryption. Original copies of the source need to be kept regardless of the fact that decryption might be possible. Do not depend upon this. Encrypted code is much harder to work with in terms of diagnosing and tuning performance issues. Question: Why might you want to obfuscate the definition of a stored procedure?
9-22
Key Points
In this demonstration you will see: How to create a stored procedure How to execute a stored procedure How to create a stored procedure that returns multiple rowsets How to alter a stored procedure How to view the list of stored procedures
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_09_PRJ\6232B_09_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 21 Demonstration 2A.sql script file. Follow the instructions contained within the comments of the script file.
Question: How could the GetBlueProductsAndModels stored procedure be made more useful?
9-23
Lesson 3
The stored procedures that you have seen earlier in this module have not involved parameters. They have produced their output without needing any input from the user and they have not returned any values back apart from the rows that they have returned. Stored procedures are more flexible when you include parameters as part of the procedure definition because you can create more generic application logic. Stored procedures can use both input and output parameters and return values. While the reuse of query execution plans is in general desirable, there are situations where this reuse is detrimental. You will see situations where this can occur and consider options for workarounds to avoid the detrimental outcomes.
Objectives
After completing this lesson, you will be able to: Parameterize stored procedures Work with input parameters Work with output parameters Explain the issues surrounding parameter sniffing and performance and the potential workarounds.
9-24
Key Points
Parameterized stored procedures allow for a much higher level of code reuse. They contain 3 major components: input parameters, output parameters and return values.
Input Parameters
Parameters are used to exchange data between stored procedures and functions and the application or tool that called the stored procedure or function. They allow the caller to pass a data value to the stored procedure or function. To define a stored procedure that accepts input parameters, you declare one or more variables as parameters in the CREATE PROCEDURE statement. You will see an example of this in the next topic.
Output Parameters
Output parameters allow the stored procedure to pass a data value or a cursor variable back to the caller. A key difference between stored procedures and user-defined function are that user-defined functions cannot specify output parameters. (User-defined functions are discussed in a later module). To use an output parameter within Transact-SQL, you must specify the OUTPUT keyword in both the CREATE PROCEDURE and the EXECUTE statements.
Return Value
Every stored procedure returns an integer return code to the caller. If the stored procedure does not explicitly set a value for the return code, the return code is 0 if no error occurs, otherwise a negative value is returned. Return values are commonly used to return a status result or an error code from a procedure and are sent by the T-SQL RETURN statement.
9-25
While it is possible to send a business-logic related value via a RETURN statement, in general, you should use OUTPUT parameters to output values rather than the RETURN value.
9-26
Key Points
Stored procedures can accept input parameters, similar to the way that parameters are passed to functions or methods or subroutines in higher-level languages.
Input Parameters
Stored procedure parameters must have an @ prefix and must have a data type specified. The data type will be checked when a call is made. There are two ways to call a stored procedure with input parameters. One is to pass the parameter list in order. The other is to pass the list by name. You cannot combine these two options in a single EXEC call.
Default Values
Provide default values for a parameter where appropriate. If a default is defined, a user can execute the stored procedure without specifying a value for that parameter. Look at the beginning of the procedure declaration from the example on the slide:
CREATE PROCEDURE Sales.OrdersByDueDateAndStatus @DueDate datetime, @Status tinyint = 5 AS
Two parameters have been defined (@DueDate and @Status). The @DueDate parameter has no default value and must be supplied when the procedure is executed. The @Status parameter has a default value of 5. If a value for the parameter is not supplied when the stored procedure is executed, then a value of 5 will be used.
9-27
Validating parameters early avoids doing substantial work in the procedure and then having to undo all that work.
This execution supplies a value for both @DueDate and for @Status. Note that the names of the parameter have not been mentioned. SQL Server knows which parameter is which by its position in the parameter list. Now look at the second option:
EXEC Sales.OrdersByDueDateAndStatus '20050713';
In this case, a value for the @DueDate parameter has been supplied but no value for the @Status parameter has been supplied. In this case, the procedure will be executed with the @Status value set at a default value of 5. Finally, look at the third option:
EXEC Sales.OrdersByDueDateAndStatus @DueDate = '20050713', @Status = 5;
In this case, the stored procedure is being called with both parameters but they are being identified by name. Note that you could have also achieved the same outcome with the parameters specified in the reverse order as they are identified by name:
EXEC Sales.OrdersByDueDateAndStatus @Status = 5, @DueDate = '20050713';
9-28
Key Points
Output parameters are used very similarly to input parameters. While they are declared and used very similarly to input parameters, output parameters have a few special requirements.
Look at the beginning of the procedure declaration from the example on the slide:
CREATE PROC Sales.GetOrderCountByDueDate @DueDate datetime, @OrderCount int OUTPUT AS
In this case, the @DueDate parameter is an input parameter and the @OrderCount parameter has been specified as an output parameter. Note that in SQL Server there is no true equivalent of a .NET output parameter. SQL Server output parameter are really input/output parameters. Now look at how the procedure is called:
DECLARE @DueDate datetime = '20050713'; DECLARE @OrderCount int; EXEC Sales.GetOrderCountByDueDate @DueDate, @OrderCount OUTPUT; SELECT @OrderCount;
First, variables to hold the parameter values have been declared. In this case, a variable to hold a due date has been declared, along with another to hold the order count.
9-29
In the EXEC call, note that the @OrderCount parameter is followed by the OUTPUT keyword. If you do not specify the OUTPUT parameter in the EXEC statement, the stored procedure would still execute as normal, including preparing a value to return in the output parameter. However, the output parameter value would simply not be copied back into the @OrderCount variable. This is a common bug when working with output parameters. Finally, you would then use the returned value somehow in the business logic that follows the EXEC call. Question: Why might you use output parameters in conjunction with IDENTITY columns?
9-30
Key Points
In general, the reuse of query plans when a stored procedure is re-executed is a good thing. Sometimes however, a stored procedure would benefit from an entirely different query execution plan for different parameter values.
Parameter Sniffing
It has been mentioned that SQL Server attempts to reuse query execution plans from one execution of a stored procedure to the next. While this is mostly helpful, imagine a procedure that takes a range of names as parameters. If you ask for the rows from A to A, you might need a very different query plan to the times when you ask for A to Z. SQL Server provides a variety of ways to deal with this problem, often called a "parameter sniffing" problem. Note that this only applies to parameters and not to variables within the batch. While the code for these looks very similar, variable values are not "sniffed" at all and this can lead to poor execution plans regardless.
WITH RECOMPILE
You can add a WITH RECOMPILE option when declaring a stored procedure. This causes the procedure to be recompiled every time it is executed.
9-31
OPTIMIZE FOR
There is a query hint OPTION (OPTIMIZE FOR) that allows you to specify the value of a parameter that should be assumed when compiling the procedure, regardless of the actual value of the parameter. Question: How would you determine the value to assign in an OPTIMIZE FOR hint?
9-32
Key Points
In this demonstration you will see: How to create a stored procedure with parameters How to alter a stored procedure with parameters to correct a common stored procedure bug
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_09_PRJ\6232B_09_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
9-33
Lesson 4
Stored procedures normally execute in the security context of the user calling the procedure. As long as a chain of ownership extends from the stored procedure to the objects that are referenced, the user can execute the procedure without the need for permissions on the underlying objects. Ownership chaining issues with stored procedures are identical to those that were discussed for views in Module 4. Sometimes, however, more precise control over the security context that the procedure is executing in is required.
Objectives
After completing this lesson, you will be able to: Control execution context Work with the EXECUTE AS clause View execution context
9-34
Key Points
The security context that a stored procedure executes in is referred to as its execution context. This context is used to establish the identity against which permissions to execute statements or perform actions are checked.
Execution Contexts
An execution context is represented by a login token and a user token. The tokens identify the primary and secondary principals against which permissions are checked and the source used to authenticate the token. A login connecting to an instance of SQL Server has one login token and one or more user tokens, depending on the number of databases to which the account has access.
Login Token: A login token is valid across the instance of SQL Server. It contains the primary and secondary identities against which server-level permissions and any database-level permissions associated with these identities are checked. The primary identity is the login itself. The secondary identity includes permissions inherited from rules and groups. User Token: A user token is valid only for a specific database. It contains the primary and secondary identities against which database-level permissions are checked. The primary identity is the database user itself. The secondary identity includes permissions inherited from database roles. User tokens do not contain server-role memberships and do not honor the server-level permissions granted to the identities in the token including those that are granted to the server-level public role.
9-35
9-36
Key Points
The EXECUTE AS clause sets the execution context of a stored procedure. It is useful when you need to override the default security context.
Explicit Impersonation
SQL Server supports the ability to impersonate another principal either explicitly by using the stand-alone EXECUTE AS statement, or implicitly by using the EXECUTE AS clause on modules. The stand-alone EXECUTE AS statement can be used to impersonate server-level principals, or logins, by using the EXECUTE AS LOGIN statement. The stand-alone EXECUTE AS statement can also be used to impersonate database level principals, or users, by using the EXECUTE AS USER statement. To execute as another user, you must first have IMPERSONATE permission on that user. Any login in the sysadmin role has IMPERSONATE permission on all users.
Implicit Impersonation
Implicit impersonations are performed through the WITH EXECUTE AS clause on modules and are used to impersonate the specified user or login at the database or server level. This impersonation depends on whether the module is a database-level module, such as a stored procedure or function, or a server-level module, such as a server-level trigger. When impersonating a principal by using the EXECUTE AS LOGIN statement, or within a server-scoped module by using the EXECUTE AS clause, the scope of the impersonation is server-wide. This means that after the context switch, any resource within the server that the impersonated login has permissions on can be accessed. However, when impersonating a principal by using the EXECUTE AS USER statement, or within a databasescoped module by using the EXECUTE AS clause, the scope of impersonation is restricted to the database by default. This means that references to objects outside the scope of the database will return an error.
9-37
Key Points
You may wish to programmatically query the current security context details. These details are provided by the sys.login_token and sys.user_token system views.
9-38
Key Points
In this demonstration you will see: How to view details of execution context How to change execution context for a session How to use the WITH EXECUTE AS clause in a stored procedure
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_09_PRJ\6232B_09_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 41 Demonstration 4A.sql script file. Follow the instructions contained within the comments of the script file.
9-39
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item. Click Switch User, and then click Other User.
9-40
i. User name: AdventureWorks\Administrator ii. Password: Pa$$w0rd 8. From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. 9. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window. 10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_09_PRJ\6232B_09_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
You need to create a set of stored procedures to support a new reporting application. The procedures will be created within a new Reports schema.
Supporting Documentation
Stored Procedure Input Parameters: Output Parameters: Output Columns: Output Order: Notes: Reports.GetProductColors None None Color (from Marketing.Product) Color Colors should not be returned more than once in the output. NULL values should not be returned. Reports.GetProductsAndModels None None ProductID, ProductName, ProductNumber, SellStartDate, SellEndDate and Color (from Marketing.Product), ProductModelID (from Marketing.ProductModel), EnglishDescription, FrenchDescription, ChineseDescription. ProductID, ProductModelID For descriptions, return the Description column from the Marketing.ProductDescription table for the appropriate language. The
9-41
Stored Procedure
Reports.GetProductColors LanguageID for English is 'en', for French is 'fr' and for Chinese is 'zh-cht'. If no specific language description is available, return the invariant language description if it is present. The LanguageID for the invariant language is a blank string ''. Where neither the specific language or invariant language descriptions exist, return the ProductName instead.
Reports.GetProductsByColor @Color (same datatype as the Color column in the Marketing.Product table) None ProductID, ProductName, ListPrice (returned as a column named Price), Color, Size and SizeUnitMeasureCode (returned as a column named UnitOfMeasure) (from Marketing.Product) ProductName The procedure should return products that have no Color if the parameter is NULL. None None Color (from Marketing.Product) Color Colors should not be returned more than once in the output. NULL values should not be returned.
9-42
9-43
9-44
Challenge Exercise 3: Alter the execution context of stored procedures (Only if time permits)
Scenario
In this exercise, you will alter the stored procedures to use a different execution context. The main tasks for this exercise are as follows: 1. 2. 3. Alter the Reports.GetProductColors stored procedure to execute as OWNER. Alter the Reports.GetProductsAndModels stored procedure to execute as OWNER. Alter the Reports.GetProductsByColor stored procedure to execute as OWNER.
9-45
Review Questions
1. 2. What does the WITH RECOMPILE option do when used with a CREATE PROC statement? What does the WITH RECOMPILE option do when used with an EXECUTE statement?
Best Practices
1. 2. 3. Use the EXECUTE AS clause to override the execution context of stored procedures that use dynamic SQL, rather than granting permissions on the underlying tables to users. Design procedures to perform individual tasks. Avoid designing procedures that perform a large number of tasks, unless those tasks are performed by executing other stored procedures. Keep consistent ownership of stored procedures, views, tables and other objects within databases.
9-46
10-1
Module 10
Merging Data and Passing Tables
Contents:
Lesson 1: Using the MERGE Statement Lesson 2: Implementing Table Types Lesson 3: Using TABLE Types As Parameters Lab 10: Passing Tables and Merging Data 10-3 10-14 10-22 10-26
10-2
Module Overview
Each time a client application makes a call to a SQL Server system, considerable delay is encountered at the network layer. The basic delay is unrelated to the amount of data being passed. It relates to the latency of the network. For this reason, it is important to minimize the number of times that a client needs to call a server for a given amount of data that must be passed between them. Each call is termed a "roundtrip". In this module you will review the techniques that provide the ability to process sets of data rather than individual rows. You will then see how these techniques can be used in combination with TABLE parameter types to minimize the number of required stored procedure calls in typical applications.
Objectives
After completing this lesson, you will be able to: Use the MERGE statement Implement table types Use TABLE types as parameters
10-3
Lesson 1
A very common requirement when coding in T-SQL is the need to update a row if it exists but to insert the row if it does not already exist. SQL Server 2008 introduced the MERGE statement that provides this ability plus the ability to process entire sets of data rather than processing row by row or in several separate set-based statements. This leads to much more efficient execution and simplifies the required coding. In this lesson, you will investigate the use of the MERGE statement and the use of the most common options associated with the statement.
Objectives
After completing this lesson, you will be able to: Explain the role of the MERGE statement Describe how to use the WHEN MATCHED clause Describe how to use the WHEN NOT MATCHED BY TARGET clause Describe how to use the WHEN NOT MATCHED BY SOURCE clause Explain the role of the OUTPUT clause and $action Describe MERGE determinism and performance
10-4
MERGE Statement
Key Points
The MERGE statement is most commonly used to insert data that does not already exist but to update the data if it does exist. It can operate on entire sets of data rather than just on single rows and can perform alternate actions such as deletes.
MERGE
It is a common requirement to need to update data if it already exists but to insert it if it does not already exist. Some other database engines (not SQL Server) provide an UPSERT statement for this purpose. The MERGE statement provided by SQL Server is a more capable replacement for such statements in other database engines and is based on the ANSI SQL standard together with some Microsoft extensions to the standard. A typical situation where the need for the MERGE statement arises is in the population of data warehouses from data in source transactional systems. For example, consider a data warehouse holding details of a customer. When a customer row is received from the transactional system, it needs to be inserted into the data warehouse. When later updates to the customer are made, the data warehouse would then need to be updated.
Atomicity
Where statements in other languages typically operate on single rows, the MERGE statement in SQL Server can operate on entire sets of data in a single statement execution. It is important to realize that the MERGE statement functions as an atomic operation in that all inserts, updates or deletes occur or none occur.
10-5
The source table provides the rows that need to be matched to the rows in the target table. You can think of the source table as the incoming data. It is specified in a USING clause. The source table does not have to be an actual table but can be other types of expressions that return a table such as: A view A sub-select (or derived table) with an alias A common table expression (CTE) A VALUES clause with an alias
The source and target are matched together as the result of an ON clause. This can involve one or more columns from both tables.
10-6
WHEN MATCHED
Key Points
The WHEN MATCHED clause defines the action to be taken when a row in the source is matched to a row in the target.
WHEN MATCHED
The ON clause is used to match source rows to target rows. The WHEN MATCHED clause specifies the action that needs to occur when a source row matches a target row. In most cases, this will involve an UPDATE statement but it could alternately involve a DELETE statement. In the example shown in the slide, rows in the EmployeeUpdate table are being matched to rows in the Employee table based upon the EmployeeID. When a source row matches a target row, the FullName and EmploymentStatus columns in the target table are updated with the values of those columns in the source. Note that only the target table can be updated. If an attempt is made to modify any other table, a syntax error is returned.
Multiple Clauses
It is also possible to include two WHEN MATCHED clauses such as shown in the following code block:
WHEN MATCHED AND s.Quantity > 0 ... WHEN MATCHED ...
No more than two WHEN MATCHED clauses can be present. When two clauses are used, the first clause must have an AND condition. If the source row matches the target and also satisfies the AND condition, then the action specified in the first WHEN MATCHED clause is performed. Otherwise, if the source row
10-7
matches the target but does not satisfy the AND condition, the condition in the second WHEN MATCHED clause is evaluated instead. When two WHEN MATCHED clauses are present, one action must specify an UPDATE and the other action must specify a DELETE. Question: What is different about the UPDATE statement in the example shown, compared to a normal UPDATE statement?
10-8
Key Points
The WHEN NOT MATCHED BY TARGET clause specifies the action that needs to be taken when a row in the source cannot be matched to a row in the target.
Syntax
The words BY TARGET are optional and are often omitted. The clause is then just written as WHEN NOT MATCHED. Note again that no table name is included in the action statement (INSERT statement) as modifications may only be made to the target table. The WHEN NOT MATCHED BY TARGET clause is part of the ANSI SQL standard.
10-9
Key Points
The WHEN NOT MATCHED BY SOURCE statement is used to specify an action to be taken for rows in the target that were not matched by rows from the source.
10-10
Key Points
The OUTPUT clause was added in SQL Server 2005 and allows the return of a set of rows when performing data modifications. In 2005, this applied to INSERT, DELETE and UPDATE. In SQL Server 2008 and later, this clause can also be used with the MERGE statement.
OUTPUT Clause
The OUTPUT clause was a useful addition to the INSERT, UPDATE and DELETE statements in SQL Server 2005. For example, consider the following code:
DELETE FROM HumanResources.Employee OUTPUT deleted.BusinessEntityID, deleted.NationalIDNumber WHERE ModifiedDate < DATEADD(YEAR,-10,SYSDATETIME());
In this example, employees are deleted when their rows have not been modified within the last ten years. As part of this modification, a set of rows is returned that provides details of the BusinessEntityID and NationalIDNumber for each row deleted. As well as returning rows to the client application, the OUTPUT clause can include an INTO sub-clause that causes the rows to be inserted into another existing table instead. Consider the following example:
DELETE FROM HumanResources.Employee OUTPUT deleted.BusinessEntityID, deleted.NationalIDNumber INTO Audit.EmployeeDelete WHERE ModifiedDate < DATEADD(YEAR,-10,SYSDATETIME());
In this example, details of the employees being deleted are inserted into the Audit.EmployeeDelete table instead of being returned to the client.
10-11
Composable SQL
In SQL Server 2008 and later, it is now possible to consume the rowset returned by the OUTPUT clause more directly. The rowset cannot be used as a general purpose table source but can be used as a table source for an INSERT SELECT statement. Consider the following example:
INSERT INTO Audit.EmployeeDelete SELECT Mods.EmployeeID FROM (MERGE INTO dbo.Employee AS e USING dbo.EmployeeUpdate AS eu ON e.EmployeeID = eu.EmployeeID WHEN MATCHED THEN UPDATE SET e.FullName = eu.FullName, e.EmploymentStatus = eu.EmploymentStatus WHEN NOT MATCHED THEN INSERT (EmployeeID,FullName,EmploymentStatus) VALUES (eu.EmployeeID,eu.FullName,eu.EmploymentStatus) OUTPUT $action AS Action,deleted.EmployeeID) AS Mods WHERE Mods.Action = 'DELETE';
In this example, the OUTPUT clause is being used with the MERGE statement. A row would be returned for each row either updated or deleted. However, you wish to only audit the deletion. You can treat the MERGE statement with an OUTPUT clause as a table source for an INSERT SELECT statement. The enclosed statement must be given an alias. In this case, the alias "Mods" has been assigned. The power of being able to SELECT from a MERGE statement is that you can then apply a WHERE clause. In this example, only the DELETE actions have been selected. Note that from SQL Server 2008 onwards, this level of query composability also applies to the OUTPUT clause when used in standard T-SQL INSERT, UPDATE and DELETE statements. Question: How could the OUTPUT clause be useful in a DELETE statement?
10-12
Key Points
The actions performed by a MERGE statement are not identical to those that would be performed by separate INSERT, UPDATE or DELETE statements.
Determinism
When an UPDATE statement is executed with a join, if more than one source row matches a target row, no error is thrown. This is not permitted for an UPDATE action performed within a MERGE statement. Each source row must match only a single target row or none at all. If more than a single source row matches a target row, an error occurs and all actions performed by the MERGE statement are rolled back.
Performance of MERGE
The MERGE statement will often outperform code constructed from separate INSERT, UPDATE and DELETE statements and conditional logic. In particular, the MERGE statement only ever makes a single pass through the data.
10-13
Key Points
In this demonstration you will see: How to use the MERGE statement How to use the OUTPUT clause with the MERGE statement How to perform optional updates with MERGE How to use MERGE as a composable query How to use the VALUES clause as a MERGE source
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_10_PRJ\6232B_10_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file. Question: What is meant by the term "composable query"?
3. 4. 5.
10-14
Lesson 2
It was mentioned earlier that reducing the number of calls between client applications and SQL Server is important. The aim is to minimize the amount of time lost through network delays and latency. SQL Server 2000 introduced the TABLE data type for use as variables. SQL Server 2008 introduced the ability to declare these as permanent (or temporary) data types and to use them as parameters. The use of TABLE valued parameters can significantly reduce the number of round trips needed between client application code and SQL Server.
Objectives
After completing this lesson, you will be able to: Explain the need to reduce round-trip overhead Describe previous options for passing lists as parameters Explain the role of the TABLE type Populate TABLE types with row constructors
10-15
Key Points
In many applications, the time taken for commands to be sent to the server and for responses to be received can be substantial. It can often be longer than the time to execute the SQL command at the server. It is desirable then to minimize the number of times this happens in an operation.
This means that performing a single action results in six separate round trips to the server.
Transaction Duration
A golden rule when designing systems to maximize concurrency is to never hold a transaction open for longer than required. In the previous example, the transaction is being held open for the time to insert the order header and detail lines. While the transaction needs to be held open for this time, its duration is being artificially increased by the time taken to make all the round trips to the server. This is not desirable. Question: How could the number of round trips being made to the server be reduced?
10-16
Key Points
One method for reducing the number of round trips from a client application to SQL Server is to pass lists of values in each trip.
Previous Options
Prior to SQL Server 2008, the available options for passing lists of values within a single procedure call were very limited. The most commonly used method was to use delimited lists. Mostly these were implemented as comma-delimited lists but other delimiters (eg: pipe) were used. When delimited lists were used, the entire set of values was sent as a string. There are issues with doing this: No control was able to be exerted over the data type being passed. A string could be passed to code expecting a number. The structure was loose. For example, one string might contain five columns and another might contain six. Custom string parsing logic needed to be written.
Passing XML
Another option for passing lists of values was to use XML. This became more common with SQL Server 2005 when the XML data type was introduced. Prior to SQL Server 2005, developers would also sometimes pass XML values as strings but they then needed to write very complex parsing logic. With the introduction of the XML data type, the parsing became easier but not trivial. Processing the received XML is also non-trivial and some processing methods (such as OPENXML) had memory implications, while others (such as the nodes() method) had query optimization implications. In modules 17 and 18 you will learn more about the use of XML in SQL Server.
10-17
Key Points
In this demonstration you will see: How to query a table-valued function that performs list parsing How the function allows for different delimiters How to use the function in a join How common errors can occur with delimited lists
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_10_PRJ\6232B_10_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 21 Demonstration 2A.sql script file. Follow the instructions contained within the comments of the script file.
Question: What are the basic problems with using delimited lists for parameters?
10-18
Key Points
SQL Server 2008 introduced the ability to create user TABLE data types and record them in the system catalog. These types are very useful as they can be used for both variables and for parameters.
TABLE Type
In SQL Server 2000, it was possible to declare a variable of type TABLE. You needed to define the schema of the table when declaring the variable such as in the following code:
DECLARE @BalanceList TABLE (CustomerID int, CurrentBalance decimal(18,2));
In this example, a variable called @BalanceList is defined as being a table. The schema of the table and the variable, only last for the duration of the batch in which the variable is defined. SQL Server 2008 introduced the ability to create user-defined table data type definitions. You can create table data types that can be used both for the data type of variables but also for the data type of parameters. In the example shown on the slide, CustomerBalance is declared as a new data type. You can declare a variable as being of CustomerBalance data type as follows:
CREATE TYPE dbo.CustomerBalance AS TABLE (CustomerID int, CurrentBalance decimal(18,2)); GO DECLARE @BalanceList dbo.CustomerBalance;
10-19
A key advantage of table data types is that you can pass complex structures inside a table more easily than you could in alternatives such as comma-delimited lists. You can have multiple rows each of two or more columns and you can be sure of the data types that will be stored. This is also useful even when declaring variables as a table type can be created and then used for variables through the database application, which leads to less potential inconsistencies. Note that there is no ALTER TYPE statement that can be used to modify the TABLE type definition. Types must be dropped and then recreated when they need to be altered.
10-20
Key Points
SQL Server 2008 also introduced the concept of row constructors and multi-row INSERT statements. These are useful when working with TABLE data types as well as when working with database tables.
10-21
Key Points
In this demonstration you will see: 1. 2. 3. 1. How to work with row constructors How to declare a table data type How to work with variables with user-defined table data types If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_10_PRJ\6232B_10_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
Demonstration Steps
2. 3.
Open the 22 Demonstration 2B.sql script file. Follow the instructions contained within the comments of the script file.
Question: Can other users make use of the table type that you create?
10-22
Lesson 3
In the previous lesson you saw how to declare TABLE data types and how to declare variables of those types. Another potential use for the TABLE data types is in the declaration of parameters, particularly for use with stored procedures but also able to be used with user-defined functions. In this lesson, you will see how to use TABLE input parameters to stored procedures and see how this solves the round trip problems identified in lesson 1. You will also see how to call a stored procedure when passing a table valued parameter.
Objectives
After completing this lesson, you will be able to: Describe the use of TABLE input parameters for stored procedures Use row constructors to populate parameters to be passed to stored procedures
10-23
Key Points
As well as being used for declaring variables, user-defined table data types can be used as parameter data types for stored procedures and functions. (User-defined functions will be discussed in Module 13).
10-24
Key Points
In the previous topic, you saw how to declare a stored procedure that uses a table-valued parameter. The final step in using such a procedure is to then pass a table parameter in the EXEC statement.
10-25
Key Points
In this demonstration you will see: 1. 2. 3. 4. How traditional stored procedure calls often involve multiple round trips to the server How to declare a table data type How to use the table data type to avoid round trips How to view catalog information about the table data types by querying the sys.types and sys.table_types system views
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_10_PRJ\6232B_10_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3.
Open the 31 Demonstration 3A.sql script file. Follow the instructions contained within the comments of the script file.
Question: What is the purpose of the SCOPE_IDENTITY() function shown in the demonstration?
10-26
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item. Click Switch User, and then click Other User. Log on using the following credentials:
10-27
i. User name: AdventureWorks\Administrator ii. Password: Pa$$w0rd 8. From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. 9. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window. 10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_10_PRJ\6232B_10_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
In earlier versions of SQL Server, passing lists of values to stored procedures was a challenge. SQL Server 2008 introduced the table type and table-valued parameters. In this lab, you will create a replacement stored procedure Reports.GetProductsByColorList_Test that uses a table-valued parameter to replace an existing stored procedure Reports.GetProductsByColorList that was based on passing a comma-delimited list of values. If time permits, you will then create a new procedure that processes complete rows of data and performs updates using the MERGE statement.
Supporting Documentation
Procedure Required: Marketing.SalespersonMerge Requirements Input Parameters: Table of Salesperson details, including SalespersonID, FirstName, MiddleName, LastName, BadgeNumber, EmailAlias, SalesTerritoryID. The parameter should be named: SalespersonDetails None For each row, return one column called Action that contains INSERT or UPDATE and another column with the SalespersonID. The SalespersonID must be provided. If it matches an existing salesperson, that row should be updated. Only update columns that are provided. Any SalesTerritoryID that is provided must be valid as it is defined as a foreign key to the Marketing.SalesTerritory table.
Notes:
10-28
10-29
10-30
Challenge Exercise 3: Use a Table Type with MERGE (Only if time permits)
Scenario
In this exercise, you will create a new stored procedure that takes a table-valued parameter and uses the MERGE statement to update a table in the marketing system. The procedure should allow for the creation or update of salespeople held in the Marketing.Salesperson table. The main tasks for this exercise are as follows: 1. 2. 3. Create a new table type Create a replacement procedure Test the replacement procedure.
10-31
Review Questions
1. 2. What is the difference between SOURCE NOT MATCHED and TARGET NOT MATCHED in a MERGE statement? What is a key advantage of the MERGE statement in terms of performance?
Best Practices
1. 2. Use multi-row inserts when the rows being inserted are related in some way, for example, the detail rows of an invoice. Consider making multiple-entity procedures instead of single-entity procedures to help minimize round trip behavior and to reduce locking. For example, very minor changes are required to construct a stored procedure that can insert multiple sales orders compared to a stored procedure that can insert a single sales order.
10-32
11-1
Module 11
Creating Highly Concurrent SQL Server 2008 R2 Applications
Contents:
Lesson 1: Introduction to Transactions Lesson 2: Introduction to Locks Lesson 3: Management of Locking Lesson 4: Transaction Isolation Levels Lab 11: Creating Highly Concurrent SQL Server Applications 11-3 11-17 11-28 11-38 11-44
11-2
Module Overview
It is the responsibility of an enterprise database system to provide mechanisms ensuring the physical integrity of each transaction. A transaction is a sequence of operations performed as a single logical unit of work, and Microsoft SQL Server provides locking facilities that preserve transaction isolation. In this module, you will learn how to manage transactions and locks. Database systems struggle with the need to balance consistency and concurrency. There is often a direct trade-off between these two aims. The challenge is to use the lowest concurrency impact possible while still maintaining sufficient consistency. This module explains how to use transactions and the SQL Server locking mechanisms to meet the performance and data integrity requirements of your applications. Another aim of database management systems is to provide each user with the illusion wherever possible that they are the only user on the system. Transaction isolation levels are critical to minimizing the impact of one user on another. In this module you will also investigate transaction isolation levels.
Objectives
After completing this module, you will be able to: Describe the role of transactions Explain the role of locks Manage locking Work with transaction isolation levels
11-3
Lesson 1
Introduction to Transactions
A core capability provided by relational database management systems such as SQL Server is to provide the ability to group a set of changes that need to be made and to ensure that the entire set of changes occurs or that none occur. Transactions are how this requirement is met within SQL Server.
Objectives
After completing this lesson, you will be able to: Explain what transactions are Describe auto commit transactions Describe explicit transactions Describe implicit transactions Explain the role of transaction recovery Detail considerations for using transactions
11-4
Key Points
A transaction is a sequence of steps that perform a logical unit of work. They must exhibit four properties that are collectively known as ACID.
Atomicity
Atomicity says that either all the steps in the transaction must succeed or none of them should be performed.
Consistency
Consistency ensures that when the transaction is complete, data must be in a consistent state. A consistent state is one where the data is consistent with the business rules related to the data. Inconsistent data violates one or more business rules. In SQL Server, the database has to be in a consistent state after each statement within a transaction and not only at the end of the transaction.
Isolation
Isolation defines that changes made by a transaction must be isolated from other concurrent transactions.
Durability
Durability defines that when the transaction is complete, the changes must be made permanent in the database and must survive even system failures. Transactions ensure that multiple data modifications are processed as a unit. For example, a banking transaction might credit one account and debit another. Both steps must be completed together or not at all. SQL Server supports transaction processing to manage multiple transactions.
Transaction Log
Every transaction is recorded in a transaction log to maintain database consistency and to aid in transaction recovery. When changes are made to data in SQL Server, the database pages that have been
11-5
modified are written to the transaction log on the disk first and later written to the database. If any part of the transaction fails, all of the changes made so far are rolled back to leave the database in its original state. This system ensures that updates are complete and recoverable. Transactions use locking to prevent other users from changing or reading data in a transaction that has not completed. Locking is required in online transaction processing (OLTP) for multi-user systems. Question: Can you think of database operations in your organization where database transactions are especially critical?
11-6
Key Points
Autocommit mode is the default transaction management mode of the SQL Server Database Engine. Every Transact-SQL statement is committed or rolled back when it completes. If a statement completes successfully, it is committed; if it encounters any error, it is rolled back.
Autocommit Mode
A connection to an instance of the Database Engine operates in autocommit mode whenever this default mode has not been overridden by either explicit or implicit transactions. Autocommit mode is also the default mode for ADO, OLE DB, ODBC, and DB-Library. A connection to an instance of the Database Engine operates in autocommit mode until a BEGIN TRANSACTION statement starts an explicit transaction, or implicit transaction mode is set on. When the explicit transaction is committed or rolled back, or when implicit transaction mode is turned off, the connection returns to autocommit mode. If a run-time statement error (such as a constraint violation) occurs in a batch, the default behavior in the Database Engine is to roll back only the statement that generated the error. You can change this behavior using the SET XACT_ABORT statement. After SET XACT_ABORT ON is executed, any run-time statement error causes the batch to quit and if you are currently in a transaction, that would cause an automatic rollback of the current transaction.
Compile Errors
Compile errors, such as syntax errors, are not affected by SET XACT_ABORT. When working in autocommit mode, compile time errors can cause more than one Transact-SQL statement to fail. In this mode, a batch of statements is compiled as a unit and if a compile error is found, nothing in the batch is compiled or executed. The following example shows how this might happen.
USE AdventureWorks2008; GO
11-7
TABLE NewTable (Id INT PRIMARY KEY, Info CHAR(3)); INTO NewTable VALUES (1, 'aaa'); INTO NewTable VALUES (2, 'bbb'); INTO NewTable VALUSE (3, 'ccc'); * FROM NewTable;
-- Syntax error.
-- Returns no rows.
Because there is a typographical error in the third INSERT statement, the batch cannot compile, and so none of the three INSERT statements run. XACT_ABORT can convert statement terminating errors into batch terminating errors. It will be covered in more detail in the T-SQL Error Handling module. Question: When might autocommit mode not be appropriate in a database application?
11-8
Explicit Transactions
Key Points
An explicit transaction is one in which you explicitly define both the start and end of the transaction. You can use explicit transactions to define your own units of business logic. For example, in a bank transfer function, you might enclose the withdrawal of funds from one account and the deposit of those funds in another account within one logical unit of work. DB-Library applications and Transact-SQL scripts use the BEGIN TRANSACTION, COMMIT TRANSACTION, COMMIT WORK, ROLLBACK TRANSACTION, or ROLLBACK WORK Transact-SQL statements to define explicit transactions.
Starting a Transaction
You start a transaction by using the BEGIN TRANSACTION statement. You can specify a name for the transaction, and you can use the WITH MARK option to specify a description for the transaction to be marked in the transaction log. This transaction log mark can be used when restoring a database to indicate the point to which you want to restore. The BEGIN TRANSACTION statement is often abbreviated to BEGIN TRAN. XACT_ABORT has an effect on explicit transaction as well as on implicit transactions. By default, only the statement in error would be rolled back and the batch would continue to run and commit the other statements in the transaction. Therefore error handling must be implemented (which will be discussed in a later module) or XACT_ABORT can be turned on to abort the batch and rollback the whole transaction in case of an error.
Committing a Transaction
You can commit the work contained in a transaction by issuing the COMMIT TRANSACTION statement. Use this to end a transaction if no errors have occurred and you want the contents of the transaction to be committed to the database.
11-9
The COMMIT TRANSACTION statement is often abbreviated to just the word COMMIT. To assist users from other database platforms that are migrating to SQL Server, the statement COMMIT WORK can also be used.
Saving a Transaction
By using savepoints, you can roll back a transaction to a named point in the transaction, instead of the beginning of the transaction. You create a savepoint by issuing the SAVE TRANSACTION statement and specifying the name of the savepoint. You can then use the ROLLBACK TRANSACTION statement and specify the savepoint name to roll the changes back to that point. Use savepoints when an error is unlikely to occur and the cost of checking the data before the error occurs is much higher than testing for the error after the data modifications have been submitted. For example, if you do not expect stock levels to be too low to fulfill an order, you could create a trigger that raises an error when stock levels fall below zero on a stock table. In your ordering code, you can create a savepoint, submit the order, and then check for a negative stock level error from the trigger. If that error is raised, you can roll back the transaction to before the savepoint and notify the customer accordingly. Question: When might you want to use a savepoint?
11-10
Implicit Transactions
Key Points
When a connection is operating in implicit transaction mode, the database engine automatically starts a new transaction after the current transaction is committed or rolled back. You do nothing to delineate the start of a transaction; you only commit or roll back each transaction. Implicit transaction mode generates a continuous chain of transactions.
Implicit Transactions
In most cases, it is best to work in autocommit mode and define transactions explicitly using the BEGIN TRANSACTION statement. However for applications that were originally developed on systems other than SQL Server, the implicit transaction mode can be useful. Implicit transaction mode automatically starts a transaction when you issue certain statements, and the transaction then continues until you issue a commit statement or a rollback statement.
By default, implicit transaction mode is off and the database works in autocommit mode.
11-11
CREATE DELETE DROP FETCH GRANT INSERT OPEN REVOKE SELECT TRUNCATE TABLE UPDATE
Nested transactions (where a transaction is started within another transaction) are not allowed in implicit transaction mode. If the connection is already in a transaction, these statements do not start a new transaction.
11-12
Transaction Recovery
Key Points
SQL Server automatically guarantees that all committed transactions are reflected in the database in the event of a failure. It uses the transaction log and checkpoints to do this.
Checkpoints
As each Transact-SQL statement is executed, it is recorded to the transaction log on disk before it is written to the database and before the user is notified that the transaction was committed successfully. SQL Server performs checkpoints at defined intervals. Checkpoints are marked in the transaction log to identify which transactions have already been applied to the database. When a new checkpoint occurs, all data pages in memory that have been modified since the last checkpoint are written to the database.
Transaction Recovery
If any errors occur during a transaction, the instance of SQL Server uses the information in the log file to roll back the transaction. This rollback does not affect the work of any other users working in the database at the same time. Usually, the error is returned to the application, and if the error indicates a possible problem with the transaction, the application issues a ROLLBACK statement. Some errors, such as a 1205 deadlock error, roll back a transaction automatically. If anything stops the communication between the client and an instance of SQL Server while a transaction is active, the instance rolls back the transaction automatically when notified of the stoppage by the network or operating system. This could happen if the client application terminates, if the client computer is shut down or restarted, or if the client network connection is broken. In all of these error conditions, any outstanding transaction is rolled back to protect the integrity of the database.
11-13
committed transactions and roll back any incomplete transactions. The log uses the last checkpoint as a starting marker knowing that all transactions committed before this were written to the database and that all transactions that started before this, but were still active, need to be rolled back as the changes where already written to the data files. In the slide example: Transaction 1 is committed before the checkpoint, so it is reflected in the database. Transactions 2 and 4 were committed after the checkpoint, so they must be reconstructed from the log (rolled forward). Transactions 3 and 5 were not committed, so SQL Server rolls them back.
Question: A server crash occurs while two transactions are running. Transaction A is an autocommit transaction that has been written to the transaction log, but not written to the disk. Transaction B is an explicit transaction that has not been committed, though a checkpoint was written while Transaction B was running. What will happen to each transaction when the server is recovered?
11-14
Key Points
There are a number of general considerations that need to be kept in mind when working with transactions.
11-15
11-16
Key Points
In this demonstration you will see: How transactions work How blocking affects other users Note that blocking is discussed further in the next lesson
Demonstration Steps
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_11_PRJ\6232B_11_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Open the 12 Demonstration 1A 2nd Window.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5. 6.
11-17
Lesson 2
Introduction to Locks
SQL Server (and many other database engines) makes extensive use of locks when ensuring isolation between users and consistency of transactions. It is important to have an understanding of how locking works and to understand how locking differs from blocking.
Objectives
After completing this lesson, you will be able to: Detail different methods of Concurrency Control Explain what locks are Differentiate between blocking vs locking Describe what concurrency problems are prevented by locking Detail SQL Server's lockable resources Describe the types of locks that are available Explain lock compatibility
11-18
Key Points
Concurrency control is system of controls that administers many people trying to access the same data at the same time so they do not adversely affect each other.
Concurrency Control
SQL Server supports a wide range of optimistic and pessimistic concurrency control mechanisms. Users specify the type of concurrency control by specifying the transaction isolation level for a connection. Concurrency control ensures that modifications made by one person do not adversely affect modifications that others make. There are two types of concurrency control; pessimistic and optimistic.
11-19
Key Points
Locking is a mechanism used by the Database Engine to synchronize access by multiple users to the same piece of data at the same time.
Locking Behavior
Before a transaction acquires a dependency on the current state of a piece of data, such as by reading or modifying the data, it must protect itself from the effects of another transaction modifying the same data. The transaction does this by requesting a lock on the piece of data. Read locks allow others to read but not write data. Write locks stop others from reading or writing. In SQL Server, these locks are implemented via different locking modes, such as shared or exclusive. The lock mode defines the level of dependency the transaction has on the data. No transaction can be granted a lock that would conflict with the mode of a lock already granted on that data to another transaction. If a transaction requests a lock mode that conflicts with a lock that has already been granted on the same data, the instance of the Database Engine will pause the requesting transaction until the first lock is released. When a transaction modifies a piece of data, it holds the lock protecting the modification until the end of the transaction. How long a transaction holds the locks acquired to protect read operations depends on the transaction isolation level setting. All locks held by a transaction are released when the transaction completes (either commits or rolls back). Question: If a doctor's office uses a database application to manage patient records, how might locks play a role in that application?
11-20
Key Points
The two terms locking and blocking are not the same thing and are often confused for each other.
Locking
Locking is the action of taking and holding locks that is used to implement concurrency control.
Blocking
Blocking is what happens to one process while it needs to wait for a resource that another process has locked. Blocking is a normal occurrence for systems that use locking. It is only excessive blocking that is a problem. Question: What symptoms do you imagine that "excessive" blocking might relate to?
11-21
Key Points
Users modifying data can affect other users who are reading or modifying the same data at the same time. These users are said to be accessing the data concurrently. If a data storage system has no concurrency control, users could see the side effects listed on the slide.
11-22
Question: Has your organization experienced concurrency problems with database applications? If so, what behavior did you see?
11-23
Lockable Resources
Key Points
For optimal performance, the number of locks that SQL Server maintains must be balanced with the amount of data that each lock holds. To minimize the cost of locking, SQL Server automatically locks resources at a level that is appropriate to the task. Question: If a database needs to lock several rows of data at once, what resources might be locked?
11-24
Types of Locks
Key Points
SQL Server locks resources using different lock modes that determine how the resources can be accessed by concurrent transactions. SQL Server has two main types of locks: basic locks and locks for special situations.
Basic Locks
In general, read operations acquire shared locks, and write operations acquire exclusive locks. Shared locks. SQL Server typically uses shared (read) locks for operations that neither change nor update data. If SQL Server has applied a shared lock to a resource, a second transaction also can acquire a shared lock, even though the first transaction has not completed. Consider the following facts about shared locks: They are used for read-only operations; data cannot be modified. SQL Server releases shared locks on a record as soon as the next record is read. A shared lock will exist until all rows that satisfy the query have been returned to the client.
Exclusive locks. SQL Server uses exclusive (write) locks for the INSERT, UPDATE, and DELETE data modification statements. Consider the following facts about exclusive locks: Only one transaction can acquire an exclusive lock on a resource. A transaction cannot acquire a shared lock on a resource that has an exclusive lock. A transaction cannot acquire an exclusive lock on a resource until all shared locks are released.
11-25
Intent locks. SQL Server uses intent locks internally to minimize locking conflicts. Intent locks establish a locking hierarchy so that other transactions cannot acquire locks at more inclusive levels. For example, if a transaction has an exclusive row-level lock on a specific customer record, the intent lock prevents another transaction from acquiring an exclusive lock at the table-level. Intent locks include intent share (IS), intent exclusive (IX), and shared with intent exclusive (SIX).
Update locks. SQL Server uses update locks when it will modify a page at a later point. Before it modifies the page, SQL Server promotes the update page lock to an exclusive page lock to prevent locking conflicts. Consider the following facts about update locks. Update locks are: Acquired during the initial portion of an update operation when the pages are first being read. Compatible with shared locks.
Schema locks. SQL Server uses these to ensure that a table or index is not dropped, or its schema modified, when it is referenced by another session. SQL Server provides two types of schema locks: Schema stability (Sch-S), which ensures that a resource is not dropped. Schema modification (Sch-M), which ensures that other sessions do not reference a resource that is under modification.
Bulk update locks. SQL Server uses these to enable processes to bulk copy data concurrently into the same table while preventing other processes that are not bulk-copying data from accessing the table. SQL Server uses bulk update locks when either of the following is used: the TABLOCK hint or the table lock on bulk load option.
Question: What happens if a query tries to read data from a row that is currently locked by an exclusive (X) lock?
11-26
Lock Compatibility
Key Points
Some locks are compatible with other locks, and some locks are not. For example, two users can both hold shared locks on the same data at the same time, but only one update lock can be issued on a piece of data at any one time.
Lock Compatibility
Locks have a compatibility matrix that shows which locks are compatible with other locks that are established on the same resource. The locks shown in the slide are the most common forms. The locks in the following table are listed in order from the least restrictive (shared) to the most restrictive (exclusive). Existing granted lock Requested Lock Intent shared (IS) Shared (S) Update (U) Intent exclusive (IX) Shared with intent exclusive (SIX) Exclusive (X) IS Yes Yes Yes Yes Yes No S Yes Yes Yes No No No U Yes Yes No No No No IX Yes No No Yes No No SIX Yes No No No No No X No No No No No No
11-27
The schema modification lock (Sch-M) is incompatible with all locks. The schema stability lock (Sch-S) is compatible with all locks except the schema modification lock (Sch-M).
11-28
Lesson 3
Management of Locking
Locking behavior in SQL Server mostly operates without any need for management or application intervention. However, it may be desirable in some situations to exert control over locking behavior.
Objectives
After completing this lesson, you will be able to: Explain locking timeout Describe lock escalation Explain what deadlocks are Describe locking-related table hints Describe methods to view locking information
11-29
Locking Timeout
Key Points
Applications might need to wait some time for locks held by other applications to be released. A key decision is how long an application should wait for a lock to be released.
Locking Timeout
The length of time that it is reasonable to wait for a lock to be released is totally dependent upon the design requirements of the application. By default, SQL Server will wait forever for a lock. LOCK_TIMEOUT is a session-level setting that determines (in milliseconds) the number of seconds to wait for any lock to be released before rolling back the statement (note not necessarily the transaction) and returning an error. The default value of -1 indicates that SQL Server should wait forever. Setting a lock timeout at the session level is not common as most applications implement query timeouts. READPAST tells SQL Server to ignore any rows that it can't read as they are locked. It is rarely used. Question: Can you think of any situations where READPAST might be useful?
11-30
Lock Escalation
Key Points
Lock escalation is the process of converting many fine-grain locks into fewer coarse-grain locks, reducing system overhead while increasing the probability of concurrency contention.
The database engine might do both row and page locking for the same statement to minimize the number of locks and reduce the likelihood that lock escalation will be necessary. For example, the database engine could place page locks on a nonclustered index (if enough contiguous keys in the index node are selected to satisfy the query) and row locks on the data. To escalate locks, the database engine attempts to change the intent lock on the table to the corresponding full lock, for example, changing an intent exclusive (IX) lock to an exclusive (X) lock, or an intent shared (IS) lock to a shared (S) lock). If the lock escalation attempt succeeds and the full table lock is acquired, then all heap or B-tree, page (PAGE), or row-level (RID) locks held by the transaction on the
11-31
heap or index are released. If the full lock cannot be acquired, no lock escalation happens at that time and the database engine will continue to acquire row, key, or page locks. Partitioned tables can have locks escalated to the partition level before escalating to the table level. Partition-level escalation can be set on a per-table basis. This allows for data in other partitioned to be available, so that the entire table is not locked. Partition-level lock escalation can be set on a per-table basis. Question: Why do you imagine that SQL Server might find escalating locks worthwhile?
11-32
Key Points
Deadlocks occur when demands for resources are not able to be resolved by waiting for locks to be released, no matter how long the processes involved wait.
Deadlocks
The most simple example of a deadlock occurs when two transactions have locks on separate objects and each transaction requests a lock on the other transactions object. For example: Transaction A holds a shared lock on row 1. Transaction B holds a shared lock on row 2. Transaction A requests an exclusive lock on row 2, but it cannot be granted until Transaction B releases the share lock. Transaction B requests an exclusive lock on row 1, but it cannot be granted until Transaction A releases the share lock.
Each transaction must wait for the other to release the lock. A deadlock can also occur when several long-running transactions execute concurrently in the same database. A deadlock also can occur as a result of the order in which the optimizer processes a complex query, such as a join, in which you cannot necessarily control the order of processing.
11-33
Notifies the deadlock victims application (with message number 1205) Cancels the deadlock victims current request Allows other transactions to continue
Error 1205 is one of the errors that should be specifically checked for by applications. If error 1205 is found, the application should attempt the transaction again. Error 1205 is also a good example of why database engine errors should not be passed directly to end users. The message returned for error 1205 is highly emotive sounding. Emotive words can cause emotive reactions. People don't like to see they have been "chosen as a deadlock victim". Question: Have you experienced deadlocking problems in your current environment? If so, how did you determine that deadlocks were a problem, and how was it resolved?
11-34
Key Points
The available locking hints are listed here for completeness. While a wide range of locking hints are available, it is important to realize that they should rarely be used and only with extreme caution. Question: Why would you ever take an exclusive table-lock?
11-35
Key Points
Typically, you use SQL Server Management Studio to display a report of active locks. You can use SQL Server Profiler to obtain information on a specific set of transactions. You can also use Reliability and Performance Monitor to display SQL Server locking histories.
11-36
analyze and resolve server resource issues, monitor login attempts and connections, and correct deadlock problems. Use the Locks Event Category to capture locking information in a trace.
11-37
Key Points
In this demonstration you will see how: View lock information using Activity Monitor Use dynamic management views to view lock info
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_11_PRJ\6232B_11_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3. 4. 5.
Open the 31 Demonstration 3A.sql script file. Open the 32 Demonstration 3A 2nd Window.sql script file. Open the 33 Demonstration 3A 3rd Window.sql script file. Follow the instructions contained within the comments of the script file.
11-38
Lesson 4
The final important concept that needs to be understood when working with concurrency in SQL Server is transaction isolation levels. It was mentioned earlier that a role of any database engine is to give each user the best possible illusion that they are the only user on the system. This is not totally possible but appropriate use of transaction isolation levels can help in this regard.
Objectives
After completing this lesson, you will be able to: Describe SQL Server transaction isolation levels Explain the role of the read committed snapshot database option Detail isolation-related table hints
11-39
Key Points
An isolation level protects a transaction from the effects of other concurrent transactions. Use the transaction isolation level to set the isolation level for all transactions during a session. When you set the isolation level, you specify the default locking behavior for all statements in your session.
READ UNCOMMITTED is the lowest isolation level, and only ensures that corrupt data is not read. This is equivalent to a NOLOCK table hint. With READ UNCOMMITTED, you are sacrificing consistency in favor of high concurrency. NOLOCK was commonly used in reporting applications before SNAPSHOT isolation level became available. It is not safe to use NOLOCK on queries that are accessing data currently being changed. The ability to read uncommitted data can lead to many inconsistencies. READ COMMITTED acquires short lived share locks before reading data and released after the processing is complete. This is the SQL Server default. Dirty reads cannot occur but non-repeatable reads can occur. REPEATABLE READ retains locks on every row it touches until the end of the transaction. Even rows that do not qualify for the query result remain locked. These locks ensure that the rows touched by the query
11-40
cannot be updated or deleted by a concurrent session until the current transaction completes (whether it is committed or rolled back). SNAPSHOT tries to avoid one process blocking another when only one is performing updates. As an example, if a transaction attempts to modify a row that another transaction is holding a shared lock on, SQL Server creates a copy of the row in a row version table (actually held in tempdb) and allows the update to proceed. The transaction with the shared lock will read from the row in the row version table instead of the table row. The situation where this can lead to a problem is where the application then attempts to update the version of the row in the row version table. If the modification to the original table row is committed, this second modification will then fail with a concurrency violation. Note that before SNAPSHOT isolation level can be used, it needs to be enabled via a database option. SERIALIZABLE ensures consistency by assuming that two transactions might try to update the same data and uses locks to ensure that they do not but at a cost of reduced concurrency - one transaction must wait for the other to complete and two transactions can deadlock. Transactions are completely isolated from one another.
11-41
Key Points
Not every application can use snapshot isolation level as it needs to be specified when beginning a transaction. Often this requires a change to the application code. Many existing reporting applications cause excessive blocking. Prior to SQL Server 2005, this was commonly dealt with via NOLOCK hints.
It is important to note though that the read committed snapshot database option does not achieve exactly the same outcome as the snapshot isolation level set at the session level. With the read committed
11-42
snapshot database option, snapshots are only present for the duration of each statement, not for the duration of the transaction.
Database Configuration
Before either snapshot isolation level or the read committed snapshot database option can be used, the database needs to be configured to allow snapshot isolation.
11-43
Key Points
Similar to the way table hints can be applied to control locking, the isolation-level table hints can be applied to override the default transaction isolation level. They are listed here for completeness.
11-44
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item. Click Switch User, and then click Other User.
11-45
i. User name: AdventureWorks\Administrator ii. Password: Pa$$w0rd 8. From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. 9. If the Server Manager window appears, check the Do not show me this console at logon check box and close the Server Manager window. 10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_11_PRJ\6232B_11_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
In this lab, you will perform basic investigation of a deadlock situation. You are trying to determine an appropriate transaction isolation level for a new application. If you have time, you will investigate the trade-off between concurrency and consistency.
11-46
11-47
Execute the code step by step, making sure to highlight and execute just the required code blocks in each script window, by following the step by step instructions
Results: After this exercise, you will have seen how transaction isolation levels work.
11-48
Review Questions
1. 2. 3. Why is snapshot isolation level helpful? What is the difference between a shared lock and an exclusive lock? Why would you use read committed snapshot rather than snapshot isolation level?
Best Practices
1. 2. 3. Always use the lowest transaction isolation level possible to avoid blocking and to avoid the chance of deadlocks. Many Microsoft-supplied components default to Serializable transactional isolation level but do not need to be run at that level. Common examples are Component Services and BizTalk adapters. Before spending too much time investigating blocking issues, make sure that all the queries that are involved are executing quickly. This usually involves making sure that appropriate indexes are in place. Often when query performance issues are resolved, blocking issues disappear.
12-1
Module 12
Handling Errors in T-SQL Code
Contents:
Lesson 1: Understanding T-SQL Error Handling Lesson 2: Implementing T-SQL Error Handling Lesson 3: Implementing Structured Exception Handling Lab 12: Handling Errors in T-SQL Code 12-3 12-13 12-23 12-31
12-2
Module Overview
When creating applications for SQL Server using the T-SQL language, appropriate handling of errors is critically important. A large number of myths surround how error handling in T-SQL works. In this module, you will explore T-SQL error handling; look at how it has traditionally been implemented and how structured exception handling can be used.
Objectives
After completing this lesson, you will be able to: Design T-SQL error handling Implement T-SQL error handling Implement structured exception handling
12-3
Lesson 1
Before delving into the coding that deals with error handling in T-SQL, it is important to gain an understanding of the nature of errors, of where they can occur when T-SQL is being executed, the data that is returned by errors and the severities that errors can exhibit.
Objectives
After completing this lesson, you will be able to: Explain where T-SQL errors occur Describe types of errors Explain what values are returned by an error Describe different levels of error severities
12-4
Key Points
T-SQL statements go through multiple phases during their execution. Errors can occur at each phase. Some errors could potentially be handled by the database engine. Other errors will need to be passed back to the calling application.
Syntax Check
In the first phase of execution, the syntax of a statement is checked. At this phase errors occur if the statements do not conform to the rules of the language. During the syntax checking phase that the objects referred to may not actually exist yet still no errors would be returned. For example, imagine the execution of a statement where the word "Customer" was misspelled:
SELECT * FROM Custommer;
There is nothing incorrect about this from a syntax point of view. The rules of the T-SQL language have been followed. During the syntax checking phase, no error would be returned.
Object Resolution
In the second phase of execution, the objects referenced by name in the T-SQL statements are resolved to underlying object IDs. Errors occur at this phase if the objects do not exist. Single part names are resolved to specific objects at this point. To avoid ambiguity at this point, multi-part names should be used for objects except in rare circumstances. In the example above, SQL Server would first look for an table named "Custommer" in the default schema of the user executing the code. If no such table exists, SQL Server would then look for the table "dbo.Custommer" ie: it would next look in the dbo schema. If no such table existed in the dbo schema, an error would then be returned.
12-5
Statement Execution
In the third phase of execution, the statement is executed. At this phase, runtime errors can occur. For example, the user may not have permission to SELECT from the table specified or an INSERT statement might fail because a constraint was going to be violated. You could also have more basic errors occurring at this point such as an attempt to divide by a zero value. Some errors can be handled in the database engine but other errors will need to be handled by client applications. Client applications always need to be written with error handling in mind. Question: Can you suggest a reason why you might want to catch errors in a client application rather than allowing the errors to be seen by the end users?
12-6
Types of Errors
Key Points
A number of different categories of error can occur. Mostly they differ by the scope of the termination that occurs when the error is not handled.
Syntax Errors
Syntax errors occur when the rules of the language are not followed. For example, consider the following statement:
SELECT TOP(10) FROM Production.Product;
If you try to execute the statement, you receive the following message:
Msg 156, Level 15, State 1, Line 1 Incorrect syntax near the keyword 'FROM'.
Note that the syntax of the entire batch of statements being executed is checked before the execution of any statement within the batch is attempted. Syntax errors are batch terminating errors.
The same issue would occur if the schema for the object was not specified and the object did not exist in the user's default schema or in the dbo schema. Note that if a syntax error occurs, no attempt at object name resolution will be made.
12-7
When this batch is executed, the following is returned: Msg 547, Level 16, State 0, Line 3
The DELETE statement conflicted with the REFERENCE constraint "FK_BillOfMaterials_Product_ComponentID". The conflict occurred in database "AdventureWorks2008R2", table "Production.BillOfMaterials", column 'ComponentID'. The statement has been terminated. Hello
Note that the PRINT statement was still executed even though the DELETE statement failed because of a constraint violation.
12-8
What's in an Error?
Key Points
An error is itself an object and has properties as shown in the table.
What's in an Error
It might not be immediately obvious that a SQL Server error (or sometimes called an exception) is itself an object. Errors return a number of useful properties. Error numbers are helpful when trying to locate information about the specific error, particularly when searching online for information about the error. You can view the list of system-supplied error messages by querying the sys.messages catalog view:
SELECT * FROM sys.messages ORDER BY message_id, language_id;
Note that there are multiple messages with the same message_id. Error messages are localizable and can be returned in a number of languages. A language_id of 1033 is the English version of the message. You can see an English message in the third line of the output above. Severity indicates how serious the error is. It is described further in the next topic.
12-9
State is defined by the author of the code that raised the error. For example, if you were writing a stored procedure that could raise an error for a missing customer and there were five places in the code that this message could occur, you could assign a different state to each of the places where the message was raised. This would help later to troubleshoot the error. Procedure name is the name of the stored procedure that that error occurred in and Line Number is the location within that procedure. In practice, line numbers are not very helpful and not always applicable. Question: Why is it useful to be able to localize error messages?
12-10
Error Severity
Key Points
The severity of an error indicates the type of problem encountered by SQL Server. Low severity values are informational messages and do not indicate true errors. Error severities occur in ranges:
Values from 0 to 10
Values from 0 to 9 are purely informational messages. When queries that raise these are executed in SQL Server Management Studio, the information is returned but no error status information is provided. For example, consider the following code executed against the AdventureWorks2008R2 database:
SELECT COUNT(Color) FROM Production.Product;
When executed, it returns a count as expected. However, if you look on the Messages tab in SQL Server Management Studio, you will see the following:
Warning: Null value is eliminated by an aggregate or other SET operation. (1 row(s) affected)
Note that no error really occurred but SQL Server is warning you that it ignored NULL values when counting the rows. Note that no status information is returned. Severity 10 is the top of the informational messages.
Values from 11 to 16
Values from 11 to 16 are considered errors that the user can correct. Typically they are used for errors where SQL Server assumes that the statement being executed was in error. Here are a few examples of these errors:
12-11
Error Severity Example 11 indicates that an object does not exist 13 indicates a transaction deadlock 14 indicates errors such as permission denied 15 indicates syntax errors 17 indicates that SQL Server has run out of resources (memory, disk space, locks, etc.)
Values from 17 to 19
Values from 17 to 19 are considered serious software errors that the user cannot correct.
Values above 19
Values above 19 tend to be very serious errors that normally involve errors with either the hardware or SQL Server itself. It is common to ensure that all errors above 19 are logged and alerts generated on them.
12-12
Key Points
In this demonstration you will see how to: See how different types of errors are returned from T-SQL statements See the types of messages that are related to severe errors Query the sys.messages view and note which errors are logged automatically
Demonstration Setup
1. 2. Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_12_PRJ\6232B_12_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer. Open the 11 Demonstration 1A.sql script file. Follow the instructions contained within the comments of the script file.
3. 4. 5.
12-13
Lesson 2
Now that you understand the nature of errors, it is time to consider how they can be handled or reported in T-SQL. The T-SQL language offers a variety of error handling capabilities. It is important to understand these and how they relate to transactions. This lesson covers basic T-SQL error handling, including how you can raise errors intentionally and how you can set up alerts to fire when errors occur. In the next lesson, you will see how to implement a more advanced form of error handling known as structured exception handling.
Objectives
After completing this lesson, you will be able to: Raise errors Use the @@ERROR system variable Explain the role of errors and transactions Explain transaction nesting errors Raise custom errors Create alerts that fire when errors occur
12-14
Raising Errors
Key Points
Both PRINT and RAISERROR can be used to return information or warning messages to applications. RAISERROR allows applications to raise an error that could then be caught by the calling process.
RAISERROR
The ability to raise errors in T-SQL makes error handling in the application easier as it is sent like any other system error. RAISERROR is used to: Help troubleshoot T-SQL code Check the values of data Return messages that contain variable text
Note that using a PRINT statement is similar to raising an error of severity 10, as shown in the sample on the slide.
12-15
Using @@Error
Key Points
Most traditional error handling code in SQL Server applications has been created using @@ERROR. Note that structured exception handling was introduced in SQL Server 2005 and provides a strong alternative to using @@ERROR. It will be discussed in the next lesson. A large amount of existing SQL Server error handling code is based on @@ERROR so it is important to understand how to work with it.
@@ERROR
@@ERROR is a system variable that holds the error number of the last error that has occurred. One significant challenge with @@ERROR is that the value it holds is quickly reset as each additional statement is executed. For example, consider the following code:
RAISERROR(N'Message', 16, 1); IF @@ERROR <> 0 PRINT 'Error=' + CAST(@@ERROR AS VARCHAR(8)); GO
You might expect that when it is executed, it would return the error number in a printed string. However, when the code is executed, it returns:
Msg 50000, Level 16, State 1, Line 1 Message Error=0
Note that the error was raised but that the message printed was "Error=0". You can see in the first line of the output that the error was actually 50000 as expected with a message passed to RAISERROR. This is because the IF statement that follows the RAISERROR statement was executed successfully and caused the @@ERROR value to be reset.
12-16
12-17
Key Points
Many new developers are surprised to find out that a statement that fails even when enclosed in a transaction does not automatically rolled the transaction back, only the statement itself. The SET XACT_ABORT statement can be used to control this behavior.
12-18
Note that both PRINT statements still execute even though the DELETE statement failed.
SET XACT_ABORT ON
The SET XACT_ABORT ON statement is used to tell SQL Server that statement terminating errors should become batch terminating errors. Now consider the same code with SET XACT_ABORT ON present:
SET XACT_ABORT ON; BEGIN TRAN; DELETE Production.Product WHERE ProductID = 1; PRINT 'Hello'; COMMIT; PRINT 'Hello again';
Note that when the DELETE statement failed, the entire batch was terminated, including the transaction that had begun. The transaction would have been rolled back.
12-19
Key Points
Any ROLLBACK causes all levels of transactions to be rolled back, not just the current nesting level.
Nesting Levels
You can determine the current transaction nesting level by querying the @@TRANCOUNT system variable. Another rule to be aware of is that SQL Server requires that the transaction nesting level of a stored procedure is the same on entry to the stored procedure and on exit from it. If the transaction nesting level differs, error 286 is raised. This is commonly seen when users are attempting to nest transaction rollback.
12-20
Key Points
Rather than raising system errors, SQL Server allows users to define custom error messages that have meaning to their applications. The error numbers supplied must be 50000 or above and the user adding them must be a member of the sysadmin or serveradmin fixed server roles.
12-21
Key Points
For certain categories of errors, administrators might wish to be notified as soon as these errors occur. This can even apply to user-defined error messages. For example, you may wish to raise an alert whenever a customer is deleted. More commonly, alerting is used to bring high-severity errors (such as severity 19 or above) to the attention of administrators.
Raising Alerts
Alerts can be created for specific error messages. The alerting service works by registering itself as a callback service with the event logging service. This means that alerts only work on errors that are logged. There are two ways to make messages be alert-raising. You can use the WITH LOG option when raising the error or the message can be altered to make it logged by executing sp_altermessage. Modifying system errors via sp_altermessage is only possible from SQL Server 2005 SP3 or SQL Server 2008 SP1 onwards. Question: Can you suggest an example of an error that would require immediate attention from an administrator?
12-22
Key Points
In this demonstration you will see: How to raise errors How severity affects errors How to add a custom error message How to raise a custom error message That custom error messages are instance-wide How to use @@ERROR That system error messages cannot be raised
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_12_PRJ\6232B_12_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3. 4.
Open the 21 Demonstration 2A.sql script file. Open the 22 Demonstration 2A 2nd Window.sql script file. Follow the instructions contained within the comments of the script file.
12-23
Lesson 3
Now that you have an understanding of the nature of errors and of basic error handling in T-SQL, it is time to look at a more advanced form of error handling. Structured exception handling was introduced in SQL Server 2005. You will see how to use it and evaluate its benefits and limitations.
Objectives
After completing this lesson, you will be able to: Explain TRY CATCH block programming Describe the role of error handling functions Describe catchable vs non-catchable errors Explain how TRY CATCH relates to transactions Explain how errors in managed code are surfaced
12-24
Key Points
Structured exception handling has been part of high level languages for some time. SQL Server 2005 introduced structured exception handling to the T-SQL language.
Current Limitations
High level languages often offer a try/catch/finally construct. There is no equivalent FINALLY block in TSQL. There is currently no mechanism for rethrowing errors and only errors above 50000 can be thrown manually. This means that you cannot raise a system error within a CATCH block. Question: In what situation might it have been useful to be able to raise a system error?
12-25
Key Points
CATCH blocks make the error related information available throughout the duration of the CATCH block, including in sub-scopes such as stored procedures run from within the CATCH block.
12-26
Key Points
It is important to realize that while TRYCATCH blocks allow you to catch a much wider range of errors than you could catch with @@ERROR, you cannot catch all types of errors.
Question: Given the earlier discussion on the phases of execution of T-SQL statements, how could a syntax error occur once a batch has already started executing?
12-27
Key Points
If a transaction is current at the time an error occurs, the statement that caused the error is rolled back. If XACT_ABORT is ON, the execution jumps to the CATCH block, instead of terminating the batch as usual.
XACT_STATE()
Look at the code in the slide example. It is important to consider that when the CATCH block is entered, the transaction may or may not have actually started. In this example, @@TRANCOUNT is being used to determine if there is a transaction in progress and to roll back if there is one. Another option is to use the XACT_STATE() function which provides more detailed information in this situation. The XACT_STATE() function can be used to determine the state of the transaction: A value of 1 indicates that there is an active transaction. A value of 0 indicates that there is no active transaction.
12-28
A value of -1 indicates that there is a current transaction but that it is doomed. The only action permitted within the transaction is to roll it back.
12-29
Key Points
SQL CLR Integration allows for the execution of managed code within SQL Server. High level .NET languages such as C# and VB have detailed exception handling available to them. Errors can be caught using standard .NET try/catch/finally blocks.
12-30
Key Points
In this demonstration you will see how to use structured exception handling to retry deadlock errors.
Demonstration Steps
1. If Demonstration 1A was not performed: Revert the 623XB-MIA-SQL virtual machine using Hyper-V Manager on the host system. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, click SQL Server Management Studio. In the Connect to Server window, type Proseware in the Server name text box and click Connect. From the File menu, click Open, click Project/Solution, navigate to D:\6232B_Labs\6232B_12_PRJ\6232B_12_PRJ.ssmssln and click Open. Open and execute the 00 Setup.sql script file from within Solution Explorer.
2. 3. 4.
Open the 31 Demonstration 3A.sql script file. Open the 32 Demonstration 3A 2nd Window.sql script file. Follow the instructions contained within the comments of the script file.
12-31
Lab Setup
For this lab, you will use the available virtual machine environment. Before you begin the lab, you must complete the following steps: 1. 2. 3. On the host computer, click Start, point to Administrative Tools, and then click Hyper-V Manager. Maximize the Hyper-V Manager window. In the Virtual Machines list, if the virtual machine 623XB-MIA-DC is not started: 4. Right-click 623XB-MIA-DC and click Start. Right-click 623XB-MIA-DC and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears, and then close the Virtual Machine Connection window.
In the Virtual Machines list, if the virtual machine 623XB-MIA-SQL is not started: Right-click 623XB-MIA-SQL and click Start. Right-click 623XB-MIA-SQL and click Connect. In the Virtual Machine Connection window, wait until the Press CTRL+ALT+DELETE to log on message appears.
5. 6. 7.
In Virtual Machine Connection window, click on the Revert toolbar icon. If you are prompted to confirm that you want to revert, click Revert. Wait for the revert action to complete. In the Virtual Machine Connection window, if the user is not already logged on: On the Action menu, click the Ctrl-Alt-Delete menu item. Click Switch User, and then click Other User. Log on using the following credentials:
12-32
i. User name: AdventureWorks\Administrator ii. Password: Pa$$w0rd 8. From the View menu, in the Virtual Machine Connection window, click Full Screen Mode. 9. If the Server Manager window appears, check the Do not show me this console at logoncheck box and close the Server Manager window. 10. In the virtual machine, click Start, click All Programs, click Microsoft SQL Server 2008 R2, and click SQL Server Management Studio. 11. In Connect to Server window, type Proseware in the Server name text box. 12. In the Authentication drop-down list box, select Windows Authentication and click Connect. 13. In the File menu, click Open, and click Project/Solution. 14. In the Open Project window, open the project D:\6232B_Labs\6232B_12_PRJ\6232B_12_PRJ.ssmssln. 15. In Solution Explorer, double-click the query 00-Setup.sql. When the query window opens, click Execute on the toolbar.
Lab Scenario
In this lab, a company developer asks you for assistance with some code he is modifying. The code was written some time back and uses simple T-SQL error handling. He has heard that structured exception handling is more powerful and wishes to use it instead. If time permits, you will also design and implement changes to a stored procedure to provide for automated retry on deadlock errors.
12-33
Exercise 1: Replace @@ERROR based error handling with structured exception handling
Scenario
In this exercise, you need to modify his code to use structured exception handling. The main tasks for this exercise are as follows: 1. 2. 3. Review the existing code Rewrite the stored procedure to use structured exception handling Test that the procedure
12-34
Challenge Exercise 2: Add deadlock retry logic to the stored procedure (Only if time permits)
Scenario
In this exercise, the operations team have mentioned that the same stored procedure also seems to routinely fail with deadlock errors. To assist them, make further modifications to your new procedure to add automatic retry code for deadlock errors. The main tasks for this exercise are as follows: 1. 2. Modify the code to re-try on deadlock Test the stored procedure
12-35
Review Questions
1. 2. 3. What is the purpose of the SET XACT_ABORT ON statement? Why should retry logic be applied to deadlock handling? Give an example of an error that retries would not be useful for.
Best Practices
a) When designing client-side database access code, do not assume that database operations will always occur without error. Instead of a pattern like: Start a transaction Do some work Commit the transaction Consider instead a pattern like: Reset the retry count While the transaction is not committed and the retry count is not exhausted, attempt to perform the work and commit the transaction. If an error occurs and it is an error that retries could apply to, retry step b). Otherwise, return the error to the calling code.
b)
12-36