Está en la página 1de 12

1

MIS 3500 DBMS * Instructor: Bob Travica


Lab on Advanced Querying and Other Uses of SQL
Updated 2015

The purpose of this lab is to complete the topic of querying and provide you with some methods
that could be useful for developing your term project application. Specifically, you will explore these
topics:
-

Writing OUTER JOINS


Writing complex queries that use other queries and checking query accuracy
Writing Crosstab Query
Writing SQL statements for modifying data (UPDATE Query)
Writing SQL statements for deleting data (DELETE Query)
Writing SQL statements for inserting data into another table (INSERT Query)

To go through this lab you need to use the Sallys Pet Store 2007 software. You can
download it here. This database is different from the 2010 version we used in other occasions in
the Animal line of products (there are animal orders, sales, etc.); take a look the schema before
you proceed to querying.

1. Outer Joins
INNER JOIN displays just the rows matched between two tables, that is, the rows that have the
same values in columns for the primary key and foreign key. In contrast, OUTER JOIN displays all
the rows from one table and the matched rows from another table. You can think of INNER JOIN as
symmetric and OUTER JON as asymmetric. A useful consequence of OUTER JOIN is that it can
uncover rows that do not match on the value of the primary and the foreign key.
Consider the question: What animals have not been sold? (Note that the table Animal stores simply
the data on the animals inventory without deleting data from it when an animal is sold; thus the
table serves and archival purpose.)
The question on sold animals can be answered by using the NOT IN clause (we had the example in
the class using the 2010 PetStore that didnt have table SaleAnimal):
SELECT Animal.AnimalID FROM Animal WHERE Animal.AnimalID NOT IN (SELECT
SaleAnimal.AnimalID FROM SaleAnimal);
Output: 8 rows of column AnimalID, values: 2, 4, 12, 19, 25, 34, 88, 181.

2
To simplify the query by avoiding the query-subquery format, you can use the outer join to answer
this question:
SELECT Animal.AnimalID, SaleAnimal.AnimalID
FROM Animal LEFT JOIN SaleAnimal ON Animal.AnimalID=SaleAnimal.AnimalID
ORDER BY SaleAnimal.AnimalID, Animal.AnimalID Asc;
The left join outputs all the rows from the left table (the one named first) and the matching rows
from the other table. The unmatched values of the primary key (AnimalID) from left the table will
be associated with null values of the foreign key (AnimalID) in the other table (blank cells or
somehow else designated). Note that these rows and blank FK cells do not really exist in table
SaleAnimal; they are just created in RAM for the sake of outputting results of the LEFT JOIN query.
Check the table SaleAnimal.
The following modification of the query above will give you just the unmatched rows, those in
which the value of AnimalID is null:
SELECT Animal.AnimalID As [Unsold Animals]
FROM Animal LEFT JOIN SaleAnimal ON Animal.AnimalID=SaleAnimal.AnimalID
WHERE SaleAnimal.AnimalID IS null;
The left join will output all rows from the table cited before the command LEFT JOIN and all the
matched rows from two tables.
Since there is LEFT JOIN, there must be RIGHT JOIN as well. Think of the question: What
customers have not made any purchase. To answer this question, you need to retrieve the tables
Sale and Customer. Again, you can use the NOT IN option. Or you can take the perspective of the
Sale table and ask which Customer IDs exist in the Customer table but do not appear in the Sale
table. A right join comes handy in this case:
SELECT Sale.CustomerID, SaleDate, Customer.CustomerID
FROM Sale RIGHT JOIN Customer ON Customer.CustomerID=Sale.CustomerID
ORDER BY Sale.CustomerID, Customer.CustomerID, SaleDate Asc ;
The right join will output all the rows from the table cited after the command RIGHT JOIN and all
the matched rows.
Note that SaleDate is included just as an additional detail proving that a sale did or did not occur.
Also, notice that the same output could be obtained by simply reversing the order of tables in the
FROM line and using LEFT JOIN instead. RIGHT JOIN is provided for convenience, as it is equivalent
to LEFT JOIN when order of citing tables in the query is reversed.

2. Complex Queries and Checks


Complex queries are those that involve a number of processing steps needed to arrive at an
output. These queries do not need necessarily to involve a large number of tables.
Consider this standard business question: What is the total value of animal orders, including
shipping costs, placed in February?
If you try to write a single query for answering this question, it may not be that easy to get a
correct result. In particular, due to the 1:M relationship between AnimalOrder and
AnimalOrderItem, shipping costs could be calculated wrongly for each animal appearing on an
order rather than for the entire order (check the Pet Store schema). More on this problem in
queries 4-6 below.
Fortunately, you can always simplify query writing by breaking a question into pieces, and
addressing these by separate queries. At the end, you either automatically or manually put
together the results of these individual, simpler quires. Therefore, to answer the question above,
you can write two simpler queriesone to calculate shipping costs, and the other to get the value
of animals ordered. Follow the procedure below and try it out.
1. Write the query:
SELECT SUM(AnimalOrder.ShippingCost) AS [Total Shipping Costs]
FROM AnimalOrder
WHERE Month(AnimalOrder.OrderDate)=2;
Output = $894.92. Save this query as QueryTotalShipping.
2. Write the query:
SELECT SUM(AnimalOrderItem.Cost) AS [Total Animal Orders]
FROM AnimalOrder
INNER JOIN AnimalOrderItem ON AnimalOrderItem.OrderID=AnimalOrder.OrderID
WHERE Month(AnimalOrder.OrderDate)=2 ;
Output = $5,475.33. Save this query as QueryTotalAnimalsPurchased.
3. Therefore, you get the total of animal orders and shipping costs by adding outputs from steps 1
and 2: $894.92 + $5,475.33= $6,370.25. This total could be retrieved by the third query:
SELECT SUM([QueryTotalShipping].[Total Shipping Costs] + [QueryTotalAnimalsPurchased].[Total
Animal Orders]) AS [Animals & Shipping Total]
FROM QueryTotalShipping, QueryTotalAnimalsPurchased;
Output: $6,370.25.
Save this querey as QueryTotalAnimal-Purchases-Shipping.

Note: The FROM line resembles the older SQL syntax, and there is no PK-FK relationship.
But if you try to write the query in one step without exercising caution, the following queries give
larger amounts:
4. Now, lets see what happens if we vary the SQL statement. Try this:
SELECT SUM(AnimalOrderItem.Cost + AnimalOrder.ShippingCost) AS [Sum Animal Orders]
FROM AnimalOrder
INNER JOIN AnimalOrderItem ON AnimalOrderItem.OrderID=AnimalOrder.OrderID WHERE
Month(AnimalOrder.OrderDate)=2;
You will get: $8,867.99.
5. Same with:
SELECT sum(AnimalOrderItem.Cost)+sum(AnimalOrder.ShippingCost) AS TotalCosts
FROM AnimalOrderItem INNER JOIN AnimalOrder ON
AnimalOrderItem.OrderID=AnimalOrder.OrderID
WHERE Month(AnimalOrder.OrderDate)=2;
6. And with this one; here you can see the inflated shipping costs:
SELECT SUM(AnimalOrderItem.Cost) as TotalValueOfAnimalOrders,
SUM(AnimalOrder.ShippingCost) AS TotalShippingCosts
FROM AnimalOrderItem INNER JOIN AnimalOrder ON
AnimalOrderItem.OrderID=AnimalOrder.OrderID
WHERE Month(AnimalOrder.OrderDate)=2;
In queries 4-6, shipping costs are inflated as they are associated with each animal that appears on
an order, rather than with an order as a whole.

2.1 Exercises
Here are some query exercises. If you need help, check Solutions at the bottom of this file.
1. Find total sales per customer for any (chosen) month. Hint: a month can be specified by a
month function - Month(Sale.SaleDate). Think of the parameter query. Also, pay attention to the
GROUP BY line or to the associated error messages: it must list the attributes used for grouping
records.
2. Which merchandise items with more than 100 units on hand have not been sold? Hint: Use an
OUTER JOIN to answer the question.
Output: 2 items

5
3. Which animal orders contained more than one animal?
Output: 22 items

3. Self-Referencing a Table in a Query


Sometimes even a query of a single table can be tricky. Consider the common business question:
Who manages whom? Or specifically in the Pert Store business, which employees are managed by
manager Bill Gibson?
The problem with this query is that there are no separate tables on employees and managers. This
would be waste of space because managers are also employees that have all the employee data.
Each employee has the attribute ManagerID, which takes values of the key EmployeeID
(ManagerID in a particular record = EmployeeID in some other record). Since every employee has
some manager, with the exception of the top manager, the column ManagerID will be populated in
all but 1 record (the employee who is a top executive).
The goal, therefore, is to match each employee with their manager, which on the level of attributes
involves associating an EmployeeID with a ManagerID (another employee). The value of ManagerID
functionally depends on the value of EmployeeID. So, the employee-manager relationship is
designed as unary class diagram, the diagram with one class. This diagram translates into one
table called Employee. To answer who manages whom, you need to self-reference the table, or
build a reflexive join. In other words, you join the table with itself. This is done by calling this table
once by its real name and then by a made-up name (an alias). The aliased table is a copy of this
table that the system makes dynamically (in the main memory) and joins it to table Employee.
Consider this query:
SELECT Employee.EmployeeID, Employee.LastName, Employee.FirstName,
Employee.ManagerID
FROM Employee AS Manager
INNER JOIN Employee ON Manager.EmployeeID=Employee.ManagerID
WHERE (Manager.LastName="Gibson") AND (Manager.FirstName="Bill");
Output:
EmployeeID LastName FirstName
5 James

Leisha

6 Eaton

Anissa

7 Farris

Dustin

9 O'Connor

Jessica

10 Shields

Howard

Analysis: Focus on the table Employee. We can figure out from a brief inspection of the Employee
table the underlying business rule: each manager can manage many employees, and each

6
employee has only one manager. Therefore, Manager and Employee are in the 1:M relationship.
This implies that the same ManagerID repeats many times in table Employee where it is matched
with different values of EmployeeID. We want to pick out employee rows where name attributes are
Bill Gibson, but we do not know what his EmployeeID is.
The figure below shows how the system simulates a transient (temporary) table Manager that is in
1:M relationship with the existing table Employee. Assuming that Gibson has no manager, you can
see that Gibson is the manager of the other two employees as Gibsons ID is associated with them.
We need table Manager to find Gibson on the name attributes. The tricky part is to figure the
names of matching columns (PK, FK). Since EmployeeID remains PK of the new table Manager, the
tables key fully named is Manager.EmployeeID. In table Employee, the FK is ManagerID, and so its
full name is Employee.ManagerID.
Manager

Employee

EmployeeID LastName FirstName

EmployeeID LastName ManagerID

5 Gibson

Bill

5 Gibson

6 Obama

6 Obama

7 Putin

7 Putin

4. Crosstab Query
Crosstab queries are useful for representing multidimensional data. For example, you may want to see total
sales per animal category by each employee. This can be very useful for management purposes. You can get
this output via a crosstab query. The crosstab query you will create in this exercise can be useful for building
reports atop of it (will be addressed in the lab on application development).
To design a crosstab query, you first need to write a SELECT query. Then you use this query as the input for
the crosstab query. Essentially, the crosstab query regroups records, cross-tabulates them, and performs
calculations. Your SELECT query should list total sales per animal category and employee, as follows.
SELECT Animal.Category, Employee.LastName, Sum(SaleAnimal.SalePrice) AS [Value Sold]
FROM Employee, Sale, SaleAnimal, Animal
WHERE Employee.EmployeeID=Sale.EmployeeID AND Sale.SaleID=SaleAnimal.SaleID AND
SaleAnimal.AnimalID=Animal.AnimalID
GROUP BY Animal.Category, Employee.LastName;
Output, partial (43 rows total):
Category LastName Value Sold
Bird

Carpenter

$379.88

Bird

Eaton

$153.07

Bird

Farris

$202.03

Bird

Gibson

$326.21

7
Category LastName Value Sold
Bird

Hopkins

$310.82

Save this query under the title QueryCategoryEmployeeValue. Now you can build a crosstab query from this
query by using this procedure:
1.
2.
3.
4.
5.

In the main menu at the top, click Create


In the Create menu, click Query Wizard
In the New Query dialog box, click Crosstab Query Wizard and the OK button
In the Crosstab Query Wizard dialog box, check the Queries label in the View section
In the list of queries available, find your QueryCategoryEmployeeValue; click it and then click the Next
button.
6. On the next screen, select the field Last Name for the row headings; click Next button
7. On the next screen, select Category for the column headings; click Next
8. On the next screen, click Value Sold (provided that you named this way the sum of the
SalePrice field in the QueryCategoryEmployeeValue). Then, select Sum function and
click Next.
10. On the next screen, accept the choice for naming the crosstab query it should be
QueryCategoryEmployeeValue_Crosstab and click the Finish button
Your output may contain rows like these below (the figure is shortened; total is 10 rows):
QueryCategoryEmployeeValue_Crosstab
LastName Total Of Value Sold

Bird

Cat

Dog

Fish Mammal Reptile Spider

Farris

$4,203.71 $202.03

$689.23 $2,966.01

Gibson

$4,105.19 $326.21 $1,086.08 $2,131.52 $39.60 $207.81

Reasoner

$4,060.15 $625.94

$920.15 $2,219.68 $61.20

$346.44
$19.80 $172.88

$313.97
$40.50

5. Writing SQL statements for modifying data (UPDATE Query)


Many tasks in developing and managing database systems can be performed by using SQL queries.
This is the CRUD set of operations you already know, just a bit differently named (Create=INSERT,
Read=SELECT; and the terms UPDATE and DELETE are as in CRUD). The queries performing
other than retrieval tasks are sometimes called Action Queries. As the retrieval query is named
after the command SELECT, action queries are named after their key command words; for
example, UPDATE query.
The UPDATE query changes values of attributes in already existing records. To practice it, let us
first add a record to the table Animal in the Sallys Pet Store system. Open the table and start
typing in the blank row at the bottom the values shown below:

8
Animal
AnimalID Name Category Breed DateBorn Gender Registered Color ListPrice Photo
202 Fify

Dog

Mastiff

2/2/2007 Male

ACK

Brown

$100.00

Save the table. The new record with AnimalID 202 appears in the table (or some other number that
comes after the last number of AnimalID in your version of Pet Store). Next, assume that you want
to change the Registered attribute (it is mistyped) and Color (say, Gold is a more precise
description). Instead of going directly to records and retyping these modified data, an UPDATE
query can do the job:
UPDATE Animal
SET Color = "GOLD"
WHERE AnimalID=202;
If your Animal table is opened on the screen, you may need to click in the row you just updated to
see the change. Save the table.
To practice this type of query more, make a copy of table Animal, and then work with it. Write an
UPDATE query to set the Color to Gold for all animals and to increase the List Price for 5%. If you
get stuck, check the solutions at the bottom of this file. While goldening all the animals is just for
fun, you can easily see a business value of the massive adjustment of prices that the other part of
this query does.
If you get stuck, check Solutions. After running a query, you may need (again) to click in the table
and/or to close and re-open it in order to see the changes.

6. Writing SQL statements for deleting data (DELETE Query)


The following DELETE query will delete a record you want from table Animal:
DELETE
FROM Animal
WHERE AnimalID=[ ];
Note that rows can be deleted directly in a table: like with Excel, right-click the first, unnamed
column in a row, and select Delete Record in the pop-up menu.
As with the UPDATE query, real power of the DELETE query surfaces with processing many rows.
For example, the query below will delete all the rows on spiders in your table Copy of Animal. But
first, lets find quickly what spiders are listed in the catalogue. You can use the filtering function for
this. Filtering is like retrieving data but without using SQL or QBE statements; also, you work
directly with rows in tables (or query outputs).
Try this:
1. Open table Copy of Animal

9
2. Find the first occurrence of the word spider; for example, right-click the
heading of the Category column, and in the popup menu select Find; then type
spider
3. In Home/Sort & Filter menu, click the selection icon (upper right)
4. In the popup menu, click Equals Spider
The output should be a set or rows that contains spiders (3 rows only, with AnimalID values of 197,
183, and 28). This is a dynamic table and its results appear only on the screen (they are neither
automatically stored nor they affect the original data).
Now that you know what the spider rows are, you can go ahead and delete these:
DELETE
FROM [Copy of Animal]
WHERE Category="spider";
Check the table for the above three values of AnimalID to be sure that they are really erased.
Another important use of the DELETE query is to automate the task of erasing the sold items from
an inventory table.

7. Writing SQL statements for copying data to another table (APPEND Query)
Assume you want to archive the data on animals sold before deleting the data from the table
Animal. One way of doing this is by having a table on animals which has the same structure as the
table Animal and that you can use for storing archival data. Lets call it Animals Sold. In fact, this
whole task breaks down to copying specific rows from Animal to Animals Sold, and then deleting
these rows from Animal.
The APPEND query can do the job of copying data from Animal to Animals Sold. This query is also
called INSERT query after the name of the start command word in the SQL statement.
To make a copy of the structure of the table Animal, do this:
1. Right-click the label Animal in the Tables view of the Access screen
2. Right-click anywhere in the Tables window, and select Paste in the pop-up menu. An input form
entitled Paste Table As appears.
4. In this input form, select the option Structure Only
5. In the same form, name the table as Animals Sold
Now that you have the space for storing the data on animals sold, use the query below, which
assumes that all dogs are sold:

10
INSERT INTO [Animals Sold] (AnimalID, Name, Category, Breed, DateBorn, Gender, Registered,
Color, ListPrice, Photo)
SELECT AnimalID, Name, Category, Breed, DateBorn, Gender, Registered, Color, ListPrice, Photo
FROM Animal
WHERE Category="dog";
To finish the data archiving task, write a query for deleting dog records from the table Copy of
Animal (work with this table to preserve the original table Animal).
To practice this more, insert rows for black cats into the table Animals Sold. Then, delete these
rows from the table Copy of Animal. In the process, first find out what black cats exist in the
Animal table by using the filtering function; this time use Filter By Form option that resembles QBE.
There should be 9 rows on black cats.

11
Solutions
1. Find total sales per customer for any (chosen) month
SELECT Sale.CustomerID, Customer.LastName, Sale.SaleDate,
Sum([SaleItem.SalePrice]*[SaleItem.Quantity]) AS [Month Total]
FROM Customer INNER JOIN
(Sale INNER JOIN SaleItem
ON Sale.SaleID = SaleItem.SaleID)
ON Customer.CustomerID = Sale.CustomerID
WHERE Month(Sale.SaleDate)=[Enter month as number]
GROUP BY Sale.CustomerID, Customer.LastName, Sale.SaleDate ;

2. Which merchandise items with more than 100 units on hand have not been sold? Use an OUTER
JOIN to answer the question.
SELECT Merchandise.ItemID, Merchandise.Description, Merchandise.QuantityOnHand,
SaleItem.ItemID
FROM Merchandise
LEFT JOIN SaleItem ON Merchandise.ItemID = SaleItem.ItemID
WHERE Merchandise.QuantityOnHand>100 AND SaleItem.ItemID Is Null;

3. Which animal sale orders contained more than one animal?


SELECT SaleAnimal.SaleID, COUNT(SaleAnimal.AnimalID) As [Animals/Order]
FROM SaleAnimal
GROUP BY SaleAnimal.SaleID
HAVING COUNT(SaleAnimal.AnimalID) > 1
ORDER BY SaleAnimal.SaleID, COUNT(SaleAnimal.AnimalID);
Output: 22 items
Check: Remember a sale ID from the output above and then run this query:
SELECT SaleAnimal.SaleID, SaleAnimal.AnimalID
FROM SaleAnimal
WHERE SaleID=[];
When prompted to enter a SaleID, enter the one you memorized.

4. Which merchandise items have an average sale price more than 50 percent higher
than their average purchase cost?
Write a query QueryAverageMerchandiseCost:

12

SELECT OrderItem.ItemID, Avg(OrderItem.Cost) AS AvgOfCost


FROM OrderItem
GROUP BY OrderItem.ItemID;
Write a query QueryAverageSalePrice:
SELECT SaleItem.ItemID, Avg(SaleItem.SalePrice) AS AvgOfSalePrice
FROM SaleItem
GROUP BY SaleItem.ItemID;
The final query combines the above two:
SELECT Merchandise.ItemID, Merchandise.Description, QueryAverageMerchandiseCost.AvgOfCost,
QueryAverageSalePrice.AvgOfSalePrice
FROM (QueryAverageMerchandiseCost INNER JOIN Merchandise ON
QueryAverageMerchandiseCost.ItemID = Merchandise.ItemID) INNER JOIN QueryAverageSalePrice
ON Merchandise.ItemID = QueryAverageSalePrice.ItemID
WHERE AvgOfSalePrice >1.5*AvgOfCost;
Output: 2 items.

5. The UPDATE query to change the color to gold and to raise the list price for 5% in all records:
UPDATE [Copy of Animal]
SET Color = "Gold", ListPrice=ListPrice+(ListPrice*0.05);
The End