Está en la página 1de 8

DATABASE MANAGEMENT SYSTEMS 1

CHAPTER 9

DATABASE MANAGEMENT SYSTEMS


LECTURE NOTES

Learning Objectives:
To understand the operational problems inherent in the flat file approach to data management that gave rise to the
database concept.
To understand the relationships among the defining elements of the database environment.
To understand the anomalies caused by unnormalized databases and the need for data normalization.
To be familiar with the operational characteristics of the relational database model.
To be familiar with the stages of database design, including entity identification, data modeling, constructing the physical
database, and preparing user views.
To be familiar with the operational features of distributed databases and recognize the issues which need to be
considered in database design.

Purpose of Chapter 9:
The purpose of this chapter is to explain how the strategies, techniques, hardware, and software of database models are
different from flat-file environments.
The first section describes flat-file environments and how the database model resolves many of the flat file environment
issues.
The primary elements of the database environment are the users, the database management system (DBMS), the
database administrator (DBA), and the physical database.
Relational database design is discussed in-depth including data modeling, deriving relational tables from entity
relationship diagrams, creating user views, and normalizing data structures.
Finally, distributed database issues are discussed and centralized, partitioned and replicated database systems are
compared.

Overview of Flat File vs. Database Approach


Figures 9-1 (flat file) and 9-2 (database) compare the two approaches to data management.

Flat file approach characteristics include:


Data files that are owned or only used by specific application programs.
Data files that are specifically created for the utilization of one user group for one purpose.
Different user groups that use the same data have created their own representations of the data in their own data files,
leading to redundancies, inconsistencies, and incompatibilities as well as task-data dependency.
Data redundancy creates problems in data storage (multiple collection and multiple storing of the same data), data
updating (redundant and possibly inconsistent) and currency of information (delays and errors in updating may
result in loss of currency).
Task-data dependency refers to the inability to obtain additional information as the users needs change. For any
change in data elements, the corresponding programs must be altered, which results to be costly.

Database approach pools the data in one central location, the database, for use by all users who are connected with the
system. Because all the data is shared and is entered in the system only once, the problem of data redundancy is eliminated.
Updating can be performed with just a single procedure and information is kept perpetually current for all users.

Traditional flat file problems are solved with the Database concept:
No data redundancy: data is pooled across users and tasks
Single time data update: stored once, updated once
Current values in data fields
Task-data independence: users are constrained only by the limitations of the data available to the firm and the
legitimacy of their need to access it

Controlling Access to the Database is more difficult than controlling flat file systems because all users need the database.
Typically, control is accomplished through field (attribute) level authorizations to read or write or read/write.

The Database Management System (DBMS) is the operating software that lies between the data and the user applications.
The DBMS holds all logical data structures and controls access to the database.

Three Conceptual Models for Database Architecture are hierarchical, network, and relational database models.
Hierarchical and Network models are termed navigational or structured models because all relationships must be defined a
priori. These models are discussed in the appendix to the chapter. Relational database models allow for users to create new
and unique paths through the database, allowing them to solve a wider range of business problems.

Elements of the Database Approach

The elements of the database approach to data management are:


the users,
2 Chapter 9
the database management system,
the database administrator, and
the physical database.
Figures 9-3 and 9-4 illustrate the components of the database concept. Figure 9-4 illustrates the database management
system.

Users

Data may be accessed by programs or directly by users through ad-hoc (not pre-planned) queries.

Database Management System (DBMS)

The DBMS is the software that provides a controlled environment to efficiently manage and secure the data resource. Some
of the more common objectives of DBMS models are:
Program development
Backup and recovery
Database usage reporting
Database access

The primary components of a DBMS include:


The Data Definition Language (DDL) defines the physical database, including the names and relationships of all data
elements, records, and files. There are three views involved in this definition:
Internal view: physical record arrangement.
Conceptual view (schema): the logical, abstract connections between the attributes, records, and files. The
users program based on the conceptual definitions.
User view (subschema) defines how a particular user sees the database. Basically, it identifies the
attributes of each record type that a user has authorization to access, and the type of access afforded to
them (read, write, or both).
The Data Manipulation Language (DML) is the programming language that the software uses to retrieve, process,
and store data. The data manipulation language matches the user request to the stored user view and conceptual
views, determines the data structure parameters, and then passes them to the operating system, which performs the
actual data retrieval.
Typical process: a user makes a request for data. The DBMS translates the request into the data manipulation
language, verifies that the user is authorized to view the data requested, retrieves the data structures from the
internal view (file organization and file access method), and passes the request to the operating system, which
retrieves the data. The retrieved data is stored in a memory buffer, the user manipulates the data as needed, and
then the steps are reversed to store the revised data in the physical database.

The Query Language permits end users to access data in the database directly without the need for a conventional
program. The most popular is IBMs SQL (pronounced sequel). Figure 9-5 illustrates the powerful SELECT
command within SQL.
The Data Dictionary describes every data element in the database, creating a common view for all users of the system.

The Database Administrator

The database administrator holds a personnel position that does not exist in the traditional environment. His job is to manage
the database resource, a job complicated by the fact that there are multiple users sharing the common database. Therefore the
administrator must utilize organization and coordination skills to develop rules and guidelines that will protect the integrity of
the database.

Table 9-1 reviews the functions of the Database Administrator: database planning, design, implementation, operation,
maintenance, growth, and change.

One of the database administrator's responsibilities is to evaluate requests to make use of the database. The administrator
also gives access authority to users via the data definition language software module [DBMS block of Figure 9-3] and
creates the data dictionary, a collection of descriptions of every data element in the database.

Organizational Interactions of the DBA


Figures 9-3 and 9-6 illustrate the position of the database administrator with respect to the users and the database
management system. Internal control is enhanced by the separation of duties between access control and systems
development. The DBA authorizes views and access priviledges.

The Physical Database

The physical database is the lowest, actual physical location/repository of data on magnetic disks. This is illustrated at the
far right side of Figure 9-3. Various data structures are employed, usually in combination, to process the physical database.
Relational databases are based on the indexed sequential file structure. Figure 9-7 illustrates the index in conjunction with the
sequential ordering. Multiple indexes may be created (termed inverted lists) to assist user retrieval.

The Relational Database Model


DATABASE MANAGEMENT SYSTEMS 3
The relational model represents data in two-dimensional tables, called relations.
Within each table:
1. Attributes or fields are represented in the columns, and
2. Records are represented in the rows.

Tables are relational if they support the relational algebra functions of restrict (row selection), project (column selection),
and join (build a new physical table from two tables consisting of all concatenated pairs of rows from each table). Figure 9-8
illustrates a relational table. Figure 9-9 illustrates these algebraic functions.

Relational Database Concepts

Entity: something about which the organization wants to capture data. Can be physical or conceptual. Entities are nouns,
resources, events, or agents involved in the business process. They are represented by rectangles on an ER diagram. Figure
9-10 illustrates the concept of entities in an entity relationship diagram.

Data model is the blueprint for ultimately creating the physical database.

Entity relationship diagram is the graphic representation of the data model.

Entity, Occurrence, and Attributes

The database designer identifies the key entities in the system of interest and defines the data elements that are important for
capture and maintenance, keeping in mind the responsibilities and goals of the users of the system to be designed and the
organizational rules of doing business. These findings are documented in a data modeling graphic such as the entity
relationship (ER) diagram:

Entities are nouns, resources, events, or agents involved in the business process.
Occurrence describes the number of instances or records that pertain to a specific entity (rows).
Attributes are data that describe the characteristics or properties of entities (columns). They are represented by labeled
circles attached to the entities. Many times these are not shown on the basic ER diagram because they clutter the picture. To
see the attributes, a user just has to place their cursor over the entity of interest, and a window shows a list of the attributes
for that entity.
Relationships between entities are verbs of action and are depicted by diamond symbols.

Associations and Cardinality


Associations are the relationships between tables, the nature of which is typically described with verbs such as
shipped, requests, or receives.
Cardinality is the degree of association between two entities.

There are three primary table associations (Figure 9-11 illustrates many different associations):
One-to-Zero or One (1:0,1) associations. For every record occurrence in one table, there is zero or one occurrence in
the other table. Example: employee and year-to-date earnings files. Example 1 in Figure 9-11.
One-to-One (1:1) One record is associated with only one record in the other entity. Example 2 in Figure 9-11.
One-to-Zero or Many (1:0.M) The minimum of records matched is zero and the maximum is many. Example 3 in
Figure 9-11.
One-to-Many (1:M) For every record occurrence in one table, there are many occurrences in the other table. Example 4
in Figure 9-11.

Many-to-Many (M:M). Examples 4 and 5 in Figure 9-11. Example: inventory and vendors. One inventory item might
be supplied by many vendors. One vendor may supply many different inventory items.

Physical Database Tables


Rows (records or occurrences or tuples) and Columns (attributes) form the tables.

Tables should have four characteristics:


1. The value of at least on column in each row must be unique. This will be the primary key.
2. The table must conform to the rules of normalization: free of repeating groups, partial dependencies, and transitive
dependencies.
3. All attribute values in any column must be of the same class.
4. Each column within a table must be uniquely named; however, different tables may have the same name for a column.

Linkages Between Relational Tables are created by embedding a field attribute (one column) from one table in another
tables record. Figure 9-12 illustrates linkages between relational tables.

Foreign Key: if the key refers to a different entity

Composite or concatenated keys are keys made up of two combined fields or attributes.

User Views are the set of data that a particular user sees: the forms, the screens, the tables, or the reports. These views can
be prepared by a word processor, a graphics package, or pencil and paper.
4 Chapter 9

The Data Normalization Process

Performing normalization requires a good understanding of the user information needs and the organizations business rules.
The first step is to design the user views (output reports, documents, input screens). Figure 9-13 contains the user view of the
inventory status report for a purchasing agent.

The Importance of Data Normalization is clearly illustrated by the fact that users will not have access to the information
they need without normalization. Most corporations normalize a database to the 3rd normal form (there are higher levels that
are not utilized). Figure 9-14 shows an unnormalized database table for inventory.

Mistakes/errors that occur when normalization is missing include:


Update anomalies result from redundancy in the unnormalized table. Example is a supplier who provides more than
one type of inventory. If the suppliers address is in each row of the inventory file, and if the suppliers address
changes, then you would have to change the address as many times as it occurs in your database (could be
thousands!). Figure 9-21 provides an unnormalized table for inventory.

Insertion anomalies result from not having a separate table for an entity. For example, there is no vendor table in the
inventory example, vendors are attributes within each inventory part number record. Therefore, it would be
impossible to add a vendor you might want to work with unless you purchased an inventory item from them.
Deletion anomalies result from unintentional database deletions. For example, if an inventory item is deleted, and if that
inventory item was the only item purchased from that supplier, then deleting the inventory item automatically deletes
the supplier from your database. A user may go along without knowing that critical information has been deleted.

Data Normalization Rules for 3rd Normal Form

The tables in Figures 9-14 and 9-15 illustrate the normalization process. Have the students identify the problems: two entities
inventory and suppliers, and name, address, and telephone number are not dependent upon part number.

The focus of this normalization is to remove repeating groups, partial dependencies, and transitive dependencies.
Splitting up unnormalized complex tables into simpler, smaller tables so that:
All non-key attributes are dependent on the primary key.
All non-key attributes are independent of the other non-key attributes.

Figure 9-15 illustrates the three tables that result from the 3 rd normal form normalization of the table presented in Figure 9-
13. Note that an intersection table (termed the link file) has been created to connect suppliers and inventory parts. This
eliminates a M:M relationship that was in the data. This table will have a combined primary key from both fields. The
update anomaly is resoled because the data about each supplier exists in only one location (supplier table) The insert
anomaly is resolved because new vendors can be added to the supplier table even if they are not currently supplying the
organization with any inventory items. The deletion anomaly is eliminated: anything can be deleted without causing
unintentional deletions of other items.

Linking Normalized Tables

The proper linking of normalized tables is dependent upon the business rules of the situation at hand as well as the
cardinality of the association. The chapter example of showing two business rules for the relationship between vendor and
inventory item illustrate this point.

Keys in 1:1 associations: either or both primary keys are embedded as foreign keys in the related table. If the relationship is
1:0, then put the primary key of the 1 table into the 0 table.

Keys in 1:M associations: The primary key of the 1 table is embedded as a foreign key in the M table.

Keys in M:M associations: A link table is needed, with a combined (composite) key consisting of the two primary keys from
each of the two M tables. This avoids the embedding of foreign keys.

Example: It may help students to review Figure 9-15 to first understand the relationships and tables.
Then have them assume a 1:3 relationship between vendor and 3 inventory parts as the first business rule to consider. Figure
9-16 illustrates the changes that would be needed in the structure to accommodate this type of relationship, properly placing
the foreign key on the one side of the relationship. Figure 9-17 is an illustration of the opposite placement of the foreign
key, a clumsy logic that would work but only because we have limited the inventory part numbers to 3.

Third, assume a 1:M relationship between vendor and inventory part. This is the situation where you want to embed a
foreign key on the one side of the relationship, never on the many side. Again, Figure 9-16 would still work, but the
foreign key reversal illustrated in Figure 9-17 would not work.

Then assume a M:M relationship between vendor and inventory part. Now an intersection record is required because both
sides of the relationship involve many. Figure 9-15 has been set up to support this link table.

Accountants need to understand normalization to know if a database has been normalized because:
Update anomalies result in conflicting and obsolete database values.
DATABASE MANAGEMENT SYSTEMS 5
Insertion anomalies result in unrecorded transactions and incomplete audit trails.
Deletion anomalies result in loss of accounting records and destruction of audit trails.

Designing Relational Databases

First, a company decides that they need a new information system. The next step is to develop the user need requirements for
the new system. After that, there are three phases of database design: conceptual database design, logical database design,
and physical database design. This section of the systems development process is the conceptual database design (assumes
that the user needs have already been captured as systems requirements).

The six primary phases of database design:


1. Identify entities.
2. Construct a data model showing entity associations.
3. Add primary keys and attributes to the model.
4. Normalize the data model and add foreign keys.
5. Construct the physical database.
6. Prepare the user views.

Identify entities

The primary entities of the organization must be identified, and a data model constructed to reflect their relationships.
Business rules and information needs of all users must be analyzed. The analyst identifies and documents the key operational
features of the system needed. The next step is to identify the underlying entities involved in those operations.

The example in the chapter reviews a purchasing agent identifying the need for more inventory, selecting a vendor, preparing
a purchase order (Figure 9-18a), sending it to the vendor, and receiving the inventory shipped (Figure 9-18b receiving
report). The entities in this story are: purchasing agent receiving clerk, inventory, suppliers, inventory status report, purchase
order and receiving report.

To pass the valid entity test (for an entity to be included in the data modeling), two conditions must be met:
An entity must consist of two or more occurrences
An entity must contribute at least one attribute that is not provided by other entities.

The entities that do not pass the two conditions, and hence, will not be modeled: the purchasing agent, the receiving clerk,
and the inventory status report.
Inventory, supplier, purchase order, and receiving report pass both criteria and will be included in the model.

Construct a data model showing entity associations


Figure 9-19 illustrates the associations for this example
Purchase order and Inventory entities have a (0,M : M)
Inventory and Suppliers (M : M)
Supplier and Purchase order (1 : 0,M)
Purchase order and Receiving Report (1 : 1)
Receiving Report and Inventory (0,M : M)

Add primary keys and attributes to the model

The analyst should select a primary key for each entity that logically defines the non key attributes, and uniquely identifies
each occurrence (row) in the entity. Purchase order number, Supplier number, etc. Figure 9-20 illustrates the four entities
with the primary keys assigned.

Every attribute in an entity should appear directly or indirectly (calculated) in at least one user view. Figure 9-21 illustrates
the attributes assigned to each entity in Figure 9-20.

Normalize the data model and add foreign keys

Figure 9-22 presents the normalized tables for this example. All of the attributes in each final table are uniquely and totally
dependent upon and explained by the primary key (creating 3rd normal form).

The normalization steps created a new table called the Receiving report Item Detail, a new table entitled the Purchase Order
Item Detail, and a new table called Inventory-Supplier Link. These new tables removed all M:M relationships (inventory-
supplier link), removed all repeating groups (within purchase orders and receiving reports); and removed all transitive
dependencies (no more redundant data storage)

In the purchase order table, part-num, description, order-quantity, and unit price are repeating group data. (when a
purchase order contains more than one item, multiple values will exist for part number, description, quantity ordered,
and price. One row cannot have multiple values.
In the receiving report table, part-num, quantity received, and condition code are also repeating groups that must be
removed.
In the purchase requisition, purchase order, and receiving report tables, there are fields or attributes that contain data
that is not dependent upon their primary keys. Rather, they have what is called transitive dependency on another
6 Chapter 9
non-primary key field. Therefore, all such fields must be removed and made into a table of their own.

Figure 9-23 illustrates the normalized tables for this example. The primary and foreign keys linking the tables are
represented by dotted lines.
Each record in the Receiving Report Item Detail table represents an individual item on a receiving report. The key to this
table is a composite key of both the Receiving Report number and the Inventory part number. Both of these
attributes are needed to uniquely identify the remaining attributes in this table, the quantity received and the
condition. The Receiving Report number attribute provides the link to the Receiving Report table, and the Part
Number attribute provides the link to the Inventory table in order to allow the updating of the quantity on hand field
from the Quantity Received filed of the Item-Detail record.
The Purchase Order Item Detail table uses a composite primary key of both the Purchase Order number and the Part
Number. Again, this uniquely identifies the remaining fields in this table, the order quantity. The PO number links
this table to the PO table, and the Part number links this table to the Inventory table, where the Description and the
unit cost data are stored.

Construct the physical database

This phase converts the conceptual user views into underlying base tables. This may take months for companies to
accomplish, as data needs to be entered, or transferred from existing storage via software programs (that may have to be
written).

Prepare the User Views

The lower portion of Figure 9-23 represents a physical user view that could be created through the query feature of the
database. Note that the fields represented on a screen or document for a specific user may come from a variety of tables.
These fields have been cross-referenced with numbers from 1-12 in Figure 9-23 to help students understand the relationship
between logical storage and user views.
These same logical fields (attributes) could be combined in different ways to make many different user views.

Forms and Reports are pre-prepared user views. Ad-hoc reports (not pre-planned real time research) can be prepared
through database queries. SQL stands for the standard query language used by databases. Some of the most common SQL
operators are SELECT, FROM, and WHERE.
SELECT identifies the attributes (columns) to include in the report.
FROM identifies which tables to go to for those columns.
WHERE specifies which rows (occurrences or records) to pick for the report, usually by a value or range of values in a
particular attributed.
AND, OR, NOT are operators that are used to further specify the report.

Headers, trailers, summation fields, averages, and graphic representations can all be added to complete the reports.

Databases in a Distributed Environment

Companies need to decide where the organizations database will be located for supporting a distributed database processing
(DDP) system. The two choices are a centralized database or a distributed database. In a distributed database, there are two
categories: partitioned and replicated databases. Each of these options are discussed in the chapter.

Centralized Databases

Figure 9-24 illustrates a centralized database approach. The central site performs the function of file manager for the remote
users.

Data currency can be a problem when two users are accessing the same subsidiary ledger and both are updating it, because
account balances pass through a state of temporary inconsistency where the values are incorrectly stated, which can result in
incorrect data processing. The chapter includes an accounts receivable example to make this concept more tangible for the
students.

The solution is to lock-out all subsequent updaters until the first updater is finished with their transaction.

Distributed Databases

There are two options for this architecture: partitioned and replicated.

Partitioned databases take the centralized database and distribute it according to its primary users, thereby dramatically
reducing the I/O traffic and increasing response time, increasing primary users control over the data and reducing the
potential of losing all the corporate data if a disaster were to occur. There are still lock-out needs for secondary users. Figure
9-25 illustrates partitioned databases.

The Dead Lock phenomenon: where multiple sites lock each other out due to unique timing of update requests, then remain
in permanent wait stages, resulting in incomplete processes and corruption of the database. Figure 9-26 illustrates the
deadlock situation.
DATABASE MANAGEMENT SYSTEMS 7
Deadlocks are resolved by sacrificing one or more transactions. Which transaction run to terminate depends upon:
The amount of resources currently invested (number of updates already performed).
The stage of completion (let the ones closer to being done finish first).
The number of deadlocks associated with the transaction (the more, the greater the chance of being the sacrificed entry).

Replicated databases involve duplicating the database at each site for each user.

Figure 9-27 illustrates a replicated database. These systems work best when up-to-the-minute currency is not needed, and
many users must perform many queries to perform their responsibilities. Then, either in a serial real-time or periodically
(each evening), the databases are updated to reflect the changes made by the other users (a rather complicated updating
process). Currency can be achieved through the utilization of time stamping, and updating by means of the timestamp.

A common method used for concurrency control is to serialize transactions by time-stamping. This involves labeling each
transaction with two criteria. First, transactions are grouped into type classes (based on potential conflicts). Second, each
transaction is time-stamped. When transactions are received at a site, they are evaluated for conflicts. If a conflict exists, the
transactions are entered into a serialization schedule, and an algorithm is used to schedule the updates to the databases
according to their class and time-stamp. This method allows multiple concurrent transactions to be processed without error
as if they were serial events.

Distributed Databases and the Accountant

The questions that an accountant should consider with respect to an organizations database processing include:
Should the business use centralized or distributed processing?
If distribution is desired, should the database be replicated or partitioned?
If the partitioned distributed database is desired, how should the segments be allocated?
The answers to these questions affect the integrity of the database and the quality of the audit trails.

Summary

The database approach overcomes many of the weaknesses of the flat file approach to data storage and processing. The
components of the database approach were reviewed.
The database design and development process was explained, with the relational database model reviewed in detail,
emphasizing its advantages and the importance of normalization.
Distributed data processing was reviewed and the most common architectures, centralized, distributed with partition, and
distributed with replication, were explained.

Appendix

The Hierarchical Database Model

Figure 9-29 illustrates a hierarchical database structure, also termed the tree structure and a navigational database.
Figure 9-30 illustrates a portion of a hierarchical database.

Each relationship involves a set of parent and child files. Child files with the same parent are called siblings. A parent may
have one or more children. No child can have more than one parent. Parents have pointer fields that navigate the computer to
the related children. The top-most parent is termed the root segment, and the bottom or most-detailed child file is termed
the leaf segment.

Queries in this database structure are limited by the location of the desired data. The only way to access data at lower levels
in the tree is from the root and via the pointers down the navigational path. You cannot cross across the structure, only up
and down.

Limitations of this structure include that each "leaf" or "child" can have only one "root" or "parent", while any parent can
have one or more child records. This creates constraints within the system since not all possible data combinations may be
permitted.

Figure 9-32 shows the data structures and linkage between files for the database pictured in Figure 9-30. This example
utilizes a hierarchical indexed direct access method. The customer root file is organized as an index file with only summary
information. Lower-level files use pointers in a linked-list arrangement. This organization allows for efficient updates to the
root file and direct queries to the child files.

The sales invoice file is a randomly organized file arranged in a series of linked-lists that all belong to one customer. Each
record ends with a pointer to the next record. The query continues along the link path until the desired record is matched.

The Network Database Model.

Figure 9-31 illustrates a simple network structure.

The network model is another navigational model, similar to the hierarchical model except that a child record may have
multiple parents, permitting more flexibility within the system by allowing more than one path to get to a record. For
example, a business may track the data according to salesperson or according to customer. This is much greater flexibility
8 Chapter 9
than the hierarchical model.

Figure 9-33 illustrates the data structures that could be utilized for the relationships depicted in Figure 9-31. There are many
more linked records in this model because each child is allowed to have many parents.

The Data Normalization Process

This section has been provided as additional support in the understanding of the process required to reach normalization in
the 3rd normal form. Figure 9-34 depicts an unnormalized table of student enrollment data for a university registrar.

The normalization process attempts to reduce a complex table into a series of linked simple tables that meet two conditions:
All non-key attributes in the table are dependent upon the primary key (the primary key defines them).
All non-key attributes are independent of the other non-key attributes (there are no tables within a table).

Fibure 9-35 graphically explains the steps needed for third form of normalization. Figure 9-36 illlustrates the relationship
between the candidate key and other attributes.
Student number has a 1:1 relationship with student and major
Student number has a 1:M relationship with course and all the course attributes

The following three steps must be taken:


All repeating groups of data must be removed. Figure 9-37 illustrates the removal of repeating groups in this example.
Repeating groups are multiple data values at the intersection of rows and columns. Removal of repeating groups
puts the tables in first normal form. In this example, this would require a table for students, and a table for courses,
as the attributes to each course are dependent on that course. This still leaves a great amount of data redundancy,
which may lead to update anomalies, insertion, and deletion anomalies.
All partial dependencies must be removed. Figure 9-38 illustrates the partial dependency of this situation and Figure 9-
39 depicts the solution for their removal. Partial dependencies occur in tables that have a composite primary key,
and the remaining attributes (at least some of them) are not dependent on BOTH of the attributes in the composite
key. Removal of partial dependencies puts the tables in second normal form. In this example, this would require
separating out the students grade from the course into two tables.
All transitive dependencies must be removed. Transitive dependencies are non-key attributes that are dependent upon
other non-key attributes. Figure 9-40 and 9-41 illustrate the transitive relationships and their removal.
o Update anomaly: the instructor attributes are defined not by the course that the instructor teaches, but by the
instructor. These need to be separated into a separate table, or there will be an update anomaly.
o Insertion anomaly: The registrar cannot add information about an instructor unless they are teaching a course.
Instructors need to be separated from courses.
o Deletion anomaly: if the instructor and course were in the same table, and the course were canceled, that might
also erase the instructor! Separating the course from the instructor into two tables eliminates this risk.
Figure 9-42 illustrates the resulting four tables, and their links. This set is in 3rd normal form. The tables are Student,
Student-Grade, Course, and Instructor. The Student-Grade table has a composite key of Student Number and Course, and
can be accessed with by either attribute or both.