Documentos de Académico
Documentos de Profesional
Documentos de Cultura
2- Atomicit y: Transactions cannot be divided or split. Transactions either succeed or fail as a single unit.
3- Consist ency: Databases only store data that is consistent with the rules defined for storing data. For instance, if an inventory table isn't allowed to have
negative quantities, then no rows will be allowed to have negative quantities.
4- Isolat ion: Transactions are independent of the outside world. No outside change can cause an unexpected change within our transaction. When making
a sale, no other sales can change the quantity in inventory until this sale has been completed.
5- Durabilit y: Once a database transaction commits, the data is secure in the database.
6- Transaction Isolation Levels
7- The first is called read uncommitted. It is the fastest, but least safe level. Transactions run with this isolation level are not guaranteed to occur
independently of other transactions. This means that one sale transaction is allowed to "dirty read" the results of other incomplete sale transactions. (A
"dirty read" is when the database gives you data that has not yet been committed to disk.) This isolation level is only used in certain circumstances
where you don't necessarily care if the data is incorrect.
8- The next transaction level, called read committed, is slightly safer than read uncommitted. Dirty reads are not possible under this isolation level. If you
run a query at the beginning of your transaction, and run the same query at the end of the transaction, the results won't be the same if another
transaction begins and commits in the middle of your transaction.
9- Repeatable read is usually the default transaction isolation level. In this isolation level, you cannot get dirty reads, and no other transactions can change
rows during your transaction.
10- The highest level of safety, and the slowest transaction level, is serializable. Under this isolation level, each 'read' and 'write' to the database occurs in
sequence. This isolation level can cause performance issues, since your transaction might have to wait for someone else's transaction to complete.
11- SET transaction isolation level read committed;
12- Transactions are usually used in applications or in stored procedures. Your application will begin a transaction, then execute a query.
13- Combining Tables with a Join
A JOIN clause is used to combine rows from two or more tables, based on a related column between them.
(INNER) JOIN: Returns records that have matching values in both tables
LEFT (OUTER) JOIN: Return all records from the left table, and the matched records from the right table
RIGHT (OUTER) JOIN: Return all records from the right table, and the matched records from the left table
FULL (OUTER) JOIN: Return all records when there is a match in either left or right table
b.) Buffer Manager - A program module responsible to get data from disk storage into main memory and decide on the data that should exist in
cache memory.
c.) Transaction Manager - A program module ensuring that database remains in a consistent state even after the system failures and concurrent
transaction execution keeps going on without conflicting.
d.) File Manager - A program module that manages space allocation on disk and data structure used to represent information on a disk.
List some advantages of DBMS.
DBMS is useful in:
1. Controlling redundancy
2. Restricting unauthorised access
3. Giving backup & recovery
4. Providing multiple user interfaces
5. Getting integrity constraints enforced
Explain the various types of normalization.
When we create a database, we include all the required columns in it but we see that there is a lot of redundant data in it. To get rid of this
redundant data, the table is split and this process is called normalization.
1. First Normal Form (1NF) - In this state a relation all underlying domains compulsorily contain atomic values only. After 1NF, it is still possible for
the system to possess some redundant data.
Extension is the number of tuples present in a table at any instance while Intension is a constant value that provides the name, structure of table
and the constraints on it.
It implies that the modification in schema definition at one level does not affect the schema definition at the next higher level.
a.) Physical Data Independence: Any change in physical level does not affect the logical level.
b.) Logical Data Independence: Changes made at logical level do not affect the view level.
Explain a. Entity b. Entity Type c. Entity Set
a. Entity - A thing that has got an independent existence is called entity.
b. Entity type - Collection of entities with same attributes.
c. Entity set - Collection of entities of a particular type in the database.
Explain a.) Partial key b.) Alternate key c.) Artificial key d.) Compound key e.) Natural key
a.) Partial Key: Also referred to as Discriminator at times, partial key refers to a set of attributes that
b.) Alternate Key: It is a set of all Candidate Keys excluding the Primary Key
c.) Artificial Key: When there's no conspicuous key available like stand alone key or compound key. we create a new key called artificial key. This
is done by assigning a unique number to each record or occurrence.
d.) Compound Key: When it is not possible for a single data element to uniquely identify the occurrences in a construct, a combination of multiple
elements is used to create an identitfier which is unique. This unique identifier is known as Compound Key.
e.) Natural Key: Sometime we use a data element stored in a construct as the primary key, it is called as a natural key.
Explain Phantom Deadlock.
- During the process of distributed deadlock detection, sometime a delay occurs while propagating local information and this leads to deadlock
detection algorithms pointing to deadlocks that do not actually occur.
- These deadlocks are referred to as phantom deadlocks, as they do not really exist.
- Phantom deadlock lead to unnecessary abortions.
Explain the following frequently used terms in Database.
1. Field - An area within a record that is reserved for a specific data.
2. Record - Collection of values / fields of a specific entity. E.g. Customer, Account etc.
3. Table - Collection of records of a specific type. E.g. Customer Table, Phone numbers table etc.
What happens when Shared and Exclusive locks are applied on data item?
- If a shared lock is applied on a data item, other transactions can not write on it. They can only read the item.
- If an exclusive lock is applied on a data item, other transactions can neither read nor write on the data item.
List the properties of a transaction.
Properties of a database transaction are recognized by the acronym - ACID.
1. Atomicity
2. Consistency
3. Isolation
4. Durability
What is a view?
- A view is a virtual table.
- A table contains real data while a view just contains queries that are capable of dynamically retrieving the data when used.
What is a materialized view?
- Materialized views are disk based views.
- Materialized views get updated periodically based on the interval specified in the query.
- Materialized views can be indexed.
What are the advantages of views in a database?
Advantages of views in a database are:
1. The data in views is not stored at any physical location. This saves resources and still gets the output.
2. Since data insertion, update and deletion is not possible with the view, it puts a restriction on the access.
What are the disadvantages of views in a database?
Disadvantages of views in a database are:
- A clustered index reorders the way records in the table are physically stored.
- There can be only one clustered index per table.
- It makes data retrieval faster.
Non-clustered Index
- A non-clustered index does not make any changes to the way the records were stored but creates a completely separate object inside the table.
- This makes the insert and update command work faster.
What are B-trees?
- These are data structures in the form of trees that store sorted data.
- They allow searches, insertions, sequential access and deletions to be carried out in logarithmic time.
What is Table Scan and Index Scan?
- Table Scan - Iterating over the table rows
- Index Scan - Iterating over the index items
Explain Database partitioning. What is its importance?
Database partitioning refers to dividing the large database into small logical units. It helps in improving the management, availability & performance
of the system.
Fact tables are central tables in data warehousing. They contain the aggregate values that are used in business process. Dimension tables describe the attributes
of the fact table.
Fact Tables: These tables track a specific activity. These are the schema for star schema.
For example :
The changes of email messages are tracked while processing by the contact center, is tracked by fact table requestEvent. The email messages are tracked through
contact center by the fact table queueEvent.
Dimension Table: These tables are the end of the stars and contain the entity attributes. For example, one record for one customer is available in the customer
dimension which interacted with the contact center and the every information is known to the customer.
1) Extraction: In this phase, data is extracted from the source and loaded in a structure of data warehouse.
2) Transformation: After extraction cleaning process happens for better analysis of data.
- ETL is an important component in data warehousing architecture. The data from operational applications are copied into data warehouse staging area, from
data warehouse staging area into data warehouse. Ultimately the from the data warehouse will be placed into a set of confirmed data marts that are accessible
by data marts.
- The extraction of data, transforms values of inconsistent data, cleansing bad data, filtering data and loading the data into the destination database is the process
ETL software performs. If a failure occurs by one ETL job, the remaining ETL jobs must respond appropriately.
Data mining is a process of analyzing current data and summarizing the information in more useful manner. It is useful in analyzing how good business is going
and also helps in forecasting business status.
- Pattern extraction from data is a process called Data mining. Transforming data doubles over a period of time into information is the prime factor for data
mining. It is used in marketing, surveillance, fraud detection and scientific discovery like profiling practices.
- It is important to be aware of the use of non-representative samples of data may cause results that is not indicative of the domain. Data mining will also not find
the patterns which may be present in the domain. Verification and validation of the patterns on other samples of data is an important process in data mining.
- Index can be thought as index of the book that is used for fast retrieval of information.
- Index uses one or more column index keys and pointers to the record to locate record.
- Index is used to speed up query performance.
- Both exist as B-tree structure.
- Kind of the indexes are clustered and non-clustered.
Index is the way to order the records in a database according to the field values. It is the way to have fast access to the particular information. Indexes are created
to the columns that are queried frequently.
- Indexes improve query performance but it slows down data modification operations.
- Indexes consume disk space.
An index of a data base is a data object or structure which is utilized for improving the operations speed in a specific table. Index is a subset of columns from a
table. The values in a subset are stored in a sorted order and the database server can quickly finds the records which are based on the data in the index.
1) Clustered index
2) Non-clustered
Clustered index
Non-clustered
- Non-clustered index is the index in which logical order doesnt match with physical order of stored data on disk.
- Non-clustered index contains index key to the table records in the leaf level.
- There can be one or more Non-clustered indexes in a table
Types of indexes.
1. Clustered: Clustered index sorts and stores the rows data of a table / view based on the order of clustered index key. Clustered index key is implemented in B-
tree index structure.
2. Nonclustered: A non clustered index is created using clustered index. Each index row in the non clustered index has non clustered key value and a row locator.
Locator positions to the data row in the clustered index that has key value.
3. Unique: Unique index ensures the availability of only non-duplicate values and therefore, every row is unique.
4. Full-text: It supports is efficient in searching words in string data. This type of indexes is used in certain database managers.
5. Spatial: It facilitates the ability for performing operations in efficient manner on spatial objects. To perform this, the column should be of geometry type.
6. Filtered: A non clustered index. Completely optimized for query data from a well defined subset of data. A filter is utilized to predicate a portion of rows in the
table to be indexed.
ER diagram is a conceptual and abstract representation of data and entities. ER is a data modeling process, which is to produce a conceptual schema of a system,
often an RDBMS and the needs in top down fashion. ER diagrams are derived by using this model.