Está en la página 1de 5

Birla Institute of Technology & Science, Pilani

Work-Integrated Learning Programmes Division


Second Semester 2010-2011

Mid-Semester Test (EC-1 Regular)
Solution

Course No. : SS ZG515
Course Title : DATA WAREHOUSING
Nature of Exam : Closed Book
Weightage : 40%
Duration : 2 Hours
Date of Exam : 05/02/2011 (AN)
1. How are the top-down and bottom-up approaches for building a data warehouse
different ? Discuss the merits and disadvantages of each approach.
[7]
Ans :
The top down approach

Bill Inmon saw a need to transfer data from diverse OLTP systems into a
centralized place where the data could be used for analysis. He insisted that
data should be organized into subject oriented, integrated, non volatile and
time variant structures. The data should be accessible at detailed atomic levels
by drilling down or at summarized levels by drilling up. The data marts are
treated as sub sets of the data warehouse. Each data mart is built for an
individual department and is optimized for analysis needs of the particular
department for which it is created.

The Bottom-Up Approach

Ralph Kimball designed the data warehouse with the data marts connected to
it with a bus structure.
The bus structure contained all the common elements that are used by data
marts such as conformed dimensions, measures etc defined for the enterprise
as a whole. He felt that by using these conformed elements, users can query
all data marts together. This architecture makes the data warehouse more of a
virtual reality than a physical reality. All data marts could be located in one
server or could be located on different servers across the enterprise while the
data warehouse would be a virtual entity being nothing more than a sum total
of all the data marts.

Advantages of Top-Down
a. A truly corporate effort, an enterprise view of data
b. Inherently architected-not a union of disparate DMs
c. Single, central storage of data about the content
d. Central rules and control
e. May be developed fast using iterative approach
Disadvantages of Top-Down
f. Takes longer to build even with iterative method
g. High exposure/risk to failure
h. Needs high level of cross functional skills
i. High outlay without proof of concept
j. Difficult to sell this approach to senior management and sponsors
Advantages of Bottom-Up Approach
k. Faster and easier implementation of manageable pieces
l. Favorable ROI and proof of concept
m. Less risk of failure
n. Inherently incremental; can schedule important DMs first
o. Allows project team to learn and grow
Disadvantages of Bottom-Up Approach
p. Each DM has its own narrow view of data
q. Permeates redundant data in every DM
r. Difficult to integrate if the overall requirements are not considered in the
beginning

2. Why it is difficult to capture the effectiveness of promotion dimension? Why
promotion dimension is not suitable for inventory data mart?
[6]
Ans :
Promotion conditions include TPRs, End-aisle displays, Newspapers ads,
Coupons, Combinations are common. Promotion are judged on the basis of:[2]
Lift and Baseline sales
Time shifting
Cannibalization
Growing the market
So difficult to capture the effect of promotion.[4]
In the promotional dimension typical of promotions: discount, brokerage, joint venture, etc., as
the dimension attribute which are business dimensions so there is no need of including them in
inventory data mart[2]

3. Explain the importance of grain in fact table. How fact tables are sparse? Why
we do not store GMROI in the inventory fact table? [8]
Ans :
Grain is the level of detail for the measurement or metrics.[2]
Sparse fact table: for example for any sales table if there is a holiday or no orders are
received or processed then that rows lies null.[4]
GMROI [Gross Margin Return On Inventory]is not additive and, therefore, is not
stored in enhanced fact table. Gmroi is calculated from the constituent columns [2]

4. Why it is important to transform the data before it is loaded onto the data
warehouse? List any five types of transformations that are performed. Give an
example for each. [6]
Ans :

Source systems are very diverse and desperate .
Many systems are old legacy system running on obsolete data base.
Historical data change in values is not preserved in source operational systems.
Lack of consistency in source operational systems
Changing data according to new business conditions[3]

major transformation process that are done [3]
1) format revisions
2)decoding of fields
3)character set conversion
4)key restructuring
5)de-duplication
6)merging of information

5. Explain the difference between destructive merge and constructive merge for
applying data to the data warehouse repository. When do you use these modes?
[6]
Ans :
data loading component
Destructive merge:
Incoming data is applied to target data if primary key matches then update the
matching record and a record without a match is simply added to the target.
Constructive merge:
If primary key of record matches with the existing one then mark the incoming
record as a super ceding to the old one[4]
It will be used in the data loading and append process.[2]

6. Discuss the architectural component of the data warehouse. In which
architectural component does OLAP fit in? What is the function of OLAP?
[7]
The Datwarehouse architectural components are as follows :
Data Content
Complex Analysis and Quick Response
Metadata-driven
At the Data Source - The data staging architectural component governs the
transformation, cleansing, and integration of data.



Data Source Layer
This represents the different data sources that feed data into the data warehouse. The data
source can be of any format -- plain text file, relational database, other types of database,
Excel file, ... can all act as a data source.
Many different types of data can be a data source:
Operations -- such as sales data, HR data, product data, inventory data, marketing data,
systems data.
Web server logs with user browsing data.
Internal market research data.
Third-party data, such as census data, demographics data, or survey data.
All these data sources together form the Data Source Layer.
Data Extraction Layer
Data gets pulled from the data source into the data warehouse system. There is likely
some minimal data cleansing, but there is unlikely any major data transformation.
Staging Area
This is where data sits prior to being scrubbed and transformed into a data warehouse /
data mart. Having one common area makes it easier for subsequent data processing /
integration.
ETL Layer
This is where data gains its "intelligence", as logic is applied to transform the data from a
transactional nature to an analytical nature. This layer is also where data cleansing
happens.
Data Storage Layer
This is where the transformed and cleansed data sit. Based on scope and functionality, 3
types of entities can be found here: data warehouse, data mart, and operational data store
(ODS). In any given system, you may have just one of the three, two of the three, or all
three types.
Data Logic Layer
This is where business rules are stored. Business rules stored here do not affect the
underlying data transformation rules, but does affect what the report looks like.
Data Presentation Layer
This refers to the information that reaches the users. This can be in a form of a tabular /
graphical report in a browser, an emailed report that gets automatically generated and
sent everyday, or an alert that warns users of exceptions, among others.
Metadata Layer
This is where information about the data stored in the data warehouse system is stored. A
logical data model would be an example of something that's in the metadata layer.
System Operations Layer
This layer includes information on how the data warehouse system operates, such as ETL
job status, system performance, and user access history.
OLAP is used for DSS tools that use multidimensional data analysis techniques
Support for a DSS data store
Data extraction and integration filter
Specialized presentation interface
Multi Dimensional Data analysis
Advanced Database Support
Easy to use end user interfaces
Support Client/Server Architecture

También podría gustarte