Documentos de Académico
Documentos de Profesional
Documentos de Cultura
AGGEGATE - It applies aggregate functions to Record Sets to produce new output records
from aggregated values.
AUDIT - Adds Package and Task level Metadata - such as Machine Name, Execution Instance,
Package Name, Package ID, etc..
CHARACTER MAP - Performs SQL Server level makes string data changes such as changing
data from lower case to upper case.
CONDITIONAL SPLIT Separates available input into separate output pipelines based on
Boolean Expressions configured for each output.
COPY COLUMN - Add a copy of column to the output we can later transform the copy keeping
the original for auditing.
DATA CONVERSION - Converts columns data types from one to another type. It stands for
Explicit Column Conversion.
DATA MINING QUERY Used to perform data mining query against analysis services and
manage Predictions Graphs and Controls.
DERIVED COLUMN - Create a new (computed) column from given expressions.
EXPORT COLUMN Used to export a Image specific column from the database to a flat file.
FUZZY GROUPING Used for data cleansing by finding rows that are likely duplicates.
FUZZY LOOKUP - Used for Pattern Matching and Ranking based on fuzzy logic.
IMPORT COLUMN - Reads image specific column from database onto a flat file.
LOOKUP - Performs the lookup (searching) of a given reference object set against a data
source. It is used for exact matches only.
MERGE - Merges two sorted data sets into a single data set into a single data flow.
MERGE JOIN - Merges two data sets into a single dataset using a join junction.
MULTI CAST - Sends a copy of supplied Data Source onto multiple Destinations.
ROW COUNT - Stores the resulting row count from the data flow / transformation into a
variable.
ROW SAMPLING - Captures sample data by using a row count of the total rows in dataflow
specified by rows or percentage.
UNION ALL - Merge multiple data sets into a single dataset.
PIVOT Used for Normalization of data sources to reduce analomolies by converting rows into
columns
UNPIVOT Used for demoralizing the data structure by converts columns into rows incase of
building Data Warehouses.
Q: How to log SSIS Executions?
SSIS includes logging features that write log entries when run-time events occur and can also
write custom messages. This is not enabled by default. Integration Services supports a diverse
set of log providers, and gives you the ability to create custom log providers. The Integration
Services log providers can write log entries to text files, SQL Server Profiler, SQL Server,
Windows Event Log, or XML files. Logs are associated with packages and are configured at the
package level. Each task or container in a package can log information to any package log.
The tasks and containers in a package can be enabled for logging even if the package itself is
not.
Q: How do you deploy SSIS packages.
BUILDing SSIS Projects provides a Deployment Manifest File. We need to run the manifest file
and decide whether to deploy this onto File System or onto SQL Server [ msdb]. SQL Server
Deployment is very faster and more secure then File System Deployment. Alternatively, we
can also import the package from SSMS from File System or SQ Server.
Q: What are variables and what is variable scope ?
Variables store values that a SSIS package and its containers, tasks, and event handlers can
use at run time. The scripts in the Script task and the Script component can also use
variables. The precedence constraints that sequence tasks and containers into a workflow can
use variables when their constraint definitions include expressions. Integration Services
supports two types of variables: user-defined variables and system variables. User-defined
variables are defined by package developers, and system variables are defined by Integration
Services. You can create as many user-defined variables as a package requires, but you
cannot create additional system variables.
Q: Can you name five of the Perfmon counters for SSIS and the value they provide?
SQLServer:SSIS Service
SSIS Package Instances
SQLServer:SSIS Pipeline
BLOB bytes read
BLOB bytes written
BLOB files in use
Buffer memory
Buffers in use
Buffers spooled
Flat buffer memory
Flat buffers in use
Private buffer memory
Private buffers in use
Rows read
Rows written
Q: How do I find the bottom 10 customers with the lowest sales in 2003 that were not null?
A: Simply using bottomcount will return customers with null sales. You will have to combine it
with NONEMPTY or FILTER.
By default Analysis Services returns members in an order specified during attribute design.
Attribute properties that define ordering are "OrderBy" and "OrderByAttribute". Lets say we
want to see order counts for each year. In Adventure Works MDX query would be:
SELECT {[Measures].[Reseller Order Quantity]} ON 0
, [Date].[Calendar].[Calendar Year].Members ON 1
FROM [Adventure Works];
Today i will share with you the list of SSIS most frequently asked interview
questions with answers
SSIS 2008 or SSIS 2012 interview questions will help you prepare better for your
interviews
6.What is a Task?
A task is very much like a method of any programming language which
represents or carries out an individual unit of work. There are broadly two
categories of tasks in SSIS, Control Flow tasks and Database Maintenance tasks.
All Control Flow tasks are operational in nature except Data Flow tasks. Although
there are around 30 control flow tasks which you can use in your package you
can also develop your own custom tasks with your choice of .NET programming
language
9.What is a Transformation?
A transformation simply means bringing in the data in a desired format. For
example you are pulling data from the source and want to ensure only distinct
records are written to the destination, so duplicates are removed. Anther
example is if you have master/reference data and want to pull only related data
from the source and hence you need some sort of lookup. There are around 30
transformation tasks available and this can be extended further with custom built
tasks if needed.
Question: How have you learnt SSRS (on the job, articles, books, conferences)
Comment: The thing is that most people who read good books have usually an advantage over
those who hasn't because they know what they know and they know what they don't know (but
they know it exists and is available). Blog/Articles vary in quality so best practise articles is a
big plus+, conferences can be also a plus.
Question: What languages have you used to query data for SSRS Reports?
Comment: Most answers will probably be SQL (TSQL). But T-SQL is not the only query language.
If someone build reports based on cubes before than they will say MDX. You can also query
data from other sources (not recommended) or use data mining expressions (DMX = advanced).
Question: What types of graphs do you normally use and what effects do you apply to them?
Comment: Good graph are bar, line, scatter, bullet graphs. Bad graphs are pie charts, area
graphs, gauges (apart from bullet graph which classified as gauge in SSRS). Effects should be
limited to minimum. Developers should avoid 3D effects, "glass" effect, shadows etc
SSRS Advanced questions
Question: Have you used custom assemblies in SSRS? If Yes give an example
Comment: This allows to re-use code in reports and it is not very common. Re-usability is good
but building dependencies is not so good so one small mistake might break all reports using
it; so it should be used with care.
Question: What is the formatting code (format property) for a number with 2 decimal places?
Comment: N2. Attention to details and good memory is always welcome.
SSRS interview "narrowed" questions
In this section I will give you fairly long list of short and narrowed questions:
Question: What does rdl stand for?
Answer: Report Definition Language
Question: How to deploy an SSRS Report?
Answer: Configure project properties (for multiple environments) and deploy from bids, upload
manually or use rs.exe for command line deployment.
Question: What is Report Manager?
Answer: Web based tool that allows to access and run reports.
Question: What is Report Builder?
Answer: Report Builder is a self-service tool for end users.
Question: What permission do you need to give to users to enable them to use Report Builder?
Answer: "Report Builder" role and "system user". Report builder should also be enable in report
server properties.
Question: What do you need to restore report server database on another machine?
Answer: SSRS Encryption key
Question: Can you create subscription using windows authentication?
Answer: No.
Question: What caching options do you have for reports on report server?
Answer: Do no cache, expiry cache after x minute, on schedule.
Question: How to find slow running reports?
Answer: Check ReportExecution table
Question: How can you make calendar available in report parameter?
Answer: Set parameter data type to date.
Question: How to pass multi-valued parameter to stored procedure in dataset?
Answer: Join function in SSRS and split function in T-SQL
Question: Which functions are used to pass parameter to MDX query?
Answer: StrToMember and StrToSet
Question: How to create "dependant" parameter "Make, Model, Year"
Answer: They need to be in correct order, and previous parameter should filter next parameter
dataset. For instance Model dataset should be filtered using Make parameter
Question: How to create "alternate row colour"?
Answer: use RowNumber and Mod function OR visit our tutorial.
Question: How to create percentile function?
Answer: Custom code is required. Visit our tutorial for more details.
Question: How to create median function?
Answer: Custom code is required. Visit our tutorial for more details.
Question: How to make conditional sum in SSRS?
Answer: IIF condition true then SUM else nothing. Visit our tutorial for more details
Question: How to find a value in another dataset based on current dataset field (SSRS 2008 R2)?
Answer: Use lookup function.
Question: How to change parameter value inside the report?
Answer: Set action. "Jump to itself" and pass different value for the parameter.
Question: How to identify current user in SSRS Report?
Answer: User!UserID
-----------------------------------------------------------------------------------------------------
Q1. WHAT is SQL Server Reporting Services(SSRS)?
SQL Server Reporting Services is a server-based reporting platform that you can use to create
and manage tabular, matrix, graphical, and free-form reports that contain data from relational
and multidimensional data sources. The reports that you create can be viewed and managed
over a World Wide Web-based connection
-Admin
Q3. What are the three stages of Enterprise Reporting Life Cycle ?
a. Authoring
b. Management
c. Access and Delivery
Q6. Which programming language can be used to code embedded functions in SSRS?
Visual Basic .NET Code.
1. Report definition: The blueprint for a report before the report is processed or rendered. A
report definition contains information about the query and layout for the report.
2. Report snapshot: A report that contains data captured at a specific point in time. A report
snapshot is actually a report definition that contains a dataset instead of query instructions.
3. Rendered report: A fully processed report that contains both data and layout information, in
a format suitable for viewing (such as HTML).
4. Parameterized report: A published report that accepts input values through parameters.
5. Shared data source: A predefined, standalone item that contains data source connection
information.
7. Report-specific data source: Data source information that is defined within a report
definition.
8. Report model: A semantic description of business data, used for ac hoc reports created in
Report Builder.
9. Linked report: A report that derives its definition through a link to another report.
10. Report server administrator: This term is used in the documentation to describe a user
with elevated privileges who can access all settings and content of a report server. If you are
using the default roles, a report server administrator is typically a user who is assigned to both
the Content Manager role and the System Administrator role. Local administrators can have
elevated permission even if role assignments are not defined for them.
11. Folder hierarchy: A bounded namespace that uniquely identifies all reports, folders, report
models, shared data source items, and resources that are stored in and managed by a report
server.
12. Report Server: Describes the Report Server component, which provides data and report
processing, and report delivery. The Report Server component includes several subcomponents
that perform specific functions.
13. Report Manager: Describes the Web application tool used to access and manage the
contents of a report server database.
14. Report Builder: Report authoring tool used to create ad hoc reports.
15. Report Designer: Report creation tool included with Reporting Services.
16. Model Designer: Report model creation tool used to build models for ad hoc reporting.
17. Report Server Command Prompt Utilities: Command line utilities that you can use to
administer a report server.
a) RsConfig.exe, b) RsKeymgmt.exe, c) Rs.exe
Q8. what are the Command Line Utilities available In Reporting Services?
Rsconfig Utility (Rsconfig.exe): encrypts and stores connection and account values in the
RSReportServer.config file. Encrypted values include report server database connection
information and account values used for unattended report processing
RsKeymgmt Utility: Extracts, restores, creates, and deletes the symmetric key used to
protect sensitive report server data against unauthorized access
RS Utility: this utility is mainly used to automate report server deployment and
administration tasks.Processes script you provide in an input file.
-Development
Q. What is difference between Tablular and Matrix report?
OR What are the different styles of reports?
Tablular report: A tabular report is the most basic type of report. Each column corresponds to
a column selected from the database.
1. You want to include an image in a report. How do you display the Image
Properties dialog box?
When you drag an image item from the Toolbox window to the Report Designer, the
Image Properties dialog box automatically opens.
3. What are data regions?Data regions are report items that display repeated
rows of summarized information from datasets.
4. You want to generate a report that is formatted as a chart. Can you use the
Report Wizard to create such a report?
No, the Report Wizard lets you create only tabular and matrix reports. you must create
the chart report directly by using the Report Designer.
5. You want to use BIDS to deploy a report to a different server than the one you
chose in the Report Wizard. How can you change the server URL?
You can right-click the project in Solution Explorer and then change the Target-Server
URL property.
7. Can you use a stored procedure to provide data to an SSRS report?Yes, you
can use a stored procedure to provide data to an SSRS report by configuring the
dataset to use a stored procedure command type. However, your stored procedure
should return only a single result set. If it returns multiple result sets, only the first
one is used for the report dataset.
8. You want to use a perspective in an MDX query. How do you select the
perspective?Use the Cube Selector in the MDX Query Designer to select a perspective.
9. Can you use data mining models in SSRS?Yes, you can use the DMX Designer
to create data mining queries for SSRS reports. However, do not forget to flatten the
result set returned by the DMX query.
10. You want your report to display a hyperlink that will take users to your
intranet. How do you configure such a hyperlink?Create a text box item, set the action
to Go To URL, and then configure the URL.
11. You want a report to display Sales by Category, SubCategory, and Product. You
want users to see only summarized information initially but to be able to display the
details as necessary. How would you create the report?Group the Sales information by
Category, SubCategory, and Product. Hide the SubCategory group and set the visibility
to toggle based on the Category item. Hide the Product category group and set the
visibility to toggle based on the SubCategory item.
12. You want to create an Excel interactive report from SSRS. In SSRS, can you
create the same interactive experience in Excel that you would have on the Web? No,
you cannot create the same experience with SSRS. you can, however, use Excel to
create such an experience.
13. What is the main difference between a Matrix report item and a Table report
item? The main difference between a Matrix and a Table report item is in the initial
template. Actually, both report items are just templates for the Tablix data region.
14. When you do not use report caching, is it better to use parameters to filter
information in the query or to use filters in the dataset?From a performance
perspective, it is better to use parameters because they let SSRS pull filtered data
from the data source. In contrast, when you use filters, the queries retrieve all data
and then filter the information in an additional step.
15. How do you configure a running aggregate in SSRS?You can use the
RunningValue function to configure a running aggregate.
18. You want your users to select a parameter from a list of values in a list box.
How should you configure the parameter?
You should create a data source that contains the possible values and then bind the
data source to the parameter.
19. What is the main benefit of using embedded code in a report?The main benefit
of using embedded code in a report is that the code you write at the report level can
be reused in any expression in the report.
20. What programming language would you use to create embedded functions in
SSRS?
An SSRS report supports only visual Basic .nET embedded code.
---------------------------------------------------------------------------------------------------------
SSIS and SSRS Interview questions with Answers
1) What is the control flow
ANS: In SSIS a workflow is called a control-flow. A control-flow links together our modular
data-flows as a series of operations in order to achieve a desired result.
A control flow consists of one or more tasks and containers that execute when the package
runs. To control order or define the conditions for running the next task or container in the
package control flow, you use precedence constraints to connect the tasks and containers in
a package. A subset of tasks and containers can also be grouped and run repeatedly as a
unit within the package control flow.
SQL Server 2005 Integration Services (SSIS) provides three different types of control flow
elements: containers that provide structures in packages, tasks that provide functionality, and
precedence constraints that connect the executables, containers, and tasks into an ordered
control flow.
SQL Server 2005 Integration Services (SSIS) provides three different types of data flow
components: sources, transformations, and destinations. Sources extract data
from data stores such as tables and views in relational databases, files, and Analysis
Services databases. Transformations modify, summarize, and clean data. Destinations load
data into data stores or create in-memory datasets.
For example, a data conversion fails because a column contains a string instead of a
number, an insertion into adatabase column fails because the data is a date and the column
has a numeric data type, or an expression fails to evaluate because a column value is zero,
resulting in a mathematical operation that is not valid.
-Data conversion errors, which occur if a conversion results in loss of significant digits, the
loss of insignificant digits, and the truncation of strings. Data conversion errors also occur if
the requested conversion is not supported.
-Expression evaluation errors, which occur if expressions that are evaluated at run time
perform invalid operations or become syntactically incorrect because of missing or incorrect
data values.
-Lookup errors, which occur if a lookup operation fails to locate a match in the lookup table.
Many data flow components support error outputs, which let you control how the component
handles row-level errors in both incoming and outgoing data. You specify how the
component behaves when truncation or an error occurs by setting options on individual
columns in the input or output.
For example, you can specify that the component should fail if customer name data is
truncated, but ignore errors on another column that contains less important data.
Integration Services supports a diverse set of log providers, and gives you the ability to
create custom log providers. The Integration Services log providers can write log entries
to text files, SQL Server Profiler, SQL Server, WindowsEvent Log, or XML files.
Logs are associated with packages and are configured at the package level. Each task or
container in a package can log information to any package log. The tasks and containers in a
package can be enabled for logging even if the package itself is not.
Integration Services supports two types of variables: user-defined variables and system
variables. User-defined variables are defined by package developers, and system variables
are defined by Integration Services. You can create as many user-defined variables as a
package requires, but you cannot create additional system variables.
Scope : A variable is created within the scope of a package or within the scope of a
container, task, or event handler in the package. Because the package container is at the top
of the container hierarchy, variables with package scope function like global variables and
can be used by all containers in the package. Similarly, variables defined within the scope of
a container such as a For Loop container can be used by all tasks or containers within the
For Loop container.
6) True or False - Using a checkpoint file in SSIS is just like issuing the CHECKPOINT
command against the relational engine. It commits all of the data to the database.
ANS: False. SSIS provides a Checkpoint capability which allows a package to restart at the
point of failure.
7) True or False: SSIS has a default means to log all records updated, deleted or inserted on
a per table basis.
ANS: False, but a custom solution can be built to meet these needs.
9) How do you eliminate quotes from being uploaded from a flat file to SQL Server?
ANS: In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the
Text qualifier field then preview the data to ensure the quotes are not included.
Additional information: How to strip out double quotes from an import file in SQL Server
Integration Services
How to Implement?
Designed SSIS package like:
The script component code:
4. My Source Table Data as follows:
FailPackageOnFailure: property needs to be set to True for enabling the task in the
checkpoint.
Checkpoint mechanism uses a Text File to mark the point of package failure.
These checkpoint files are automatically created at a given location upon the package
failure and automatically deleted once the package ends up with success.
10. How to execute SSIS Package from Stored Procedure.
using xp_cmdshell command
11. Parallel processing in SSIS
To support parallel execution of different tasks in the package, SSIS uses 2
properties:
1.MaxConcurrentExecutables: defines how many tasks can run simultaneously, by
specifying the maximum number of SSIS threads that can execute in parallel per
package. The default is -1, which equates to number of physical or logical processor
+ 2.
2. EngineThreads: is property of each DataFlow task. This property defines how
many threads the data flow engine can create and run in parallel. The EngineThreads
property applies equally to both the source threads that the data flow engine creates
for sources and the worker threads that the engine creates for transformations and
destinations. Therefore, setting EngineThreads to 10 means that the engine can
create up to ten source threads and up to ten worker threads.
Drag and Drop 'Execute Sql Task'. Open the Execute Sql Task Editor and in Parameter
Mapping' section, select the system variables as follows:
Create a table in Sql Server Database with Columns as: PackageID, PackageName,
TaskID, TaskName, ErrorCode, ErrorDescription.
The package will be failed during the execution.
The error information is inserted into Table.
The Merge transformation combines two sorted datasets into a single dataset. The
rows from each dataset are inserted into the output based on values in their key
columns.
The Merge transformation is similar to the Union All transformations. Use the Union
All transformation instead of the Merge transformation in the following situations:
Multicast Transformation generates exact copies of the source data, it means each
recipient will have same number of records as the source whereas the Conditional
Split Transformation divides the source data based on the defined conditions and if
no rows match with this defined conditions those rows are put on default output.
Bulk Insert Task is used to copy the large volumn of data from text file to sql server
destination.
19. Incremental Load in SSIS
Using Slowly Changing Dimension
Using Lookup and Cache Transformation
20. How to migrate Sql server 2005 Package to 2008 version
1. In BIDS, by right click on the "SSIS Packages" folder of an SSIS project and
selecting "Upgrade All Packages".
2. Running "ssisupgrade.exe" from the command line (default physical location
C:\Program Files\Microsoft SQL Server\100\DTS\Bin folder).
3. If you open a SSIS 2005 project in BIDS 2008, it will automatically launch the
SSIS package upgrade wizard.
21. Difference between Synchronous and Asynchronous Transformation
Synchronous T/F process the input rows and passes them onto the data flow one row
at a time.
When the output buffer of Transformation created a new buffer, then it is
Asynchronous transformation. Output buffer or output rows are not sync with input
buffer.
22. What are Row Transformations, Partially Blocking Transformation, Fully Blocking
Transformation with examples.
In Row Transformation, each value is manipulated individually. In this transformation,
the buffers can be re-used for other purposes like following:
OLEDB Datasource, OLEDB Data Destinations
Other Row transformation within the package, Other partially blocking
transformations within the package.
examples of Row Transformations: Copy Column, Audit, Character Map
Partially Blocking Transformation:
These can re-use the buffer space allocated for available Row transformation and get
new buffer space allocated exclusively for Transformation.
examples: Merge, Conditional Split, Multicast, Lookup, Import, Export Column
Fully Blocking Transformation:
It will make use of their own reserve buffer and will not share buffer space from
other transformation or connection manager.
examples: Sort, Aggregate, Cache Transformation
Container
Container Description Purpose of SSIS Container
Type
To repeat tasks for each element in
a collection, for example retrieve
Foreach
This container runs a Control Flow files from a folder, running T-SQL
Loop
repeatedly using an enumerator. statements that reside in multiple
Container
files, or running a command for
multiple objects.
To repeat tasks until a specified
This container runs a Control Flow expression evaluates to false. For
For Loop repeatedly by checking conditional example, a package can send a
Container expression (same as For Loop in different e-mail message seven
programming language). times, one time for every day of the
week.
This container group tasks and
containers that must succeed or fail
Groups tasks as well as containers as a unit. For example, a package
Sequence
into Control Flows that are subsets can group tasks that delete and add
Container
of the package Control Flow. rows in a database table, and then
commit or roll back all the tasks
when one fails.
Success Workflow will proceed when the preceding container executes successfully.
Indicated in control flow by a solid green line.
Failure Workflow will proceed when the preceding containers execution results in a
failure. Indicated in control flow by a solid red line.
Completion Workflow will proceed when the preceding containers execution
completes, regardless of success or failure. Indicated in control flow by a solid blue
line.
Expression/Constraint with Logical AND Workflow will proceed when specified
expression and constraints evaluate to true. Indicated in control flow by a solid color
line along with a small fx icon next to it. Color of line depends on logical constraint
chosen (e.g. success=green, completion=blue).
Expression/Constraint with Logical OR Workflow will proceed when either the
specified expression or the logical constraint (success/failure/completion) evaluates
to true. Indicated in control flow by a dotted color line along with a small fx icon
next to it. Color of line depends on logical constraint chosen (e.g. success=green,
completion=blue).
Keep Identity By default this setting is unchecked which means the destination
table (if it has an identity column) will create identity values on its own. If you check
this setting, the dataflow engine will ensure that the source identity values are
preserved and same value is inserted into the destination table.
Keep Nulls Again by default this setting is unchecked which means default value
will be inserted (if the default constraint is defined on the target column) during
insert into the destination table if NULL value is coming from the source for that
particular column. If you check this option then default constraint on the destination
table's column will be ignored and preserved NULL of the source column will be
inserted into the destination.
Table Lock By default this setting is checked and the recommendation is to let it be
checked unless the same table is being used by some other process at same time. It
specifies a table lock will be acquired on the destination table instead of acquiring
multiple row level locks, which could turn into lock escalation problems.
Check Constraints Again by default this setting is checked and recommendation is
to un-check it if you are sure that the incoming data is not going to violate
constraints of the destination table. This setting specifies that the dataflow pipeline
engine will validate the incoming data against the constraints of target table. If you
un-check this option it will improve the performance of the data load.
#5 - Effect of Rows Per Batch and Maximum Insert Commit Size Settings:
Rows per batch:
The default value for this setting is -1 which specifies all incoming rows will be
treated as a single batch. You can change this default behavior and break all
incoming rows into multiple batches. The allowed value is only positive integer which
specifies the maximum number of rows in a batch.
Maximum insert commit size:
The default value for this setting is '2147483647' (largest value for 4 byte integer
type) which specifies all incoming rows will be committed once on successful
completion. You can specify a positive value for this setting to indicate that commit
will be done for those number of records. Changing the default value for this setting
will put overhead on the dataflow engine to commit several times. Yes that is true,
but at the same time it will release the pressure on the transaction log and tempdb
to grow specifically during high volume data transfers.
The above two settings are very important to understand to improve the
performance of tempdb and the transaction log. For example if you leave 'Max insert
commit size' to its default, the transaction log and tempdb will keep on growing
during the extraction process and if you are transferring a high volume of data the
tempdb will soon run out of memory as a result of this your extraction will fail. So it
is recommended to set these values to an optimum value based on your
environment.
The number of buffer created is dependent on how many rows fit into a buffer and
how many rows fit into a buffer dependent on few other factors. The first
consideration is the estimated row size, which is the sum of the maximum sizes of all
the columns from the incoming records. The second consideration is the
DefaultBufferMaxSize property of the data flow task. This property specifies the
default maximum size of a buffer. The default value is 10 MB and its upper and lower
boundaries are constrained by two internal properties of SSIS which are
MaxBufferSize (100MB) and MinBufferSize (64 KB). It means the size of a buffer can
be as small as 64 KB and as large as 100 MB. The third factor is,
DefaultBufferMaxRows which is again a property of data flow task which specifies the
default number of rows in a buffer. Its default value is 10000.
If the size exceeds the DefaultBufferMaxSize then it reduces the rows in the buffer.
For better buffer performance you can do two things.
First you can remove unwanted columns from the source and set data type in each
column appropriately, especially if your source is flat file. This will enable you to
accommodate as many rows as possible in the buffer.
Second, if your system has sufficient memory available, you can tune these
properties to have a small number of large buffers, which could improve
performance. Beware if you change the values of these properties to a point where
page spooling (see Best Practices #8) begins, it adversely impacts performance. So
before you set a value for these properties, first thoroughly testing in your
environment and set the values appropriately.
Let's consider a scenario where the first component of the package creates an object
i.e. a temporary table, which is being referenced by the second component of the
package. During package validation, the first component has not yet executed, so no
object has been created causing a package validation failure when validating the
second component. SSIS will throw a validation exception and will not start the
package execution. So how will you get this package running in this common
scenario?
SSIS provide a set of performance counters. Among them, the following few are
helpful when you tune or debug your package:
Buffers in use
Flat buffers in use
Private buffers in use
Buffers spooled
Rows read
Rows written
Buffers in use, Flat buffers in use and Private buffers in use are useful to
discover leaks. During package execution time, we will see these counters
fluctuating. But once the package finishes execution, their values should return to
the same value as what they were before the execution. Otherwise, buffers are
leaked.
Buffers spooled has an initial value of 0. When it goes above 0, it indicates that the
engine has started memory swapping. In a case like this, set Data Flow Task
properties BLOBTempStoragePath and BufferTempStoragePath appropriately for
maximal I/O bandwidth.
Buffers Spooled: The number of buffers currently written to the disk. If the data flow
engine runs low on physical memory, buffers not currently used are written to disk
and then reloaded when needed.
Rows read and Rows written show how many rows the entire Data Flow has
processed.
12. FastParse property
Fast Parse option in SSIS can be used for very fast loading of flat file data. It will
speed up parsing of integer, date and time types if the conversion does not have to
be locale-sensitive. This option is set on a per-column basis using the Advanced
Editor for the flat file source.
13. Checkpoint features helps in package restarting
1. A data flow consists of the sources and destinations that extract and load data,
the transformations that modify and extend data, and the paths that link sources,
transformations, and destinations. The Data Flow task is the executable within the
SSIS package that creates, orders, and runs the data flow. Data Sources,
Transformations, and Data Destinations are the three important categories in the
Data Flow.
2. Data flows move data, but there are also tasks in the control flow, as such, their
success or Failure effects how your control flow operates
3. Data is moved and manipulated through transformations.
4. Data is passed between each component in the data flow.
DTEXECUI provides a graphical user interface that can be used to specify the various
options to be set when executing an SSIS package. You can launch DTEXECUI by
double-clicking on an SSIS package file (.dtsx). You can also launch DTEXECUI from
a Command Prompt then specify the package to execute.
2. Using the DTEXEC.EXE command line utility one can execute an SSIS package
that is stored in a File System, SQL Server or an SSIS Package Store. The syntax to
execute a SSIS package which is stored in a File System is shown below.
DTEXEC.EXE /F "C:\BulkInsert\BulkInsertTask.dtsx"
3. Test the SSIS package execution by running the package from BIDS:
-In Solution Explorer, right click the SSIS project folder that contains the package
which you want to run and then click properties.
- In the SSIS Property Pages dialog box, select Build option under the Configuration
Properties node and in the right side panel, provide the folder location where you
want the SSIS package to be deployed within the OutputPath. Click OK to save the
changes in the property page.
-Right click the package within Solution Explorer and select Execute Package option
from the drop down menu
The first step to setting up the proxy is to create a credential (alternatively you could
use an existing credential). Navigate to Security then Credentials in SSMS Object
Explorer and right click to create a new credential
Navigate to SQL Server Agent then Proxies in SSMS Object Explorer and right click to
create a new proxy
SSIS Package Store is nothing but combination of SQL Server and File System
deployment, as you can see when you connect to SSIS through SSMS: it looks like a
store which has categorized its contents (packages) into different categories based
on its managers (which is you, as the package developer) taste. So, dont get it
wrong as something different from the 2 types of package deployment.
48. How to provide security to packages?
We can provide security to packages in 2 ways
1. Package encryption
2. Password protection
At run time, the FTP task connects to a server by using an FTP connection
manager. The FTP connection manager includes the server settings, the credentials
for accessing the FTP server, and options such as the time-out and the number of
retries for connecting to the server.
The FTP connection manager supports only anonymous authentication and basic
authentication. It does not support Windows Authentication.
Predefined FTP Operations:
Send Files, Receive File,
Create Local directory, Remove Local Directory,
Create Remote Directory, Remove Remote Directory
Delete Local Files, Delete Remote File
Customer Log Entries available on FTP Task:
FTPConnectingToServer
FTPOperation
51. New features in SSIS 2012
1. GUI Improvements - -Sort packages by name -Package visualization -Zoom -Data
flow source/destination wizard -Grouping in data flow
2. CDC (Change Data Capture) Task and Components - -CDC is nothing but
Incremental load loads all rows that have changed since the last load -CDC needs to
keep track of which changes have already been processed. -CDC task does this by
storing LSNs in a tracking table -CDC source component reads from the CDC table
function, based on the LSN it for from the CDC task. -CDC transformation splits
records into new rows, updated rows and deleted rows.
3. Flat File Connection Manager Changes - -The Flat File connection manager now
supports parsing files with embedded qualifiers. The connection manager also by
default always checks for row delimiters to enable the correct parsing of files with
rows that are missing column fields. The Flat File Source now supports a varying
number of columns, and embedded qualifiers.
REPLACENULL: You can use this function to replace NULL values in the first argument
with the expression specified in the second argument. This is equivalent to ISNULL in
T-SQL: REPLACENULL(expression, expression)
TOKEN: This function allows you to return a substring by using delimiters to separate
a string into tokens and then specifying which occurrence to
return: TOKEN(character_expression, delimiter_string, occurrence)
TOKENCOUNT: This function uses delimiters to separate a string into tokens and then
returns the count of tokens found within the
string: TOKENCOUNT(character_expression, delimiter_string)
6. Easy Column Remapping in Data Flow (Mapping Data Flow Columns) -When
modifying a data flow, column remapping is sometimes needed -SSIS 2012 maps
columns on name instead of id -It also has an improved remapping dialog
9. ODBC Source and Destination - -ODBC was not natively supported in 2008 -SSIS
2012 has ODBC source & destination -SSIS 2008 could access ODBC via ADO.NET
10. Reduced Memory Usage by the Merge and Merge Join Transformations The old
SSIS Merge and Merge Join transformations, although helpful, used a lot of system
resources and could be a memory hog. In 2012 these tasks are much more robust
and reliable. Most importantly, they will not consume excessive memory when the
multiple inputs produce data at uneven rates.
11. Undo/Redo: One thing that annoys users in SSIS before 2012 is lack of support
of Undo and Redo. Once you performed an operation, you cant undo that. Now in
SSIS 2012, we can see the support of undo/redo.
Control The Script task is configured on the Control Flow tab The Script component is configured on the
Flow/Date of the designer and runs outside the data flow of the Data Flow page of the designer and
Flow package. represents a source, transformation, or
destination in the Data Flow task.
Purpose A Script task can accomplish almost any general- You must specify whether you want to create
purpose task. a source, transformation, or destination with
the Script component.
Raising The Script task uses both the TaskResult property The Script component runs as a part of the
Results and the optional ExecutionValue property of the Dts Data Flow task and does not report results
object to notify the runtime of its results. using either of these properties.
Raising The Script task uses the Events property of the Dts The Script component raises errors,
Events object to raise events. For example: warnings, and informational messages by
Dts.Events.FireError(0, "Event Snippet", ex.Message using the methods of the
& ControlChars.CrLf & ex.StackTrace IDTSComponentMetaData100 interface
returned by the ComponentMetaData
property. For example: Dim
myMetadata as IDTSComponentMetaData100
myMetaData = Me.ComponentMetaData
myMetaData.FireError(...)
Execution A Script task runs custom code at some point in the A Script component also runs once, but
package workflow. Unless you put it in a loop typically it runs its main processing routine
container or an event handler, it only runs once. once for each row of data in the data flow.
Editor The Script Task Editor has three pages: General, The Script Transformation Editor has up to
Script, and Expressions. Only the ReadOnlyVariables four pages: Input Columns, Inputs and
and ReadWriteVariables, and ScriptLanguage Outputs, Script, and Connection Managers.
properties directly affect the code that you can write. The metadata and properties that you
configure on each of these pages determines
the members of the base classes that are
autogenerated for your use in coding.
Interaction In the code written for a Script task, you use the Dts In Script component code, you use typed
with the property to access other features of the package. The accessor properties to access certain package
Package Dts property is a member of the ScriptMain class. features such as variables and connection
managers. The PreExecute method can
access only read-only variables. The
PostExecute method can access both read-
only and read/write variables.
Using The Script task uses the Variables property of the Dts The Script component uses typed accessor
Variables object to access variables that are available through properties of the autogenerated based class,
the tasks ReadOnlyVariables and ReadWriteVariables created from the components
properties. For example: string myVar; ReadOnlyVariables and ReadWriteVariables
myVar = properties. For example:
Dts.Variables["MyStringVariable"].Value.ToString(); string myVar; myVar =
this.Variables.MyStringVariable;
Using The Script task uses the Connections property of the The Script component uses typed accessor
Connections Dts object to access connection managers defined in properties of the autogenerated base class,
the package. For example: string created from the list of connection managers
myFlatFileConnection; myFlatFileConnection entered by the user on the Connection
= (Dts.Connections["Test Flat File Managers page of the editor. For example:
Connection"].AcquireConnection(Dts.Transaction) as IDTSConnectionManager100
String); connMgr;connMgr =
this.Connections.MyADONETConnection;
3. The Bulk Insert task uses the T-SQL BULK INSERT statement for speed when
loading large amounts of data.
58.which services are installed during Sql Server installation
SSIS
SSAS
SSRS
SQL Server (MSSQLSERVER)
SQL Server Agent Service
SQL Server Browser
SQL Full-Text
In this case, you want to pass variables dynamically, using an available value from
the source dataset. You can think of it like this:
http://servername/reportserver?%2fpathto
%2freport&rs:Command=Render&ProductCode=Fields!ProductCode.Value
The exact syntax in the "Jump to URL" (Fx) expression window will be:
="javascript:void(window.open('http://servername/reportserver?%2fpathto
%2freport&rs:Command=Render&ProductCode="+Fields!ProductCode.Value+"'))"
STEP2:
In the Pie Chart, select Series Properties and select the Fill option from left side.
Now write following expression in the Color expression:
=code.GetColor(Fields!Year.Value)
Now apply this function to the style property of an element on the report.
=code.StyleElement("TABLE_HEADER_TEXT")
If you want apply dynamic styles to report, then create tables in sql server and insert
style information into the tables.
Create a Dataset, specify the Stored Procedure.
example: =Fields!TABLE_HEADER_TEXT.Value
where TABLE_HEADER_TEXT is a value in the table.
Parameters are applied at the database level. The Data will be fetched based on
parameters at the database level using WHERE condition in the query.
1. The total time to generate a report (RDL) can be divided into 3 elements:
Time to retrieve the data (TimeDataRetrieval).
Time to process the report (TimeProcessing)
Time to render the report (TimeRendering)
Total time = (TimeDataRetrieval) + (TimeProcessing) + (TimeRendering)
These 3 performance components are logged every time for which a deployed report
is executed. This information can be found in the table ExecutionLogStorage in the
ReportServer database.
2. Use the SQL Profiler to see which queries are executed when the report is
generated. Sometimes you will see more queries being executed than you expected.
Every dataset in the report will be executed. A lot of times new datasets are added
during building of reports. Check if all datasets are still being used. For instance,
datasets for available parameter values. Remove all datasets which are not used
anymore.
3. Sometimes a dataset contains more columns than used in the Tablix\list. Use only
required columns in the Dataset.
4. ORDER BY in the dataset differs from the ORDER BY in the Tablix\list. You need to
decide where the data will be sorted. It can be done within SQL Server with an
ORDER BY clause or in by the Reporting server engine. It is not useful to do it in
both. If an index is available use the ORDER BY in your dataset.
5. Use the SQL Profiler to measure the performance of all datasets (Reads, CPU and
Duration). Use the SQL Server Management Studio (SSMS) to analyze the execution
plan of every dataset.
6. Avoid dataset with result sets with a lot of records like more than 1000 records. A
lot of times data is GROUPED in the report without an Drill down option. In that
scenario do the group by already in your dataset. This will save a lot of data transfer
to the SQL Server and it will save the reporting server engine to group the result set.
7. Rendering of the report can take a while if the result set is very big. Look very
critical if such a big result set is necessary. If details are used in only 5 % of the
situations, create another report to display the details. This will avoid the retrieval of
all details in 95 % of the situations.
12. I have 'State' column in report, display the States in bold, whose State name
starts with letter 'A' (eg: Andhra pradesh, Assam should be in bold)
Shared datasets use only shared data sources, not embedded data sources.
To create a shared dataset, you must use an application that creates a shared
dataset definition file (.rsd). You can use one of the following applications to create a
shared dataset:
1. Report Builder: Use shared dataset design mode and save the shared dataset to a
report server or SharePoint site.
2. Report Designer in BIDS: Create shared datasets under the Shared Dataset folder
in Solution Explorer. To publish a shared dataset, deploy it to a report server or
SharePoint site.
Upload a shared dataset definition (.rsd) file. You can upload a file to the report
server or SharePoint site. On a SharePoint site, an uploaded file is not validated
against the schema until the shared dataset is cached or used in a report.
The shared dataset definition includes a query, dataset parameters including default
values, data options such as case sensitivity, and dataset filters.
18. How do u display the partial text in bold format in textbox in Report? (eg:
FirstName LastName, where "FirstName" should in bold fornt and "LastName" should
be in normal font.)
Use PlaceHolder
To avoid extra blank pages during export, the size of the body should be less or
equal to the size of the report - margins.
Set the width of the body to 26.7 cm (29.7 -1.5 - 1.5)
Set the height of the body to 18 cm (21 - 1.5 -1.5)
3. There are 2 options for deploying the reports that you create with Report Builder
3.0:
1. Report Manager
2. SharePoint document library
The first time a user clicks the link for a report configured to cache, the report
execution process is similar to the on-demand process. The intermediate format is
cached and stored in ReportServerTempDB Database until the cache expiry time.
If a user request a different set of parameter values for a cached report, then the
report processor treats the requests as a new report executing on demand, but flags
it as a second cached instance.
Report snapshot contains the Query and Layout information retrieved at specific
point of time. It executes the query and produces the intermediate format. The
intermediate format of the report has no expiration time like a cached instance, and
is stored in ReportServer Database.
27. Subscription. Different types of Subscriptions?
Subscriptions are used to deliver the reports to either File Share or Email in response
to Report Level or Server Level Schedule.
There are 2 types of subscriptions:
1. Standard Subscription: Static properties are set for Report Delivery.
2. Data Driven Subscription: Dynamic Runtime properties are set for Subscriptions
Ad hoc reports:Ad Hoc reporting allows the end users to design and create reports on
their own provided the data models.
3 components: Report Builder, Report Model and Model Designer
Use 'Model Designer' tool to design 'Report Models' and then use 'Report Model' tool
to generate reports.
Report Builder
- Windows Winform application for End users to build ad-hoc reports with the help of
Report models.
32. Explain the Report Model Steps.
1. Create the report model project
select "Report Model Project" in the Templates list
A report model project contains the definition of the data source (.ds file), the
definition of a data source view (.dsv file), and the report model (.smdl file).
2. Define a data source for the report model
3. Define a data source view for the report model
A data source view is a logical data model based on one or more data sources.
SQL Reporting Services generates the report model from the data source view.
4. Define a report model
5. Publish a report model to report server.
The <Query> element of RDL contains query or command and is used by the Report
Server to connect to the datasources of the report.
The <Query> element is optional in RDLC file. This element is ignored by Report
Viewer control because Report Viewer control does not perform any data processing
in Local processing mode, but used data that the host application supplies.
You can provide control to the user by adding Interactive Sort buttons to toggle
between ascending and descending order for rows in a table or for rows and columns
in a matrix. The most common use of interactive sort is to add a sort button to every
column header. The user can then choose which column to sort by.
36. What is Report Builder
Windows Winform application for End users to build ad-hoc reports with the help of
Report models.
37. Difference between Table report and Matrix Report
A Table Report can have fixed number of columns and dynamic rows.
A Matrix Report has dynamic rows and dynamic columns.
38. When to use Table, Matrix and List
1. Use a Table to display detail data, organize the data in row groups, or both.
2. Use a matrix to display aggregated data summaries, grouped in rows and
columns, similar to a PivotTable or crosstab. The number of rows and columns for
groups is determined by the number of unique values for each row and column
groups.
3. Use a list to create a free-form layout. You are not limited to a grid layout, but can
place fields freely inside the list. You can use a list to design a form for displaying
many dataset fields or as a container to display multiple data regions side by side for
grouped data. For example, you can define a group for a list; add a table, chart, and
image; and display values in table and graphic form for each group value
42.How to Combine Datasets in SSRS (1 Dataset gets data from Oracle and other
dataset from Sql Server)
Using LookUP function, we can combine 2 datasets in SSRS.
In the following example, assume that a table is bound to a dataset that includes a
field for the product identifier ProductID. A separate dataset called "Product" contains
the corresponding product identifier ID and the product name Name.
In the above expression, Lookup compares the value of ProductID to ID in each row
of the dataset called "Product" and, when a match is found, returns the value of the
Name field for that row.
The configuration settings of Report Manager and the Report Server Web service are
stored in a single configuration file (rsreportserver.config).
Report Manager is the web-based application included with Reporting Services that
handles all aspects of managing reports (deploying datasources and reports, caching
a report, subscriptions, snapshot).
44. Steps to repeat Table Headers in SSRS 2008?
1. Select the table
2. At the bottom of the screen, select a dropdown arrow beside column
groups. Enable "Advanced Mode" by clicking on it.
3. under Row Groups,select the static row and choose properties / press F4.
4. Set the following attributes for the static row or header row.
Set RepeatOnNewPage= True for repeating headers
Set KeepWithGroup= After
Set FixedData=True for keeping the headers visible.
45. How to add assemblies in SSRS
45. Report Extensions?
46. parent grouping, child grouping in SSRS
Open the data source dialog in report designer, and select the "Use
Single Transaction when processing the queries' check box. Once
selected, datasets that use the same data source are no longer
executed in parallel. They are also executed as a transaction, i.e. if
any of the queries fails to execute, the entire transaction is rolled
back.
In MOLAP, the structure of aggregation along with the data values are stored in multi
dimensional format, takes more space with less time for data analysis compared to
ROLAP.
MOLAPoffers faster query response and processing times, but offers a high latency
and requires average amount of storage space. This storage mode leads to
duplication of data as the detail data is present in both the relational as well as the
multidimensional storage.
3. Types of Dimensions
Dimension Description
type
Regular A dimension whose type has not been set to a special
dimension type.
Time A dimension whose attributes represent time periods,
such as years, semesters, quarters, months, and days.
Organization A dimension whose attributes represent organizational
information, such as employees or subsidiaries.
Geography A dimension whose attributes represent geographic
information, such as cities or postal codes.
BillOfMaterials A dimension whose attributes represent inventory or
manufacturing information, such as parts lists for
products.
Accounts A dimension whose attributes represent a chart of
accounts for financial reporting purposes.
Customers A dimension whose attributes represent customer or
contact information.
Products A dimension whose attributes represent product
information.
Scenario A dimension whose attributes represent planning or
strategic analysis information.
Quantitative A dimension whose attributes represent quantitative
information.
Utility A dimension whose attributes represent miscellaneous
information.
Currency This type of dimension contains currency data and
metadata.
Rates A dimension whose attributes represent currency rate
information.
Channel A dimension whose attributes represent channel
information.
Promotion A dimension whose attributes represent marketing
promotion information.
4. Types of Measures
Fully Additive Facts: These are facts which can be added across all the associated
dimensions. For example, sales amount is a fact which can be summed across
different dimensions like customer, geography, date, product, and so on.
Semi-Additive Facts: These are facts which can be added across only few dimensions
rather than all dimensions. For example, bank balance is a fact which can be
summed across the customer dimension (i.e. the total balance of all the customers in
a bank at the end of a particular quarter). However, the same fact cannot be added
across the date dimension (i.e. the total balance at the end of quarter 1 is $X million
and $Y million at the end of quarter 2, so at the end of quarter 2, the total balance is
only $Y million and not $X+$Y).
Non-Additive Facts: These are facts which cannot be added across any of the
dimensions in the cube. For example, profit margin is a fact which cannot be added
across any of the dimensions. For example, if product P1 has a 10% profit and
product P2 has a 10% profit then your net profit is still 10% and not 20%. We
cannot add profit margins across product dimensions. Similarly, if your profit margin
is 10% on Day1 and 10% on Day2, then your net Profit Margin at the end of Day2 is
still 10% and not 20%.
Derived Facts: Derived facts are the facts which are calculated from one or more
base facts, often by applying additional criteria. Often these are not stored in the
cube and are calculated on the fly at the time of accessing them. For example, profit
margin.
Factless Facts: A factless fact table is one which only has references (Foreign Keys)
to the dimensions and it does not contain any measures. These types of fact tables
are often used to capture events (valid transactions without a net change in a
measure value). For example, a balance enquiry at an automated teller machine
(ATM). Though there is no change in the account balance, this transaction is still
important for analysis purposes.
Textual Facts: Textual facts refer to the textual data present in the fact table, which
is not measurable (non-additive), but is important for analysis purposes. For
example, codes (i.e. product codes), flags (i.e. status flag), etc.
5. Types of relationships between dimensions and measuregroups.
No relationship: The dimension and measure group are not related.
Regular: The dimension table is joined directly to the fact table.
Referenced: The dimension table is joined to an intermediate table, which in turn,is
joined to the fact table.
Many to many:The dimension table is to an intermediate fact table,the intermediate
fact table is joined, in turn, to an intermediate dimension table to which the fact
table is joined.
Data mining:The target dimension is based on a mining model built from the source
dimension. The source dimension must also be included in the cube.
Fact table: The dimension table is the fact table.
6. Proactive caching
Proactive caching can be configured to refresh the cache (MOLAP cache) either on a
pre-defined schedule or in response to an event (change in the data) from the
underlying relational database. Proactive caching settings also determine whether
the data is queried from the underlying relational database (ROLAP) or is read from
the outdated MOLAP cache, while the MOLAP cache is rebuilt.
Proactive caching helps in minimizing latency and achieve high performance.
It enables a cube to reflect the most recent data present in the underlying database
by automatically refreshing the cube based on the predefined settings.
Lazy aggregations:
When we reprocess SSAS cube then it actually bring new/changed relational data
into SSAS cube by reprocessing dimensions and measures. Partition indexes and
aggregations might be dropped due to changes in related dimensions data so
aggregations and partition indexes need to be reprocessed. It might take more time
to build aggregation and partition indexes.
If you want to bring cube online sooner without waiting rebuilding of partition
indexes and aggregations then lazy processing option can be chosen. Lazy
processing option bring SSAS cube online as soon as dimensions and measures get
processed. Partition indexes and aggregations are triggered later as a background
job.
When you build a cube, and you add dimensions to that cube, you create cube
dimensions: cube dimensions are instances of a database dimension inside a cube.
A database dimension can be used in multiple cubes, and multiple cube dimensions
can be based on a single database dimension
The Database dimension has only Name and ID properties, whereas a Cube
dimension has several more properties.
Database dimension is created one where as Cube dimension is referenced from
database dimension.
Database dimension exists only once.where as Cube dimensions can be created more
than one using ROLE PLAYING Dimensions concept.
11. Importance of CALCULATE keyword in MDX script, data pass and limiting cube
space
Select to store the attribute member in the intermediate dimension that links the
attribute in the reference dimension to the fact table in the MOLAP
structure. This imporvies the qery performance, but increases the processing time
and storage space.
If the option is not selected, only the relationship between the fact records and the
intermediate dimension is stored in the cube. This means that Anaylysis services has
to derive the aggregated values for the members of the referenced dimension when
a query is executed, resulting in slower query performance.
13. Partition processing and Aggregation Usage Wizard
16. Role playing Dimensions, Junk Dimensions, Conformed Dimensions, SCD and
other types of dimensions
Role playing Dimesnion:
A Role-Playing Dimension is a Dimension which is connected to the same Fact Table
multiple times using different Foreign Keys.
eg: Consider a Time Dimension which is joined to the same Fact Table (Say
FactSales) multiple times, each time using a different Foreign Key in the Fact
Table like Order Date, Due Date, Ship Date, Delivery Date, etc
Steps:
In Cube Designer, click the Dimension Usage tab.
Either click the 'Add Cube Dimension' button, or right-click anywhere on the work
surface and then click Add Cube Dimension.
In the Add Cube Dimension dialog box, select the dimension that you want to add,
and then click OK.
A Conformed Dimension is a Dimension which connects to multiple Fact Tables across
one or more Data Marts (cubes). Conformed Dimensions are exactly the same
structure, attributes, values (dimension members), meaning and definition.
Example: A Date Dimension has exactly the same set of attributes, same members
and same meaning irrespective of which Fact Table it is connected to
Linked Dimensions can be used when the exact same dimension can be used across
multiple Cubes within an Organization like a Time Dimension, gography
Dimension etc.
Degenerate Dimensions are commonly used when the Fact Table contains/represents
Transactional data like Order Details, etc. and each Order has an Order Number
associated with it, which forms the unique value in the Degenerate Dimension.
One of the common scenarios is when a Fact Table contains a lot of Attributes which
are like indicators, flags, etc. Using Junk Dimensions, such Attributes can be
removed/cleaned up from a Fact Table.
SCD: The Slowly Changing Dimension (SCD) concept is basically about how the data
modifications are absorbed and maintained in a Dimension Table.
The new (modified) record and the old record(s) are identified using some kind of a
flag like say IsActive, IsDeleted etc. or using Start and End Date fields to indicate the
validity of the record.
17. Parent Child Hierarchy, NamingTemplate property, MemberWithLeafLevelData
property
18. How will you keep measure in cube without showing it to user?
Now, if you pass the value [Date].[Calendar Year].&[2002] to the P1, then it will run
just like:
where [Date].[Calendar Year].&[2002]
23. CASE (CASE, WHEN, THEN, ELSE, END) statement, IF THEN END IF, IS keyword,
HAVING clause
30. What do you understand by rigid and flexible relationship? Which one is better
from performance perspective?
Rigid: Attribute Relationship should be set to Rigid when the relationship between
those attributes is not going to change over time. For example,
relationship between a Month and a Date is Rigid since a particular Date always
belongs to a particular Month like 1st Feb 2012 always belongs to Feb
Month of 2012. Try to set the relationship to Rigid wherever possible.
Flexible: Attribute Relationship should be set to Flexible when the relationship
between those attributes is going to change over time. For example, relationship
between an Employee and a Manager is Flexible since a particular Employee might
work under one manager during this year (time period) and under a different
manager during next year (another time period).
31. In which scenario, you would like to go for materializing dimension?
Reference dimensions let you create a relationship between a measure group and a
dimension using an intermediate dimension to act as a bridge between
them.
32. In dimension usage tab, how many types of joins are possible to form
relationship between measure group and dimension?
38. How will you write back to dimension using excel or any other client tool?
39. What do you understand by dynamic named set (SSAS 2008)? How is i different
from static named set?
43. What are different storage mode option in SQL server analysis services and
which scenario, they will be useful?
44. How will you implement data security for given scenario in analysis service data?
"I have 4 cubes and 20 dimension. I need to give access to CEO, Operation
managers and Sales managers and employee.
1) CEO can see all the data of all 4 cubes.
2) Operation Managers can see only data related to their cube. There are four
operation managers.
3) Employees can see only certain dimension and measure groups data. (200
Employees) "
1.BIDS
In BIDS from the build menu select the build option (or right click on the project in
the solution explorer).
The build process will create four xml files in the bin subfolder of the project folder
.asdatabase - is the main object definition file
.configsettings
.deploymentoptions
.deploymenttargets
2. Deploy
Deployment via BIDS will overwrite the destination database management settings
so is not recommended for production deployment.
46. What are the options available to incrementally load relational data into SSAS
cube?
Use Slowly Changing Dimesnion
47. Why will you use aggregation at remote server?
52. What are KPIs? How will you create KPIs in SSAS?
53. What are the main feature differences in SSAS 2005 and SSAS 2008 from
developer point of view?
MDX
1. Explain the structure of MDX query?
10.What is the difference between NON EMPTY keyword and NONEMPTY() function?
11. Functions used commonly in MDX like Filter, Descendants, BAsc and others
12. Difference between NON EMPTY keyword and function, NON_EMPTY_BEHAVIOR,
ParallelPeriod, AUTOEXISTS
16. Write MDX for retrieving top 3 customers based on internet sales amount?
17. Write MDX to find current month's start and end date?
18. Write MDX to compare current month's revenue with last year same month
revenue?
19. Write MDX to find MTD(month to date), QTD(quarter to date) and YTD(year to
date) internet sales amount for top 5 products?
21. Write MDX to rank all the product category based on calendar year 2005 internet
sales amount?
22. Write MDX to extract nth position tuple from specific set?
24. What are the performance consideration for improving MDX queries?
26. Which one is better from performance point of view...NON Empty keyword or
NONEMPTY function?
27. How will you find performance bottleneck in any given MDX?