Está en la página 1de 20

Informatica Power Center

PowerCenter provides an environment that allows you to load data into a centralized location,

such as a datamart, data warehouse, or operational data store (ODS). You can extract data from

multiple sources, transform the data according to business logic you build in the client

application, and load the transformed data into file and relational targets. PowerCenter provides

the following integrated components:

 PowerCenter repository. The PowerCenter repository is at the center of the PowerCenter

suite. You create a set of metadata tables within the repository database that the

PowerCenter applications and tools access. The PowerCenter Client and Server access the

repository to save and retrieve metadata.

 PowerCenter Repository Server. The PowerCenter Repository Server manages

connections to the repository from client applications. It inserts, updates, and fetches

objects from the repository database tables. It also maintains object consistency.

 PowerCenter Client. Use the PowerCenter Client to manage users, define sources and

targets, build mappings and mapplets with the transformation logic, and create workflows

to run the mapping logic. The PowerCenter Client has the following client applications:

Repository Manager, Repository Server Administration Console, Designer, Workflow

Manager, and Workflow Monitor.

 PowerCenter Server. The PowerCenter Server extracts the source data, performs the data

transformation, and loads the transformed data into the targets.

Sources

PowerCenter accesses the following sources:

 Relational. Oracle, Sybase, Informix, IBM DB2, Microsoft SQL Server, and Teradata.

 File. Fixed and delimited flat file, COBOL file, and XML.
 Application. You can purchase additional PowerConnect products to access business

sources, such as PeopleSoft, SAP R/3, Siebel, IBM MQSeries, and TIBCO.

 Mainframe. You can purchase PowerConnect for Mainframe for faster access to IBM DB2

on MVS.

 Other. Microsoft Excel and Access.

Note: The Designer imports relational sources, such as Microsoft Excel, Microsoft Access, and

Teradata using ODBC and native drivers.

For more information about sources, see “Working with Sources” in the Designer Guide.

Targets

PowerCenter can load data into the following targets:

 Relational. Oracle, Sybase, Sybase IQ, Informix, IBM DB2, Microsoft SQL Server, and

Teradata.

 File. Fixed and delimited flat file and XML.

 Application. You can purchase additional PowerConnect products to load data into SAP

BW. You can also load data into IBM MQSeries message queues and TIBCO.

 Other. Microsoft Access.

You can load data into targets using ODBC or native drivers, FTP, or external loaders.

For more information about targets, see “Working with Targets” in the Designer Guide.

Power Center repository

The PowerCenter repository resides on a relational database. The repository database tables

contain the instructions required to extract, transform, and load data. PowerCenter Client

applications access the repository database tables through the Repository Server.

You add metadata to the repository tables when you perform tasks in the PowerCenter Client

application, such as creating users, analyzing sources, developing mappings or mapplets, or

creating workflows. The PowerCenter Server reads metadata created in the Client application
when you run a workflow. The PowerCenter Server also creates metadata, such as start and finish

times of a session or session status.

You can develop global and local repositories to share metadata:

 Global repository. The global repository is the hub of the domain. Use the global

repository to store common objects that multiple developers can use through shortcuts.

These objects may include operational or Application source definitions, reusable

transformations, mapplets, and mappings.

 Local repositories. A local repository is within a domain that is not the global repository.

Use local repositories for development. From a local repository, you can create shortcuts

to objects in shared folders in the global repository. These objects typically include source

definitions, common dimensions and lookups, and enterprise standard transformations.

You can also create copies of objects in non-shared folders.

 Version control. A versioned repository can store multiple copies, or versions, of an

object. Each version is a separate object with unique properties. PowerCenter version

control features allow you to efficiently develop, test, and deploy metadata into

production.

You can connect to a repository, back up, delete, or restore repositories using pmrep, a

command line program. For more information on pmrep, see “Using pmrep”.

Repository Server

The Repository Server manages repository connection requests from client

applications. For each repository database registered with the Repository Server, it configures

and manages a Repository Agent process. The Repository Server also monitors the status of

running Repository Agents, and sends repository object notification messages to client

applications.
The Repository Agent is a separate, multi-threaded process that retrieves, inserts, and

updates metadata in the repository database tables. The Repository Agent ensures the

consistency of metadata in the repository by employing object locking.

PowerCenter Client

The PowerCenter Client consists of the following applications that you use to manage the

repository, design mappings, mapplets, and create sessions to load the data:

 Repository Server Administration Console. Use the Repository Server Administration

console to administer the Repository Servers and repositories.

 Repository Manager. Use the Repository Manager to administer the metadata repository.

You can create repository users and groups, assign privileges and permissions, and

manage folders and locks.

 Designer. Use the Designer to create mappings that contain transformation instructions

for the PowerCenter Server. Before you can create mappings, you must add source and

target definitions to the repository. The Designer has five tools that you use to analyze

sources, design target schemas, and build source-to-target mappings:

o Source Analyzer. Import or create source definitions.

o Warehouse Designer. Import or create target definitions.

o Transformation Developer. Develop reusable transformations to use in mappings.

o Mapplet Designer. Create sets of transformations to use in mappings.

o Mapping Designer. Create mappings that the PowerCenter Server uses to extract,

transform, and load data.

 Workflow Manager. Use the Workflow Manager to create, schedule, and run workflows. A

workflow is a set of instructions that describes how and when to run tasks related to
extracting, transforming, and loading data. The PowerCenter Server runs workflow tasks

according to the links connecting the tasks. You can run a task by placing it in a workflow.

 Workflow Monitor. Use the Workflow Monitor to monitor scheduled and running

workflows for each PowerCenter Server. You can choose a Gantt Chart or Task view. You

can also access details about those workflow runs.

Install the client tools on a Microsoft Windows machine. For more information about installation

requirements, see Minimum System Requirements.

Power Center Server

The Power Center Server reads mapping and session information from the repository. It extracts

data from the mapping sources and stores the data in memory while it applies the

transformation rules that you configure in the mapping. The Power Center Server loads the

transformed data into the mapping targets.

The Power Center Server can achieve high performance using symmetric multi-processing

systems. The Power Center Server can start and run multiple workflows concurrently. It can also

concurrently process partitions within a single session. When you create multiple partitions within

a session, the Power Center Server creates multiple database connections to a single source and

extracts a separate range of data for each connection, according to the properties you

configure.

Database Connections

The Repository Server maintains a pool of reusable database connections for serving client

applications. The server generates a Repository Agent process for each database. The Repository

Agent creates new database connections only if all the current connections are in use.
For example, if 10 clients send requests to the Repository Agent one at a time, the agent requires

only one connection. It reuses the same database connection for all the requests. If the 10 clients

send requests simultaneously, the Repository Agent opens 10 connections. You can set the

maximum number of open connections using the DatabasePoolSize parameter in the repository

configuration file.

For a session, a reader object holds the connection for as long as it needs to read the data from

the source tables. A writer object holds a connection for as long as it needs to write data to the

target tables.

The PowerCenter Server maintains a database connection pool for stored procedure or lookup

databases in a workflow. You can optionally set the MaxLookupSPDBConnections parameter to

limit connections when you configure the PowerCenter service. The PowerCenter Server allows

an unlimited number of connections to lookup or stored procedure databases. If a database user

does not have permission for the number of connections a session requires, the session fails.

For pre-session, post-session, and load stored procedures, consecutive stored procedures reuse

a connection if they have identical connection attributes. Otherwise, the connection for one

stored procedure closes and a new connection begins for the next stored procedure.

PowerCenter Metadata Reporter

PowerCenter provides PowerCenter Metadata Reporter, a web-based application that allows you

to run reports against PowerCenter repository metadata. It gives you insight into your repository,

which enhances your ability to analyze and manage your repository efficiently.

The Metadata Reporter provides a number of reports, including reports on transformations,

mapplets, mappings, sources, targets, sessions, worklets, and workflows.


Using the Repository Server Administration Console

Use the Repository Server Administration Console to administer your Repository Servers and

repositories. A Repository Server can manage multiple repositories. You use the Repository

Server Administration Console to create and administer the repository through the Repository

Server.

You can use the Administration Console to perform the following tasks:

 Add, edit, and remove repository configurations.

 Export and import repository configurations.

 Create a repository.

 Promote a local repository to a global repository.

 Copy a repository.

 Delete a repository from the database.

 Back up and restore a repository.

 Start, stop, enable, and disable repositories.

 Send repository notification messages.

 Register and unregister a repository.

 Propagate domain connection information for a repository.

 View repository connections and locks.

 Close repository connections.

 Register and remove repository plug-ins.

 Upgrade a repository.

Repository Objects

You create repository objects using the Repository Manager, Designer, and Workflow Manager

client tools. You can view the following objects in the Navigator window of the Repository

Manager:
 Source definitions. Definitions of database objects (tables, views, synonyms) or files that

provide source data.

 Target definitions. Definitions of database objects or files that contain the target data.

 Multi-dimensional metadata. Target definitions that are configured as cubes and

dimensions.

 Mappings. A set of source and target definitions along with transformations containing

business logic that you build into the transformation. These are the instructions that the

PowerCenter Server uses to transform and move data.

 Reusable transformations. Transformations that you can use in multiple mappings.

 Mapplets. A set of transformations that you can use in multiple mappings.

 Sessions and workflows. Sessions and workflows store information about how and when

the PowerCenter Server moves data. A workflow is a set of instructions that describes how

and when to run tasks related to extracting, transforming, and loading data. A session is a

type of task that you can put in a workflow. Each session corresponds to a single

mapping.

The Design Process

The goal of the design process is to create mappings that depict the flow of data between

sources and targets, including changes made to the data before it reaches the targets. However,

before you can create a mapping, you must first create or import source and target definitions.

You might also want to create reusable objects, such as reusable transformations or mapplets.

For a list of objects you create in the Design process, see Repository Objects.

Perform the following design tasks in the Designer:


1. Import source definitions. Use the Source Analyzer to connect to the sources and import

the source definitions.

2. Create or import target definitions. Use the Warehouse Designer to define relational, flat

file, or XML targets to receive data from sources. You can import target definitions from a

relational database or a flat file, or you can manually create a target definition.

3. Create the target tables. If you add a target definition to the repository that does not exist

in a relational database, you need to create target tables in your target database. You do

this by generating and executing the necessary SQL code within the Warehouse Designer.

4. Design mappings. Once you have source and target definitions in the repository, you can

create mappings in the Mapping Designer. A mapping is a set of source and target

definitions linked by transformation objects that define the rules for data transformation.

A transformation is an object that performs a specific function in a mapping, such as

looking up data or performing aggregation.

5. Create mapping objects. Optionally, you can create reusable objects for use in multiple

mappings. Use the Transformation Developer to create reusable transformations. Use the

Mapplet Designer to create mapplets. A mapplet is a set of transformations that may

contain sources and transformations.

6. Debug mappings. Use the Mapping Designer to debug a valid mapping to gain

troubleshooting information about data and error conditions.

7. Import and export repository objects. You can import and export repository objects, such

as sources, targets, transformations, mapplets, and mappings to archive or share

metadata.

Workflow Manager

The Workflow Manager consists of three tools to help you develop a workflow:
 Task Developer. Create tasks you want to accomplish in the workflow in the Task

Developer.

 Workflow Designer. Create a workflow by connecting tasks with links in the Workflow

Designer. You can also create tasks in the Workflow Designer as you develop the

workflow.

 Worklet Designer. Create a worklet in the Worklet Designer. A worklet is an object that

groups a set of tasks. A worklet is similar to a workflow, but without scheduling

information. You can nest multiple worklets inside a workflow.

Before you create a workflow, you must configure the following connection information:

 PowerCenter Server connection. Register the PowerCenter Server with the repository

before you can start it or create a session to run against it.

 Database connections. Create connections to source and target systems.

 Other connections. If you want to use external loaders or FTP, you configure these

connections in the Workflow Manager.

Workflow Monitor

After you create a workflow, you run the workflow in the Workflow Manager and monitor it in the

Workflow Monitor. The Workflow Monitor is a tool that displays details about workflow runs in

two views, Gantt Chart view and Task view. You can monitor workflows in online and offline

modes.

The Workflow Monitor consists of the following windows:

 Navigator window. Displays monitored repositories, servers, and repositories objects.

 Output window. Displays messages from the PowerCenter Server.

 Time window. Displays progress of workflow runs.

 Gantt Chart view. Displays details about workflow runs in chronological format.

 Task view. Displays details about workflow runs in a report format.


Getting Started

Before you can begin using PowerCenter, you must create the environment and perform the

following administration tasks to allow access to the repository and the PowerCenter Server:

1. Configure the sources. If you extract data from relational sources, ask the database

administrator to create user profiles with read access. These user profiles allow you to

import source definitions into the repository and access the sources at runtime.

If you extract data from file sources, the files must be accessible to the PowerCenter Server and

Client machines.

2. Configure the targets. Ask the database administrator to create user profiles with read

and write access. These user profiles allow you to import target definitions into the

repository and write to the targets at runtime.

If the target database does not exist, create it using the database administration tools included

with your RDBMS. After you create the target database, you can use the Designer to design and

create target tables.

For flat file targets, you need a target directory large enough to process the resulting files.

3. Choose globalization settings and data movement modes. The data movement mode

you use depends on whether you want the PowerCenter Server to process single-byte

data or multibyte character data. You select code pages for the repository, PowerCenter

Client and PowerCenter Server.

4. Create repository database. Create a database for the repository. Users accessing the

repository database need full rights in that database. If you upgrade the repository to a

new version, you need database rights to drop or modify these tables.

5. Install the PowerCenter Client. Install the client software on a machine that accesses the

sources, targets, and repository databases, as well as the PowerCenter Server.


6. Install and configure the Repository Server. Install and configure the Repository Server on

a machine that accesses the repository database, the PowerCenter Client, and the

PowerCenter Server.

7. Install and configure the PowerCenter Server. Install the PowerCenter Server on a

Windows or UNIX system that accesses the sources, targets, and the repository database.

8. Configure connectivity. Configure network, native, and ODBC connectivity. Create ODBC

data sources to connect to the PowerCenter Clients to the sources and targets. You must

also have network connections between all databases and PowerCenter Servers.

9. Create the repository. After you configure connectivity between source, target, and

repository databases, you can create the metadata repository. Connect to the Repository

Server from within the Repository Server Administration Console to create the metadata

repository. The Repository Server connects to the repository database and runs the SQL

to create the repository tables. All the objects you create with PowerCenter are stored as

metadata in the repository.

10. Create repository users and groups. Create groups and user profiles, then assign

privileges and permissions that determine tasks that users can perform.

11. Register the PowerCenter Server. Before you can start the PowerCenter Server, you must

register the PowerCenter Server so the Workflow Manager can direct the PowerCenter

Server to the repository.

12. Changing Data Movement Modes

13. You can change the PowerCenter Server data movement mode in the PowerCenter

Server configuration parameters. After you change the data movement mode, the

PowerCenter Server runs in the new data movement mode the next time you start the

PowerCenter Server. When the data movement mode changes, the PowerCenter Server

handles character data differently. To avoid creating data inconsistencies in your target
tables, the PowerCenter Server performs additional checks for sessions that reuse session

caches and files.

14. Table 2-1 describes how the PowerCenter Server handles session files and caches after

you change the data movement mode:

15.

Table 2-1. Session and File Cache Handling After Data Movement Mode Change

Session File or PowerCenter Server Behavior After Data


Time of Creation or Use
Cache Movement Mode Change

No change in behavior. Creates a new


Session Log File
Each session. session log for each session using the
(*.log)
PowerCenter Server code page.

No change in behavior. Creates a new

Workflow Log Each workflow. workflow log file for each workflow using

the PowerCenter Server code page.

No change in behavior. Appends rejected

Reject File (*.bad) Each session. data to the existing reject file using the

PowerCenter Server code page.

No change in behavior for delimited flat

Output File (*.out) Sessions writing to flat file. files. Creates a new output file for each

session using the target code page.

No change in behavior. Creates a new


Indicator File (*.in) Sessions writing to flat file.
indicator file for each session.
When files are removed or deleted, the

PowerCenter Server creates new files.

When files are not removed or deleted,

the PowerCenter Server fails the session

with the following error message:


Incremental
Sessions with Incremental
Aggregation Files
Aggregation enabled. TE_7038 Aggregate Error: ServerMode:
(*.idx, *.dat)
[server data movement mode] and

CachedMode: [data movement mode that

created the files] mismatch.

You should also remove or delete files

created using a different code page.

Sessions with a Lookup


Unnamed
transformation configured for
Persistent Lookup Rebuilds the persistent lookup cache.
a named persistent lookup
Files (*.idx, *.dat)
cache.

Named Persistent Sessions with a Lookup

Lookup Files (*.idx, transformation configured for Fails the session.

*.dat) a persistent lookup cache.

Code Page Overview

A code page contains the encoding to specify characters in a set of one or more

languages. An encoding is the assignment of a number to a character in the character set. You
use code pages to identify data that might be in different languages. For example, if you are

importing Japanese data into a mapping, you must select a Japanese code page for the source

data.

When you choose a code page, the program or application for which you set the

code page refers to a specific set of data that describes the characters the application recognizes.

This influences the way that application stores, receives, and sends character data.

Table 2-2. Code Page Compatibility

Component Code Page Code Page Compatibility

Source (including relational, flat Subset of target.

file, and XML file) Subset of PowerCenter Server.

Superset of source.

Target (including relational, XML Superset of PowerCenter Server.

files, and flat files) PowerCenter Server creates external loader data and control

files using the target flat file code page.

Lookup and Stored Procedures Compatible with PowerCenter Server and repository.

Superset of source.

Subset of target.

Identical to PowerCenter Server operating system and

PowerCenter Server machine hosting pmcmd.

Compatible with repository and PowerCenter Client.

Compatible with database connection code page used by

Lookup and Stored Procedure transformations.


Compatible with repository.
Repository Server
Compatible with PowerCenter Client and PowerCenter Server.

Compatible with local repository. Can also be a subset of local

Global Repository repository.

Compatible with PowerCenter Client and Server.

Compatible with global repository. Can also be a superset of

Local Repository global repository.

Compatible with PowerCenter Client and Server.

Standalone Repository Compatible with PowerCenter Client and Server.

PowerCenter Client Compatible with PowerCenter Server and repository.

Machine hosting pmcmd Identical to PowerCenter Server.

Power Center Server Variable Directories

The installation program creates the following directories under the installation directory to store

session files and caches associated with each PowerCenter Server:

 BadFiles

 Cache

 ExtProc

 LkpFiles

 SessLogs

 SrcFiles

 Temp

 TgtFiles

 WorkflowLogs
All workflows use these directories by default

Server Variables

You can define server variables for each PowerCenter Server you register. Server variables define

the path and directories for session and workflow output files and caches. You can also use

server variables to define workflow properties, such as the number of workflow logs to archive.

The installation process creates default directories in the location where you install the

PowerCenter Server. By default, the PowerCenter Server writes output files in these directories

when you run a workflow. To use these directories as the default location for the session and

workflow output files, you must configure the server variable $PMRootDir to define the path to

the directories.

Sessions and workflows are configured to use server directories by default. You can override the

default by entering different directories session or workflow properties.

For example, you might have a PowerCenter Server running all workflows in a repository. If you

define the server variable for workflow logs directory as c:\pmserver\workflowlog, the

PowerCenter Server saves the workflow log for each workflow in c:\pmserver\workflowlog by

default.

If you change the default server directories, make sure the designated directories exist before

running a workflow. If the PowerCenter Server cannot resolve a directory during the workflow, it

cannot run the workflow.

By using server variables instead of hard-coding directories and parameters, you simplify the

process of changing the PowerCenter Server that runs a workflow. If each workflow in a

development folder uses server variables, then when you copy the folder to a production

repository, the production server can run the workflow as configured. When the production

server runs the workflow, it uses the directories configured for its server variables. If, instead, you
changed workflow to use hard-coded directories, workflows fail if those directories do not exist

on the production server.

Table 11-1 lists the server variables you configure when you register a PowerCenter Server:

Table 11-1. Server Variables

Server Variable Required/Optional Description

A root directory to be used by any or all other

server variables. Informatica recommends you


$PMRootDir Required
use the PowerCenter Server installation

directory as the root directory.

Default directory for session logs. Defaults to


$PMSessionLogDir Required
$PMRootDir/SessLogs.

Default directory for reject files. Defaults to


$PMBadFileDir Required
$PMRootDir/BadFiles.

Default directory for the lookup cache, index

and data caches, and index and data files. To

avoid performance problems, always use a

$PMCacheDir Required drive local to the PowerCenter Server for the

cache directory. Do not use a mapped or

mounted drive for cache files. Defaults to

$PMRootDir/Cache.

Default directory for target files. Defaults to


$PMTargetFileDir Required
$PMRootDir/TgtFiles.
Default directory for source files. Defaults to
$PMSourceFileDir Required
$PMRootDir/SrcFiles.

Default directory for external procedures.


$PMExtProcDir Required
Defaults to $PMRootDir/ExtProc.

Default directory for temporary files. Defaults to


$PMTempDir Required
$PMRootDir/Temp.

Email address to receive post-session email

$PMSuccessEmailUser Optional when the session completes successfully. Use to

address post-session email.

Email address to receive post-session email

when the session fails. Use to address post-

$PMFailureEmailUser Optional session email. Default is an empty string. For

details, see “Sending Emails” in the Workflow

Administration Guide.

Number of session logs the PowerCenter Server

archives for the session. Defaults to 0. Use to


$PMSessionLogCount Optional
archive session logs. For details, see “Log Files”

in the Workflow Administration Guide.

Number of non-fatal errors the PowerCenter

Server allows before failing the session. Non-

$PMSessionErrorThreshold Optional fatal errors include reader, writer, and DTM

errors. If you want to stop the session on errors,

enter the number of non-fatal errors you want


to allow before stopping the session. The

PowerCenter Server maintains an independent

error count for each source, target, and

transformation. Use to configure the Stop On

option in the session properties.

Defaults to 0. If you use the default setting,

non-fatal errors do not cause the session to

stop.

Default directory for workflow logs. Defaults to


$PMWorkflowLogDir Required
$PMRootDir/WorkflowLogs.

Number of workflow logs the PowerCenter

$PMWorkflowLogCount Optional Server archives for the workflow. Use to archive

workflow logs. Defaults to 0.

Default directory for lookup files. Defaults to


$PMLookupFileDir Optional
$PMRootDir/LkpFiles.

También podría gustarte