Documentos de Académico
Documentos de Profesional
Documentos de Cultura
What is Teradata?
Teradata is a relational database management system (RDBMS)
That is:
An open system, running on a UNIX MP-RAS or Windows 2000
server platform.
It acts as a server
Teradata Advantages
Mature Optimizer - Complex queries, joins per query, ad-hoc Processing (translations)
Model the Business - 3NF, robust (3D) view processing, star Schema
Lowest TCO (total cost of owner) - Ease of setup and maintenance, robust parallel
utilities, no
High Availability - No single point of failure, scalable of data loading, parallel load
utilities.
Teradata History
TERADATA COMPONENTS
P.E Architecture
BYNET Features:
Amp is called as Heart of teradata and every AMP will consists of its
own virtual disk(VDISK)
FAULT T OLERANCES:
Fallback:
A fallback table is a duplicate copy of a primary table. Each row in a fallback
table is stored on an AMP different from the one to which the primary row
hashes. This reduces the likelihood of loss of data due to simultaneous losses of
the 2 AMPs or their associated disk storage.
AMP Clusters:
Cliques
Each physical disk in the array has an exact copy in the same array.
The array controller can read from either disk and write to both.
When one disk of the pair fails, there is no change in performance.
Mirroring reduces available disk space by 50%.
Array controller reconstructs failed disks quickly.
good performance with disk failures
higher cost in terms of disk space
RAID 5 (Parity)
TEMPORARY TABLES:
Global Temporary Tables:
Global temporary tables are tables that exist only for the duration of the
SQL session in which they are used.
The contents of these tables are private to the session, and the system
automatically drops the table at the end of that session.
However, the system saves the global temporary table definition
permanently in the data dictionary. In addition, global temporary tables
allow the database administrator to define a template in the schema,
which a user can reference for their exclusive use during a session.
A volatile temporary table resides in memory but does not survive across
a system restart.
If a user needs a temporary table for a single use only, they
should define a volatile temporary table.
Using volatile temporary tables improves performance even more than
using global temporary tables, since the system does not store the
definitions of volatile temporary tables in the data dictionary. Moreover,
users require no privilege to access volatile temporary tables.
FASTEXPORT
.LOGTABLE loginfotabl1;
of available
.END EXPORT;
The END EXPORT command signifies the end of an export task and
initiates processing by the Teradata Database.
The EXPORT command provides the client system destination and file format
specifications
for the export data retrieved from the Teradata Database and, optionally,
generates a
MultiLoad script file that you can use to reload the export data.
BLOCKSIZE: - maximum block size that should be used when returning data
to the client.
The default block size is 64K bytes, which is the maximum
supported by the
Teradata Database V2R3.0 and later.
Note: The FastExport utility does not support field mode. To export field
mode data, use the appropriate format clauses in your SELECT statements
to enable the Teradata Database to convert response data to character form
SCRIPT:-
.LOGTABLE ismail.fexp_logtable;
.LOGON 127.0.0.1/dbc, dbc;
.BEGIN EXPORT
SESSIONS 6;
.begin export;
.export outfile "C:\Documents and Settings\Anay\Desktop\test11.txt"
mode record
format text;
select
cast
(
coalesce(trim(empno),'')||'|'||
coalesce(trim(name),'')||'|'||
coalesce(trim(phone),'')||'|'||
coalesce(trim(salary),'') as char(200))
from retail.employee;
.end export;
.logoff;
.quit;
.logtable retail.chandrika_log;
.logon 127.0.0.1/dbc,dbc;
.begin export
sessions 3;
.export outfile "C:\Documents and Settings\Anay\Desktop\test1.txt"
mode record
format text;
select
cast
(
coalesce(trim(l_orderkey),'')||'|'||
coalesce(trim(l_partkey),'')||'|'||
coalesce(trim(l_suppkey),'')||'|'||
coalesce(trim(l_comment),'') as char(100))
from retail.item sample 100;
FASTLOAD
1). Fastload is a BATCH MODE, command-driven utility that uses multiple sessions to
quickly transfer large amounts of data from
client-based application (flat files) to Teradata Database.
2). Fastload is used to load data into only one Empty table.
3). Target table should not contain secondary indexes, join indexes, transient journals.
By avoiding SI, JI, TJ fastload will be the fastest load in load utilities.
4). It can perform only insert operations.
5). Even if target table is NUPI and multiset, it will not allow duplicate records. It is fully
automatic restartable and checkpoint configurable.
6). We can run 15 fastloads jobs concurrently.
Describe the two phases of FastLoad.
Phase 1
FastLoad uses one SQL session to define AMP steps.
The PE sends a block to each AMP which stores blocks of unsorted data
records.
AMPs hash each record and redistribute them to the AMP responsible for
the hash value.
At the end of Phase 1, each AMP has the rows it should have, but the
rows are not in row hash sequence.
Infrastructure of FASTLOAD
1. LOGTABLE
In fastload logtable will be created defaultly in sysadmin.
SYNTAX: .select * from sysadmin.fastlog;
2. ERROR TABLES-2
ERROR TABLES
Error table1. Error table 2.
1. Conversion errors
2. Constraint errors 1.Unique violation errors.
3. Down amp errors
Example1:
,:in_Lname
,:in_Fname
,:in_ SocSec );
ENDLOADING;
LOGOFF;
Start Phase2; if
omitted, utility will
pause
ConvertingtheData
Checkpoints:
a. Checkpoints entries posted to restart table (logtable) at regular
intervals during Fastload data transfer (checkpoint information
stores in Logtable).
b. If processing stops while Fastload job running. You can start from
recent checkpoint.
Ex: - You have 10, 00,000 records. U declared checkpoint 50,000
when each 50,000 records successfully completed Fastload
pushes and post entry to restart table. Your Fastload job stops at
1, 60,000 records. When restart same Fastload job it starts from
1, 50, 001 record.
><. To see error records, error records will be in error tables first we need to
.end loading;
2).In fastload we can run multiple sources (up to 5 sources) to single target table.
Preliminary Phase
Start Multiload sessions
Create temporary tables.
Apply utility locks (disallow DDLs (create, drop, alter)) on the tables.
DML Transaction Phase
Send DML requests, USING clause and Match-Tags to server.
Acquisition Phase
Client sends data buffers to server
AMPs receives the data blocks are redistributes (by row hash) each data
records to its target AMPs.
Data rows are accumulated in work tables of corresponding target table.
At the end of acquisition phase it puts write locks on the target tables.
Application Phase
Clean-up Phase
Release locks
Logoff sessions.
Target Table
1) Each IMPORT task can access upto 5 tables on the teradata database.
2) To perform import task, you must have appropriate access permission
(INSERT, DELETE, and UPDATE) on each target tables.
3) Target tables can Empty table. (but need not be Empty )
4) Target table have NUSIs, but can not have unique secondary indexes
5) Each IMPORT task commend need not access same Target Table
.LOGTABLE RETAIL.MLOAD_LOG1;
.logon 127.0.0.1/dbc, dbc;
.BEGIN IMPORT MLOAD TABLES RETAIL.ITEM3
ERRORTABLES RETAIL.MLOADERR1
RETAIL.MLOADUV
WORKTABLES RETAIL.WT
SESSIONS 8
ERRLIMIT 20
CHECKPOINT 400; /*TILL THIS LINE: initialization phase*/
MULTILOAD DELETE:
.LOGTABLE RETAIL.M_log;
.logon 127.0.0.1/dbc,dbc;
T-pump
Teradata parallel data pump provides an excellent utility for low volume batch
maintenance of large TD databases.
T-PUMP is a data loading utility that helps you maintain (update,delete,insert and
automatic upserts) the data in your TD-RDBMS.
Allows near real-time updates from transactional systems into the warehouse.
Performs INSERT, UPDATE, and DELETE operations, or a combination, from the
same source.
Up to 63 DML statements can be included for one IMPORT task.
Alternative to MultiLoad for low-volume batch maintenance of large tables.
Allows target tables to:Have secondary indexes, join indexes, hash indexes, and
Referential Integrity.Be MULTISET or SET.Be populated or empty.
Allows conditional processing (via APPLY in the .IMPORT statement).
Supports automatic restarts; uses Support Environment.
No session limit use as many sessions as necessary.
No limit to the number of concurrent instances.
Uses row-hash locks, allowing concurrent updates on the same table.
Can always be stopped and locks dropped with no ill effect.
Designed for highest possible throughput.User can specify how many updates occur
minute by minute; can be changed as the job runs.
Pack: If we specify pack 500 then 500 records are put in a single macro for insertion.
NOTE: Max value for pack is 600.
Robust: Whenever a macro is completed tpump notes a robust point in a log table.
If the tpump is restarting after the failure it starts from recent robust point or from
general checkpoint which is closest to tpump.
Robust on/off;
serialize: whenever we specify serialize on tpump control goes to the flat file and
Verifies the records under of same order and puts in a same macro.
Serialize on/off;
Tpump always takes checkpoint value in no. of minutes not more than 60min.
Tpump requires 1 log table, 1 error table
SAMPLE SCRIPT:
.LOGTABLE RETAIL.TPUMP_LOG;
.LOGON 192.168.1.6/DBC, DBC;
.BEGIN LOAD
ERRORTABLE RETAIL.ERRTABLE
PACK 100
RATE 100
ROBUST ON
SERIALIZE ON
SESSIONS 6
TENACITY 4
CHECKPOINT 50;
.LAYOUT LAY1;
.FIELD l_orderkey * VARCHAR(10) KEY;
.FIELD l_partkey * VARCHAR(44);
jobscript.btq :
LOGON tpd1/user1,passwd1
. IMPORT DATA FILE = datafil3.dat
. QUIET ON
. REPEAT *
USING in_CustNo (INTEGER)
, in_SocSec (INTEGER)
UPDATE Customer
SET Social_Security = :in_SocSec
WHERE Customer_Number = :in_CustNo ;
.QUIT;