Está en la página 1de 21

Outline

n n n n n n n

Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Distributed Transaction Management
Data server approach Parallel architectures Parallel DBMS techniques Parallel execution models

o o o o

Parallel Database Systems Distributed Object DBMS Database Interoperability Concluding Remarks
1998 M. Tamer zsu & Patrick Valduriez Page 13.1

Distributed DBMS

The Database Problem


n

Large volume of data use disk and large main memory I/O bottleneck (or memory access bottleneck)
Speed(disk) << speed(RAM) << speed(microprocessor)

Predictions
(Micro-) processor speed growth : 50 % per year DRAM capacity growth : 4 every three years Disk throughput : 2 in the last ten years

Conclusion : the I/O bottleneck worsens

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.2

Page 1

The Solution
n

Increase the I/O bandwidth


Data partitioning Parallel data access

Origins (1980's): database machines


Hardware-oriented bad cost-performance failure Notable exception : ICL's CAFS Intelligent Search Processor

1990's: same solution but using standard hardware components integrated in a multiprocessor
Software-oriented Standard essential to exploit continuing technology

improvements

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.3

Multiprocessor Objectives
n n

High-performance with better cost-performance than mainframe or vector supercomputer Use many nodes, each with good costperformance, communicating through network
Good cost via high-volume components Good performance via bandwidth

Trends
Microprocessor and memory (DRAM): off-the-shelf Network (multiprocessor edge): custom

The real chalenge is to parallelize applications to run with good load balancing

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.4

Page 2

Data Server Architecture


Client

client interface

Application server

query parsing data server interface

communication channel Data application server interface database functions server

database
Distributed DBMS 1998 M. Tamer zsu & Patrick Valduriez Page 13.5

Objectives of Data Servers


Avoid the shortcomings of the traditional DBMS approach
Centralization of data and application management General-purpose OS (not DB-oriented)

By separating the functions between


Application server (or host computer) Data server (or database computer or back-end computer)

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.6

Page 3

Data Server Approach: Assessment


n

Advantages
Integrated data control by the server (black box) Increased performance by dedicated system Can better exploit parallelism Fits well in distributed environments

Potential problems
Communication overhead between application and data

server
u

High-level interface

High cost with mainframe servers

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.7

Parallel Data Processing


n

Three ways of exploiting high-performance multiprocessor systems:


Automatically detect parallelism in sequential programs

(e.g., Fortran, OPS5)

Augment an existing language with parallel constructs

(e.g., C*, Fortran90) Offer a new language in which parallelism can be expressed or automatically inferred n

Critique
Hard to develop parallelizing compilers, limited resulting

speed-up

Enables the programmer to express parallel computations

but too low-level

Can combine the advantages of both (1) and (2)

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.8

Page 4

Data-based Parallelism
n Inter-operation p operations of the same query in parallel

op.3 op.1 op.2

n Intra-operation the same operation in parallel on different data partitions

op. R
Distributed DBMS

op. R1

op. R2

op. R2

op. R4
Page 13.9

1998 M. Tamer zsu & Patrick Valduriez

Parallel DBMS
n

Loose definition: a DBMS implemented on a tighly coupled multiprocessor Alternative extremes


Straighforward porting of relational DBMS (the software

vendor edge)

New hardware/software combination (the computer

manufacturer edge)

Naturally extends to distributed databases with one server per site

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.10

Page 5

Parallel DBMS - Objectives


n

Much better cost / performance than mainframe solution High-performance through parallelism
High throughput with inter-query parallelism Low response time with intra-operation parallelism

High availability and reliability by exploiting data replication Extensibility with the ideal goals
Linear speed-up Linear scale-up

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.11

Linear Speed-up
Linear increase in performance for a constant DB size and proportional increase of the system components (processor, memory, disk)

ideal new perf. old perf.

components
Distributed DBMS 1998 M. Tamer zsu & Patrick Valduriez Page 13.12

Page 6

Linear Scale-up
Sustained performance for a linear increase of database size and proportional increase of the system components.

new perf. old perf.

ideal

components + database size


Distributed DBMS 1998 M. Tamer zsu & Patrick Valduriez Page 13.13

Barriers to Parallelism
n

Startup
The time needed to start a parallel operation may

dominate the actual computation time

Interference
When accessing shared resources, each new process slows

down the others (hot spot problem)

Skew
The response time of a set of parallel processes is the time

of the slowest one

Parallel data management techniques intend to overcome these barriers

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.14

Page 7

Parallel DBMS Functional Architecture


User task 1 Session Mgr RM task 1 Request Mgr RM task n User task n

DM task 11

DM task 12

DM Data Mgr task n2

DM task n1

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.15

Parallel DBMS Functions


n

Session manager
Host interface Transaction monitoring for OLTP

Request manager
Compilation and optimization Data directory management Semantic data control Execution control

Data manager
Execution of DB operations Transaction management support Data management

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.16

Page 8

Parallel System Architectures


n

Multiprocessor architecture alternatives


Shared memory (shared everything) Shared disk Shared nothing (message-passing)

Hybrid architectures
Hierarchical (cluster) Non-Uniform Memory Architecture (NUMA)

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.17

Shared-Memory Architecture
P1 Pn

interconnect

Global Memory

Examples: DBMS on symmetric multiprocessors (Sequent, Encore, Sun, etc.)

Simplicity, load balancing, fast communication Network cost, low extensibility


1998 M. Tamer zsu & Patrick Valduriez Page 13.18

Distributed DBMS

Page 9

Shared-Disk Architecture
P1 M1 Pn Mn

interconnect

Examples : DEC's VAXcluster, IBM's IMS/VS Data Sharing


network cost, extensibility, migration from uniprocessor complexity, potential performance problem for copy coherency
1998 M. Tamer zsu & Patrick Valduriez Page 13.19

Distributed DBMS

Shared-Nothing Architecture
interconnect

P1 D1 M1

Pn Dn Mn

Examples : Teradata (NCR), NonStopSQL (TandemCompaq), Gamma (U. of Wisconsin), Bubba (MCC)
Extensibility, availability Complexity, difficult load balancing
1998 M. Tamer zsu & Patrick Valduriez

Distributed DBMS

Page 13.20

Page 10

Hierarchical Architecture
P1 Pn P1 Pn

interconnect

interconnect

Global Memory

Global Memory

n n

Combines good load balancing of SM with extensibility of SN Alternatives


Limited number of large nodes, e.g., 4 x 16 processor

nodes High number of small nodes, e.g., 16 x 4 processor nodes, has much better cost-performance (can be a cluster of workstations)
Distributed DBMS 1998 M. Tamer zsu & Patrick Valduriez Page 13.21

Shared-Memory vs. Distributed Memory


n

Mixes two different aspects : addressing and memory


Addressing

Single address space : Sequent, Encore, KSR Multiple address spaces : Intel, Ncube Physical memory u Central : Sequent, Encore u Distributed : Intel, Ncube, KSR
u u

NUMA : single address space on distributed physical memory


Eases application portability Extensibility

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.22

Page 11

NUMA Architectures
n n

Cache Coherent NUMA (CC-NUMA)


statically divide the main memory among the nodes

Cache Only Memory Architecture (COMA)


convert the per-node memory into a large cache of the

shared address space

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.23

COMA Architecture
Disk Disk Disk

P1

P2

Pn

Cache Memory

Cache Memory

Cache Memory

Hardware shared virtual memory

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.24

Page 12

Parallel DBMS Techniques


n

Data placement
Physical placement of the DB onto multiple nodes Static vs. Dynamic

Parallel data processing


Select is easy Join (and all other non-select operations) is more difficult

Parallel query optimization


Choice of the best parallel execution plans Automatic parallelization of the queries and load balancing

Transaction management
Similar to distributed transaction management

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.25

Data Partitioning
n

Each relation is divided in n partitions (subrelations), where n is a function of relation size and access frequency Implementation
Round-robin
u u

Maps i-th element to node i mod n Simple but only exact-match queries Supports range queries but large index Only exact-match queries but small index

B-tree index
u

Hash function
u

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.26

Page 13

Partitioning Schemes
Hashing

Round-Robin

a-g

h-m

u-z

Interval
Distributed DBMS 1998 M. Tamer zsu & Patrick Valduriez Page 13.27

Replicated Data Partitioning


n

High-availability requires data replication


simple solution is mirrored disks
u

hurts load balancing when one node fails interleaved partitioning (Teradata) chained partitioning (Gamma)

more elaborate solutions achieve load balancing


u u

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.28

Page 14

Interleaved Partitioning
Node Primary copy Backup copy r 2.3 r 3.2 r 3.2 1 R1 2 R2 r 1.1 3 R3 r 1.2 r 2.1 4 R4 r 1.3 r 2.2 r 3.1

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.29

Chained Partitioning

Node Primary copy Backup copy

1 R1 r4

2 R2 r1

3 R3 r2

4 R4 r3

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.30

Page 15

Placement Directory
n

Performs two functions


F1 (relname, placement attval) = lognode-id F2 (lognode-id) = phynode-id

In either case, the data structure for f1 and f2 should be available when needed at each node

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.31

Join Processing
n

Three basic algorithms for intra-operator parallelism


Parallel nested loop join: no special assumption Parallel associative join: one relation is declustered on join

attribute and equi-join

Parallel hash join: equi-join

They also apply to other complex operators such as duplicate elimination, union, intersection, etc. with minor adaptation

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.32

Page 16

Parallel Nested Loop Join


node 1 R1:
send partition

node 2 R2:

S1 node 3 node 4

S2

R
Distributed DBMS

S i=1,n(R

Si)
Page 13.33

1998 M. Tamer zsu & Patrick Valduriez

Parallel Associative Join


node 1 R1: R2: node 2

S1 node 3 node 4

S2

R
Distributed DBMS

S i=1,n(Ri

Si)
Page 13.34

1998 M. Tamer zsu & Patrick Valduriez

Page 17

Parallel Hash Join


node R1: R2: node S1: node S2: node

node 1

node 2

R
Distributed DBMS

S i=1,P(Ri

Si)
Page 13.35

1998 M. Tamer zsu & Patrick Valduriez

Parallel Query Optimization


The objective is to select the "best" parallel execution plan for a query using the following components Search space
Models alternative execution plans as operator trees Left-deep vs. Right-deep vs. Bushy trees

Search strategy
Dynamic programming for small search space Randomized for large search space

Cost model (abstraction of execution system)


Physical schema info. (partitioning, indexes, etc.) Statistics and cost functions

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.36

Page 18

Execution Plans as Operators Trees


Result j3 j2 R3 R2 R4 R4 R3 R1 Result j6 j5 j4 R2

Left-deep
j1 R1

Right-deep

Result Result j9 j12 j8 j7 R1 R2 R3 R1 j10 R2 R3 j11 R4

Zig-zag

R4

Bushy

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.37

Equivalent Hash-Join Trees with Different Scheduling


Build3 Temp2 R4 Build2 Temp1 R3 Build1 Probe1 R3 Build1 Probe1 Probe2 R4 Build2 Probe2 Temp1 Probe3 Build3 Build3 Probe3 Temp2

R1
Distributed DBMS

R2
1998 M. Tamer zsu & Patrick Valduriez

R1

R2
Page 13.38

Page 19

Load Balancing
n

Problems arise for intra-operator parallelism with skewed data distributions


attribute data skew (AVS) tuple placement skew (TPS) selectivity skew (SS) redistribution skew (RS) join product skew (JPS)

Solutions
sophisticated parallel algorithms that deal with skew dynamic processor allocation (at execution time)

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.39

Data Skew Example


JPS
Res1

JPS
Res2

AVS/TPS
S2

Join1

Join2 S1

AVS/TPS

RS/SS AVS/TPS
R2 Scan1

RS/SS
Scan2 R1

AVS/TPS

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.40

Page 20

Some Parallel DBMSs


n

Prototypes
EDS and DBS3 (ESPRIT) Gamma (U. of Wisconsin) Bubba (MCC, Austin, Texas) XPRS (U. of Berkeley) GRACE (U. of Tokyo)

Products
Teradata (NCR) NonStopSQL (Tandem-Compac) DB2 (IBM), Oracle, Informix, Ingres, Navigator

(Sybase) ...

Distributed DBMS

1998 M. Tamer zsu & Patrick Valduriez

Page 13.41

Open Research Problems


n n n n n n n

Hybrid architectures OS support:using micro-kernels Benchmarks to stress speedup and scaleup under mixed workloads Data placement to deal with skewed data distributions and data replication Parallel data languages to specify independent and pipelined parallelism Parallel query optimization to deal with mix of precompiled queries and complex ad-hoc queries Support of higher functionality such as rules and objects
1998 M. Tamer zsu & Patrick Valduriez Page 13.42

Distributed DBMS

Page 21