Documentos de Académico
Documentos de Profesional
Documentos de Cultura
For beginners and non, rman is one of the most obscure Oracle tools, but in reality it is based
on quite simple principles. It is true, though, that a restoration can be distructive for a
database.
rman is dangerous because it works at the database physical level; to understand why, let's
say that rman is not "too" different from WinZip. Since everybody knows WinZip, it will
probably be easier to understand rman.
Is WinZip dangerous?
The creation of a zip archive is not dangerous, unless you choose the same name for the
output file, or the archive is so big that you fill a device or the temporary directory etc. The
same thing can be said about rman: taking an rman backup cannot be dangerous (unless you
stop the database, or close it intentionally) and you don't overwrite previous backups (you can
prevent it).
rman creates a compressed backup of the physical database files, including controlfiles,
datafiles, archived logs and stores them somewhere. This somewhere can be a disk (like
WinZip) or a tape.
Therefore, the first important point to remember is:
rman creates compressed backups of the physical database; if you prefer, rman zips
the database, entirely or not.
Is Unzip dangerous?
Unzip, on the other hand, can be very dangerous. The risk is overwriting files that should not
be overwritten and, therefore, losing important content. The very same applies to rman
restores: since they overwrite database files, you should know whether this is the right
decision because, otherwise, you may lose your data. This is the main difference between an
import and an rman restoration: an rman restoration (restore) overwrites one or more
datafiles; this means WATCH OUT.
On our Windows 2000 workstation we open a CMD window and type the following few lines:
E:\>rman target /
Database altered.
Database altered.
Let's now repeat the operation having put the database in archive mode Choosing C:\Temp as
directory for the backup is, of course, a very bad idea. Backups are gold and shoud be kept on
a dedicated device; a good name could be E:\RMAN_BACKUPS\SID
E:\>rman target /
RMAN> run {
2> allocate channel t1 type disk;
3> backup database format 'E:\RMAN_BACKUPS\ARK9201\%d_%u_%s';
4> release channel t1;
5> }
allocated channel: t1
channel t1: sid=12 devtype=DISK
released channel: t1
The resulting backup file(s) are in our case
dir c:\rman_backups\ark9201
Volume in drive C is 80-01-14A2
Volume Serial Number is 64DA-0BF7
Directory of c:\rman_backups\ark9201
12/26/2004 02:29p .
12/26/2004 02:29p ..
12/26/2004 02:44p 418,471,936 ARK920_03G8KC6G_3
Taking only the database backup is not enough, for reasons that will be clear later. We must
take a backup of the archived log as well, and the command is very similar
RMAN> run {
2> allocate channel t1 type disk;
3> backup archivelog all delete input format
'C:\RMAN_BACKUPS\ARK9201\arch_%d_%u_%s';
4> release channel t1;
5> }
allocated channel: t1
channel t1: sid=12 devtype=DISK
released channel: t1
RMAN>
rman on UNIX
rman works the same way on a UNIX server, using an appropriate directory format:
released channel: t1
RMAN> host ;
allocated channel: t1
channel t1: sid=26 devtype=DISK
released channel: t1
Here is a backup of archived log files
allocated channel: t1
channel t1: sid=26 devtype=DISK
released channel: t1
Sometimes you will want to backup only certain archived logs and not all. This is the case, for
example, when you ran out of space and you were forced to move the archived logs on disk
onto another directory. In this case the clause "archivelog all" would give an error, because
rman will not be able to find the files.
run {
allocate channel dev1 type 'sbt_tape';
backup archivelog
from logseq 12 until logseq 15 thread 1
delete input;
}
Keeping an eye on the list above, the command would back up the archived redo logs from
ORA920_1_12.arc to ORA920_1_15.arc Remember that the name depends on the parameter
... in your init.ora In our case the values is ...
RMAN> show
2> all;
RMAN>
Are we sure that our backups do not contain corrupted blocks
There is, of course, little point in taking backups if we don't know whether they are physically
usable, i.e. all their blocks are clean. rman offer the command validate
RMAN> validate backupset 6;
There are cases when backups fail and the stand becomes unclear. To avoid any risk the clause
not backed up since time can be used. This commands makes rman save only the datafiles
(or the archived logs) that haven't be backed up for a certain span of time and skip the rest.
RMAN> run { allocate channel t1 type disk;
2> backup database format '/u01/oracle/archive/%d_%u_%s' not backed up
since time 'sysdate -5';
3> release channel t1;}
allocated channel: t1
channel t1: sid=25 devtype=DISK
released channel: t1
An examples on how to restore a data file can be found on the first article about restoration
The basics of an rman restore
Care is needed for rman restores because they will normally overwrite existing datafiles; before taking any action it is
imperative to understand why a restore is needed and how to proceed.
In our opinion it is better to explain basic rman concepts with understandable examples.
1. In order to restore a database or a data file some kind of backup must have been taken; in our exaple, let's suppose
that our database consists of three datafiles and that a backup is taken on Mar 10th 2005 in the afternoon.
">
The stand of the datafiles can differ because in general it is not guaranteed that the very last changes in the database
have actually been written to disk. This is why the timestamps 15:09 and 15:15 have been used in the picture.
2. During the week archive log backups are taken daily, for example to tape using the command:
run {
allocate channel t1 type 'sbt_tape';
backup archivelog all delete input;
release channel t1;
}
3. A week later a disk crashes and the database file number 3 is lost:
In this example, it must be understood that only the DBFile3 has to be restored, without overwriting the datafiles that
have not been affected by the crash, or a week’s work would be lost.
The tablespace containing the damaged file should be put offline with the SQLPLUS command:
SQLPLUS / AS SYSDBA
ALTER TABLESPACE this_tbs OFFLINE IMMEDIATE;
run {
allocate channel t1 type 'sbt_tape';
recover database;
release channel t1;
}
During the execution of this command, rman will restore the archived redo logs and apply them to the damaged
tablespace. At the end of the process the tablespace can be brought online and the database is back to normal.
• UTL_INADDR
• SYS_CONTEXT
• V$INSTANCE
• V$SESSION
UTL_INADDR
The UTL_INADDR package was introduced in Oracle 8.1.6 to provide a means of retrieving host
names and IP addresses of remote hosts from PL/SQL.
The GET_HOST_ADDRESS function returns the IP address of the specified host name.
SQL> SELECT UTL_INADDR.get_host_address('bart') FROM dual;
UTL_INADDR.GET_HOST_ADDRESS('BART')
-----------------------------------------------------------------------
---------
192.168.2.4
SQL>
The IP address of the database server is returned if the specified host name is NULL or is
omitted.
SQL> SELECT UTL_INADDR.get_host_address from dual;
GET_HOST_ADDRESS
-----------------------------------------------------------------------
---------
192.168.2.5
SQL>
An error is returned if the specified host name is not recognized.
SQL> SELECT UTL_INADDR.get_host_address('banana') from dual;
SELECT UTL_INADDR.get_host_address('banana') from dual
*
ERROR at line 1:
ORA-29257: host banana unknown
ORA-06512: at "SYS.UTL_INADDR", line 19
ORA-06512: at "SYS.UTL_INADDR", line 40
ORA-06512: at line 1
SQL>
The GET_HOST_NAME function returns the host name of the specified IP address.
SQL> SELECT UTL_INADDR.get_host_name('192.168.2.4') FROM dual;
UTL_INADDR.GET_HOST_NAME('192.168.2.4')
-----------------------------------------------------------------------
---------
bart
SQL>
The host name of the database server is returned if the specified IP address is NULL or omitted.
SQL> SELECT UTL_INADDR.get_host_name FROM dual;
GET_HOST_NAME
-----------------------------------------------------------------------
---------
C4210gR2
1 row selected.
SQL>
An error is returned if the specified IP address is not recognized.
SQL> SELECT UTL_INADDR.get_host_name('1.1.1.1') FROM dual;
SELECT UTL_INADDR.get_host_name('1.1.1.1') FROM dual
*
ERROR at line 1:
ORA-29257: host 1.1.1.1 unknown
ORA-06512: at "SYS.UTL_INADDR", line 4
ORA-06512: at "SYS.UTL_INADDR", line 35
ORA-06512: at line 1
SQL>
SYS_CONTEXT
The SYS_CONTEXT function is able to return the following host and IP address information for the
current session:
• TERMINAL - An operating system identifier for the current session. This is often the client
machine name.
• HOST - The host name of the client machine.
• IP_ADDRESS - The IP address of the client machine.
• SERVER_HOST - The host name of the server running the database instance.
The following examples show the typical output for each variant.
SQL> SELECT SYS_CONTEXT('USERENV','TERMINAL') FROM dual;
SYS_CONTEXT('USERENV','TERMINAL')
--------------------------------------------------------------------
marge
1 row selected.
SYS_CONTEXT('USERENV','HOST')
--------------------------------------------------------------------
marge
1 row selected.
SYS_CONTEXT('USERENV','IP_ADDRESS')
--------------------------------------------------------------------
192.168.2.3
1 row selected.
SYS_CONTEXT('USERENV','SERVER_HOST')
--------------------------------------------------------------------
C4210gr2
1 row selected.
SQL>
V$INSTANCE
The HOST_NAME column of the V$INSTANCE view contains the host name of the server running
the instance.
SQL> SELECT host_name FROM v$instance;
HOST_NAME
------------------------------------------------
C4210gR2
1 row selected.
SQL>
V$SESSION
The V$SESSION view contains the following host information for all database sessions:
• TERMINAL - The operating system terminal name for the client. This is often set to the
client machine name.
• MACHINE - The operating system name for the client machine. This may include the
domain name if present.
The following examples show the typical output for each column.
SQL> SELECT terminal, machine FROM v$session WHERE username =
'TIM_HALL';
TERMINAL MACHINE
------------------------------
----------------------------------------------------
MARGE ORACLE-BASE\MARGE
1 row selected.
Oracle Flashback
In Oracle the rolling back a transaction is possible before the final commit or up to a
savepoint, but the database offers another chance, which is probably little known,
and it is the Flashback.
The essential requirement is that the database is using UNDO tablespaces and that
the undo_retention is high enough. Generally speaking, rolling back the changes is
only possible if the correction is made not later than the number of seconds specified
by the parameter undo_retention.
SQL> show parameter undo
1 row created.
SQL> INSERT INTO t_a values (4);
SQL> INSERT INTO t_a values (5);
1 row created.
SQL> COMMIT;
Commit complete.
Session altered.
SYSDATE
--------------------
12:53:43 31-dec-2004
5 rows deleted
SQL> COMMIT
Commit completed
COUNT(*)
----------
0
The deletion was a blunder and the problem is realized 20 minutes later; since the
undo_retention is 1800 seconds (30 minutes) it is probably still possible to rescue
the data using the close "AS OF TIMESTAMP" and specifying a time interval in the
past.
SYSDATE
--------------------
13:11:20 31-dec-2004
C1
----------
1
2
3
4
5
By specifying SELECT AS OF TIMESTAMP we are still able to access the records that
have been deleted, even if the table is now empty.
Table created.
COUNT(*)
----------
5
C1
----------
1
2
3
4
5
It was therefore possible to restore the contents as they were 20 minutes before. If
the table is dropped, unfortunately no rollback is possible.
SQL>
SQL> DROP TABLE t_a;
Table dropped.
How does Oracle know that a database needs recovering? What checks does it carry
out when a database is opened? This article explains a couple of common cases of
Oracle recovery and the role of the system change number SCN in the recovery.
What is the SCN system change number?
The modifications are logged in the online redo log by the log writer LGWR, but not
necessarily in the datafiles. Infact, there are times when the modified data is still in
memory in a block that has become "dirty", meaning that it no longer contains the
same data as the block in the datafile.
There are moments, though, when all modifications are written onto all the datafiles:
this happens by a checkpoint, shutdown of the database, when a redo log is
switched.
At that point the header of the datafiles is updated with the SCN of the system,
which will be the same as recorded in the controlfile. The Oracle process CKPT (the
checkpoint process) is responsible for this update. When the checkpoint is completed
the database reaches a "consistent" state, meaning that it is clean, there is no
contraddiction between the SCN and timestamps of the various components.
For the purpose of our discussion, it is important to familiarise ourselves with the
dynamic view V$DATAFILE_HEADER, which contains the information that Oracle
writes in the datafile headers.
CONTROLFILE_CHANGE#
-------------------
6488361
CHECKPOINT_CHANGE#
------------------
6488359
We notice that the CHECKPOINT_CHANGE# in the datafile headers is identical to the
CHECKPOINT_CHANGE# in the controlfile(s) because when the queries were
executed, the database had just been opened..
Being on UNIX, it is possible to make copies of datafiles and controlfiles when the
database is open. Since we don't want to break anything, our database is in archive
mode, we have already taken a good rman backup and we are therefore ready to
restore if something goes wrong.
These are the steps of our exercise to simulate the restore and recovery of a
datafile:
1. A copy of one of the datafiles is taken.
2. Normal activity continues.
3. The database is closed.
4. The copy taken at step 1 is put back onto its original place; this file is
therefore "older" than the others (even if the timestamp might be more
recent)
5. The database is open
FILE# NAME
----------
-----------------------------------------------------------------------
---------
1 /u05/oradata/DEVDB/system01DEVDB.dbf
2 /u04/oradata/DEVDB/rbs01DEVDB.dbf
3 /u05/oradata/DEVDB/temp01DEVDB.dbf
4 /u04/oradata/DEVDB/tools01DEVDB.dbf
5 /u05/oradata/DEVDB/users01DEVDB.dbf
6 /u05/oradata/DEVDB/data01DEVDB.dbf
8 /u04/oradata/DEVDB/appidx01DEVDB.dbf
9 /u05/oradata/DEVDB/appdata01DEVDB.dbf
8 rows selected.
We create a table t_test before copying the datafile and a table t_test2 after copying
it:
SQL> create table t_test(c1 number) tablespace tools;
Table created.
8 rows selected.
1 row created.
SQL> /
1 row created.
SQL> commit;
Commit complete.
8 rows selected.
The SCN hasn't changed yet. We now make the copy
SQL> !cp /u04/oradata/DEVDB/tools01DEVDB.dbf /tmp
System altered.
8 rows selected.
oracle-localora@# cp /tmp/tools01DEVDB.dbf
/u04/oradata/DEVDB/tools01DEVDB.dbf
NAME FIRST_CHANGE#
-----------------------------------------------------------------------
---------
/opt/oracle/admin/DEVDB/arch/1_60.dbf 6488563
Let's take a copy of the controlfile before any DDL activity, such as creating a new
table and populating it. This will simulate the restore of a controlfile:
SQL> !cp /u01/oradata/DEVDB/ctl1DEVDB.dbf /tmp
SQL> commit;
Commit complete.
We now shutdown the database; this will cause a checkpoint to happen, meaning
that all datafiles headers will be updated with the SCN contained in the controlfile.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit
Before any dangerous experiment, the wise DBA will always takes a copy of the
controlfile(s)
oracle-localora@# cp /u01/oradata/DEVDB/ctl1DEVDB.dbf
/u01/oradata/DEVDB/ctl1DEVDB.dbf.save
oracle-localora@# cp /u02/oradata/DEVDB/ctl2DEVDB.dbf
/u02/oradata/DEVDB/ctl2DEVDB.dbf.save
oracle-localora@# cp /u03/oradata/DEVDB/ctl3DEVDB.dbf
/u03/oradata/DEVDB/ctl3DEVDB.dbf.save
Let's now overwrite the controlfiles with the copy we created on the /tmp directory:
oracle-localora@# cp /tmp/ctl1DEVDB.dbf /u01/oradata/DEVDB/ctl1DEVDB.dbf
cp: overwrite /u01/oradata/DEVDB/ctl1DEVDB.dbf (yes/no)? y
oracle-localora@# cp /tmp/ctl1DEVDB.dbf /u02/oradata/DEVDB/ctl2DEVDB.dbf
cp: overwrite /u02/oradata/DEVDB/ctl2DEVDB.dbf (yes/no)? y
oracle-localora@# cp /tmp/ctl1DEVDB.dbf /u03/oradata/DEVDB/ctl3DEVDB.dbf
cp: overwrite /u03/oradata/DEVDB/ctl3DEVDB.dbf (yes/no)? y
oracle-localora@#
Let's try to open the database.
SQL> startup
ORACLE instance started.
8 rows selected.
CHECKPOINT_CHANGE#
------------------
6488582
Oracle complains saying the the controlfile is an old copy judging from the SCN;
during other kinds of recovery you will be prompted that the command "recover
database using backup controlfile" is needed essentially for the same reason. Let's
see what happens if we try a normal recovery:
SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/u05/oradata/DEVDB/system01DEVDB.dbf'
ORA-01207: file is more recent than controlfile - old controlfile
Instead we use:
MEMBER
-----------------------------------------------------------------------
---------
/u03/oradata/DEVDB/redog2m1DEVDB.dbf
/u04/oradata/DEVDB/redog2m2DEVDB.dbf
/u01/oradata/DEVDB/redog1m1DEVDB.dbf
/u02/oradata/DEVDB/redog1m2DEVDB.dbf
C1
----------
10
11
12
C1
----------
1
1
The contents of the database are as expected.
Error message: port already in use
Sometimes it happens that an installation fails with the error message "port already
in use" (this can be the case with the Oracle snmpx daemon on UNIX systems).
If you don't know which TCP port is generating the error, the following UNIX
command lists those currently being used.
This list shows the port 21 (ftp), 23 (telnet), 1521 (an Oracle listener) etc. If you
suspect of a particular TCP port (for example 161) go for it.
A sample output is
..
/proc/76
/proc/777
sockname: AF_INET 0.0.0.0 port: 161
sockname: AF_INET 179.146.111.98 port: 161
sockname: AF_INET 179.146.111.101 port: 161
sockname: AF_INET 179.146.111.102 port: 161
sockname: AF_INET 179.146.111.100 port: 161
/proc/779
/proc/784
...
We have now got the process ID 777 and it is easy to determine what it is doing :
Our investigation is therefore completed: the TCP port 161 is being used by the
process /opt/buw/bin/snmpd.
(1808).
Action:
ensure that no other process is using this socket and retry the
operation
Therefore, withe script above, let's check whether some other process is already
using the port 1808:
...
/proc/1162
/proc/1165
/proc/3687
/proc/3689
/proc/3691
/proc/4429
sockname: AF_INET 169.166.228.84 port: 1808
/proc/5632
... ...
root@dbserver[on pts/7]# ps -aef|grep -i 4429
root 7961 6922 0 11:36:15 pts/7 0:00 grep -i 4429
oracle 4429 1 0 Dec 09 ? 43:56
/u00/app/oracle/product/8.1.7/bin/vppdc
root@dbserver[on pts/7]# kill -3 4429
root@dbserver[on pts/7]# kill -5 4429
root@dbserver[on pts/7]# kill -9 4429
The problem, therefore, was that the data gatherer of version 8.1.7 was still running.
The reason why sometimes we forget this process is that the agent in 9i is also the
data gatherer.
Again, this is a real case.
What is a cluster?
This article gives some information about intalling and using the Oracle Real
Application Cluster RAC. A definition of a cluster is : "A group of computers linked
to provide fast and reliable service. Cluster technologies have evolved over the past
15 years to provide servers that are both highly available and scalable. Clustering is
one of several approaches that exploits parallel processing — the use of multiple
subsystems as building blocks".
Clusters have existed for quite a long time; the first solution was offered by DEC for
its VMS operating system. Clusters on UNIX are more recent.
Shared storage
A cluster will share some storage. There will be some software in place to manage
this storage, which could be the Veritas Volume Manager, or Sun Soltice, or UNIX
itself if you are using raw devices. Even if it is likely that the clustering software and
the storage manager will be provided by the same vendor, it is not necessarily so.
Veritas states that the I/O performance of the VSF matches that of the raw devices.
The CFS allows the use of normal Oracle commands such as "alter datafile resize
500M". With raw devices this is not possible, because the system manager must
create the device with a certain size using the Volume Manager. Analogously, the
archived redo logs can be created on the CFS.
Oracle Cluster can be installed also on a single node, but the cluster management
must be installed in advance and running. The tool on WindowsNT/2000 and Linux is
called oracm and is shipped with Oracle.
How many copies of init.ora should you keep? Where to define TNS_ADMIN? The new
cluster systax introduced a "dot" notation in the init.ora file to specify parameters for
different instances. This allows to keep only one copy of init.ora for all instances.
To fix the ideas, let's say that the cluster nodes are called sercluster1 and
sercluster2, the database name is EUROPE and that the instances names are EU1,
EU2 etc. It the shared mountpoint is /orasoft, the ORACLE_BASE=/orasof/oracle,
ORACLE_HOME=$ORACLE_BASE/9.2.0 One possible way to proceed is creating a
tree $ORACLE_BASE/admin/EUROPE/... on the shared device. In particular, we will
define an initEUROPE.ora on $ORACLE_BASE/admin/EUROPE/pfile
The soft links must now be created:
ln -s $ORACLE_BASE/admin/EUROPE/pfile/initEUROPE.ora
$ORACLE_HOME/dbs/initEU1.ora
ln -s $ORACLE_BASE/admin/EUROPE/pfile/initEUROPE.ora
$ORACLE_HOME/dbs/initEU2.ora
To set correctly the environment on the two nodes, a possible script would be
#!ksh
export ORACLE_BASE=/orasoft/oracle
export ORACLE_HOME=$ORACLE_BASE/9.2.0
myHost=`hostname`
if [ $myHost = "sercluster1" ]; then
ORACLE_SID=EU1
else
ORACLE_SID=EU2
fi
export ORACLE_SID
... ... ...
The immediate question is: how can you use the same init.ora if the instance names
are different? The answer is that in a RAC environmnet it is now possible to specify a
parameter with a "dot" notation. For example:
cluster_database=true
cluster_database_instances=2
#
EU1.instance_name=EU1
EU1.instance_number=1
EU1.thread=1
EU1.undo_tablespace=UNDOEU1
EU2.instance_name=EU2
EU2.instance_number=2
EU2.thread=2
EU2.undo_tablespace=UNDOEU2
undo_management=auto
For this operation, the parameter cluster_database should be set to FALSE; in fact,
the first part is not different from creating a database with a single instance and this
setting avoids some unnecessary complexity at this stage.
cluster_database = FALSE
spool off
The second instance can start, but cannot mount nor open the database. Before this
is possible, the redo logs and the UNDO tablespace for the second instance must be
created
LISTENER_EU1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = sercluster1)(PORT = 2002))
)
SID_LIST_LISTENER_EU1 =
(SID_LIST =
(SID_DESC =
(ORACLE_HOME = /optware/oracle/9.2.0)
(SID_NAME = EU1)
)
)
LISTENER_EU2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = sercluster2)(PORT = 2002))
)
SID_LIST_LISTENER_EU2 =
(SID_LIST =
(SID_DESC =
(ORACLE_HOME = /orasoft/oracle/9.2.0)
(SID_NAME = EU2)
)
)
EUROPE =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = sercluster1)(PORT = 2002))
(ADDRESS = (PROTOCOL = TCP)(HOST = sercluster1)(PORT = 2002))
(LOAD_BALANCE = yes)
)
(CONNECT_DATA =
(SERVICE_NAME = EUROPE)
)
)