Oracle8i Backup and Recovery Guide Release 2 (8.1.6) Part Number A76993-01 |
|
This chapter describes how to recover from common media failures, and includes the following topics:
Media failures fall into two general categories: permanent and temporary. Use different recovery strategies depending on the type of failure.
Permanent media failures are serious hardware problems that cause permanent loss of data on the disk. Lost data cannot be recovered except by repairing or replacing the failed storage device and restoring backups of the files stored on the damaged storage device.
Temporary media failures are hardware problems that make data temporarily inaccessible, but do not corrupt the data. Following are two examples of temporary media failures:
If a media failure affects datafiles, the appropriate recovery procedure depends on:
The following sections explain the appropriate recovery strategies for the database mode:
If either a permanent or temporary media failure affects any datafiles of a database operating in NOARCHIVELOG mode, then Oracle automatically shuts down the database. Depending on the type of media failure, you can use one of the following recovery methods:
If the media failure is... | Then... |
---|---|
Temporary |
Correct the hardware problem and restart the database. Usually, instance recovery is possible, and all committed transactions can be recovered using the online redo log. |
Permanent |
Follow the procedure "Recovering a Database in NOARCHIVELOG Mode". |
If either a permanent or temporary media failure affects the datafiles of a database operating in ARCHIVELOG mode, then the following scenarios can occur.
Damaged Datafiles | Database Status | Solution |
---|---|---|
Datafiles in the SYSTEM tablespace or datafiles with active rollback segments. |
Oracle shuts down. |
If the hardware problem is temporary, then fix it and restart the database. Usually, instance recovery recovers lost transactions. If the hardware problem is permanent, follow the procedure in "Performing Closed Database Recovery". |
Non-SYSTEM datafiles or datafiles that do not contain active rollback segments. |
Oracle takes affected datafiles offline, but the database stays open. |
If the unaffected portions of the database must remain available, then do not shut down the database. Take tablespaces containing problem datafiles offline using the temporary option, then follow the procedure in "Performing Open Database Recovery". |
If database recovery rolls forward through an ADD DATAFILE operation, then Oracle stops recovery when applying the ADD DATAFILE redo record and lets you confirm the location of the added file.
For example, suppose you create a new tablespace containing two datafiles: /db/db2.f
and /db/db3.f
. If you later perform media recovery through the CREATE TABLESPACE operation, Oracle may signal the following error when applying the CREATE TABLESPACE redo data:
ORA-00283: recovery session canceled due to errors ORA-01244: unnamed datafile(s) added to controlfile by media recovery ORA-01110: data file 3: '/db/db2.f' ORA-01110: data file 2: '/db/db3.f'
SELECT file#, name FROM v$datafile; FILE# NAME ------------------------------------- 1 /db/db1.f 2 /db/UNNAMED00002 3 /db/UNNAMED00003
ALTER DATABASE RENAME FILE '/db/UNNAMED00002' TO '/db/db3.f'; ALTER DATABASE RENAME FILE '/db/UNNAMED00003' TO '/db/db2.f';
RECOVER DATABASE
The transportable tablespace feature of Oracle allows a user to transport a set of tablespaces from one database to another. Transporting a tablespace into a database is like creating a tablespace with pre-loaded data. Using this feature is often an advantage because:
Like normal tablespaces, transportable tablespaces are recoverable. While you can recover normal tablespaces without a backup, you must have a version of the transported datafiles in order to recover a plugged-in tablespace.
To recover a transportable tablespace, restore a backup of the transported datafiles and issue normal recovery commands. The backup can be the initial version of the transported datafiles or any backup taken after the tablespace is transported. Just as when recovering through a CREATE TABLESPACE operation, Oracle may signal ORA-01244
when recovering through a transportable tablespace operation. In this case, rename the unnamed files to the correct locations using the procedure in "Recovering Through an ADD DATAFILE Operation".
See Also:
Oracle8i Administrator's Guide for detailed information about using the transportable tablespace feature. |
If a media failure has affected the online redo logs of a database, then the appropriate recovery procedure depends on:
The following sections describe the appropriate recovery strategies for these situations:
If the online redo log of a database is multiplexed, and at least one member of each online redo log group is not affected by the media failure, Oracle allows the database to continue functioning as normal. Oracle writes error messages to the LGWR trace file and the alert.log
of the database.
Solve the problem by taking one of the following actions:
SELECT group#, status, member FROM v$logfile; GROUP# STATUS MEMBER ------- ----------- --------------------- 0001 /oracle/dbs/log1a.f 0001 /oracle/dbs/log1b.f 0002 /oracle/dbs/log2a.f 0002 INVALID /oracle/dbs/log2b.f 0003 /oracle/dbs/log3a.f 0003 /oracle/dbs/log3b.f
log2b.f
from group 2, issue:
ALTER DATABASE DROP LOGFILE MEMBER '/oracle/dbs/log2b.f';
log2c.f
to group 2, issue:
ALTER DATABASE ADD LOGFILE MEMBER '/oracle/dbs/log2c.f' TO GROUP 2;
If the file you want to add already exists, then it must be the same size as the other group members, and you must specify REUSE:
ALTER DATABASE ADD LOGFILE MEMBER '/oracle/dbs/log2b.f' REUSE TO GROUP 2;
If a media failure damages all members of an online redo log group, different scenarios can occur, depending on the type of online redo log group affected by the failure and the archiving mode of the database.
If the damaged log group is inactive, then it is not needed for instance recovery; if it is active, then it is needed for instance recovery.
Your first task is to determine whether the damaged group is active or inactive.
SELECT group#, status, member FROM v$logfile; GROUP# STATUS MEMBER ------- ----------- --------------------- 0001 /oracle/dbs/log1a.f 0001 /oracle/dbs/log1b.f 0002 INVALID /oracle/dbs/log2a.f 0002 INVALID /oracle/dbs/log2b.f 0003 /oracle/dbs/log3a.f 0003 /oracle/dbs/log3b.f
SELECT group#, members, status, archived FROM v$log; GROUP# MEMBERS STATUS ARCHIVED ------ ------- --------- ----------- 0001 2 INACTIVE YES 0002 2 ACTIVE NO 0003 2 CURRENT NO
If all members of an online redo log group with INACTIVE status are damaged, then the procedure depends on whether you can fix the media problem that damaged the inactive redo log group.
You can clear an active redo log group when the database is open or closed. The procedure depends on whether the damaged group has been archived.
STARTUP MOUNT
ALTER DATABASE CLEAR LOGFILE GROUP 2;
Clearing an unarchived log allows it to be reused without archiving it. This action makes backups unusable if they were started before the last change in the log, unless the file was taken offline prior to the first change in the log. Hence, if you need the cleared log file for recovery of a backup, you cannot recover that backup.
STARTUP MOUNT
ALTER DATABASE CLEAR LOGFILE UNARCHIVED GROUP 2;
If there is an offline datafile that requires the cleared unarchived log to bring it online, then the keywords UNRECOVERABLE DATAFILE are required. The datafile and its entire tablespace have to be dropped because the redo necessary to bring it online is being cleared, and there is no copy of it. For example, enter:
ALTER DATABASE CLEAR LOGFILE UNARCHIVED GROUP 2 UNRECOVERABLE DATAFILE;
% cp /disk1/oracle/dbs/*.f /disk2/backup
ALTER DATABASE BACKUP CONTROLFILE TO '/oracle/dbs/cf_backup.f';
The ALTER DATABASE CLEAR LOGFILE statement can fail with an I/O error due to media failure when it is not possible to:
In these cases, the CLEAR LOGFILE statement (before receiving the I/O error) would have successfully informed the control file that the log was being cleared and did not require archiving. The I/O error occurred at the step in which CLEAR LOGFILE attempts to create the new redo log file and write zeros to it.
If the database is still running and the lost active log is not the current log, then issue the ALTER SYSTEM CHECKPOINT statement. If successful, your active log is rendered inactive, and you can follow the procedure in "Losing an Inactive Online Redo Log Group". If unsuccessful, or if your database has halted, perform one of these procedures, depending on the archiving mode.
Note that the current log is the one LGWR is currently writing to. If a LGWR I/O fails, then LGWR terminates and the instance crashes. In this case, you must restore a backup, perform incomplete recovery, and open the database with the RESETLOGS option.
% cp /disk2/backup/*.f /disk1/oracle/dbs
STARTUP MOUNT
ALTER DATABASE OPEN RESETLOGS;
SHUTDOWN IMMEDIATE
% cp /disk1/oracle/dbs/*.f /disk2/backup
ALTER DATABASE RENAME FILE "/oracle/dbs/log_1.rdo" TO "/temp/log_1.rdo"; ALTER DATABASE RENAME FILE "/oracle/dbs/log_2.rdo" TO "/temp/log_2.rdo";
ALTER DATABASE OPEN RESETLOGS;
If you have lost multiple groups of the online redo log, then use the recovery method for the most difficult log to recover. The order of difficulty, from most difficult to least, follows:
If the database is operating in ARCHIVELOG mode, and the only copy of an archived redo log file is damaged, then the damaged file does not affect the present operation of the database. The following situations can arise, however, depending on when the redo log was written and when you backed up the datafile.
If a media failure has affected the control files of a database (whether control files are multiplexed or not), the database continues to run until the first time that an Oracle background process needs to access the control files. At this point, the database and instance are automatically shut down.
If the media failure is temporary and the database has not yet shut down, avoid the automatic shutdown of the database by immediately correcting the media failure. If the database shuts down before you correct the temporary media failure, however, then you can restart the database after fixing the problem and restoring access to the control files.
The appropriate recovery procedure for media failures that permanently prevent access to control files of a database depends on whether you have multiplexed the control files. The following sections describe the appropriate procedures:
Use the following procedures to recover a database if all the following conditions are met:
If all control files of a multiplexed control file configuration have been damaged, follow the procedure in "Losing All Copies of the Current Control File".
Note:
SHUTDOWN ABORT
bad_cf.f
with good_cf.f
, you might enter:
% cp /oracle/good_cf.f /oracle/dbs/bad_cf.f
STARTUP
SHUTDOWN ABORT
good_cf.f
as new_cf.f
you might issue:
% cp /oracle/dbs/good_cf.f /oracle/dbs/new_cf.f
new_cf.f
you might enter:
CONTROL_FILES = '/oracle/dbs/good_cf.f', '/oracle/dbs/new_cf.f'
STARTUP
If all control files of a database have been lost or damaged by a permanent media failure, but all online redo logfiles remain intact, then you can recover the database by creating a new control file.
Depending on the existence and currency of a control file backup, you have the following options for generating the text of the CREATE CONTROLFILE statement
If you... | Then... |
---|---|
Executed ALTER DATABASE BACKUP CONTROLFILE TO TRACE NORESETLOGS after you made the last structural change to the database, and if you have saved the SQL command trace output |
Use the CREATE CONTROLFILE statement from the trace output as-is. |
Performed your most recent execution of ALTER DATABASE BACKUP CONTROLFILE TO TRACE before you made a structural change to the database |
Edit the output of ALTER DATABASE BACKUP CONTROLFILE TO TRACE to reflect that change. For example, if you recently added a datafile to the database, add that datafile to the DATAFILE clause of the CREATE CONTROLFILE statement. |
Have not backed up the control file using the TO TRACE option, but used the TO filename option of ALTER DATABASE BACKUP CONTROLFILE |
Use the control file copy to obtain SQL output. Copy the backup control file and execute STARTUP MOUNT before ALTER DATABASE BACKUP CONTROLFILE TO TRACE NORESETLOGS. If the control file copy predated a recent structural change, edit the TO TRACE output to reflect the structural change. |
Do not have a control file backup in either TO TRACE format or TO filename format |
Generate the CREATE CONTROLFILE statement manually (see Oracle8i SQL Reference). |
CREATE CONTROLFILE REUSE DATABASE SALES NORESETLOGS ARCHIVELOG MAXLOGFILES 32 MAXLOGMEMBERS 2 MAXDATAFILES 32 MAXINSTANCES 16 MAXLOGHISTORY 1600 LOGFILE GROUP 1 '/diska/prod/sales/db/log1t1.dbf', '/diskb/prod/sales/db/log1t2.dbf' ) SIZE 100K GROUP 2 '/diska/prod/sales/db/log2t1.dbf', '/diskb/prod/sales/db/log2t2.dbf' ) SIZE 100K, DATAFILE '/diska/prod/sales/db/database1.dbf', '/diskb/prod/sales/db/filea.dbf' ;
RECOVER DATABASE
An accidental or erroneous operational or programmatic change to the database can cause loss or corruption of data. Recovery may require a return to a state prior to the error.
The manner in which you perform media recovery depends on whether your database participates in a distributed database system. The Oracle distributed database architecture is autonomous. Therefore, depending on the type of recovery operation selected for a single, damaged database, you may have to coordinate recovery operations globally among all databases in the distributed database system.
Table 6-2 summarizes different types of recovery operations and whether coordination among nodes of a distributed database system is required.
In special circumstances, one node in a distributed database may require recovery to a past time. To preserve global data consistency, it is often necessary to recover all other nodes in the system to the same point in time. This operation is called coordinated, time-based, distributed database recovery. The following tasks should be performed with the standard procedures of time-based and change-based recovery described in this chapter.
alert.log
of the database for the RESETLOGS message.
If the message is, "RESETLOGS after complete recovery through change xxx", then you have applied all the changes in the database and performed a complete recovery. Do not recover any of the other databases in the distributed system, or you will unnecessarily remove changes in them. Recovery is complete.
If the message is, "RESETLOGS after incomplete recovery UNTIL CHANGE xxx", you have successfully performed an incomplete recovery. Record the change number from the message and proceed to the next step.
If a master database is independently recovered to a past time (that is, coordinated, time-based distributed database recovery is not performed), any dependent remote snapshot that was refreshed in the interval of lost time will be inconsistent with its master table. In this case, the administrator of the master database should instruct the remote administrators to perform a complete refresh of any inconsistent snapshot.
|
![]() Copyright © 1996-2000, Oracle Corporation. All Rights Reserved. |
|