1 前提:备份数据库
查看数据库的模式
SYS@orcl>select open_mode,log_mode from v$database;
OPEN_MODE LOG_MODE
-------------------- ------------
READ WRITE ARCHIVELOG
修改RMAN的备份参数
RMAN> configure controlfile autobackup on;
new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored
RMAN> configure controlfile autobackup format for device type disk to '/tmp/backup/%F';
old RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/tmp/backup/cs_%F';
new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/tmp/backup/%F';
new RMAN configuration parameters are successfully stored
创建测试表空间,并存放数据
SYS@orcl>create tablespace tbs02 datafile '/u01/app/oracle/oradata/orcl/tbs002.dbf' size 1m;
Tablespace created.
SYS@orcl>create table bruce.test01 tablespace tbs02 as select * from emp;
Table created.
SYS@orcl>select count(*) from bruce.test01;
COUNT(*)
----------
14
在RMAN下面备份数据库
RMAN> backup database format '/tmp/backup/%U' tag=bruce20221216;
Starting backup at 2022-12-16 11:13:48
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orcl/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orcl/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orcl/undotbs01.dbf
input datafile file number=00005 name=/u01/app/oracle/oradata/orcl/tbs01_001.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orcl/users01.dbf
input datafile file number=00002 name=/u01/app/oracle/tbs003.dbf
input datafile file number=00008 name=/u01/app/oracle/oradata/orcl/tbs002.dbf
channel ORA_DISK_1: starting piece 1 at 2022-12-16 11:13:48
channel ORA_DISK_1: finished piece 1 at 2022-12-16 11:13:55
piece handle=/tmp/backup/141fh3vc_1_1 tag=BRUCE20221216 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07
Finished backup at 2022-12-16 11:13:55
Starting Control File and SPFILE Autobackup at 2022-12-16 11:13:55
piece handle=/tmp/backup/c-1648706630-20221216-03 comment=NONE
Finished Control File and SPFILE Autobackup at 2022-12-16 11:13:56
查看备份
RMAN> list backup of database;
List of Backup Sets
===================
BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ -------------------
33 Full 1.23G DISK 00:00:03 2022-12-16 11:13:51
BP Key: 33 Status: AVAILABLE Compressed: NO Tag: BRUCE20221216
Piece Name: /tmp/backup/141fh3vc_1_1
List of Datafiles in backup set 33
File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name
---- -- ---- ---------- ------------------- ----------- ------ ----
1 Full 3900401 2022-12-16 11:13:48 NO /u01/app/oracle/oradata/orcl/system01.dbf
2 Full 3900401 2022-12-16 11:13:48 NO /u01/app/oracle/tbs003.dbf
3 Full 3900401 2022-12-16 11:13:48 NO /u01/app/oracle/oradata/orcl/sysaux01.dbf
4 Full 3900401 2022-12-16 11:13:48 NO /u01/app/oracle/oradata/orcl/undotbs01.dbf
5 Full 3900401 2022-12-16 11:13:48 NO /u01/app/oracle/oradata/orcl/tbs01_001.dbf
7 Full 3900401 2022-12-16 11:13:48 NO /u01/app/oracle/oradata/orcl/users01.dbf
8 Full 3900401 2022-12-16 11:13:48 NO /u01/app/oracle/oradata/orcl/tbs002.dbf
RMAN> list backup of controlfile;
List of Backup Sets
===================
BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ -------------------
34 Full 10.19M DISK 00:00:00 2022-12-16 11:13:55
BP Key: 34 Status: AVAILABLE Compressed: NO Tag: TAG20221216T111355
Piece Name: /tmp/backup/c-1648706630-20221216-03
Control File Included: Ckp SCN: 3900420 Ckp time: 2022-12-16 11:13:55
RMAN> list backup of spfile;
List of Backup Sets
===================
BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ -------------------
34 Full 10.19M DISK 00:00:00 2022-12-16 11:13:55
BP Key: 34 Status: AVAILABLE Compressed: NO Tag: TAG20221216T111355
Piece Name: /tmp/backup/c-1648706630-20221216-03
SPFILE Included: Modification time: 2022-12-16 09:47:46
SPFILE db_unique_name: ORCL
--备份文件--
[root@ora-server backup]# pwd
/tmp/backup
[root@ora-server backup]# ll
total 1299680
-rw-r----- 1 oracle oinstall 1320173568 Dec 16 11:13 141fh3vc_1_1
-rw-r----- 1 oracle oinstall 10698752 Dec 16 11:13 c-1648706630-20221216-03
2 故障恢复
2.1 制造故障
查看tbs02表空间对应的数据文件,进行相关的修改。
数据文件为二进制文件,通过vi命令修改其中的数据,注意在修改数据文件内容时,不要修改头部信息,精良修改中间部分。
2.2 dbverify命令
检查数据文件是否存在坏块:
[oracle@ora-server orcl]$ dbv file=/u01/app/oracle/oradata/orcl/tbs001.dbf blocksize=8192
DBVERIFY: Release 12.2.0.1.0 - Production on Fri Dec 30 10:28:11 2022
Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/orcl/tbs001.dbf
Page 91 is marked corrupt
Corrupt block relative dba: 0x0240005b (file 9, block 91)
Bad check value found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x0240005b
last change scn: 0x0000.0000.004688dd seq: 0x1 flg: 0x06
spare3: 0x0
consistency value in tail: 0x88dd0601
check value in block header: 0x716f
computed block checksum: 0x6000
DBVERIFY - Verification complete
Total Pages Examined : 128
Total Pages Processed (Data) : 109
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 17
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 1
Total Pages Marked Corrupt : 1
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 4622557 (0.4622557)
输出显示文件4的第45844个块出现坏块,dbv命令只能检测,不能修复。
2.3 blockrecover命令
在RMAN下使用blockrecover命令对坏块进行修复
RMAN> blockrecover datafile 9 block 91;
Starting recover at 30-DEC-22
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1163 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=8 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=396 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=783 device type=DISK
searching flashback logs for block images
finished flashback log search, restored 1 blocks
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 30-DEC-22
退出RMAN后再检查
[oracle@ora-server ~]$ dbv file=/u01/app/oracle/oradata/orcl/tbs001.dbf blocksize=8192
DBVERIFY: Release 12.2.0.1.0 - Production on Fri Dec 30 11:05:45 2022
Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/orcl/tbs001.dbf
DBVERIFY - Verification complete
Total Pages Examined : 128
Total Pages Processed (Data) : 110
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 17
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 1
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 4622557 (0.4622557)
2.4 analyze命令
通过analyze命令对表和索引的匹配情况进行逻辑检查,只报告错误信息,不标示坏块
- 创建测试表test01,并添加响应的索引信息
SYS@orcl>create table bruce.test01 tablespace users as select * from emp;
Table created.
SYS@orcl>create index bruce.ind_01 on bruce.test01(empno) tablespace users;
Index created.
- 使用analyze命令检测表和索引的匹配情况
SYS@orcl>analyze table bruce.test01 validate structure cascade;
Table analyzed.
SYS@orcl>analyze index bruce.ind_01 validate structure;
Index analyzed.
- 将测试表移动一下
SYS@orcl>alter table bruce.test01 move;
- 再次检测表和索引的匹配情况
09:14:01 SYS@orcl>analyze index bruce.ind_01 validate structure;
analyze index bruce.ind_1 validate structure
*
ERROR at line 1:
ORA-01502: index 'BRUCE.IND_01' or partition of such index is in unusable state
09:16:13 SYS@orcl>analyze table bruce.test01 validate structure cascade;
analyze table bruce.test01 validate structure cascade
*
ERROR at line 1:
ORA-01502: index 'BRUCE.IND_01' or partition of such index is in unusable state
由于移动了基表,改变了原有的rowid的值,会导致索引失效
- 解决:将索引删除后重建,或者重新编译一下
09:24:13 SYS@orcl>alter index bruce.ind_01 rebuild;
Index altered.
- 再次检查
09:25:14 SYS@orcl>analyze table bruce.test01 validate structure cascade;
Table analyzed.
09:25:45 SYS@orcl>analyze index bruce.ind_01 validate structure;
Index analyzed.
2.5 数据库初始化参数
对于参数db_block_checking和db_block_checksum的设定
db_block_checking的默认值是FALSE,如果设置为TRUE表示将会对所有的数据块进行检查;数据库通过读取该块,来确认数据块的自我一致性;根据数据库的工作负载,一般会产生1%~10%的开销。
参数有以下可能的值:
- off,除system表空间之外,任何表空间中都不执行块检查
- low,在内存中的块的内容发生变化之后,进行基本的块头检查
- medium,执行所有的low检查,并对所有不是以索引组织的表执行块检查
- full,执行所有的low和medium检查,以及索引块的检查
默认的值:
09:26:04 SYS@orcl>show parameter db_block_checking
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_block_checking string FALSE
此参数可以使用alter session或者alter system的命令进行修改
db_block_checksum,校验和,默认值是TRUE。数据库根据块中存储的所有字节数计算出的数字,在dbwr写脏数据的时候,同时也将此数字写入数据块的头部,之后再读取该块时,会重新计算校验和,并与数据块头的值进行比较。一般会对数据库造成1%~2%的开销,Oracle建议开启此参数。
09:34:05 SYS@orcl>show parameter db_block_checksum;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_block_checksum string TYPICAL
2.6 dbms_repair的使用
09:37:28 SYS@orcl>create tablespace block datafile '/u01/app/oracle/oradata/orcl/block.dbf' size 1m;
Tablespace created.
09:37:48 SYS@orcl>grant dba to bruce identified by 123456;
Grant succeeded.
09:38:09 SYS@orcl>conn bruce/123456
Connected.
09:38:37 BRUCE@orcl>create table test tablespace block as select * from bruce.test01;
Table created.
09:38:53 BRUCE@orcl>insert into test select * from test;
14 rows created.
09:39:14 BRUCE@orcl>/
28 rows created.
09:39:21 BRUCE@orcl>/
56 rows created.
09:39:22 BRUCE@orcl>/
112 rows created.
09:39:23 BRUCE@orcl>/
224 rows created.
09:39:25 BRUCE@orcl>/
448 rows created.
09:39:27 BRUCE@orcl>/
896 rows created.
09:39:35 BRUCE@orcl>/
1792 rows created.
09:39:36 BRUCE@orcl>/
3584 rows created.
09:39:38 BRUCE@orcl>/
7168 rows created.
09:39:39 BRUCE@orcl>/
insert into test select * from test
*
ERROR at line 1:
ORA-01653: unable to extend table BRUCE.TEST by 8 in tablespace BLOCK
09:39:42 BRUCE@orcl>commit;
Commit complete.
09:39:53 BRUCE@orcl>select count(*) from test;
COUNT(*)
----------
14336
09:40:04 BRUCE@orcl>create index i_test on test(ename);
Index created.
09:40:17 BRUCE@orcl>alter system checkpoint;
System altered.
- 模拟数据块损坏
09:40:33 BRUCE@orcl>conn / as sysdba
Connected.
09:40:40 SYS@orcl>shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
09:41:09 SYS@orcl>
--使用UltraEdit修改block.dbf后再启动数据库
10:06:21 SYS@orcl>startup
ORACLE instance started.
Total System Global Area 1660944384 bytes
Fixed Size 8621376 bytes
Variable Size 1459618496 bytes
Database Buffers 184549376 bytes
Redo Buffers 8155136 bytes
Database mounted.
Database opened.
10:26:58 SYS@orcl>select count(*) from bruce.test;
select count(*) from bruce.test
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 9, block # 91) --提示有坏块
ORA-01110: data file 9: '/u01/app/oracle/oradata/orcl/block.dbf'
- 通过dbv进行数据文件检查
[oracle@ora-server orcl]$ dbv file=/u01/app/oracle/oradata/orcl/block.dbf blocksize=8192
DBVERIFY: Release 12.2.0.1.0 - Production on Fri Dec 30 10:28:11 2022
Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/orcl/block.dbf
Page 91 is marked corrupt
Corrupt block relative dba: 0x0240005b (file 9, block 91)
Bad check value found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x0240005b
last change scn: 0x0000.0000.004688dd seq: 0x1 flg: 0x06
spare3: 0x0
consistency value in tail: 0x88dd0601
check value in block header: 0x716f
computed block checksum: 0x6000
DBVERIFY - Verification complete
Total Pages Examined : 128
Total Pages Processed (Data) : 109
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 17
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 1
Total Pages Marked Corrupt : 1
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 4622557 (0.4622557)
- 创建管理表,用于标识坏块信息
--表数据
10:27:20 SYS@orcl>exec DBMS_REPAIR.ADMIN_TABLES('REPAIR_TABLE',1,1,'USERS');
PL/SQL procedure successfully completed.
--索引数据
10:36:30 SYS@orcl>exec DBMS_REPAIR.ADMIN_TABLES('ORPHAN_TABLE',2,1,'USERS');
PL/SQL procedure successfully completed.
- 检查坏块情况
10:37:24 SYS@orcl>set serveroutput on
10:43:15 SYS@orcl>declare
10:43:31 2 cc number;
10:43:35 3 begin
10:43:38 4 dbms_repair.check_object(schema_name => 'BRUCE',object_name => 'TEST',corrupt_count => cc);
10:44:06 5 dbms_output.put_line(a => to_char(cc));
10:44:27 6 end;
10:44:30 7 /
1
PL/SQL procedure successfully completed.
- 确认坏块标示情况
10:47:23 SYS@orcl>select object_name,relative_file_id,block_id,marked_corrupt,corrupt_description,repair_description,CHECK_TIMESTAMP from repair_table;
OBJECT_NAME RELATIVE_FILE_ID BLOCK_ID MARKED_COR
-------------------------------------------------------------------------------------------------------------------------------- ---------------- ---------- ----------
CORRUPT_DESCRIPTION
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
REPAIR_DESCRIPTION
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CHECK_TIM
---------
TEST 91 TRUE
mark block software corrupt
30-DEC-22
- 跳过坏块
10:50:45 SYS@orcl>exec dbms_repair.skip_corrupt_blocks(schema_name => 'BRUCE',object_name => 'TEST',flags => 1);
PL/SQL procedure successfully completed.
10:51:18 SYS@orcl>select count(*) from bruce.test;
COUNT(*)
----------
14166
--少了14336-14166=170行数据
- 处理index上的无效键值
10:57:39 SYS@orcl>declare
10:59:30 2 cc number;
10:59:34 3 begin
10:59:37 4 dbms_repair.dump_orphan_keys(schema_name => 'BRUCE',object_name => 'I_TEST',object_type => 2,
11:00:18 5 repair_table_name => 'REPAIR_TABLE',orphan_table_name => 'ORPHAN_TABLE',key_count => CC);
11:00:23 6 end;
11:00:26 7 /
PL/SQL procedure successfully completed.
11:01:13 SYS@orcl>select * from orphan_table;
...
170 rows selected.