Bootstrap

Oracle常见数据块损坏处理方式

1 前提:备份数据库

查看数据库的模式

SYS@orcl>select open_mode,log_mode from v$database;

OPEN_MODE	     LOG_MODE
-------------------- ------------
READ WRITE	     ARCHIVELOG

修改RMAN的备份参数

RMAN> configure controlfile autobackup on;

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored

RMAN> configure controlfile autobackup format for device type disk to '/tmp/backup/%F';

old RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/tmp/backup/cs_%F';
new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/tmp/backup/%F';
new RMAN configuration parameters are successfully stored

创建测试表空间,并存放数据

SYS@orcl>create tablespace tbs02 datafile '/u01/app/oracle/oradata/orcl/tbs002.dbf' size 1m;

Tablespace created.

SYS@orcl>create table bruce.test01 tablespace tbs02 as select * from emp;

Table created.

SYS@orcl>select count(*) from bruce.test01;

  COUNT(*)
----------
	14

在RMAN下面备份数据库

RMAN> backup database format '/tmp/backup/%U' tag=bruce20221216;

Starting backup at 2022-12-16 11:13:48
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/orcl/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/orcl/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/orcl/undotbs01.dbf
input datafile file number=00005 name=/u01/app/oracle/oradata/orcl/tbs01_001.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/orcl/users01.dbf
input datafile file number=00002 name=/u01/app/oracle/tbs003.dbf
input datafile file number=00008 name=/u01/app/oracle/oradata/orcl/tbs002.dbf
channel ORA_DISK_1: starting piece 1 at 2022-12-16 11:13:48
channel ORA_DISK_1: finished piece 1 at 2022-12-16 11:13:55
piece handle=/tmp/backup/141fh3vc_1_1 tag=BRUCE20221216 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07
Finished backup at 2022-12-16 11:13:55

Starting Control File and SPFILE Autobackup at 2022-12-16 11:13:55
piece handle=/tmp/backup/c-1648706630-20221216-03 comment=NONE
Finished Control File and SPFILE Autobackup at 2022-12-16 11:13:56

查看备份

RMAN> list backup of database;

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time    
------- ---- -- ---------- ----------- ------------ -------------------
33      Full    1.23G      DISK        00:00:03     2022-12-16 11:13:51
        BP Key: 33   Status: AVAILABLE  Compressed: NO  Tag: BRUCE20221216
        Piece Name: /tmp/backup/141fh3vc_1_1
  List of Datafiles in backup set 33
  File LV Type Ckp SCN    Ckp Time            Abs Fuz SCN Sparse Name
  ---- -- ---- ---------- ------------------- ----------- ------ ----
  1       Full 3900401    2022-12-16 11:13:48              NO    /u01/app/oracle/oradata/orcl/system01.dbf
  2       Full 3900401    2022-12-16 11:13:48              NO    /u01/app/oracle/tbs003.dbf
  3       Full 3900401    2022-12-16 11:13:48              NO    /u01/app/oracle/oradata/orcl/sysaux01.dbf
  4       Full 3900401    2022-12-16 11:13:48              NO    /u01/app/oracle/oradata/orcl/undotbs01.dbf
  5       Full 3900401    2022-12-16 11:13:48              NO    /u01/app/oracle/oradata/orcl/tbs01_001.dbf
  7       Full 3900401    2022-12-16 11:13:48              NO    /u01/app/oracle/oradata/orcl/users01.dbf
  8       Full 3900401    2022-12-16 11:13:48              NO    /u01/app/oracle/oradata/orcl/tbs002.dbf

RMAN> list backup of controlfile;

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time    
------- ---- -- ---------- ----------- ------------ -------------------
34      Full    10.19M     DISK        00:00:00     2022-12-16 11:13:55
        BP Key: 34   Status: AVAILABLE  Compressed: NO  Tag: TAG20221216T111355
        Piece Name: /tmp/backup/c-1648706630-20221216-03
  Control File Included: Ckp SCN: 3900420      Ckp time: 2022-12-16 11:13:55

RMAN> list backup of spfile;

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time    
------- ---- -- ---------- ----------- ------------ -------------------
34      Full    10.19M     DISK        00:00:00     2022-12-16 11:13:55
        BP Key: 34   Status: AVAILABLE  Compressed: NO  Tag: TAG20221216T111355
        Piece Name: /tmp/backup/c-1648706630-20221216-03
  SPFILE Included: Modification time: 2022-12-16 09:47:46
  SPFILE db_unique_name: ORCL

--备份文件--

[root@ora-server backup]# pwd 
/tmp/backup
[root@ora-server backup]# ll
total 1299680
-rw-r----- 1 oracle oinstall 1320173568 Dec 16 11:13 141fh3vc_1_1
-rw-r----- 1 oracle oinstall   10698752 Dec 16 11:13 c-1648706630-20221216-03

2 故障恢复

2.1 制造故障

查看tbs02表空间对应的数据文件,进行相关的修改。

数据文件为二进制文件,通过vi命令修改其中的数据,注意在修改数据文件内容时,不要修改头部信息,精良修改中间部分。

2.2 dbverify命令

检查数据文件是否存在坏块:

[oracle@ora-server orcl]$ dbv file=/u01/app/oracle/oradata/orcl/tbs001.dbf blocksize=8192

DBVERIFY: Release 12.2.0.1.0 - Production on Fri Dec 30 10:28:11 2022

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/orcl/tbs001.dbf
Page 91 is marked corrupt
Corrupt block relative dba: 0x0240005b (file 9, block 91)
Bad check value found during dbv: 
Data in bad block:
 type: 6 format: 2 rdba: 0x0240005b
 last change scn: 0x0000.0000.004688dd seq: 0x1 flg: 0x06
 spare3: 0x0
 consistency value in tail: 0x88dd0601
 check value in block header: 0x716f
 computed block checksum: 0x6000

DBVERIFY - Verification complete

Total Pages Examined         : 128
Total Pages Processed (Data) : 109
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 17
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 4622557 (0.4622557)

输出显示文件4的第45844个块出现坏块,dbv命令只能检测,不能修复。

2.3 blockrecover命令

在RMAN下使用blockrecover命令对坏块进行修复

RMAN> blockrecover datafile 9 block 91;

Starting recover at 30-DEC-22
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1163 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=8 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=396 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=783 device type=DISK
searching flashback logs for block images
finished flashback log search, restored 1 blocks

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 30-DEC-22

退出RMAN后再检查

[oracle@ora-server ~]$ dbv file=/u01/app/oracle/oradata/orcl/tbs001.dbf blocksize=8192

DBVERIFY: Release 12.2.0.1.0 - Production on Fri Dec 30 11:05:45 2022

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/orcl/tbs001.dbf

DBVERIFY - Verification complete

Total Pages Examined         : 128
Total Pages Processed (Data) : 110
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 17
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 4622557 (0.4622557)

2.4 analyze命令

通过analyze命令对表和索引的匹配情况进行逻辑检查,只报告错误信息,不标示坏块

  1. 创建测试表test01,并添加响应的索引信息
SYS@orcl>create table bruce.test01 tablespace users as select * from emp;

Table created.

SYS@orcl>create index bruce.ind_01 on bruce.test01(empno) tablespace users;

Index created.
  1. 使用analyze命令检测表和索引的匹配情况
SYS@orcl>analyze table bruce.test01 validate structure cascade;

Table analyzed.

SYS@orcl>analyze index bruce.ind_01 validate structure;

Index analyzed.
  1. 将测试表移动一下
SYS@orcl>alter table bruce.test01 move;
  1. 再次检测表和索引的匹配情况
09:14:01 SYS@orcl>analyze index bruce.ind_01 validate structure;  
analyze index bruce.ind_1 validate structure
*
ERROR at line 1:
ORA-01502: index 'BRUCE.IND_01' or partition of such index is in unusable state

09:16:13 SYS@orcl>analyze table bruce.test01 validate structure cascade;
analyze table bruce.test01 validate structure cascade
*
ERROR at line 1:
ORA-01502: index 'BRUCE.IND_01' or partition of such index is in unusable state

由于移动了基表,改变了原有的rowid的值,会导致索引失效

  1. 解决:将索引删除后重建,或者重新编译一下
09:24:13 SYS@orcl>alter index bruce.ind_01 rebuild;

Index altered.
  1. 再次检查
09:25:14 SYS@orcl>analyze table bruce.test01 validate structure cascade;

Table analyzed.

09:25:45 SYS@orcl>analyze index bruce.ind_01 validate structure;

Index analyzed.

2.5 数据库初始化参数

对于参数db_block_checking和db_block_checksum的设定

db_block_checking的默认值是FALSE,如果设置为TRUE表示将会对所有的数据块进行检查;数据库通过读取该块,来确认数据块的自我一致性;根据数据库的工作负载,一般会产生1%~10%的开销。
参数有以下可能的值:

  1. off,除system表空间之外,任何表空间中都不执行块检查
  2. low,在内存中的块的内容发生变化之后,进行基本的块头检查
  3. medium,执行所有的low检查,并对所有不是以索引组织的表执行块检查
  4. full,执行所有的low和medium检查,以及索引块的检查

默认的值:

09:26:04 SYS@orcl>show parameter db_block_checking

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
db_block_checking		     string	 FALSE

此参数可以使用alter session或者alter system的命令进行修改

db_block_checksum,校验和,默认值是TRUE。数据库根据块中存储的所有字节数计算出的数字,在dbwr写脏数据的时候,同时也将此数字写入数据块的头部,之后再读取该块时,会重新计算校验和,并与数据块头的值进行比较。一般会对数据库造成1%~2%的开销,Oracle建议开启此参数。

09:34:05 SYS@orcl>show parameter db_block_checksum;

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
db_block_checksum		     string	 TYPICAL

2.6 dbms_repair的使用

09:37:28 SYS@orcl>create tablespace block datafile '/u01/app/oracle/oradata/orcl/block.dbf' size 1m;

Tablespace created.

09:37:48 SYS@orcl>grant dba to bruce identified by 123456;

Grant succeeded.

09:38:09 SYS@orcl>conn bruce/123456
Connected.

09:38:37 BRUCE@orcl>create table test tablespace block as select * from bruce.test01;

Table created.

09:38:53 BRUCE@orcl>insert into test select * from test;

14 rows created.

09:39:14 BRUCE@orcl>/

28 rows created.

09:39:21 BRUCE@orcl>/

56 rows created.

09:39:22 BRUCE@orcl>/

112 rows created.

09:39:23 BRUCE@orcl>/

224 rows created.

09:39:25 BRUCE@orcl>/

448 rows created.

09:39:27 BRUCE@orcl>/

896 rows created.

09:39:35 BRUCE@orcl>/

1792 rows created.

09:39:36 BRUCE@orcl>/

3584 rows created.

09:39:38 BRUCE@orcl>/

7168 rows created.

09:39:39 BRUCE@orcl>/
insert into test select * from test
*
ERROR at line 1:
ORA-01653: unable to extend table BRUCE.TEST by 8 in tablespace BLOCK

09:39:42 BRUCE@orcl>commit;

Commit complete.

09:39:53 BRUCE@orcl>select count(*) from test;

  COUNT(*)
----------
     14336

09:40:04 BRUCE@orcl>create index i_test on test(ename);

Index created.

09:40:17 BRUCE@orcl>alter system checkpoint;

System altered.
  1. 模拟数据块损坏
09:40:33 BRUCE@orcl>conn / as sysdba
Connected.
09:40:40 SYS@orcl>shutdown immediate 
Database closed.
Database dismounted.
ORACLE instance shut down.
09:41:09 SYS@orcl>

--使用UltraEdit修改block.dbf后再启动数据库

10:06:21 SYS@orcl>startup
ORACLE instance started.

Total System Global Area 1660944384 bytes
Fixed Size		    8621376 bytes
Variable Size		 1459618496 bytes
Database Buffers	  184549376 bytes
Redo Buffers		    8155136 bytes
Database mounted.
Database opened.
10:26:58 SYS@orcl>select count(*) from bruce.test;
select count(*) from bruce.test
       *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 9, block # 91)		--提示有坏块
ORA-01110: data file 9: '/u01/app/oracle/oradata/orcl/block.dbf'
  1. 通过dbv进行数据文件检查
[oracle@ora-server orcl]$ dbv file=/u01/app/oracle/oradata/orcl/block.dbf blocksize=8192

DBVERIFY: Release 12.2.0.1.0 - Production on Fri Dec 30 10:28:11 2022

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/orcl/block.dbf
Page 91 is marked corrupt
Corrupt block relative dba: 0x0240005b (file 9, block 91)
Bad check value found during dbv: 
Data in bad block:
 type: 6 format: 2 rdba: 0x0240005b
 last change scn: 0x0000.0000.004688dd seq: 0x1 flg: 0x06
 spare3: 0x0
 consistency value in tail: 0x88dd0601
 check value in block header: 0x716f
 computed block checksum: 0x6000

DBVERIFY - Verification complete

Total Pages Examined         : 128
Total Pages Processed (Data) : 109
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 17
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 4622557 (0.4622557)
  1. 创建管理表,用于标识坏块信息
--表数据
10:27:20 SYS@orcl>exec DBMS_REPAIR.ADMIN_TABLES('REPAIR_TABLE',1,1,'USERS');

PL/SQL procedure successfully completed.

--索引数据
10:36:30 SYS@orcl>exec DBMS_REPAIR.ADMIN_TABLES('ORPHAN_TABLE',2,1,'USERS');

PL/SQL procedure successfully completed.
  1. 检查坏块情况
10:37:24 SYS@orcl>set serveroutput on
10:43:15 SYS@orcl>declare
10:43:31   2  cc number;
10:43:35   3  begin
10:43:38   4  dbms_repair.check_object(schema_name => 'BRUCE',object_name => 'TEST',corrupt_count => cc);
10:44:06   5  dbms_output.put_line(a => to_char(cc));
10:44:27   6  end;
10:44:30   7  /
1

PL/SQL procedure successfully completed.
  1. 确认坏块标示情况
10:47:23 SYS@orcl>select object_name,relative_file_id,block_id,marked_corrupt,corrupt_description,repair_description,CHECK_TIMESTAMP from repair_table;

OBJECT_NAME															 RELATIVE_FILE_ID   BLOCK_ID MARKED_COR
-------------------------------------------------------------------------------------------------------------------------------- ---------------- ---------- ----------
CORRUPT_DESCRIPTION
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
REPAIR_DESCRIPTION
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CHECK_TIM
---------
TEST																        91 TRUE

mark block software corrupt
30-DEC-22
  1. 跳过坏块
10:50:45 SYS@orcl>exec dbms_repair.skip_corrupt_blocks(schema_name => 'BRUCE',object_name => 'TEST',flags => 1);

PL/SQL procedure successfully completed.

10:51:18 SYS@orcl>select count(*) from bruce.test;

  COUNT(*)
----------
     14166

--少了14336-14166=170行数据
  1. 处理index上的无效键值
10:57:39 SYS@orcl>declare
10:59:30   2  cc number;
10:59:34   3  begin
10:59:37   4  dbms_repair.dump_orphan_keys(schema_name => 'BRUCE',object_name => 'I_TEST',object_type => 2,
11:00:18   5  repair_table_name => 'REPAIR_TABLE',orphan_table_name => 'ORPHAN_TABLE',key_count => CC);
11:00:23   6  end;
11:00:26   7  /

PL/SQL procedure successfully completed.
11:01:13 SYS@orcl>select * from orphan_table;

...

170 rows selected.
;