muugua 发表于 2018-9-11 07:56:27

Oracle 学习之--ASM DISK Header的备份和恢复(2)

1. Make sure all ASMinstances are shut down.  
       --关闭所有ASM 实例
  
2. Make a back up of thefirst 4k of the bad disk with dd:
  
       ddif= of= bs=4096 count=1
  
       备份损坏的disk header
  

  
3. Check existing disksand see which one has “file 1 block 1″:
  
To find the disk with f1b1 run:
  
       kfedread| grep f1b1
  
       搜索含有file 1 block 1的字段。
  

  
Example:
  
$ kfed read /ocfs02/asm/data01 | grep f1b1
  
kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002
  
$ kfed read /ocfs02/asm/data02 | grep f1b1
  
kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000
  
       Sincedata01, has a non-zero value, data01 is the disk with “file 1 block 1″.
  
       --注意这里的值,如果非0,就是代表搜索到了file 1 block 1.
  
       Confirmthis by checking the following to see if you see “KFBTYP_LISTHEAD” in the 2ndallocation unit:
  
       可以可以通过第二个AU 单元来验证。
  
kfed readaunum=2 |grep kfbh.type
  

  
Also specify the ausize with AUSZ=# ifusing a non default allocation unit size.
  
       如果使用非默认AUsize 的话,也可以指定ausize。
  

  
Example:
  
$ kfed read /ocfs02/asm/data01 aunum=2 |grep kfbh.type
  
kfbh.type: 5 ; 0x002: KFBTYP_LISTHEAD
  

  
       Ifthe lost disk is the “file 1 block 1″ disk then scan every AU of the bad disk till you find a headerwhich claims to be FILE_DIRECTORY (KFBTYP_FILEDIR).
  
       如果通过grep没有找到f1b1,就需要查找所有的AU.直到找到file directory。
  

  
       Onceyou find that you can set f1b1locn to that AU number and continue…If the file directory cannotbe found anywhere then we have no choice but to re-create the diskgroup andrestore from a backup.
  
       如果找到了f1b1locn,就将其设置为正确的AU Number,如果说没有找到File directory。 那么就只有重建diskgroup,然后通过备份进行restore了。
  

  
4. Make a copy of a gooddisk header with kfed that IS NOT the disk that contains f1b1 and is in theSAME diskgroup as the bad disk.
  
       copy 一个disk header。这个disk header是非f1b1的。 在上面的测试,f1b1在data01上。
  

  
In our example this is data02:
  
       kfedread> fix.txt
  

  
Example:
  
$ kfed read /ocfs02/asm/data02 > fix.txt
  

  
5. Edit the fix.txt and change thefollowing fields to the proper values (use the ASM alert log for reference):
  
       kfdhdb.dsknum
  
       kfdhdb.dskname
  
       kfdhdb.fgname
  
       修改相关的参数值
  

  
Example:
  
Check the alert log for proper names:
  

  
NOTE: cache opening disk 0 of grp 1:DATA_0000 path:/ocfs02/asm/data01
  
NOTE: cache opening disk 1 of grp 1:DATA_0001 path:/ocfs02/asm/data02
  
NOTE: cache opening disk 2 of grp 1:DATA_0002 path:/ocfs02/asm/data03
  

  
Old values from fix.txt:
  
       kfdhdb.dsknum:1 ; 0x024: 0x0001
  
       kfdhdb.grptyp:1 ; 0x026: KFDGTP_EXTERNAL
  

  
       kfdhdb.hdrsts:3 ; 0x027: KFDHDR_MEMBER
  

  
       kfdhdb.dskname:DATA_0001 ; 0x028: length=9
  
       kfdhdb.grpname:DATA ; 0x048: length=4
  
       kfdhdb.fgname:DATA_0001 ; 0x068: length=9
  

  
New values from fix.txt:
  
       kfdhdb.dsknum:2 ; 0x024: 0x0002
  
       kfdhdb.grptyp:1 ; 0x026: KFDGTP_EXTERNAL
  

  
       kfdhdb.hdrsts:3 ; 0x027: KFDHDR_MEMBER
  

  
       kfdhdb.dskname:DATA_0002 ; 0x028: length=9
  
       kfdhdb.grpname:DATA ; 0x048: length=4
  
       kfdhdb.fgname:DATA_0002 ; 0x068: length=9
  

  
6. Find the diskdirectory by dumping aunum=2 and blknum=2 for the disk with f1b1:
  
       根据file directory查找disk directory,命令如下:
  
      kfed readaunum=2 blknum=2 | more
  

  
Example:
  
$ kfed read /ocfs02/asm/data01 aunum=2blknum=2 | more
  
kfffde.xptr.au: 2 ; 0x4a0: 0x00000002
  
kfffde.xptr.disk: 2 ; 0x4a4: 0x0002
  
kfffde.xptr.flags: 0 ; 0x4a6: L=0 E=0D=0 S=0
  
kfffde.xptr.chk: 42 ; 0x4a7: 0x2a
  
kfffde.xptr.au: 4294967295; 0x4a8:0xffffffff
  
kfffde.xptr.disk: 65535 ; 0x4ac: 0xffff
  
kfffde.xptr.flags: 0 ; 0x4ae: L=0 E=0D=0 S=0
  
kfffde.xptr.chk: 42 ; 0x4af: 0x2a
  

  
       Afterthe initial file directory header, you will see the extent map. If thediskgroup is external redundancy then each entry refers to an extent of thefile. For normal redundancy, every pair is a extent set, similarly for highredundancy form the extent set. Here we see thedisk directory is at au = 2 in disk number = 2.
  
       In this example, it turned out to bein that location on the second AU, but it is not guaranteed that it will alwaysbe there.
  

  
7. Once the diskdirectory location is found, find the info for your disk number.
  
       一旦确定了disk directory 的位置,就可以查看disk number 的信息。命令如下:
  
       kfedreadaunum=2 blknum=0 | more
  

  
Example:
  

  
kfed read /ocfs02/asm/data02 aunum=2blknum=0 | more
  
kfbh.type: 6 ; 0x002: KFBTYP_DISKDIR
  
...
  
kfddde.entry.incarn: 1 ;0x024: A=1 NUMM=0x0
  
--为1 才是allocatedentries,为0表示该entry 已经被deleted。
  
...
  

  
kfddde.dsknum: 2 ; 0x3b4: 0x0002
  
kfddde.state: 2 ; 0x3b6: KFDSTA_NORMAL
  
kfddde.ub1spare:0 ; 0x3b7: 0x00
  
kfddde.dskname: DATA_0002 ; 0x3b8:length=9
  
kfddde.fgname: DATA_0002 ; 0x3d8:length=9
  

  
kfddde.crestmp.hi: 32885842; 0x3f8: HOUR=0x12 DAYS=0x2 MNTH=0x3 YEAR=0x7d7
  
kfddde.crestmp.lo:3860343808 ; 0x3fc: USEC=0x0 MSEC=0x20b SECS=0x21 MINS=0x39
  
kfddde.failstmp.hi: 0 ; 0x400: HOUR=0x0DAYS=0x0 MNTH=0x0 YEAR=0x0
  
kfddde.failstmp.lo: 0 ; 0x404: USEC=0x0MSEC=0x0 SECS=0x0 MINS=0x0
  

  
       Various kfddde refer to the disk directory entries.Only entries with entry.incarn numbers shouldA=1 are allocated entries. You might find entries with dskname populated, butif A=0 then it means that entry was deleted.
  

  
8. Now go back to fix.txt and adjust thecrestmp.hi and crestmp.lo to match what the disk directory shows. Ifit is already the same then leave it.
  
       根据diskdirectory里的值修改crestmp.hi 和 crestmp.lo 参数
  

  
Example:
  

  
Before:
  
kfdhdb.crestmp.hi: 32879468 ; 0x0a8:HOUR=0xc DAYS=0x1b MNTH=0xc YEAR=0x7d6
  
kfdhdb.crestmp.lo:
  
296378368 ; 0x0ac: USEC=0x0 MSEC=0x298SECS=0x1a MINS=0x4
  
kfdhdb.mntstmp.hi: 32879468 ; 0x0b0:HOUR=0xc DAYS=0x1b MNTH=0xc YEAR=0x7d6
  
kfdhdb.mntstmp.lo: 309633024 ; 0x0b4:USEC=0x0 MSEC=0x128 SECS=0x27 MINS=0x4
  

  
After:
  
kfdhdb.crestmp.hi:32885842 ; 0x0a8: HOUR=0x12 DAYS=0x2 MNTH=0x3 YEAR=0x7d7
  
kfdhdb.crestmp.lo:3860343808 ; 0x0ac: USEC=0x0 MSEC=0x20b SECS=0x21 MINS=0x39
  

  
kfdhdb.mntstmp.hi: 32885842 ; 0x0b0:HOUR=0x12 DAYS=0x2 MNTH=0x3 YEAR=0x7d7
  
kfdhdb.mntstmp.lo: 3870944256 ; 0x0b4:USEC=0x0 MSEC=0x27b SECS=0x2b MINS=0x39
  

  
9. Do a kfed merge to put the new headerinto the disk using fix.txt:
  
       用kfed 命令将我们修改的新的disk header merge 到损坏的disk header上。
  
命令如下:
  
kfed mergetext=fix.txt
  

  
Example:
  
kfed merge /ocfs02/asm/data03 text=fix.txt
  

  
       Ifyou are using ASMLIB, at this point you will need to run the following to fixthe ASMLIB portion of the header:
  
       如果使用ASMLIB,还需要修复对应的header,命令如下:
  
       /etc/init.d/oracleasmforce-renamedisk /dev/sdbg1
  
       /etc/init.d/oracleasm scandisks
  
       /etc/init.d/oracleasm listdisks
  

  
10. Startup nomount the ASM instance:
  
SQL> startup nomount;
  
       启动ASM 实例
  

  
11. Check v$asm_disk.header_status toverify that the disk header is in a “MEMBER” state.
  
       检查asmdisk header 的状态。
  
Example:
  

  
SQL> select path, header_status fromv$asm_disk where path like '%data03%';
  

  
PATH
  

  
--------------------------------------------------------------------------------
  

  
HEADER_STATU
  
------------
  
/ocfs02/asm/data03
  
MEMBER
  

  
12. Mount the diskgroup.
  
       mount diskgroup,命令如下:
  
       alterdiskgroupmount;
  

  
       Ifthe diskgroup fails to mount at this point, you may want to either considerre-creating the diskgroup and restoring or engaging BDE to assist.
  
       Youmay also want to try clearing the first 4k of the disk with dd then do a kfedmerge again in case there are any extra characters causing problems (MAKE SURE YOU HAVE A BACKUP OF THE FIRST 4K FIRST):
  
       如果mount 失败,可以先考虑清空头4k的内容,然后在merge,如果还失败,就只能重建diskgroup,然后restore DB了。
  

  
Example:
  
dd if= of= bs=4096 count=1
  
dd if=/dev/zero of= bs=4096 count=1
  

  
4.2 说明
  
       我的测试环境的diskgroup 都只有一个disk,所以不能进行测试。只能通过备份进行恢复,而无法进行重建。
  

  
如果进行重建,那么分别从filedirectory 中获取如下参数:
  
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
  
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
  
kfdhdb.hdrsts:                        3 ; 0x027:KFDHDR_MEMBER
  
kfdhdb.dskname:                  DATA ; 0x028: length=4
  
kfdhdb.grpname:                  DATA ; 0x048: length=4
  
kfdhdb.fgname:                     DATA ; 0x068: length=4
  

  
从diskdirectory 中获取如下参数:
  
kfdhdb.crestmp.hi:32885842 ; 0x0a8: HOUR=0x12 DAYS=0x2 MNTH=0x3 YEAR=0x7d7
  
kfdhdb.crestmp.lo:3860343808 ; 0x0ac: USEC=0x0 MSEC=0x20b SECS=0x21 MINS=0x39


页: [1]
查看完整版本: Oracle 学习之--ASM DISK Header的备份和恢复(2)