设为首页 收藏本站
查看: 1026|回复: 1

[经验分享] Oracle BUG导致实例宕机:ORA-07445

[复制链接]
累计签到:1 天
连续签到:1 天
发表于 2013-12-24 09:01:11 | 显示全部楼层 |阅读模式
现象:
客户的数据库(RAC环境:11.1.0.6)发生了实例异常宕机现象,伴随有ORA-07445错误:
Sun Jun 23 01:00:06 2013
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xF] [PC:0x755773D, kcbw_get_bh()+67]
Errors in file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_mman_2015.trc (incident=298938):
ORA-07445: exception encountered: core dump [kcbw_get_bh()+67] [SIGSEGV] [ADDR:0xF] [PC:0x755773D] [Address not mapped to object] []
Incident details in: /oracle/app/11gR1/diag/rdbms/xij/xij1/incident/incdir_298938/xij1_mman_2015_i298938.trc
Sun Jun 23 01:00:07 2013
Trace dumping is performing id=[cdmp_20130623010007]
Sun Jun 23 01:00:09 2013
Sweep Incident[298938]: completed
Sun Jun 23 01:00:09 2013
Errors in file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_pmon_1981.trc:
ORA-00822: MMAN process terminated with error
PMON (ospid: 1981): terminating the instance due to error 822
Sun Jun 23 01:00:09 2013
Errors in file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00822: MMAN process terminated with error
Sun Jun 23 01:00:09 2013
Errors in file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_m000_22430.trc:
ORA-00822: MMAN process terminated with error
System state dump is made for local instance
System State dumped to trace file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_diag_1987.trc
Sun Jun 23 01:00:09 2013
ORA-1092 : opiodr aborting process unknown ospid (11096_47524616916112)
Sun Jun 23 01:00:09 2013
ORA-1092 : opitsk aborting process
Sun Jun 23 01:00:09 2013
ORA-1092 : opiodr aborting process unknown ospid (6317_47353365785744)
Sun Jun 23 01:00:09 2013
ORA-1092 : opitsk aborting process
Sun Jun 23 01:00:09 2013
ORA-1092 : opiodr aborting process unknown ospid (28698_47056912551056)
Sun Jun 23 01:00:09 2013
ORA-1092 : opitsk aborting process
Sun Jun 23 01:00:09 2013
ORA-1092 : opiodr aborting process unknown ospid (18927_47567504653456)
Sun Jun 23 01:00:10 2013
ORA-1092 : opitsk aborting process
Sun Jun 23 01:00:10 2013
Errors in file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_q001_3487.trc:
ORA-00822: MMAN process terminated with error
ORA-1092 : opidrv aborting process Q001 ospid (3487_47252506410128)
Sun Jun 23 01:00:11 2013
ORA-1092 : opitsk aborting process
Sun Jun 23 01:00:11 2013
License high water mark = 510
Errors in file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_m000_22430.trc:
ORA-00822: MMAN process terminated with error
ORA-00822: MMAN process terminated with error
Errors in file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00449: background process 'LGWR' unexpectedly terminated with error 822
ORA-00822: MMAN process terminated with error
Errors in file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00449: background process 'LGWR' unexpectedly terminated with error 822
ORA-00822: MMAN process terminated with error
Errors in file /oracle/app/11gR1/diag/rdbms/xij/xij1/trace/xij1_j000_22268.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00822: MMAN process terminated with error
ORA-06512: at "WKSYS.WK_JOB", line 442
ORA-00449: background process 'MMON' unexpectedly terminated with error 822
ORA-00822: MMAN process terminated with error
ORA-06512: at line 1
ORA-1092 : opidrv aborting process J000 ospid (22268_47357930925200)
Sun Jun 23 01:00:20 2013
Instance terminated by PMON, pid = 1981
Sun Jun 23 01:00:21 2013
USER (ospid: 22527): terminating the instance
Instance terminated by USER, pid = 22527
Sun Jun 23 01:00:26 2013
Starting ORACLE instance (normal)

分析:
Ora-07445通常是Oracle自身的BUG导致的,
首先使用IPS收集了alert中的错误信息(IPS使用方法见我的另一篇文章《IPS简单使用方法》):
搜寻了一下metalink,发现客户的问题跟以下三篇Note中描述的BUG类似:
ORA-7445 (kcbw_get_bh) [ID 1341402.1]
Bug 9728912 [https://bug.oraclecorp.com/pls/b ... o_top?rptno=9728912] - PMON terminates instance due to ORA-7445 [kcbw_numperchunk] / ORA-7445 [kcbw_get_bh]] [ID 9728912.8]
Instance Crashed On ORA-7445 kcbw_numperchunk [ID 1364264.1]
但根据Note可以看到,相关的BUG已经在11.1.0.6中fix掉了。
看看客户数据库中的其余严重错误信息:
Node1:
adrci> show problem

ADR Home = /oracle/app/11gR1/diag/rdbms/xij/xij1:
*************************************************************************
PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME
-------------------- ----------------------------------------------------------- -------------------- ----------------------------------------
5 ORA 7445 [kcbw_get_bh()+67] 298938 2013-06-23 01:00:06.373716 +08:00
11 ORA 600 276161 2013-06-04 18:12:12.709933 +08:00
10 ORA 600 [729] 276160 2013-06-04 18:09:27.857128 +08:00
7 ORA 7445 [kgghash()+367] 253234 2013-06-03 15:27:04.349337 +08:00
9 ORA 7445 [kksMapCursor()+323] 256538 2013-05-27 09:54:58.684956 +08:00
8 ORA 7445 [qkabxo()+22] 251194 2013-05-01 22:03:37.715416 +08:00
2 ORA 600 [kghfrh:ds] 238818 2013-01-28 11:35:23.755034 +08:00
6 ORA 7445 [eoa_pm_push()+31] 239218 2013-01-28 11:24:42.835685 +08:00
3 ORA 7445 [ioei_get_method_counts()+39] 71129 2012-10-17 11:17:39.735719 +08:00
4 ORA 7445 [jol_calculate_transitive_interface_set()+1165] 74233 2012-10-17 11:05:51.570021 +08:00
1 ORA 600 [kghfru:ds] 6369 2012-09-07 17:35:55.001585 +08:00
11 rows fetched
Node2:
[oracle@XIJ02 ~]$ adrci

ADRCI: Release 11.1.0.6.0 - Beta on Mon Jun 24 14:59:37 2013

Copyright (c) 1982, 2007, Oracle. All rights reserved.
ADR base = "/oracle/app/11gR1"
adrci>
adrci>
adrci> set homepath diag/rdbms/xij/xij2
adrci>
adrci> show problem
ADR Home = /oracle/app/11gR1/diag/rdbms/xij/xij2:
*************************************************************************
PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME
-------------------- ----------------------------------------------------------- -------------------- ----------------------------------------
1 ORA 7445 [kgghash()+367] 209965 2013-06-16 23:34:39.333982 +08:00
2 ORA 7445 [kksMapCursor()+323] 190129 2013-05-27 09:54:56.121652 +08:00
2 rows fetched
adrci>
解决方法:
在客户的2个节点中一共发现了13个疑似BUG引起的数据库故障,总体而言,Oracle 11.1.0.6不算太稳定的版本,存在着各种BUG,
Oracle在11.1.0.7中Fix掉了11.1.0.6中发现的大部分BUG,所以相对而言要稳定得多,因此建议客户升级数据库至11.1.0.7或者11.2.0.3。



附:
(Triage Tool 3.01, routed by file analysis):
Failing Function: kcbw_get_bh
Route To: BUFFER CACHE:MANAGEABILITY
Error Argument: [kcbw_get_bh]
Type of Error: ORA-07445
File Name: xij1_mman_2015_i298938.trc
Comment: Routed by Error Argument, Conventional routing
DB Version: 11.1.0.6.0
Platform: Linux CPU: x86_64
OS Version: 2.6.18-194.el5
Stack Trace: kcbw_get_bh kcbw_get_first_buffer kcbw_next_free kmgs_extract_mem_from_granule kmgs_process_request_immediate kmgs_process_request kmgsdrv ksbabs ksbrdp opirip



运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.iyunv.com/thread-12186-1-1.html 上篇帖子: Oracle日期函数:过去、现在及日期的差 下篇帖子: Oracle中REDO日志 Oracle

尚未签到

发表于 2014-1-12 07:48:07 | 显示全部楼层
`妄想得到全世界、得到了又如何呢、那已不在重要了

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表