loin 发表于 2013-11-8 09:09:16

IBM-P55A 内存故障处理

一、故障定位
1.1.故障信息

Log摘要# uname -MuIBM,9133-55A IBM,0306A1F9H
# errpt -d HIDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTIONBFE4C025 0919054612 P H sysplanar0 UNDETERMINED ERROR……
# errpt -aj BFE4C025---------------------------------------------------------------------------LABEL: SCAN_ERROR_CHRPIDENTIFIER: BFE4C025
Date/Time:Sequence Number: 20422Machine Id: 0008DB9FD600Node Id: wxdbser11Class: HType: PERMResource Name: sysplanar0Resource Class: planarResource Type: sysplanar_rspcLocation:
DescriptionUNDETERMINED ERROR
Failure CausesUNDETERMINED
Recommended ActionsRUN SYSTEM DIAGNOSTICS.
Detail DataPROBLEM DATA0644 00E0 0000 06EC 8E00 8E00 0000 0000 0000 0000 4942 4D00 5048 0030 0100 DD002012 0918 2146 2039 2012 0918 2146 2060 4500 010A 0000 0000 0000 0000 0000 0000501D 5359 501D 5359 5548 0018 0100 DD00 2303 2000 0000 E500 0000 A800 0000 00005053 00B0 0101 DD00 0201 0009 0000 00A8 0300 00F0 28D9 0110 C139 20FF 4100 00FF0081 1931 0000 0001 00A0 008A 5901 703A 4231 3233 4535 3030 2020 2020 2020 20202020 2020 2020 2020 2020 2020 2020 2020 C000 0018 5C2C 4D1C 5537 3837 422E 3030312E 444E 5747 574E 362D 5031 2D43 392D 4334 0000 4944 1C1D 3135 5237 3137 32003331 3245 5948 3130 4D4A 3931 4130 3838 4D52 2003 0000 0000 0000 004D 0081 028F0000 004D 5901 7038 0000 004D 5901 703A 5544 0094 0204 4400 0000 01A3 2F6F 70742F66 6970 732F 6269 6E2F 6365 6373 6572 7665 7200 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 6669 7073 3234 302F 6231 3231 3061 5F303732 352E 3234 3000 0000 0000 0000 0000 0000 0001 0000 0002 0000 0801 0000 00010000 0009 4D54 001C 0100 4400 3931 3333 2D35 3541 3036 4131 4639 4800 0000 00005544 0058 0000 6700 FED0 000A 6F6E 2020 0000 0000 EF77 F658 0000 0001 0000 08010000 0009 3032 2F32 312F 3230 3132 2031 343A 3333 3A30 3800 0000 0000 0000 00003032 2F32 312F 3230 3132 2031 343A 3333 3A30 3800 0000 0000 5544 003C 00CE 6700CECE 4757 0081 028F 0000 0007 0000 0005 0000 00E6 0000 0000 0000 0001 0000 D0060062 6C65 0000 0020 0000 0023 0000 0000 0081 1931 5544 003C 00CE 6700 CECE 47570081 028F 0000 0007 0000 0005 0000 00E6 0000 0000 0000 0001 0000 D006 0000 0AC50000 0020 0000 0023 0000 0000 0081 1931 5544 00A8 0133 DD00 4D53 2020 4455 4D500000 0002 0000 0003 0081 14B2 534D 4120 4455 4D50 0081 1931 0000 0000 0000 00000000 A800 5046 4135 0000 0000 A800 2000 0034 0000 434D 5020 4441 5441 0000 0000E301 0100 0303 0000 0000 0000 4341 4C4C 4F55 5453 0000 0002 5901 7038 0305 00005901 703A 0305 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 5544 03BC 0201 DD004350 5452 0000 0006 1400 8119 3100 1D00 F280 0038 8701 0000 000A 0000 0104 E0020020 8728 003B 0381 2E00 3300 03EF F900 1080 4014 0303 1B74 0001 6CF9 C006 02034210 0800 0224 F808 0022 97E9 0303 FA40 0001 00EC E900 0210 C103 0300 AFD3 000534F8 0032 2017 FD03 038F BF00 024C 04E8 0030 073C 0703 2620 0000 096C E000 1188B700 0000 BF08 0000 0000 0100 80E6 8000 00B7 23C3 901A 4100 B323 C360 0230 002000B6 23C3 0318 6200 004B B223 C340 BE24 00B5 23C3 1641 2000 B123 C320 F020 000040B4 23C3 0108 A300 00B0 9423 C300 8637 C0E3 23C3 9B8F 09FF 0000 E423 C3FF FF40C068 E823 C424 01EB 23C3 BFC3 3FD4 1500 23C3 3F49 81B9 23C3 EB00 0AEB 0000 D832C8D0 3BC4 2D08 3C00 00D1 23C3 09DF F810 0000 D923 C344 A955 002A 00BA 3BC4 3F40C0BB 3BC4 FF0A FC00 00BC 3BC4 1C24 00B8 A83B C400 37C0 BD2C C4A2 0200 2100 A065C506 0000 A165 C504 9A00 00F9 0436 C053 0020 00F4 71C0 0031 0081 1600 3137 0FFF0402 4012 8414 C0FF 08DE AA66 0400 F260 0865 C539 C108 F240 2EFF C02B C033 C0080CEF EFBF 0055 F024 0111 7EC8 1281 C821 3BC4 5522 03C0 223B C42B 40C0 248A C855258D C827 8AC8 288D C831 3BC4 30C0 0727 C03B C4CF DF00 0041 343B C4C1 C700 00359FC8 5437 9CC8 389F C881 23C3 8102 A664 8082 23C2 7EFB 6181 5680 0900 2492 4948488B 0000 08D2 0855 49B5 804B 4AAB 0200 00D4 080F FF0F C13C 899E C0D5 085F 4340FF90 4381 0CD7 0800 8BBE 4143 81D8 0820 FCCF C185 E108 013C 1087 6604 E208 FFFE8342 2400 3E84 8704 C000 0055 0843 813C 0335 843F 00FF 8340 2F84 D442 3884 D4423284 FDD4 4226 84D4 4265 85D4 814D 8400 6AC0 BE5C 465F 8340 5F85 E981 6884 D442BEAA D2C8 C1D2 C8C2 D2C8 C4D2 C8C5 AAD2 C8C6 3BC4 1222 00C7 E4C8 C8AA E1C8 C9F6C8CA DBC8 CBD8 C8CC A8ED C8CD D2C8 CECF C781 1931 0A03 00F2 8002 C3AC 98C0 2450E81A C238 FBC1 4954 0002 02D7 F000 331F 5EFB C39F 06C4 0001 CFF8 06C2 F7C2 4A058300 0774 E0F7 C385 5FC0 406D 05C0 DCB0 0031 0F9F 81FB C3AD FC00 0FEC B3F7 C702DA40 0001 2CA8 EBC7 ED07 4300 0E04 EBEB C2DE C1EE C240 1BDF C0A4 B000 218F 8B81DEC3 7F0C 000B 8FE9 DEC2 80D6 C2D8 8B00 015C E000 0413 9876 003B E2C1 2B08 0000018C A800 1188 B780 CEC3 1B08 0002 3CE8 0010 1290 D5F7 C30E 8800 0302 BCA0 00108014 F3C3 8804 5400 017F B0FB C725 800C 0003 F4A9 F7C2 DEC6 D90C 82FA C0B0 0030073C C9C3 2900 E400 03CF A800 0210 63C1 E2C3 6CC0 0204 B0E6 C2B5 C202 CBD0 00050FF1 BDC7 1C00 6C00 027F A000 3217 53FD A9C3 4800 000C E0DE C2B5 C24D BCBD C0CCB8F7 C2A5 C209 0000 3C3C F0A9 C143 4341 C3FC C8FF 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
Diagnostic AnalysisDiagnostic Log sequence number: 26336Resource tested: sysplanar0Resource Description: System PlanarLocation:SRC: B123E500Description: Memory subsystem including external cache PredictiveError, general. Refer to the system servicedocumentation for more information.Additional Words: 2-030000F0 3-28D90110 4-C13920FF 5-410000FF6-00811931 7-00000001 8-00A0008A 9-5901703APossible FRUs:Priority: M FRU: 15R7172 S/N: YH10MJ91A088 CCIN: 312ELocation: U787B.001.DNWGWN6-P1-C9-C4……
# lsattr -El mem0goodsize 32064 Amount of usable physical memory in Mbytes Falsesize 32064 Total amount of physical memory in Mbytes False
# prtconf more……Memory Size: 32064 MBGood Memory Size: 32064 MB……
# lscfg -vp……Memory DIMM:Record Name.................VINIFlag Field..................XXMSHardware Location Code......U787B.001.DNWGWN6-P1-C9-C1Customer Card ID Number.....312ESerial Number...............YH10MJ91A082Part Number.................15R7172FRU Number.................. 15R7172Size........................4096Version.....................RS6KPhysical Location: U787B.001.DNWGWN6-P1-C9-C1
Memory DIMM:Record Name.................VINIFlag Field..................XXMSHardware Location Code......U787B.001.DNWGWN6-P1-C9-C2Customer Card ID Number.....312ESerial Number...............YH10MJ91A439Part Number.................15R7172FRU Number.................. 15R7172Size........................4096Version.....................RS6KPhysical Location: U787B.001.DNWGWN6-P1-C9-C2
Memory DIMM:Record Name.................VINIFlag Field..................XXMSHardware Location Code......U787B.001.DNWGWN6-P1-C9-C3Customer Card ID Number.....312ESerial Number...............YH10MJ91A43FPart Number.................15R7172FRU Number.................. 15R7172Size........................4096Version.....................RS6KPhysical Location: U787B.001.DNWGWN6-P1-C9-C3
Memory DIMM:Record Name.................VINIFlag Field..................XXMSHardware Location Code......U787B.001.DNWGWN6-P1-C9-C4Customer Card ID Number.....312ESerial Number...............YH10MJ91A088Part Number.................15R7172FRU Number.................. 15R7172Size........................4096Version.....................RS6KPhysical Location: U787B.001.DNWGWN6-P1-C9-C4
Memory DIMM:Record Name.................VINIFlag Field..................XXMSHardware Location Code......U787B.001.DNWGWN6-P1-C9-C5Customer Card ID Number.....312ESerial Number...............YH10MJ91A089Part Number.................15R7172FRU Number.................. 15R7172Size........................4096Version.....................RS6KPhysical Location: U787B.001.DNWGWN6-P1-C9-C5
Memory DIMM:Record Name.................VINIFlag Field..................XXMSHardware Location Code......U787B.001.DNWGWN6-P1-C9-C6Customer Card ID Number.....312ESerial Number...............YH10MJ91A086Part Number.................15R7172FRU Number.................. 15R7172Size........................4096Version.....................RS6KPhysical Location: U787B.001.DNWGWN6-P1-C9-C6
Memory DIMM:Record Name.................VINIFlag Field..................XXMSHardware Location Code......U787B.001.DNWGWN6-P1-C9-C7Customer Card ID Number.....312ESerial Number...............YH10MJ91A43GPart Number.................15R7172FRU Number.................. 15R7172Size........................4096Version.....................RS6KPhysical Location: U787B.001.DNWGWN6-P1-C9-C7
Memory DIMM:Record Name.................VINIFlag Field..................XXMSHardware Location Code......U787B.001.DNWGWN6-P1-C9-C8Customer Card ID Number.....312ESerial Number...............YH10MJ91A438Part Number.................15R7172FRU Number.................. 15R7172Size........................4096Version.....................RS6KPhysical Location: U787B.001.DNWGWN6-P1-C9-C8……



1.2.故障定位


系统内存统计信息
系统内存大小32064 MB
可用系统内存大小32064 MB


内存位置统计信息
序号位置容量QUAD
1U787B.001.DNWGWN6-P1-C9-C14096KBQuad A
2U787B.001.DNWGWN6-P1-C9-C24096KBQuad B
3U787B.001.DNWGWN6-P1-C9-C34096KBQuad A
4U787B.001.DNWGWN6-P1-C9-C44096KBQuad B
5U787B.001.DNWGWN6-P1-C9-C54096KBQuad B
6U787B.001.DNWGWN6-P1-C9-C64096KBQuad A
7U787B.001.DNWGWN6-P1-C9-C74096KBQuad B
8U787B.001.DNWGWN6-P1-C9-C84096KBQuad A

可用系统内存大小没有变化,U787B.001.DNWGWN6-P1-C9-C4 槽位内存预告警;
如果更换内存,以Pair(1组2根)的方式进行更换,更换的槽位为P1-C9-C4和P1-C9-C5。


二、故障处理
2.1.先决条件

注意
确保系统关机,电源断开
操作时,使用防静电护腕
添加或更换硬件组件之前请作好数据备份。如果部件未正确安装,则可能会导致数据丢失。



2.2.准备项


准备确认项
类型准备项状态
硬件笔记本一台已准备就绪
网线一根已准备就绪
一字、十字螺丝刀各一把已准备就绪
防静电护腕一个已准备就绪
新内存2根已准备就绪
软件HMC环境已准备就绪


其它






2.3.操作项


操作项列表
序号操作项操作内容状态
1确认系统关机建议客户应用及业务数据备份

2佩戴防静电护腕确认已经佩戴防静电护腕,并且防静电护腕连接到机柜上的未涂漆部分

3断开电源断开主电源和次电源

4移除服务检修盖

5拆除处理器板

6将取下的处理器板放置在防静电的材质表面

7拆开移除处理器前盖

8确认更换内存位置

9从防静电包装中取出内存

10安装内存

11重新安装处理器板

12确认故障影响消失确认新更换的硬件无告警

确认新的硬件在系统中就绪

用户确认应用及业务数据不受影响

13收尾清理现场,结束工作



三、参考信息












o2geao 发表于 2013-11-11 07:48:38

读书读到抽筋处,文思方能如尿崩!

blueice 发表于 2013-11-13 21:28:26

啥时硬件也可以COPY就好了!

cencenhai 发表于 2013-11-14 13:53:51

学习了,不错,讲的太有道理了

沈阳格力专卖店 发表于 2013-11-16 03:06:27

饭在锅里,我在床上*^_^*

小fish 发表于 2013-11-17 11:55:09

沙发!沙发!

ph033378 发表于 2013-11-18 17:38:21

看帖回帖是美德!:lol
页: [1]
查看完整版本: IBM-P55A 内存故障处理