MegaCli64使用热备盘替换故障硬盘实操

我们这台故障机器是12盘位的戴尔EMC,10盘组raid10+1盘热备,安装MegaCli64看下面这个链接:
Proxmox(Debian)安装MegaCli64管理硬件Raid阵列卡

强烈建议看看这个:MegaCli操作手册

安装完后首先查看阵列状态:
root@JS-2002:~/megacli/Linux# MegaCli64 -LDInfo -Lall -aALL

Adapter 0 – Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name : RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0 Size : 9.093 TB Sector Size : 512 Mirror Data : 9.093 TB State : Degraded Strip Size : 64 KB Number Of Drives per span:2 Span Depth : 5 Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Disk’s Default Encryption Type : None Default Power Savings Policy: Controller Defined Current Power Savings Policy: None Can spin up in 1 minute: Yes LD has drives that support T10 power conditions: Yes LD’s IO profile supports MAX power savings with cached writes: No Bad Blocks Exist: No Is VD Cached: No

Exit Code: 0x00

root@JS-2002:~/megacli/Linux# MegaCli64 -pdinfo -physdrv[:3] -a0

                                     

Enclosure Device ID: N/A

Slot Number: 3

Drive's position: DiskGroup: 0, Span: 1, Arm: 1

Enclosure position: N/A

Device Id: 3

WWN: 5000C500260EACC4

Sequence Number: 2

Media Error Count: 0

Other Error Count: 5

Predictive Failure Count: 3

Last Predictive Failure Event Seq Number: 30255

PD Type: SAS



Raw Size: 1.819 TB [0xe8e088b0 Sectors]

Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]

Coerced Size: 1.818 TB [0xe8d00000 Sectors]

Sector Size:  0

Firmware state: Online, Spun Up

Device Firmware Level: 0008

Shield Counter: 0

Successful diagnostics completion on :  N/A

SAS Address(0): 0x5000c500260eacc5

SAS Address(1): 0x0

Connected Port Number: 0(path0) 

Inquiry Data: SEAGATE ST32000444SS    00089WM3PSCZ            

FDE Capable: Not Capable

FDE Enable: Disable

Secured: Unsecured

Locked: Unlocked

Needs EKM Attention: No

Foreign State: None 

Device Speed: 6.0Gb/s 

Link Speed: 6.0Gb/s 

Media Type: Hard Disk Device

Drive:  Not Certified

Drive Temperature :29C (84.20 F)

PI Eligibility:  No 

Drive is formatted for PI information:  No

PI: No PI

Port-0 :

Port status: Active

Port's Linkspeed: 6.0Gb/s 

Port-1 :

Port status: Active

Port's Linkspeed: Unknown 

Drive has flagged a S.M.A.R.T alert : Yes



Exit Code: 0x00

然后设置这个磁盘下线,同时标记missing:



root@JS-2002:~/megacli/Linux# MegaCli64 -PDOffline -PhysDrv [:3] -a0

Adapter: 0: EnclId-N/A SlotId-3 state changed to OffLine.

Exit Code: 0x00

root@JS-2002:~/megacli/Linux# MegaCli64 -pdmarkmissing -physdrv[:3] -aAll EnclId-N/A SlotId-3 is marked Missing. Exit Code: 0x00

标记这个硬盘准备移除:



root@JS-2002:~/megacli/Linux# MegaCli64 -pdprprmv -physdrv[:3] -a0

Prepare for removal Success

Exit Code: 0x00

这时候再看阵列的状态, 是Degraded:



root@JS-2002:~/megacli/Linux# MegaCli64 -LDInfo -Lall -aALL

Adapter 0 – Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name : RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0 Size : 9.093 TB Sector Size : 512 Mirror Data : 9.093 TB State : Degraded Strip Size : 64 KB Number Of Drives per span:2 Span Depth : 5 Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Disk’s Default Encryption Type : None Default Power Savings Policy: Controller Defined Current Power Savings Policy: None Can spin up in 1 minute: Yes LD has drives that support T10 power conditions: Yes LD’s IO profile supports MAX power savings with cached writes: No Bad Blocks Exist: No Is VD Cached: No

Exit Code: 0x00

然后将"热备"盘顶上,之前没有添加热备,只是插上了而已,这里最重要的是确定Array和row的参数是啥,找了好久....
实际上Raid10是将多组raid1的磁盘组成raid0阵列,所以在我们这里10盘的Raid10实际分成了5组Raid0。也就是这里面Array后面的参数。而row就是这每个raid1小组里面的0或者1,这样以来就好理解了,只要磁盘的Span号即可:



Enclosure Device ID: N/A

Slot Number: 3

Drive's position: DiskGroup: 0, Span: 1, Arm: 1

Enclosure position: N/A

Device Id: 3

是Array1,row1,于是:



root@JS-2002:~/megacli/Linux# MegaCli64 -PdReplaceMissing -PhysDrv[:10] -Array1 -row1 -a0

Adapter: 0: Failed to replace Missing PD at Array 1, Row 1.

FW error description: 

  The specified device is in a state that doesn't support the requested command.  

Exit Code: 0x32

替换失败了,是因为这个盘作为一个普通non-raid盘存在,所以我们直接把这块盘拔掉,然后插到3号盘的位置,神奇的开始rebuild了:



Coerced Size: 1.818 TB [0xe8d00000 Sectors]

Sector Size:  0

Firmware state: Rebuild

Device Firmware Level: HPD7

搞定!
https://paste.ubuntu.com/p/dVXG3qvnGF/

注意:本段内容须“登录”后方可查看!


This article is under CC BY-NC-SA 4.0 license.
Please quote the original link:https://www.liujason.com/article/391.html
comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy