Hello!
One our customer have DL380 G5 with SA P400 RAID controller and 8 SAS drives:
- array A of 3xHDD in RAID5;
- array B of 4xHDD in RAID10;
- 1 drive as global hot spare;
All the drives are the same 72G 15K type.
One of drive in array B has failed. Data on the degraded array are still accessible. The controller does not rebuild the array with hot spare drive as expected. Array B shown as failed and RAID10 as degraded, but still protected by hot spare:
=> controller slot=1 physicaldrive all show
Smart Array P400 in Slot 1
array A
physicaldrive 2I:1:1 (port 2I:box 1:bay 1, SAS, 72 GB, OK)
physicaldrive 2I:1:2 (port 2I:box 1:bay 2, SAS, 72 GB, OK)
physicaldrive 2I:1:3 (port 2I:box 1:bay 3, SAS, 72 GB, OK)
physicaldrive 1I:1:8 (port 1I:box 1:bay 8, SAS, 72 GB, OK, active spare)
array B (Failed)
physicaldrive 1I:1:5 (port 1I:box 1:bay 5, SAS, 72 GB, OK)
physicaldrive 1I:1:6 (port 1I:box 1:bay 6, SAS, 72 GB, OK)
physicaldrive 1I:1:7 (port 1I:box 1:bay 7, SAS, 72 GB, Failed)
physicaldrive 2I:1:4 (port 2I:box 1:bay 4, SAS, 72 GB, OK)
physicaldrive 1I:1:8 (port 1I:box 1:bay 8, SAS, 72 GB, OK, active spare)
=> controller slot=1 array B show
Smart Array P400 in Slot 1
Array: B
Interface Type: SAS
Unused Space: 0 MB
Status: Failed
MultiDomain Status: OK
=> controller slot=1 array B logicaldrive all show detail
Smart Array P400 in Slot 1
array B (Failed)
Logical Drive: 2
Size: 136.7 GB
Fault Tolerance: RAID 1+0
Heads: 255
Sectors Per Track: 32
Cylinders: 35124
Stripe Size: 16 KB
Status: OK
MultiDomain Status: OK
Array Accelerator: Enabled
Unique Identifier: 600508B1001048395656563645580003
Disk Name: /dev/cciss/c0d1
Mount Points: /u02 136.7 GB
Logical Drive Label: A277E219P61620H9VVV6EX5C47
It is look like the configuration of array B is read-only, we can not change any settings. hpacucli shows an error like "operation is not supported with the current configuration"
hplog shows some strange errors:
0042 Unknown 23:01 01/16/2012 23:01 01/16/2012 0001
LOG: Unknown Event (Class 19, Code 0)
0043 Caution 01:20 01/17/2012 17:49 08/03/2012 0002
LOG: Internal SAS Enclosure Device Failure (Bay 7, Box 1, Port 1I, Slot 1)
0044 Caution 17:46 08/03/2012 17:46 08/03/2012 0001
LOG: POST Error: 1786-Drive Array Recovery Needed
0045 Unknown 17:48 08/03/2012 17:48 08/03/2012 0001
LOG: Unknown Event (Class 19, Code 0)
We tried to reset the controller by power off and restart but without a result.
The firmware of the controller is old (5.20). Can the error be a firmware-specific issue? Is it safe to update to current 7.22 at this conditions? Any ideas more?
The server is far from us. Local technician can exchange the drive and press power button but not more. The customer can schedule only a little downtime. We can't risk to loose any data from the server