[Background]
We recently started experiencing errors with one of our applications. The support tech mentioned that perhaps the corruption was being caused by bad sectors on the disk. I ran HP Insight Diagnostics from the Smart Start CD and found that 4 of 5 disks had failed the check with:
Error F155: The read/write hard error rate recorded in the monitor and performance log is above the acceptable treshold.
[Hardware]
HP Proliant DL360 G5 with Smart Array P400i
5 HDDs in Raid 5 + 1 hot spare
No Red LEDs on the drives yet
Storage Removable Drive 1 1.0 GB, USB 2.0 Flash Disk Optical Drive 1 DVD, HL-DT-ST RW/DVD GCC-C10N Hard Drive 1, Storage Controller in Slot 0 146.8 GB, 10k RPM, SAS, HP DG146ABAB4 Hard Drive 2, Storage Controller in Slot 0 146.8 GB, 10k RPM, SAS, HP DG146ABAB4 Hard Drive 3, Storage Controller in Slot 0 146.8 GB, 10k RPM, SAS, HP DG146A4960 Hard Drive 4, Storage Controller in Slot 0 146.8 GB, 10k RPM, SAS, HP DG146ABAB4 Hard Drive 5, Storage Controller in Slot 0 146.8 GB, 10k RPM, SAS, HP DG146BABCF Hard Drive 6, Storage Controller in Slot 0 146.8 GB, 10k RPM, SAS, HP DG146ABAB4 Logical Drive 1, Storage Controller in Slot 0 587.1 GB, RAID 5 - OK
The server is still running and the drives themselves have not failed. They just didn't pass the checks in the diagnostics. They should be replaced as soon as possible. Backups are still being performed.
Is it possible to replace a drive and let it rebuild, then replace another drive, rebuild, etc. with the drives in their current state? How would I start the process? Do I just pull one of the affected drives, let the hot spare take over, and insert a new drive?