Quantcast
Viewing all articles
Browse latest Browse all 10362

Smart Array E200i stuck on "Ready for rebuild"

ProLiant ML350 G5 running SLES 11 in a RAID5 configuration.

I've been installing new SAS drives in our array, one at a time and waiting for each rebuild successfully before adding the next drive.  The last drive I installed resulted in numerous write errors on the swap-device and services becoming slow and sluggish.  Disable the swap file for the short term has alleviated any slow downs.  However, the array is still showing a status of "ready for rebuild" with no further activity.  It's trying to rebuild the boot partition (/dev/cciss/c0d0).  The HPSMH is showing drive 6 as having 14961 read errors.  That number has not increased since turning off swap.

/var/log/messages only seems to report errors on the "swap-device" and not necessarily on any real data area.  And it's write errors, whereas hpsmh reports read errors.

The drive POST complained about is in 1I:1:3.
but
hpsmh reports read errors on the drive 2I:1:6

Those are the 2 most recent drive swaps I performed.

I have tried booting from a linux livecd and used a DD command to image the drives/partitions.  It was going excruciatingly slow on the boot partition because of the disk errors so I proceeded to image the other drives/partitions.  As we got into the backup process it was going to take 15 hours to complete and then another 15 to restore once the array was reconfigured after removing the faulty drives.  That was not feasible so that planned was scrapped.

SO, at this point, kind of stuck.  Thought about throwing a USB 3.0 expansion card in to increase the throughput of the backups but turns out this server does not respond well to USB 3.0 because of its age.

My question is this:  If I remove the drive (bay 6) that hpsmh is reporting as having read errors while logical drive 1 is stuck on "ready for rebuild" am I going to destroy my raid?  Same goes for the drive that /var/log/messages reports as having write errors, if I remove that drive and install a fresh drive, will I destroy the raid and/or boot partition?

I'm not terribly fluent with linux or raid configuration so let me know if I'm not clear with any of this.   I have a diagnostic file created with hpacucli if that would be helpful.  My other option is to upgrade the server itself due to its age.  Was planning to upgrade servers this year anyway.

---------------------------------------------

details about the controller and drives:


# /opt/compaq/hpacucli/bld/hpacucli ctrl all show config detail
 

Smart Array E200i in Slot 0 (Embedded)
   Bus Interface: PCI
   Slot: 0
   Serial Number: QT83MP3021    
   Cache Serial Number: P9A3A0BXQUD0PI
   RAID 6 (ADG) Status: Disabled
   Controller Status: OK
   Hardware Revision: A
   Firmware Version: 1.86
   Rebuild Priority: Medium
   Expand Priority: Medium
   Surface Scan Delay: 15 secs
   Surface Scan Mode: Idle
   Post Prompt Timeout: 0 secs
   Cache Board Present: True
   Cache Status: OK
   Cache Status Details: A cache error was detected. Run a diagnostic report for more information.
   Cache Ratio: 50% Read / 50% Write
   Drive Write Cache: Disabled
   Total Cache Size: 128 MB
   Total Cache Memory Available: 96 MB
   No-Battery Write Cache: Disabled
   Cache Backup Power Source: Batteries
   Battery/Capacitor Count: 1
   Battery/Capacitor Status: OK
   SATA NCQ Supported: False
 
   Array: A
      Interface Type: SAS
      Unused Space: 0  MB
      Status: OK
      Array Type: Data
 
 
 
      Logical Drive: 1
         Size: 32.0 GB
         Fault Tolerance: RAID 5
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 8224
         Strip Size: 64 KB
         Full Stripe Size: 448 KB
         Status: Ready for Rebuild
         Caching:  Enabled
         Parity Initialization Status: Initialization Completed
         Unique Identifier: 600508B1001032333720202020200002
         Disk Name: /dev/cciss/c0d0
         Mount Points: / 30.0 GB
         OS Status: LOCKED
         Logical Drive Label: AFC929D4QT7BMU0237     4700
         Drive Type: Data
      Logical Drive: 2
         Size: 924.9 GB
         Fault Tolerance: RAID 5
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 64 KB
         Full Stripe Size: 448 KB
         Status: OK
         Caching:  Enabled
         Parity Initialization Status: Initialization Completed
         Unique Identifier: 600508B1001032333720202020200003
         Disk Name: /dev/cciss/c0d1
         Mount Points: None
         OS Status: LOCKED
         Logical Drive Label: AC2929DFQT7BMU0237     5207
         Drive Type: Data
 
      physicaldrive 1I:1:1
         Port: 1I
         Box: 1
         Bay: 1
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 300 GB
         Rotational Speed: 10000
         Firmware Revision: HPDF
         Serial Number: 3SE218NT00009038WDUR
         Model: HP      EG0300FAWHV    
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
 
      physicaldrive 1I:1:2
         Port: 1I
         Box: 1
         Bay: 2
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 300 GB
         Rotational Speed: 10000
         Firmware Revision: HPDF
         Serial Number: 3SE1YRLG00009037N4EM
         Model: HP      EG0300FAWHV    
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
 
      physicaldrive 1I:1:3
         Port: 1I
         Box: 1
         Bay: 3
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 300 GB
         Rotational Speed: 10000
         Firmware Revision: HPDE
         Serial Number: 6SE1K84Y0000B116KXFY
         Model: HP      EG0300FAWHV    
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
 
      physicaldrive 1I:1:4
         Port: 1I
         Box: 1
         Bay: 4
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 146 GB
         Rotational Speed: 10000
         Firmware Revision: HPDF
         Serial Number: 3SD25WGZ00009021VSVN
         Model: HP      DG0146FAMWL    
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
 
      physicaldrive 2I:1:5
         Port: 2I
         Box: 1
         Bay: 5
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 300 GB
         Rotational Speed: 10000
         Firmware Revision: HPD6
         Serial Number: ECA1PCC0JRC31251
         Model: HP      EG0300FBDSP    
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
 
      physicaldrive 2I:1:6
         Port: 2I
         Box: 1
         Bay: 6
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 300 GB
         Rotational Speed: 10000
         Firmware Revision: HPDF
         Serial Number: 3SE21B0E00009038VY45
         Model: HP      EG0300FAWHV    
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
 
      physicaldrive 2I:1:7
         Port: 2I
         Box: 1
         Bay: 7
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 300 GB
         Rotational Speed: 10000
         Firmware Revision: HPDF
         Serial Number: 3SE1YGHH00009037KFUJ
         Model: HP      EG0300FAWHV    
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown
 
      physicaldrive 2I:1:8
         Port: 2I
         Box: 1
         Bay: 8
         Status: OK
         Drive Type: Data Drive
         Interface Type: SAS
         Size: 300 GB
         Rotational Speed: 10000
         Firmware Revision: HPDF
         Serial Number: 3SE200DE00009014LY7R
         Model: HP      EG0300FAWHV    
         PHY Count: 2
         PHY Transfer Rate: 3.0Gbps, Unknown

Viewing all articles
Browse latest Browse all 10362

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>