Quantcast
Channel: ProLiant Servers (ML,DL,SL) topics
Viewing all articles
Browse latest Browse all 10362

Diagnosing Smart Array P440ar serial log errors

$
0
0

I'm trying to pinpoint some platform issues, so I'm doing a deep dive into potential disk/RAID controller issues.  Checking the controller serial log, I'm seeing a lot of different errors.  Some are easy to diagnose (certain KCS codes).  Others, not so much (other KCQs.  I've spent the last week digging into SCSI command codes, KCQs, ASC/ASCQs, Sense Codes, Opcodes, etc.  I've got a fair handle on most of the errors I'm seeing, except for one.

There's one group of errors that I have that I cannot decipher:

[2019-07-03 07:24:51] Drive SN: xxxxxxxxxxxxxxxx CDB=0x85092E0000000100B600000000002F00 CC Sense Data-- 00: 70 00 01 00 00 00 00 06 80 00 00 00 00 1D 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1C: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [2019-07-03 07:24:51] Recovered [host] PR=0x8146b050 D030 Op=85 PLErr=02 IopErr=04 S=02 [2019-07-03 07:24:51] KCQ=1:00:1D [2019-07-03 07:24:51] Drive SN: xxxxxxxxxxxxxxxx CDB=0x85092E00000001000000000000002F00 CC Sense Data-- 00: 70 00 01 00 00 00 00 06 80 00 00 00 00 1D 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1C: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [2019-07-03 07:24:51] Recovered [host] PR=0x8146b050 D031 Op=85 PLErr=02 IopErr=04 S=02 [2019-07-03 07:24:51] KCQ=1:00:1D

 

I get a pair of these errors for each SSD attached (16 SSD drives, 32 messages).  I do not get any errors for the SAS disks.  I can't find a KCQ 1:00:1D.

These errors correlate to periods of poor performance.  It almost seems like there's a bus reset happening, but I would imagine that would impact the SAS drives as well.  Any ideas?

 

Thanks in advance.


Viewing all articles
Browse latest Browse all 10362

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>