Quantcast
Channel: ProLiant Servers (ML,DL,SL) topics
Viewing all articles
Browse latest Browse all 10362

Random Error Out of the Blue: Uncorrectable PCI Express Error

$
0
0

Woke up to a stalled server, checked the iLo log and found:

 

System Error 09/18/2014 03:56 09/18/2014 03:56 1
 

Unrecoverable System Error (NMI) has occurred. System Firmware will log additional details in a separate IML entry if possible

 
PCI Bus 09/18/2014 03:56 09/18/2014 03:56 1


Uncorrectable PCI Express Error (Slot 2, Bus 0, Device 1, Function 0, Error status 0x00014000)

 

We're running VMware on the machine...  I jumped in vCenter to track down the device and found:

 

Intel Corporation Sandy Bridge IIO PCI Express Root Port 1a #1

 

Segment Number: 0
Bus Number: 0
Device Number: 1
Function Number: 0
Capabilities: Bridge Subsystem ID, MSI, PCI Express, Power management
PCI Device ID: 0x3c02
Device ID: PCI 0:0:1:0
Vendor ID: 0x8086
Subsystem ID: 0x0
Subsystem Vendor ID: 0x0
Secondary Bus Number: 7

 

The only thing we've changed from the stock server is the addition of a FusionIO 320GB PCIe SSD, about a year ago.

 

Per this thread: http://h30499.www3.hp.com/t5/ProLiant-Servers-ML-DL-SL/DL380p-Gen8-with-uncorrectabl-PCI-express-error/td-p/5995669#.VBrE8vldVRg

 

I checked our System ROM, and we're at 02/25/2012, four days after the suggested version.

 

Thoughts?

 

How can I physically identify "PCIe Root Port 1a #1" to see what's plugged into that might have generated the error?

 

Thanks!!

 

Jeff

 

 

 

 


Viewing all articles
Browse latest Browse all 10362

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>