We have 6 pieces DL360Gen9 servers which ALL sometimes do not start correctly and hang during Linux boot. We have more than 20 pieces DL360pG8 where the problem DOES NOT happen.
After 3 weeks of extensive testing and trying to find out why the problem happens only very few times and not during every boot we were able to narrow down the problem to a very small but critical issue.
The server hangs when calling BIOS Int13h to write data to disk. Obviously this only happens sometimes and only while loading the operating system and before the disk controller driver is loaded which probably replaces BIOS Int13h calls.
It is quite easy to repeat this problem by booting MSDOS 6.22 or FreeDOS from hard disk or from a USB stick and by starting the following batch file:
:loop
copy c:\command.com c:\test.tst
goto loop
It takes from 10 seconds to 4 hours for the problem to appear (this is probably the reason why the Linux boot does not hang every time but only occasionally) . Under MSDOS either a red screen of death appears or the copy command hangs. Under FreeDOS the following error is displayed on the screen: Invalid Opcode at D057 0206 0A82 ... We installed the SmartArray Controller firmware 3.56 and the BIOS 1.52. Attached are the screenshots of the error under MSDOS and FreeDOS.
HP support refuses to accept this as an issue because they claim that in AHS they cannot see any error and for them all the hardware (exchangable parts) is working correctly and they cannot do anything.
Does anybody know if there is a solution or how to correctly report this issue to HPE? There must be or there will be others who will be affected by this problem, although only occasionally.