HP server burn-in
In my experience, most companies run either Dell or HP servers. The predominant choice for those that choose HP is the popular volume commodity Proliant platform. Server burn-in is a topic that comes up occasionally with some shops skipping it completely and others wanting to do something but not knowing quite what to do. One could use popular desktop stress testing products like Prime95 or SuperPi that stress CPU and RAM primarily. Sandra from SiSoftware, another popular desktop benchmarking suite, can be purchased in a very expensive enterprise platform, but there is an easier way to ensure your new Proliant server is production-ready.
Many don’t realize that HP has some great tools bundled with their Smart Start suite for free! HP Insight Diagnostics is all you need to confidently test and burn-in your hardware. As a matter of good practice, before you turn your new server over to the production world of your company to be entrusted with tasks, critical in nature, you should thoroughly test the hardware. All of it! From the CPU to the PCIe slots, fans, power supplies and everything in between. Sandra is capable of running specific benchmark and diagnostic jobs targeting elements of specific components, but can it run exhaustive read tests against every PCI slot in the server? Exhaustively test the cache of each CPU in an MP server? No.
Insight Diagnostics can be run in two modes: Offline or Online. Offline tests are far more broad and granular than their online counterparts. Screens below show a few of the components that can be stressed and evaluated in this mode. Online mode, accessed via a running Server’s System Management Homepage (servername:2381), is far more limited providing diagnosis only for Smart Array volumes and supported power supplies. Online diagnostics can be run without disrupting active server operation.
To get started in Offline (burn-in) mode, boot your server with the latest Smart Start CD (or attach the ISO via virtual media in iLo) and enter the “maintain server” menu. Launch the diagnostics option and click the Test tab. Here you can choose specific devices you want to test or just simply choose all. Every single component of the server can be exhaustively tested. Then set the number of loops and maximum test time. 1 full loop on a 5th gen DL380 will take over an hour. I’d run this on a new server for at least a few hours up to a day or 2 if you have the time. If all tests pass on all components then you should feel confident that your new hardware is ready for production.
 
Thanks for this post very helpful.
ReplyDelete