This IBMAr RedbooksAr publication provides information about aspects of performing infrastructure health checks, such as checking the configuration and verifying the functionality of the common subsystems (nodes or servers, switch fabric, parallel file system, job management, problem areas, and so on). This IBM Redbooks publication documents how to monitor the overall health check of the cluster infrastructure, to deliver technical computing clients cost-effective, highly scalable, and robust solutions. This IBM Redbooks publication is targeted toward technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) responsible for delivering cost-effective Technical Computing and IBM High Performance Computing (HPC) solutions to optimize business results, product development, and scientific discoveries. This book provides a broad understanding of a new architecture.Example 5-9 The same tool with existing templates Remember: After any maintenance that changes the results, a new ... computeA6, computeA3 Info: Details can be found in file: /var/opt/ibmchc/log/cpu2.template/config_check.log. 20131218_193430 The health check tool cpu2 [ FAILED ] ... to redbook:/tmp Building Report.
|Title||:||IBM High Performance Computing Cluster Health Check|
|Author||:||Dino Quintero, Ross Aiken, Shivendra Ashish, Manmohan Brahma, Murali Dhandapani, Rico Franke, Jie Gong, Markus Hilger, Herbert Mehlhose, Justin I. Morosi, Thorsten Nitsch, Fernando Pizzano, IBM Redbooks|
|Publisher||:||IBM Redbooks - 2014-04-03|