O'Reilly logo

Problem-solving in High Performance Computing by Igor Ljubuncic

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 8

Monitoring and prevention

Abstract

In this chapter, the reader will learn about situational awareness, active, proactive, and reactive methods for generating full system understanding (the bird’s eye view) and making the right, data-driven decisions. The reader will also learn about the importance of monitoring and processing the significant data, as well as how to avoid the pitfalls and false positives in data trends. Last, this chapter will also address monitoring and auditing facilities, and correlation between environment and system events.

Keywords

monitoring
trend
report
log
sar
audit
nagios
zabbix
Our work so far has been focused on investigating problems and following industry best practices of problem solving. In a large ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required