O'Reilly logo

Problem-solving in High Performance Computing by Igor Ljubuncic

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 3

Basic investigation

Abstract

In this chapter, we focus on the methodology and steps needed to perform a successful first-level system debugging and analysis. We will be using system logs and statistics to try to understand the manifestation of a problem.

Keywords

top
ps
dmesg
iostat
vmstat
sar

Profile the system status

Previous chapters have taught us the necessary models when approaching what may appear to be a problem in our environment. The idea is to carefully isolate the problem, reduce it to a minimal set of variables, and then use industry-accepted methods to prove and disprove your theories. Now, we will learn about the tools that can help us in our quest.

Environment monitors

Typically, data center hosts are configured to ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required