When Your Computer Becomes a Crime Scene: Linux Detective Skills You Need

When Your Computer Becomes a Crime Scene: Linux Detective Skills You Need

So you’ve mastered the basics from Part 1 — you can nice your processes and run htop like a pro. But what happens when your system starts acting weird and basic monitoring isn’t enough? Welcome to the advanced course, where we turn you into a digital Sherlock Holmes who can solve any performance mystery.

Imagine your computer just had a performance hiccup at 2 AM, and now it’s 9 AM and your boss is asking why the server was slow. Basic tools show you what’s happening now, but you need to know what happened then. Time to bring out the big guns!

Why Should You Care About Advanced Monitoring?

Because being a Linux user without these tools is like being a doctor without a stethoscope. You might see the obvious problems (patient is unconscious), but you’ll miss the subtle ones (irregular heartbeat). Advanced monitoring helps you catch issues before they become disasters, optimize performance like a race car mechanic, and most importantly — look like an absolute wizard when you fix problems others can’t even diagnose.

Performance Metrics: Your System’s Health Checkup

The Sysstat Package — Your Swiss Army Knife

Meet mpstat and pidstat - they're like having a doctor for your computer that can check specific organs (CPU cores, processes) instead of just taking your temperature. These tools are part of the sysstat package, which is basically a medical toolkit for your system.

MPSTAT — The CPU Specialist

This tool breaks down what each CPU core is doing, like having individual heart monitors for each chamber of your heart. While top shows you overall CPU usage, mpstat tells you if core #3 is having a bad day while the others are chilling.

The Magic Syntax:

bash

mpstat [-P {cpu | ALL}] [interval] [count]

Real-world detective work:

# Check all CPU cores every 2 seconds, 5 times in a row
mpstat -P ALL 2 5
# Monitor just CPU 0 every second for 10 iterations
mpstat -P 0 1 10

What you’ll discover:

  • User mode: Time spent running your programs (the actual work)
  • System mode: Time spent on housekeeping (OS overhead)
  • Idle: Time spent twiddling thumbs (the good kind of waiting)
  • I/O wait: Time spent waiting for slow storage (the bad kind of waiting)

PIDSTAT — The Process Detective

While mpstat looks at CPUs, pidstat stalks individual processes like a very polite private investigator. It's perfect for answering questions like "Which process is eating all my memory?" or "What's hammering my disk at 3 AM?"

The Detective Toolkit:

pidstat [-u] [-r] [-d] [-p pid] [interval] [count]
  • -u: CPU usage (who's hogging the processor?)
  • -r: Memory stats (who's the memory monster?)
  • -d: I/O activity (who's hammering the disk?)

Example investigation:

# Watch Firefox's resource usage every 2 seconds continously
pidstat -u -r -d -p $(pgrep firefox) 2
# Find the top I/O consumers every 5 seconds
pidstat -d 5

Diagnostic and Debugging Tools: When Things Go Really Wrong

/proc/<pid> — The Process’s Personal Diary

Every running process has a folder in /proc/ that's like reading someone's diary - but legally! This virtual filesystem contains everything you could ever want to know about a process.

The juicy details:

# How was this process started?
cat /proc/1234/cmdline
# What environment variables does it have?
cat /proc/1234/environ

# What's its current memory footprint?
cat /proc/1234/status | grep -i mem

# What files does it have open?
ls -l /proc/1234/fd/

Pro tip: Replace 1234 with any actual process ID from ps or htop!

PSTREE — The Family Tree Detective

Ever wonder which process is the parent of that mysterious background task? pstree draws you a beautiful family tree showing who spawned whom. It's perfect for tracking down runaway processes or understanding complex service hierarchies.

Family drama investigation:

# Show the full family tree with process IDs
pstree -p
# Focus on a specific user's processes
pstree username

# Show just the children of a specific process
pstree -p 1234

LSOF — The “Who’s Using What” Detective

In Linux, everything is a file (network sockets, pipes, actual files, devices). lsof (List Open Files) tells you which process is using which file - it's like having X-ray vision for your system.

Practical magic tricks:

# See what's using port 443 (HTTPS)
sudo lsof -i TCP:443 -s TCP:LISTEN
# Find which process is using a file
lsof /path/to/important/file

# See all network connections
sudo lsof -i

# Find all files opened by Firefox
lsof -c firefox

# To find all services using TCP with port 80 for finding port conflicts
lost -iTCP:443 -sTCP:443

Real-world scenarios:

  • Can’t unmount a USB drive? lsof /media/usb will show you what's still using it
  • Suspicious network activity? lsof -i reveals all network connections
  • Port conflict? lsof -i :8080 shows what's already using port 8080

STRACE — The Process Wiretap

When a program misbehaves and you need to know exactly what it’s doing, strace is like putting a wire on it. It logs every system call (read, write, open, connect, etc.) so you can see exactly where things go wrong.

The wiretap syntax:

# Monitor all system calls of a running process
strace -p 1234
# Start a program under surveillance
strace ./my_suspicious_program

# Focus on specific types of system calls
strace -e trace=open,read,write my_program

# See only network-related calls
strace -e trace=network curl google.com

Detective scenarios:

# Debug a "Permission denied" error
strace -e trace=open ./failing_program
# See what files a program tries to access
strace -e trace=file ls /home

# Monitor a web service's network activity
strace -e trace=network -p $(pgrep apache2)

Pro tip: strace output can be overwhelming. Use -e trace= to filter, or pipe to grep to find specific patterns!

Advanced Investigation Techniques

The Performance Mystery Solving Process

When your system acts weird, follow this detective methodology:

  1. Start broad with htop or atop - what's the overall situation?
  2. Get specific with mpstat -P ALL 1 5 - which CPU cores are struggling?
  3. Find the culprit with pidstat -u -r -d 2 - which processes are misbehaving?
  4. Go deep with strace -p <suspicious_pid> - what is the bad process actually doing?
  5. Check resources with lsof -p <pid> - what files/ports is it using?

The “System Was Slow Yesterday” Investigation

This is where atop shines. Unlike other tools that only show current data, atop keeps historical snapshots:

# View system performance from yesterday at 3 PM
atop -r /var/log/atop/atop_20240817 -b 15:00 -e 15:30

The “What’s Eating My Bandwidth?” Hunt

Combine tools for network detective work:

bash

# Find processes using network
sudo lsof -i
# Monitor network system calls
sudo strace -e trace=network -p $(pgrep -f "suspicious_app")

# Check what ports are listening
sudo lsof -i -s TCP:LISTEN

TLDR — The Advanced Cheat Sheet

Performance Deep Dive:

  • mpstat -P ALL 2 5 - CPU breakdown per core over time
  • pidstat -u -r -d 2 - Per-process CPU, memory, and I/O stats
  • atop -r <logfile> - Historical performance data

Process Investigation:

  • /proc/<pid>/status - Detailed process info
  • pstree -p - Process family tree with IDs
  • lsof -p <pid> - Files and ports used by process
  • strace -p <pid> - Real-time system call monitoring

Network & File Debugging:

  • sudo lsof -i :port - What's using a specific port
  • lsof <filename> - What's using a specific file
  • strace -e trace=network <command> - Network activity monitoring

The Golden Rule: Start broad, get specific, go deep. Every performance mystery has clues — you just need to know where to look!

Post a Comment

Previous Post Next Post