Linux Boot Issues Troubleshooting Guide: GRUB, Kernel Panic & Driver Fixes

Linux Boot Issues Troubleshooting Guide: GRUB, Kernel Panic & Driver Fixes

Picture this: It’s Monday morning, you’re sipping your coffee, and suddenly your Linux server decides to play dead. No boot, no mount, no mercy. Sound familiar? If your server is throwing digital tantrums, you’re not alone — and more importantly, you’re about to become the hero who fixes everything.

Think of your Linux boot process like a morning routine. Just as you need to wake up, get dressed, and grab your keys before leaving the house, your server needs to load the bootloader, start the kernel, and mount file systems before it’s ready to work. When any step fails, chaos ensues.

Why Should You Care About Boot and File System Issues?

Server downtime costs money — lots of it. Every minute your system is down, users can’t access services, applications crash, and your phone starts ringing with angry calls. Master these troubleshooting skills and you’ll:

  • Save your weekends from emergency server calls
  • Reduce downtime from hours to minutes
  • Impress your boss with lightning-fast problem resolution
  • Sleep better knowing you can handle any boot disaster

The 5-Step Troubleshooting Foundation

Before diving into specific issues, let’s establish our troubleshooting methodology — think of it as your server CPR process:

  1. Identify the problem — What’s actually broken?
  2. Establish a theory — Why might this be happening?
  3. Test your theory — Prove or disprove your hypothesis
  4. Implement the solution — Fix it or escalate if needed
  5. Prevent it happening again — Root cause analysis and preventive measures

Throughout each step, document everything. Trust me, future-you will thank present-you.

When Your Server Won’t Even Turn On

The most terrifying scenario: complete radio silence from your server.

What you’ll see: Nothing. Blank screen. No response to power button.

Common culprits:

  • Hardware failure — Power supply, motherboard, or RAM issues
  • Loose connections — Cables aren’t seated properly
  • Power issues — Circuit breaker tripped or power supply failed

Your action plan:

# Check physical connections first
# Verify power cables, network cables, monitor connections
# Check server logs if you have remote management
ipmitool -I lanplus -H <server-ip> -U <username> sel list

# Test with minimal hardware
# Remove all non-essential components and try booting

GRUB Misconfiguration Nightmares

GRUB is your system’s doorman — when it’s confused, nobody gets in.

What you’ll see:

  • GRUB error” messages
  • “No such device” errors
  • Booting to a GRUB prompt instead of your OS

The fix:

# Boot from rescue media and chroot into your system
mount /dev/sda1 /mnt
chroot /mnt

# Reinstall GRUB
grub-install /dev/sda
update-grub
# Check your GRUB configuration
cat /boot/grub/grub.cfg | head -20

Kernel Corruption and Panic Attacks

When your kernel corrupts, it’s like your server forgot how to speak Linux.

What you’ll see:

  • Kernel panic messages during boot
  • System freezing at “Loading kernel”
  • Random reboots or complete system hangs

Emergency response:

# Boot from an older kernel (if available)
# Select "Advanced options" in GRUB menu

# Check kernel integrity
dmesg | grep -i error
journalctl --boot=-1 | grep kernel

# Reinstall kernel if corrupted
apt install --reinstall linux-image-generic

# or for Red Hat systems:
yum reinstall kernel

File System Mount Failures: When Storage Goes Rogue

Nothing ruins your day like a file system that refuses to mount.

What you’ll see:

  • “Mount: wrong fs type” errors
  • “Superblock corrupt” messages
  • Applications can’t access directories
  • Backup jobs failing mysteriously

Your rescue mission:

For basic mount errors:

# Check what's actually mounted
mount | grep -v tmpfs

# Try manual mount with verbose output
mount -v -t ext4 /dev/sda1 /mnt

# Check file system type
blkid /dev/sda1

# Fix UUID issues in fstab
nano /etc/fstab
# Replace old UUIDs with current ones from blkid

For superblock corruption:

# Boot into rescue mode first
# Run file system check
fsck -y /dev/sda1

# For ext4 file systems with backup superblocks
fsck.ext4 -b 8193 /dev/sda1

# Check and repair superblock
dumpe2fs /dev/sda1 | head -20

The “Partition Not Writable” Mystery

Your file system mounts, but everything’s read-only. Frustrating doesn’t begin to cover it.

Symptoms:

  • “Permission denied” errors when writing
  • Applications can’t save files
  • Log files stop updating

Investigation steps:

# Check if mounted read-only
cat /proc/mounts | grep sda1

# Look for 'ro' flag indicating read-only mount
mount | grep sda1

# Remount as read-write
mount -o remount,rw /dev/sda1

# Check file system for errors first
fsck -n /dev/sda1 # -n flag for read-only check

When Your Disk is “Full” But Isn’t

The classic “No space left on device” error when df -h shows available space.

Two main culprits:

Scenario 1: Actual disk full

# Check disk usage
df -h

# Find space hogs
du -sh /* | sort -hr

# Clean up common space wasters
# Rotate logs
logrotate -f /etc/logrotate.conf

# Clean old package files
apt autoremove && apt autoclean

# Remove orphaned Docker images
docker system prune -a

Scenario 2: Inode exhaustion

When there are many-many small files which fill up the inodes and new files cannot be created.

# Check inode usage
df -i

# If inodes are at 100%, find directories with too many small files
find /var -type f | wc -l
find /tmp -type f | wc -l

# Common culprit: mail directories with thousands of small files
ls /var/mail/ | wc -l

User Quota Troubles

Individual users hitting storage limits can cause localized chaos.

# Check user quotas
repquota -a

# Check specific user
quota -u username

# Adjust quotas if needed (be careful!)
edquota username

Your Emergency Response Toolkit

Keep these commands handy for quick diagnosis:

# Quick system health check
systemctl status
dmesg | tail -20
journalctl -xe

# Storage quick check
df -h && df -i
lsblk
mount | grep -v tmpfs

# Boot problem investigation
systemctl list-units --failed
journalctl --boot=-1

TLDR Cheat Sheet

Server won’t power on: Check cables → Check power → Check hardware → Call vendor

GRUB errors: Boot rescue → chroot → grub-install /dev/sda → update-grub

Mount failures: Check blkid → Fix /etc/fstab → Run fsck → Remount

Read-only partition: mount -o remount,rw → Run fsck → Check permissions

Disk full: df -h vs df -i → Clean logs → Remove old files → Extend partition

Prevention is better than cure: Regular backups, monitoring disk space, keeping spare hardware, and documenting your configurations will save you countless headaches.

Remember: every server tantrum is just a puzzle waiting to be solved. With these tools and techniques, you’ll go from stressed-out admin to confident problem-solver. Your future self (and your sleep schedule) will thank you.

Post a Comment

Previous Post Next Post