Picture this: It’s Monday morning, you’re sipping your coffee, and suddenly your Linux server decides to play dead. No boot, no mount, no mercy. Sound familiar? If your server is throwing digital tantrums, you’re not alone — and more importantly, you’re about to become the hero who fixes everything.
Think of your Linux boot process like a morning routine. Just as you need to wake up, get dressed, and grab your keys before leaving the house, your server needs to load the bootloader, start the kernel, and mount file systems before it’s ready to work. When any step fails, chaos ensues.
Why Should You Care About Boot and File System Issues?
Server downtime costs money — lots of it. Every minute your system is down, users can’t access services, applications crash, and your phone starts ringing with angry calls. Master these troubleshooting skills and you’ll:
- Save your weekends from emergency server calls
- Reduce downtime from hours to minutes
- Impress your boss with lightning-fast problem resolution
- Sleep better knowing you can handle any boot disaster
The 5-Step Troubleshooting Foundation
Before diving into specific issues, let’s establish our troubleshooting methodology — think of it as your server CPR process:
- Identify the problem — What’s actually broken?
- Establish a theory — Why might this be happening?
- Test your theory — Prove or disprove your hypothesis
- Implement the solution — Fix it or escalate if needed
- Prevent it happening again — Root cause analysis and preventive measures
Throughout each step, document everything. Trust me, future-you will thank present-you.
When Your Server Won’t Even Turn On
The most terrifying scenario: complete radio silence from your server.
What you’ll see: Nothing. Blank screen. No response to power button.
Common culprits:
- Hardware failure — Power supply, motherboard, or RAM issues
- Loose connections — Cables aren’t seated properly
- Power issues — Circuit breaker tripped or power supply failed
Your action plan:
# Check physical connections first
# Verify power cables, network cables, monitor connections
# Check server logs if you have remote management
ipmitool -I lanplus -H <server-ip> -U <username> sel list
# Test with minimal hardware
# Remove all non-essential components and try booting
GRUB Misconfiguration Nightmares
GRUB is your system’s doorman — when it’s confused, nobody gets in.
What you’ll see:
- “GRUB error” messages
- “No such device” errors
- Booting to a GRUB prompt instead of your OS
The fix:
# Boot from rescue media and chroot into your system
mount /dev/sda1 /mnt
chroot /mnt
# Reinstall GRUB
grub-install /dev/sda
update-grub
# Check your GRUB configuration
cat /boot/grub/grub.cfg | head -20
Kernel Corruption and Panic Attacks
When your kernel corrupts, it’s like your server forgot how to speak Linux.
What you’ll see:
- Kernel panic messages during boot
- System freezing at “Loading kernel”
- Random reboots or complete system hangs
Emergency response:
# Boot from an older kernel (if available)
# Select "Advanced options" in GRUB menu
# Check kernel integrity
dmesg | grep -i error
journalctl --boot=-1 | grep kernel
# Reinstall kernel if corrupted
apt install --reinstall linux-image-generic
# or for Red Hat systems:
yum reinstall kernel
File System Mount Failures: When Storage Goes Rogue
Nothing ruins your day like a file system that refuses to mount.
What you’ll see:
- “Mount: wrong fs type” errors
- “Superblock corrupt” messages
- Applications can’t access directories
- Backup jobs failing mysteriously
Your rescue mission:
For basic mount errors:
# Check what's actually mounted
mount | grep -v tmpfs
# Try manual mount with verbose output
mount -v -t ext4 /dev/sda1 /mnt
# Check file system type
blkid /dev/sda1
# Fix UUID issues in fstab
nano /etc/fstab
# Replace old UUIDs with current ones from blkid
For superblock corruption:
# Boot into rescue mode first
# Run file system check
fsck -y /dev/sda1
# For ext4 file systems with backup superblocks
fsck.ext4 -b 8193 /dev/sda1
# Check and repair superblock
dumpe2fs /dev/sda1 | head -20
The “Partition Not Writable” Mystery
Your file system mounts, but everything’s read-only. Frustrating doesn’t begin to cover it.
Symptoms:
- “Permission denied” errors when writing
- Applications can’t save files
- Log files stop updating
Investigation steps:
# Check if mounted read-only
cat /proc/mounts | grep sda1
# Look for 'ro' flag indicating read-only mount
mount | grep sda1
# Remount as read-write
mount -o remount,rw /dev/sda1
# Check file system for errors first
fsck -n /dev/sda1 # -n flag for read-only check
When Your Disk is “Full” But Isn’t
The classic “No space left on device” error when df -h
shows available space.
Two main culprits:
Scenario 1: Actual disk full
# Check disk usage
df -h
# Find space hogs
du -sh /* | sort -hr
# Clean up common space wasters
# Rotate logs
logrotate -f /etc/logrotate.conf
# Clean old package files
apt autoremove && apt autoclean
# Remove orphaned Docker images
docker system prune -a
Scenario 2: Inode exhaustion
When there are many-many small files which fill up the inodes and new files cannot be created.
# Check inode usage
df -i
# If inodes are at 100%, find directories with too many small files
find /var -type f | wc -l
find /tmp -type f | wc -l
# Common culprit: mail directories with thousands of small files
ls /var/mail/ | wc -l
User Quota Troubles
Individual users hitting storage limits can cause localized chaos.
# Check user quotas
repquota -a
# Check specific user
quota -u username
# Adjust quotas if needed (be careful!)
edquota username
Your Emergency Response Toolkit
Keep these commands handy for quick diagnosis:
# Quick system health check
systemctl status
dmesg | tail -20
journalctl -xe
# Storage quick check
df -h && df -i
lsblk
mount | grep -v tmpfs
# Boot problem investigation
systemctl list-units --failed
journalctl --boot=-1
TLDR Cheat Sheet
Server won’t power on: Check cables → Check power → Check hardware → Call vendor
GRUB errors: Boot rescue → chroot → grub-install /dev/sda
→ update-grub
Mount failures: Check blkid
→ Fix /etc/fstab
→ Run fsck
→ Remount
Read-only partition: mount -o remount,rw
→ Run fsck
→ Check permissions
Disk full: df -h
vs df -i
→ Clean logs → Remove old files → Extend partition
Prevention is better than cure: Regular backups, monitoring disk space, keeping spare hardware, and documenting your configurations will save you countless headaches.
Remember: every server tantrum is just a puzzle waiting to be solved. With these tools and techniques, you’ll go from stressed-out admin to confident problem-solver. Your future self (and your sleep schedule) will thank you.