Skip to main content

Command Palette

Search for a command to run...

Linux File System Hunting

Updated
10 min read
Linux File System Hunting

Most people learn Linux by memorizing commands. I decided to do something different — I went hunting.

Instead of running ls and mkdir in a terminal for the hundredth time, I decided to actually open up the Linux file system and ask: what's really going on here? What do these folders actually do? Why do they exist? What problems are they solving?

What followed was one of the most eye-opening afternoons I've had as a CS student. Here's what I discovered.


1. /etc — The Brain of the Entire System

I always knew /etc existed. I never knew it was basically the control room of Linux.

Almost every configuration file that decides how your system behaves lives here. Want to change how your system resolves domain names? /etc/hosts. Want to control who can log in? /etc/passwd and /etc/shadow. Want to define your hostname? /etc/hostname.

cat /etc/hostname

What blew my mind was realizing that /etc doesn't contain programs — it contains instructions for programs. Every service running on your Linux machine comes to /etc to understand its own rules. It's a convention — not enforced by the kernel — but universally followed.

The name itself comes from old Unix terminology meaning "et cetera," but practically, it means: everything that doesn't belong anywhere else but is critical to how the system runs.

Why it matters: If you ever need to debug why something behaves a certain way on Linux, /etc is always your first stop.


2. /etc/resolv.conf — How Your System Actually Resolves DNS

I typed google.com in a browser and wondered: how does my machine know where that actually is? The answer lives in one small file:

cat /etc/resolv.conf

This file tells your system which DNS nameserver to contact when it needs to resolve a domain name. On most modern systems, you'll see something like:

nameserver 127.0.0.53

That's not Google's DNS. That's systemd-resolved — a local DNS stub running on your own machine. Your system first talks to itself, which then forwards the query upstream.

resolvectl status

Running this shows you the actual upstream DNS servers in use — often assigned by your router via DHCP. What I found genuinely interesting: the DNS resolution chain is:

browser → local stub (127.0.0.53) → ISP or custom DNS → root nameservers

That single file is the entry point to that entire chain.

Why it matters: If your internet is working but domain names aren't resolving, this file is why. It's also why changing DNS (to Cloudflare's 1.1.1.1 or Google's 8.8.8.8) can speed up your browsing.


3. /etc/hosts — DNS That Existed Before DNS

Before the internet had a proper DNS system, every machine on the network maintained a file mapping hostnames to IP addresses. That file still exists today:

cat /etc/hosts

You'll see entries like:

127.0.0.1   localhost
::1         localhost

Your machine checks this file before querying any DNS server. That order is defined in /etc/nsswitch.conf — where files (i.e., /etc/hosts) comes before dns.

This means you can map any domain to any IP locally. Developers use this to point myapp.local to 127.0.0.1. Sysadmins use it for internal services. And historically, malware would modify this file to redirect legitimate domains to malicious IPs — which is why monitoring it matters from a security perspective.

The insight: A file designed in the early days of ARPANET is still actively consulted by your browser every single time you load a website.


4. /proc — A Filesystem That Doesn't Actually Exist on Disk

This one genuinely surprised me. /proc looks like a folder full of files, but none of those files physically exist on your hard drive. They are generated in real-time by the kernel every time you read them.

cat /proc/cpuinfo

This gives you live information about your CPU — cores, model name, flags. But there's no cpuinfo file sitting on disk. The kernel fabricates it the moment you ask.

cat /proc/meminfo

Same for memory usage. Everything is live.

What's really interesting is /proc/<PID>/ — for every running process, there's a directory named after its Process ID. Inside:

cat /proc/1/status

Process 1 is systemd — the first process your system starts. Inside its /proc directory, you can see its memory usage, its state, which user it's running as, and even its open file descriptors.

ls -la /proc/1/fd

fd shows every file currently open by that process. This is exactly how tools like lsof work — they're just reading /proc under the hood.

Why it matters: /proc is the window through which the kernel exposes its internals to userspace. It's how monitoring tools, debuggers, and sysadmins understand what's actually running on a machine.


5. /proc/net/route — The Routing Table Hidden in Plain Sight

I knew routers had routing tables. I didn't know my own Linux machine did too.

cat /proc/net/route

The output looks like hex gibberish at first:

Iface   Destination  Gateway   Flags  ...
eth0    00000000     0101A8C0  0003   ...

But once you decode it (it's little-endian hex), 0101A8C0 becomes 192.168.1.1 — your default gateway. The 00000000 destination means "everything else" — the default route.

A more human-readable version:

ip route show

What I realized: every time your machine sends a packet, it consults this table to decide where to send it. If the destination is on your local network — send directly. Otherwise — send to the gateway. This decision happens at the kernel level, every single time, for every single packet.

The insight: Your Linux machine is essentially a router for itself. The routing table isn't just a networking concept — it's a live kernel data structure exposed through /proc.


6. /etc/passwd and /etc/shadow — The Split Security Design

When I first ran:

cat /etc/passwd

I expected to see passwords. Instead I saw entries like:

dipan:x:1000:1000:,,,:/home/dipan:/bin/bash

That x where the password should be? It means the actual password hash is stored elsewhere — in /etc/shadow:

sudo cat /etc/shadow

The entries here look like:

dipan:\(6\)rounds=...$<long hash>:19500:0:99999:7:::

This split exists for a critical security reason: /etc/passwd needs to be readable by everyone (many programs use it to map user IDs to names), but passwords obviously shouldn't be. So Linux separates them — /etc/shadow is readable only by root.

The hash prefix itself (\(6\)) tells you the algorithm used — \(6\) means SHA-512. The numbers after the hash encode the password's age, expiry policy, and warning period — a full password lifecycle management system packed into one line.

Why it matters: This design is a real-world lesson in the principle of least privilege — give each file only the permissions it absolutely needs.


7. /var/log — The System's Memory

If your system misbehaves, /var/log is where it leaves a trail:

ls /var/log

You'll find files like syslog, auth.log, kern.log, dpkg.log, and more.

tail -f /var/log/syslog

Watching this live is fascinating. Every kernel event, every daemon starting or crashing, every network change — it flows through here in real time.

The auth.log specifically records every login attempt — successful or not:

grep "Failed password" /var/log/auth.log

On any internet-exposed machine, this is almost always full of brute-force SSH attempts from bots trying common username/password combos. Seeing it for the first time is a bit unsettling — and a good reminder of why strong passwords (or key-based auth) matter.

On modern systemd-based systems, there's also:

journalctl -xe

This queries the binary journal — a structured, indexed log format that allows filtering by service, time range, or priority in ways plain text logs can't match.

The insight: Logs aren't just for debugging. They're a forensic timeline. If something goes wrong on a Linux system, /var/log is often the only witness.


8. /etc/systemd/system — How Services Actually Start

I always used systemctl start nginx without thinking about what that actually does. The answer is in:

cat /lib/systemd/system/ssh.service

A .service file is a plain text description of how a service should behave:

[Unit]
Description=OpenBSD Secure Shell server
After=network.target
 
[Service]
ExecStart=/usr/sbin/sshd -D
Restart=on-failure
 
[Install]
WantedBy=multi-user.target

This single file tells systemd:

  • Start after the network is up
  • Run this binary
  • If it crashes — restart it automatically
  • WantedBy=multi-user.target → start this at boot when the system reaches normal multi-user mode What I found interesting: dependencies between services are declared right here in plain text. systemd builds a full dependency graph at boot and starts services in the correct order — or in parallel when possible, making boot times faster.

Why it matters: Understanding .service files means you can write your own. Any script or program can become a managed system service with automatic restart, logging, and boot integration.


9. /dev — Where Hardware Becomes a File

Linux follows one core philosophy: everything is a file. Nowhere is this more literal than /dev:

ls /dev

You'll see entries like sda, sdb (your disks), tty0, tty1 (terminals), null, zero, random, and urandom.

Device What it does
/dev/null Write anything here and it disappears. It's the void. Used to suppress output in scripts.
/dev/zero Read from this and you get an infinite stream of zero bytes. Used to create blank files or wipe disks.
/dev/urandom Read from this and you get cryptographically random bytes — used internally to generate encryption keys and session tokens.
head -c 16 /dev/urandom | base64

That gives you 16 bytes of raw randomness you can use as a random token. The entropy for this comes from hardware events — disk timings, keyboard interrupts, network packet arrival times.

The insight: The disk your OS runs on (/dev/sda) is just a file. The terminal you're typing into (/dev/tty) is just a file. This abstraction is what makes Linux both powerful and philosophically elegant.


10. /boot — The Moment Before Linux Exists

Before the kernel runs, something has to load it. That's what /boot is for:

ls /boot

You'll typically find:

  • vmlinuz — The compressed Linux kernel itself
  • initrd.img — An initial RAM disk — a tiny temporary filesystem loaded into memory before the real filesystem is mounted
  • grub/ — The GRUB bootloader config
cat /boot/grub/grub.cfg

This file tells GRUB what OS options to show you and which kernel to load. If you've ever dual-booted Linux and Windows, this file controls that menu.

The initrd.img especially fascinated me — it's a complete minimal Linux environment packed into a single file. When your machine boots, the kernel first unpacks this into RAM, uses it to load the necessary drivers to access your actual disk, then mounts the real filesystem and hands over control.

The insight: The Linux boot process is a relay race:

BIOS/UEFI → GRUB → initrd → kernel → systemd → your desktop

/boot contains the first three handoffs.


What This Hunt Taught Me

Going through the Linux file system this way — not to run commands, but to understand — completely changed how I see an OS.

Linux doesn't hide its internals. Everything is a file. Everything is readable (with the right permissions). The kernel exposes itself through /proc. Services declare their own behavior in plain text. Logs record everything.

The more you look, the more you realize Linux is not a black box — it's a transparent machine that trusts you to understand it, if you're willing to look.