Skip to main content

Command Palette

Search for a command to run...

Lecture 1 - OS Design Principles

Updated
21 min read
Lecture 1 - OS Design Principles

Disclaimer

⚠️ Ye who have ventured here, hath abandoned all hope...

The following content contains intense cybersecurity themes and may not be suitable for the faint-hearted

Students and beginners, proceed at your own thrill — this is not a walk in the park

This article explains how a computer becomes capable of running an operating system by gradually evolving the system architecture. We begin with extremely simple hardware designs (similar to early computers in the 1970s) and progressively add components until we reach a design capable of running modern operating systems like Linux.

We will build our understanding through three architectures:

  1. CPU + RAM

  2. CPU + RAM + ROM

  3. CPU + RAM + ROM + Persistent Storage

Each stage solves a fundamental problem in computer startup.

Design Architecture 1: CPU + RAM

The Goal: Establish a basic hardware execution environment where the processor can read and execute instructions.

First we will make a dummy diagram of the current architecture which contains CPU and RAM

In the above diagram, the following components will perform the following functions.

Note: The current scenario illustrates the 1970s computer architecture without any modern computer hardware/software design functionalities like Parallel Processing, Multi Processing, Threading, Interrupts, etc. We will learn each and every design paradigm as we move on

CPU

  • Can execute instructions from a fixed memory address (called the reset vector).

In this simplified architecture we assume that the reset vector points into RAM. Real computers instead map firmware (ROM) at the reset vector so that valid instructions exist immediately after power-on.

RAM

  • Can store instructions and data, but is volatile (empties at power cycle)

Power Cycle

  • A power cycle means turning a device off completely and then turning it back on.

  • It's used to reset the hardware and bring the system back to a known initial state.

  • This violently purges all temporary memory (RAM), CPU registers, caches, and hardware latches.

  • It's like giving the entire system a fresh start — especially useful when the system is frozen, behaving unexpectedly, or needs to reinitialize hardware.

Anatomy of a Power Cycle

Here is the exact CPU sequence during a power cycle:

1. Power is removed

  • When the system turns off, the CPU loses power.

  • All its internal states (registers, cache, control logic) are erased.

  • The CPU becomes completely inactive.

2. Power is restored

  • Once power is turned on again:

  • The Power Supply Unit (PSU) stabilizes and sends a Power Good electrical signal to the motherboard.

  • This tells the CPU: "Voltage levels are stable - you can start now."

3. CPU reset sequence begins

  • The CPU automatically starts executing code from a fixed memory address called the reset vector.

4. CPU state after power cycle

  • When the CPU restarts:

    • All registers are set to default values.

    • Instruction pointer (IP) is locked to the reset vector.

    • Caches and buffers are empty.

    • The CPU operates as if it's being used for the first time (a completely blank state).

Current Design

  • Upon power-on, the CPU immediately attempts to fetch its first instruction from the hardcoded reset vector

Problem - Volatility Trap

  • Because RAM is volatile, when it is powered on, its contents are random ("indeterminate") until explicitly initialized. It's not guaranteed to be zero.

  • Therefore, the memory at the reset vector, where the CPU expects its very first instruction, contains random garbage.

  • If the first instruction is garbage or invalid, the CPU may execute undefined instructions, trigger an invalid opcode exception, or simply hang.

  • The CPU cannot safely execute empty or random RAM. We need a valid instruction already in RAM at the reset vector.

  • How do we ensure that the RAM contains a valid instruction at the reset vector?

Solution - Manual Bootstrapping

  • The Approach: The operator must manually inject valid code into RAM before letting the CPU execute.

  • The Steps:

    • The operator flips physical toggle switches on a front panel, or uses a paper tape, punched card reader, or console input to feed binary instructions directly into the RAM.

    • Once this small "bootstrap" code is in RAM, the CPU is released to start executing from the reset vector.

  • The Altair 8800 (1975) required users to enter about 20–30 bytes of machine code using front-panel switches.

  • Example:

    IN port
    STORE memory
    JUMP loop
    
  • After entering it, the operator would start execution and the program would load BASIC from tape.

  • This bootstrap code is simply a small program manually entered into RAM so the CPU has valid instructions to execute. At this stage, it may perform simple tasks such as testing the system or running a basic program.

Limitations

  • Because RAM gets wiped out on every power cycle, an operator has to manually punch this code in every single time the computer restarts.

  • Is it possible to automate the bootstrap process so that manual code entry is no longer required on every boot?

Design Architecture 2: CPU + RAM + ROM

The Goal: Automate the bootstrap process so manual code entry is no longer required on every boot.

To solve the volatility trap of early computers, a new hardware component - ROM (Read-Only Memory) was introduced into the earlier design of the CPU and RAM.

CPU

  • Executes instructions from a fixed memory address (the reset vector).

  • Can be later configured to execute arbitrary code from any memory location.

RAM

  • Can store instructions/data, but is volatile (empties at power cycle)

  • Used by the CPU to execute code once it has been loaded from ROM.

ROM

  • Stores fixed code that cannot be easily modified.

  • Usually contains the firmware.

  • The code stored in ROM (i.e. firmware) contains the instructions that tell the CPU to move data into the RAM.

Current Design

  • Instead of punching code into RAM, we embed the bootstrap code permanently into ROM. We map the CPU's reset vector to point to the ROM chip.

  • Execution: When the CPU powers on, it begins executing instructions from the reset vector. On many systems, the instruction at the reset vector is a small jump that transfers control to the firmware stored in ROM. The firmware performs basic hardware initialization and then loads a small bootstrap program into RAM.

  • The Handover: Once the firmware has loaded the bootstrap program into RAM, control is transferred to it. The CPU then begins executing the program from RAM, which continues the system startup process.

  • Advantage: No need to directly punch the code into the RAM every time the system restarts.

Problem - Rigidity of Fixed Code

  • Fixed Code: ROM stores fixed code, so it cannot be modified.

  • No Flexibility: A computing setup that requires customization or flexibility cannot change the bootstrap code burned into the ROM.

  • Result: No scope for customizing the bootstrap process.

  • How do we customize the bootstrap process?

Design Architecture 3: CPU + RAM + ROM + Long Storage Device

The Goal: To gain the flexibility to change or update the bootstrap code without replacing the physical ROM hardware, and to enable the loading of larger, more complex programs.

In this design, we solve the "Fixed Code" problem of Architecture 2 by adding a Long Storage Device (e.g., Hard Drive, SSD, or Magnetic Tape).

CPU

  • Executes instructions from a fixed memory address (the reset vector).

  • Follows the instructions in ROM to fetch the program from the Storage Device and load it into the RAM.

  • Can be later configured to execute arbitrary code from any memory location.

RAM

  • Can store instructions/data, but is volatile (empties at power cycle)

  • Used by the CPU to execute the program loaded from the storage device.

ROM

  • Stores fixed code that cannot be easily modified.

  • Contains the instructions needed to initialize the Long Storage Device.

  • Provides instructions to allow CPU to read from storage device

  • Provides instructions to allow CPU to write to storage device

  • If the ROM software (bootstrap code) is written to support reading from an external storage device, then it can:

    • Access the storage device

    • Read the program or instructions from it.

    • Load those instructions into a designated execution area in RAM (such as 0x7C00 for BIOS systems).

    • Trigger the CPU to start executing the loaded code from RAM.

Long Storage Device

  • A persistent storage medium such as a hard disk, SSD, or magnetic tape that retains data even after power is removed.

  • Stores programs permanently. This may include bootstrap programs or other executable software that the firmware can load into RAM.

  • Unlike ROM, the data here can be updated or changed by the user at any time.

Current Design - The Storage Handoff

  • Power On: The CPU wakes up and begins executing from a hardcoded address (the Reset Vector).

Early x86 CPUs like the 8086 used reset vector 0xFFFF0, while modern x86 processors use 0xFFFFFFF0.

  • Hardware Mapping: This address points directly to the ROM. No address translation occurs yet because the MMU (Memory Management Unit) is not yet active.

  • The Fetch: The chipset routes reads to the BIOS ROM, which contains the minimal instructions needed to "wake up" the rest of the hardware.

  • The Handover: The CPU executes the ROM code, which finds the program on the Long Storage Device and copies it into RAM.

  • Execution: The CPU stops reading from the ROM and begins running the program loaded from the storage device directly from the RAM.

Note: ROM configures the hardware so that the initial bootstrap code is available to the CPU at the reset vector, typically using memory-mapped I/O. We will revisit this section in upcoming lectures to study concepts such as the MMU more thoroughly.

Inference

  • Till now we have satisfied the necessary hardware requirements, to run a program like the Operating System which can control many other small programs

  • Here onwards, we will focus on the historical and technical context of each of the functionalities required to build the Operating System (our universal program)

  • In a modern BIOS system, the ROM isn't just a "loader" - it's an interface. It provides basic services (like "write this text to the screen") that the early bootloader uses before the Operating System is fully awake. For more information refer to the BIOS Boot Sequence at the end of this blog.

  • This architecture is still used today. When you press the power button on a Linux computer, the CPU first executes firmware stored in ROM. The firmware then loads a bootloader such as GRUB from disk, which finally loads the Linux kernel into RAM.

On most Linux systems:
Bootloader: GRUB
Kernel image: /boot/vmlinuz
Initial RAM filesystem: /boot/initramfs.img

Historical Context Note: This architecture allowed computers to move from being single-purpose tools (like a calculator) to general-purpose machines (like a PC) that could run entirely different software simply by swapping the contents of the Storage Device.

File System

The Goal: Store multiple different programs on the same long storage device.

Problem - Blind Storage

  • Currently, if we want to run a different program from the storage medium, we have to power cycle the computer or rewrite the boostrap program. The bootstrap program is currently hardcoded to pull a single, specific block of data from the drive.

  • Is there a way to store multiple programs on the same storage device?

Solution - Book-Keeping (Mini File System)

  • The Approach: Implement a simple book-keeping system on the storage device to track what program is stored at which memory location.

  • To store multiple programs, we'll introduce a simple book-keeping system on the storage device, a mini file system: which has three members:

Fields Description
program_id Program identifier
start_address Starting address of program in storage
end_address Ending address of program in storage
  • The initial book-keeping system will look like this
    <program id><start address><end address>

  • This book-keeping helps us remember where each program lives on the storage device.

  • It's the foundation of our primitive file system (the conceptual ancestor of FAT32)

Meta-Program

The Goal: Switch between multiple programs loaded from our file system from the same storage device without requiring a power cycle.

Problem - Loading Multiple Programs

  • Earlier we stored multiple programs into the same file system.

  • But how do we switch between programs on the same storage device?

Solution - OS Kernel

  • The Approach: Implement a small meta-program (an OS kernel) that manages the execution of multiple programs. This program resides in RAM while the system is running and coordinates the loading and execution of programs from the storage device.

  • With this addition, the RAM architecture is modified to include both the kernel and the currently running programs.

  • This meta-program provides additional capabilities beyond the basic bootstrap program.

As operating systems evolved, the bootstrap program eventually became specialized into what we now call a bootloader - a program whose primary task is to locate and load the operating system kernel.

Features of Meta-Program

  • It can perform read/write operations on the storage medium (either directly or through the ROM)

  • It understands the bookkeeping data (i.e., it includes bookkeeping logic) present on the storage medium

  • As an extension of the book-keeping logic, it can load any program using its unique ID

Steps to Load a Program by ID

  1. Using the given program ID, locate its start address and end address on the storage device

  2. Read data sequentially from the start address to the end address, copying it into RAM

  3. Execute a direct jump instruction (JMP <address in RAM>) to transfer control to the newly loaded program

  4. Wait until execution finishes before loading the next program

System Calls

The Goal: Allow a user program to safely return control to the Meta-Program (Operating System Kernel) when it finishes execution or requires a system-level operation.

Problem - The Point of No Return

  • Once a program is loaded and begins executing, the CPU continues running that program's instructions.

  • However, situations may arise where the program needs to:

    • Exit after completing its work

    • Read or write data from the storage device

    • Display output

    • Perform other system-level operations

  • At this point, control must return to the Meta-Program so it can decide what to do next.

  • If the program simply executes a CPU halt instruction such as:

    HLT
    

    the processor stops entirely, bringing the whole system to a halt.

  • This creates a problem: how can a program safely hand control back to the Meta-Program without stopping the entire machine?

Solution - Standardized Control Transfer

  • The solution is to create a standardized mechanism that allows programs to transfer control back to the Meta-Program in a controlled way.

  • This mechanism forms the basis of what we now call System Calls.

The Approach - A Simple API

  • The Meta-Program exposes a set of predefined memory addresses that act like functions.

  • Each address corresponds to a specific operation handled by the Meta-Program.

  • Examples might include:

    • exit()

    • read()

    • write()

    • display()

  • These addresses act as entry points into the Meta-Program.

Handling Control using API

  • The Meta-Program reserves specific memory locations that act as known jump destinations. These addresses serve as entry points into the Meta-Program for handling operations such as exiting a program or performing I/O.

  • The compiler for this system recognizes API calls such as:

    exit()
    
  • When the compiler encounters such a call, it emits a JMP instruction that transfers control to the predefined address inside the Meta-Program.

  • Example conceptually:

    exit()  →  JMP 0x2000   (address inside the Meta-Program)
    
  • When this instruction executes, control transfers from the user program back to the Meta-Program. Since the Meta-Program has reserved that address to handle the request, it can safely process the operation.

  • The Meta-Program then performs the requested task and decides which program should run next.

Result

  • Instead of halting the CPU, the program voluntarily surrenders control back to the Meta-Program.

  • This mechanism forms the early conceptual foundation of System Calls (syscalls) - the structured interface through which programs communicate with the operating system kernel.

  • Syscalls are also called the APIs between user programs and the kernel.

Interrupts and Hardware Timers

The Goal: Maximize CPU efficiency and enable multitasking.

Problem - Wasted CPU Cycles

  • At this stage of our system, the CPU can execute only one program at a time.

  • However, programs often need to wait for hardware operations such as:

    • disk I/O

    • network responses

    • input/output devices

  • While the program waits, the CPU simply sits idle, doing nothing.

  • This wastes valuable processing power.

  • So the question becomes:

Can we make the CPU do useful work while one program is waiting?

  • Even though we only have one CPU, we would like to give the illusion that multiple programs are running at once.

Solution - Hardware Timers and Interrupts

  • Add a timer, bro ☺︎

  • The solution is to introduce a hardware timer.

  • A hardware timer is a small hardware component connected to the CPU that periodically generates a signal.

The timer is usually a separate hardware device, not part of the CPU core. It is connected to the CPU through interrupt lines.

Examples: Programmable Interval Timer (PIT), Local APIC timer, and HPET

Timer Functionality

  • A hardware timer is connected to the CPU.

  • The timer’s frequency is configurable, meaning we can control how often it triggers.

  • When the timer trips, it generates an interrupt signal to the CPU.

  • The CPU transfers control to a predefined memory address.

  • We modify the meta-program to include a special piece of code (the interrupt handler) at that predefined memory address - this code runs every time the timer triggers.

  • Inside this code, we implement the logic to shuffle or switch between processes.

Note: In real systems, the CPU does not directly know the interrupt handler address. Instead, it looks up the address in a special data structure called the Interrupt Vector Table (IVT) or Interrupt Descriptor Table (IDT).
This table maps interrupt signals (like the timer interrupt) to the corresponding interrupt handler in the operating system.

Example: x86 systems use an Interrupt Descriptor Table (IDT)

Interrupt

  • When the timer trips, it interrupts the CPU from whatever it was doing.

  • This event is called an interrupt.

  • The special code that runs in response is called an interrupt handler.

  • The interrupt handler allows the Meta-Program to:

    • pause the currently running program

    • select another program to run

    • resume execution of that program

Result

  • The timer periodically forces control back to the Meta-Program, allowing it to decide which program should run next.

  • This mechanism enables context switching, which is the foundation of multitasking operating systems.

Context Switching

Program vs Process

To understand how the interrupt handler enables multitasking, we must first distinguish between two terms.

Program

A program is passive code stored on disk.

Process

A process is an active instance of that program running in memory. It has its own CPU registers, stack, and current execution point.

Problem

  • How do we shuffle or switch between processes?

States of Process

  • Our process will have mainly two states

    • Running: The process is currently being executed by the CPU

    • Suspended: The process has temporarily paused its execution, waiting to be resumed later

  • To switch between processes, the system must keep track of the state of each process.

Components of Process State

A process's state includes:

  • Execution status: Whether it's running or suspended.

  • CPU registers: Program Counter (PC), Stack Pointer (SP), general-purpose registers, flags, etc.

These values together form the context of the process.

Solution - Context Switching

  • To switch between processes, the system saves the state of the currently running process and restores the state of a previously suspended process.

Steps to shuffle processes

  • Save the state of the current (running) process

    • Take the currently running process and copy all relevant CPU registers into memory reserved for that process. Example registers saved:

      • Program Counter (PC)

      • Stack Pointer (SP)

      • general-purpose registers

      • flags

    • Mark the process as Suspended.

  • Restore the state of the next (suspended) process

    • Take another process that is currently Suspended.

    • Copy its saved CPU registers from memory back into the CPU.

    • Mark the process as Running.

  • Resume execution

    • The CPU resumes execution from the restored Program Counter (PC).

    • Conceptually, this is equivalent to:

      • JMP <address in process memory>
    • Execution continues exactly where the process left off.

Result

  • This mechanism is called Context Switching.

  • By repeatedly saving and restoring process contexts, the operating system can alternate between multiple processes.

  • Even though the CPU executes only one instruction at a time, this rapid switching gives the illusion that multiple programs are running simultaneously.

Inference

  • We have now built a system that uses hardware interrupts to periodically regain control of the CPU, allowing the Meta-Program to switch between processes.

  • By saving and restoring process context, the system achieves multitasking on a single processor.

Next Steps

Now that multiple processes share the same RAM, an important question arises:

What prevents one program from accidentally overwriting the memory of another — or even the Meta-Program itself?

This leads to the concept of memory protection and isolation.

Extra Notes

The BIOS Boot Sequence (The Legacy Path)

1. The Wake-Up (ROM + CPU)

When you press the power button, the CPU begins execution at the hardcoded Reset Vector, which is mapped to firmware stored in ROM.

  • The BIOS (Basic Input/Output System) starts running.

  • It performs the POST (Power-On Self Test) to verify that essential hardware such as RAM, CPU, and storage controllers are functioning correctly.

2. The Search (ROM → Long Storage)

The BIOS contains a Boot Order stored in its settings.

  • It checks the configured storage devices (hard drive, SSD, etc.) and searches for the first sector of the disk, known as Sector 0.

  • This 512-byte block is called the Master Boot Record (MBR).

3. The Micro-Load (CPU moves Storage → RAM)

The BIOS firmware reads those 512 bytes from disk and copies them into RAM at address:

0x7C00

This address was chosen by early IBM PCs in the 1980s and became a standard followed by BIOS implementations.

4. The Handover (The Jump)

Once the MBR is loaded into RAM, the BIOS transfers control by jumping to that address:

JMP 0x7C00

The CPU stops executing firmware code in ROM and begins executing the bootloader code in RAM.

5. The Multi-Stage Loading (The OS takes over)

Since 512 bytes is far too small for an entire operating system, the code in the MBR acts as a Stage-1 bootloader.

Its job is to:

  • Locate the rest of the bootloader on disk

  • Load the operating system kernel into RAM

  • Transfer control to the kernel.

On most Linux systems this bootloader is GRUB (Grand Unified Bootloader), which then loads the Linux kernel located in /boot/vmlinuz.

To understand the boot process in more detail, refer to:
https://0xax.gitbooks.io/linux-insides/content/Booting/linux-bootstrap-1.html