π Lecture 06
Lecture 06: Multi-Threading
Everything about thread
πΉ Process vs Thread
Process
- A process is an independent program in execution.
- Each process has its own memory space (heap, stack, data, code segments).
- Processes do not share global variables by default (unless you use shared memory, pipes, sockets, etc.).
- Context switching between processes is heavy, because the OS must save/restore memory maps.
Thread
- A thread is a lightweight unit of execution inside a process.
- All threads in a process share the same memory space (code, heap, global variables).
- But each thread has its own stack (local variables and function calls).
- Context switching between threads is lighter, since memory is shared.
πΉ Global Variables in Threads vs Processes
In Processes
- If you define a global variable in one process, other processes cannot access it directly.
- Example:
int counter = 0; // global
int main() {
if (fork() == 0) {
// Child process
counter++;
printf("Child: %d\n", counter);
} else {
// Parent process
counter++;
printf("Parent: %d\n", counter);
}
}
Output: Both child and parent have separate copies of counter
.
In Threads
- All threads in the same process share global variables.
- Example:
int counter = 0; // global
void* thread_func(void* arg) {
counter++;
printf("Thread: %d\n", counter);
return NULL;
}
int main() {
pthread_t t1, t2;
pthread_create(&t1, NULL, thread_func, NULL);
pthread_create(&t2, NULL, thread_func, NULL);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
}
β οΈ Here both threads share counter
, so:
- Without synchronization β race conditions may occur.
- With locks (mutex/semaphore) β safe increments.
πΉ Diagram
1. Processes (Separate Memory)
+------------------+ +------------------+
| Process A | | Process B |
| | | |
| Global Var = 5 | | Global Var = 5 | <-- Separate copies
| Heap | | Heap |
| Stack | | Stack |
| Code | | Code |
+------------------+ +------------------+
- Even if both declare the same global variable, they are not shared.
2. Threads (Shared Memory)
+--------------------------------------------+
| Process |
| |
| Global Var = 5 <-----------------------+ |
| Heap (Shared across threads) | |
| Code | |
| |
| +------------+ +------------+ |
| | Thread 1 | | Thread 2 | |
| | Stack | | Stack | |
| +------------+ +------------+ |
+--------------------------------------------+
- Global variables are shared, but each thread has its own stack.
πΉ Key Differences
Feature | Processes | Threads |
---|---|---|
Memory space | Independent | Shared |
Global variables | Separate | Shared |
Overhead | High (context switch) | Low |
Communication | IPC needed (pipes, sockets, etc.) | Direct via shared memory |
Fault tolerance | Safer (one crash doesnβt kill others) | Riskier (one crash can kill process) |
πΉ Issues with Global Variables in Threads
- Race Conditions β two threads updating at the same time β inconsistent data.
- Example:
counter++
is not atomic.
- Example:
- Synchronization Needed β use mutexes, semaphores, or atomic operations.
- Deadlocks β improper locking order can freeze threads.
- Scalability Problems β too many threads sharing global state β contention.
πΉ Benefits of Threads
- Resource Sharing
- All threads of a process share code, data, and files.
- Easier communication compared to inter-process communication (IPC).
- Responsiveness
- If one thread is blocked (e.g., waiting for I/O), others can continue execution.
- Example: In a web browser, one thread renders the page while another downloads images.
- Faster Context Switching
- Switching between threads is lighter than switching between processes since memory and resources are shared.
- Economical
- Creating and managing threads requires fewer system resources than processes.
- Better Utilization of Multiprocessor Architectures
- Threads can run in parallel on multiple CPU cores β higher throughput.
- Scalability
- Threads allow applications (like servers, databases, simulations) to handle many tasks at once efficiently.
πΉ Types of Threads
1. User-Level Threads (ULTs)
- Managed by user-level libraries, not the OS kernel.
- OS sees only the main process, not individual threads.
- Advantages:
- Fast to create, switch, and schedule (no kernel involvement).
- Can be implemented on any OS (even if the OS doesnβt support threads).
- Disadvantages:
- If one thread makes a blocking system call β entire process is blocked.
- Cannot take advantage of multiprocessor systems (kernel only schedules the process as one unit).
2. Kernel-Level Threads (KLTs)
- Managed directly by the operating system kernel.
- OS schedules and manages threads individually.
- Advantages:
- True parallelism on multiprocessor systems.
- If one thread blocks, others in the same process can continue.
- Disadvantages:
- Slower to create and manage (kernel overhead).
- More memory and resources required.
3. Hybrid Threads
- Combine user-level and kernel-level threads.
- User-level library creates and manages threads, but they are mapped onto kernel threads.
- Example models:
- Many-to-One: Many user threads mapped to one kernel thread.
- One-to-One: Each user thread maps to a kernel thread (used in modern OS like Windows, Linux).
- Many-to-Many: Many user threads mapped to many kernel threads (flexible, but complex).
- Advantages:
- Balances efficiency and parallelism.
- Better performance for large-scale applications.
πΉ How Threads Work in the OS
- Thread Creation
- The OS (or a user-level library) allocates a stack and control block for each thread.
- Scheduling
- User-level threads: Scheduler is part of the thread library.
- Kernel-level threads: Scheduler is part of the OS kernel.
- Hybrid: Both user library and kernel cooperate.
- Context Switching
- Saves and restores the threadβs registers, stack pointer, and program counter.
- Faster than process context switch since memory is shared.
- Synchronization
- Since threads share global variables and memory, OS provides synchronization primitives:
- Mutexes
- Semaphores
- Monitors
- Spinlocks
- Prevents race conditions and ensures consistency.
- Since threads share global variables and memory, OS provides synchronization primitives:
- Communication
- Easier compared to processes (no need for IPC).
- Threads communicate through shared memory (global variables, heap).
- Execution on Multiprocessors
- Kernel-level and hybrid threads can be scheduled on different CPUs β true parallelism.
- OS thread scheduler balances load across cores.
πΉ Diagram Idea (Types of Threads)
User-Level Threads (ULTs)
+-------------------+
| Process |
| Thread 1 | Managed by user library
| Thread 2 | OS sees only one process
+-------------------+
Kernel-Level Threads (KLTs)
+-------------------+
| Process |
| Thread 1 (OS) | Managed by kernel
| Thread 2 (OS) | OS schedules independently
+-------------------+
Hybrid Threads
+-------------------+
| Process |
| User Thread 1 -> Kernel Thread A
| User Thread 2 -> Kernel Thread B
+-------------------+
Diagram
This is the diagram that is show casing everything
graph TD
subgraph P1 [Single-Threaded Process]
direction TB
Code1[Code Segment]
Data1[Data Segment]
Heap1[Heap]
Stack1[Stack Thread 1]
end
subgraph P2 [Multi-Threaded Process]
direction TB
Code2[Code Segment shared]
Data2[Data Segment shared]
Heap2[Heap shared]
subgraph Threads
direction LR
T1[Stack Thread 1]
T2[Stack Thread 2]
T3[Stack Thread 3]
end
end
P1 -->|Independent| P2
πΉ Process Suspension
- Suspension = process is put into a waiting state (paused), but not destroyed.
- Reasons:
- I/O wait (waiting for disk, network, etc.)
- Resource request (waiting for memory, lock, semaphore)
- User/system request (admin pauses a process)
- Parent process suspension (child often suspended with it)
- Suspended process stays in main memory or can be swapped to disk.
πΉ Thread Suspension
- A thread inside a process can also be suspended:
- E.g., waiting for I/O or waiting for a lock (mutex).
- Since threads share memory, suspension of one thread does not suspend the whole process (other threads keep running).
- β οΈ Exception: If the main thread (the one that created others) suspends itself, the OS may block the entire process.
πΉ Process Termination
- When a process terminates:
- All threads inside it are also terminated automatically.
- OS reclaims resources: memory, files, I/O, descriptors.
- Types:
- Normal termination β process finishes execution.
- Error termination β program crashes, illegal instruction, divide by zero, etc.
- Killed by another process β e.g.,
kill -9
in Linux. - Parent termination β may cause child termination (depends on OS).
πΉ Thread Termination
- A thread terminates when:
- It finishes its task (normal exit).
- It is explicitly killed (e.g.,
pthread_cancel
in POSIX). - The process itself terminates β all threads inside die.
- Thread termination may cause:
- Resource cleanup (stack freed, registers released).
- Impact on process: if itβs the main thread, the entire process ends unless other threads are detached.
πΉ Linking Between Suspension & Termination
- Process β Threads
- Suspending a process = all threads inside are suspended.
- Terminating a process = all threads inside are terminated.
- Thread β Process
- Suspending a thread = only that thread is paused (others may still run).
- Terminating the main thread can end the whole process.
πΉ Tree Structure (Multi-Threaded Process)
Think of a process tree like a family:
Process (Parent)
β
βββ Thread 1 (Main thread)
β βββ Thread 2 (Worker)
β βββ Thread 3 (Worker)
- If Process dies β all threads are killed.
- If Thread 1 dies (and itβs the main thread) β process usually dies β all other threads die too.
- If Thread 2 or 3 dies β only that branch ends, process continues.
πΉ Diagram (Mermaid)
Hereβs a diagram for clarity:
graph TD
subgraph P [Process]
T1[Thread 1 - Main]
T2[Thread 2]
T3[Thread 3]
end
P -->|Termination| X[All threads die]
T1 -->|If main thread ends| X
T2 -->|Worker thread ends| P
T3 -->|Worker thread ends| P
πΉ Example Scenarios
Example 1: Web Browser
- Process: Browser main process.
- Threads:
- UI thread β renders interface.
- Network thread β downloads files.
- Tab threads β run web pages.
- If the browser process crashes β all tabs & downloads stop.
- If one tab thread crashes β only that tab stops, browser continues.
Example 2: Word Processor
- Process: MS Word.
- Threads:
- Main thread β handles user input.
- Auto-save thread β saves file every 5 min.
- Spell-check thread β checks text in background.
- If process is suspended (e.g., OS swaps it out) β all threads paused.
- If the auto-save thread dies β program still works, but no backup saves.
π§΅ User-Level Threads (ULTs)
πΉ What They Are
- Managed in user space (library level), not by the OS kernel.
- The OS sees only the process β it doesnβt know threads exist inside it.
- Threading library (like pthreads, green threads, or Java threads) does scheduling & switching.
πΉ How They Work
- OS β Process
- The OS scheduler gives CPU time to the whole process.
- It does not know about internal threads.
- Process β Threads
- Inside the process, a thread library (user-level) decides which thread runs.
- Context switching between threads is done by the library, not the OS.
- Execution Flow
Hardware (CPU) β OS Kernel β Process β Thread Library β Thread Execution
πΉ Advantages of ULTs
- Fast switching: No kernel involvement β just library calls.
- Portable: Same library works across OSes.
- Custom scheduling: Process can implement its own policies (priority, round robin, etc.).
πΉ Issues / Limitations
- Blocking Problem β οΈ
- If one ULT makes a blocking system call (e.g.,
read()
on disk), the entire process blocks. - Why? Because the OS only sees one process, not multiple threads.
- β All threads inside are paused.
- If one ULT makes a blocking system call (e.g.,
- No true parallelism π«
- On a multicore CPU, the OS schedules the whole process onto one core at a time.
- So threads cannot run on multiple cores simultaneously.
- Resource allocation blind spot
- The OS allocates resources (CPU time, memory) per process, not per thread.
- Threads have to share what the process gets.
πΉ Resource Allocation to ULTs
- Memory: All ULTs share code, data, heap. Each has its own stack.
- CPU time: The OS gives a time slice to the process. The thread library divides it among threads.
- I/O: OS handles I/O at process level, so one blocking call can freeze everything.
πΉ Flow Diagram (ULT Execution)
flowchart TD
CPU --> OS["OS Scheduler (sees Process)"]
OS --> P["Process Control Block"]
P --> TL["User Thread Library"]
TL --> T1["Thread 1"]
TL --> T2["Thread 2"]
TL --> T3["Thread 3"]
T1 -.->|Blocking I/O| P
T2 -.->|Shares Process resources| P
T3 -.->|Shares Process resources| P
πΉ Example: Web Server with ULTs
- Suppose a web server process runs 100 user-level threads for client connections.
- OS only sees 1 process.
- If one thread does a blocking
read()
β OS pauses the whole process β all 100 clients freeze.
Perfect, this is the natural next step after User-Level Threads. Letβs dive into Kernel-Level Threads (KLTs) and contrast them with ULTs so the picture is complete.
π§΅ Kernel-Level Threads (KLTs)
πΉ What They Are
- Managed directly by the OS kernel.
- OS scheduler sees each thread as an independent schedulable entity.
- Examples: Windows threads, Linux
pthread
(when mapped to kernel), Solaris threads.
πΉ How They Work (Flow)
- Hardware β OS Kernel
- CPU runs an instruction, traps to kernel for scheduling.
- OS Kernel β Threads
- Kernel thread table stores registers, state, stack pointer for each thread.
- OS scheduler picks which thread (not just which process) runs next.
- Threads β Process
- Threads still belong to a process (share its address space, code, heap).
- But they can be scheduled independently across CPUs.
π Execution Path:
Hardware (CPU) β OS Kernel Scheduler β Thread (inside Process)
πΉ Advantages of KLTs
- True Parallelism π₯οΈ
- Multiple threads of the same process can run simultaneously on multiple cores.
- Example: Browser rendering one tab while downloading in another.
- Non-blocking behavior π¦
- If one thread makes a blocking system call, the OS can still schedule another thread from the same process.
- Fine-grained scheduling
- OS decides priority, time-slicing, and load balancing across CPUs.
πΉ Disadvantages of KLTs
- Higher overhead β οΈ
- Switching between kernel threads requires mode switch (user β kernel).
- More expensive than ULT context switching (which happens in user space).
- OS dependency
- Implementation depends on OS support β less portable than ULTs.
- Complex management
- Requires kernel data structures, synchronization, and more bookkeeping.
πΉ Resource Allocation
- Memory: Shared among threads, but each thread has its own stack & registers stored in the kernel.
- CPU time: Kernel scheduler assigns time to individual threads.
- I/O: One thread can block on I/O while others in the same process continue running.
πΉ Diagram (Kernel-Level Threads)
flowchart TD
CPU --> OS[OS Kernel Scheduler]
OS --> P[Process Control Block]
subgraph P [Process]
T1[Kernel Thread 1]
T2[Kernel Thread 2]
T3[Kernel Thread 3]
end
OS -.-> T1
OS -.-> T2
OS -.-> T3
πΉ Example: Database Server
- Process: MySQL server.
- Threads:
- Query thread (handling SQL request).
- I/O thread (reading from disk).
- Background cleanup thread.
- If query thread blocks β OS can still run I/O thread.
- On a multicore machine, queries can run truly in parallel.
πΉ ULT vs KLT Quick Table
Feature | User-Level Threads (ULT) | Kernel-Level Threads (KLT) |
---|---|---|
Managed by | User-level library | OS kernel |
Visibility to OS | OS sees only process | OS sees each thread |
Context switch | Very fast (no kernel call) | Slower (requires kernel involvement) |
Blocking syscalls | Blocks whole process | Only blocks the calling thread |
Parallelism | Not true parallelism (one core only) | True parallelism across multiple cores |
Portability | High (library only) | Low (OS-specific support needed) |
π Multithreading Models
Since we can have threads managed by the user space library (user-level threads, ULTs) and also by the kernel (kernel-level threads, KLTs), the OS needs a way to map one to the other. This mapping is what we call multithreading models.
There are three classic models:
1. Many-to-One Model
- How it works:
- Multiple user threads (ULTs) are mapped to a single kernel thread.
- The OS only sees one thread per process.
- The user-level thread library handles scheduling among user threads.
- Pros:
- Simple and efficient for context switching (only in user space).
- No need for kernel support for threading.
- Cons:
- Blocking problem: If one ULT makes a blocking system call, the whole process (all threads) is blocked.
- Cannot take advantage of multiprocessor systems since only one kernel thread runs.
- Diagram:
[ULT1] \
[ULT2] > ----> [Single KLT] ----> [CPU]
[ULT3] /
2. One-to-One Model
- How it works:
- Each user thread is mapped to its own kernel thread.
- The OS manages scheduling and execution directly.
- Pros:
- True parallelism on multiprocessors.
- If one thread blocks, others can still run.
- Cons:
- Overhead: Creating a user thread requires creating a kernel thread (more expensive).
- Some OSes limit the number of threads that can be created due to kernel resource usage.
- Diagram:
[ULT1] ---> [KLT1] ---> CPU
[ULT2] ---> [KLT2] ---> CPU
[ULT3] ---> [KLT3] ---> CPU
3. Many-to-Many Model
- How it works:
- Multiple user threads are mapped to multiple kernel threads.
- The thread library schedules user threads onto a smaller or equal pool of kernel threads.
- The OS and user library share responsibility for scheduling.
- Pros:
- Flexible: Can exploit parallelism but also avoid excessive kernel thread creation.
- Solves the blocking problem: if one ULT blocks, another can still be scheduled on a different KLT.
- Cons:
- More complex to implement (requires cooperation between user and kernel).
- Higher coordination overhead.
- Diagram:
[ULT1] \
[ULT2] > ---> [KLT1] ---> CPU
[ULT3] / \
\---> [KLT2] ---> CPU
4. Two-Level Model (Variation of Many-to-Many)
- How it works:
- Similar to Many-to-Many but allows some ULTs to be bound to specific KLTs.
- Gives flexibility and also efficiency for certain βcriticalβ threads.
- Example:
- A thread that performs real-time I/O may be bound to a kernel thread for direct execution, while other threads are multiplexed.
π Workflow (Flow of Execution)
- User creates a thread β User-level library handles the creation.
- Depending on model:
- In Many-to-One: Only one kernel thread represents all threads.
- In One-to-One: Each ULT β new KLT in the kernel.
- In Many-to-Many: ULT scheduler assigns ULT β available KLT.
- Kernel schedules KLTs β Maps them to CPUs.
- Execution happens β ULT runs on top of the assigned KLT.
π‘ Quick comparison table:
Model | Parallelism? | Blocking Issue? | Efficiency | Complexity |
---|---|---|---|---|
Many-to-One | β No | β Yes | β High | β Low |
One-to-One | β Yes | β No | β Low | β Medium |
Many-to-Many | β Yes | β No | β Medium | β High |
Two-Level | β Yes | β No | β Medium | β Higher |