Info
This is a summary of Chapter 38 from the book Operating System: Three Easy Pieces by Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau. This chapter focuses on Redundant Arrays of Inexpensive Disks (RAID), explaining different RAID levels, their advantages, trade-offs, and performance characteristics.
1. Introduction
Disks are often slow, limited in capacity, and vulnerable to failure. RAID systems address these issues by combining multiple disks to improve:
- Performance – Faster I/O operations via parallelism.
- Capacity – Larger storage by combining multiple disks.
- Reliability – Fault tolerance by introducing redundancy.
Key Questions:
- How can RAID enhance disk storage?
- What trade-offs exist between different RAID levels?
2. RAID Overview
RAID presents a transparent interface to the operating system, appearing as a single large disk. Internally, it consists of:
- Multiple disks working together.
- A RAID controller managing disk interactions.
- Memory (DRAM/NVRAM) for caching and buffering.
- Specialized hardware/software for redundancy management.
RAID Benefits:
- Improves performance through parallel I/O.
- Increases storage capacity by aggregating disks.
- Enhances reliability via redundancy mechanisms.
3. RAID Fault Model
RAID assumes a fail-stop model:
- Disks are either fully operational or completely failed.
- Failures are immediately detectable by the RAID controller.
- More complex failures (e.g., silent corruption, latent sector errors) are not initially considered but are relevant in real-world scenarios.
4. RAID Performance Metrics
RAID levels are evaluated based on:
- Capacity – Usable storage after redundancy.
- Reliability – Fault tolerance (number of disks that can fail).
- Performance – Impact on read/write speeds.
5. RAID Levels and Analysis
RAID 0 (Striping)
- No redundancy (not technically a RAID).
- Distributes data across disks to maximize performance.
- Capacity:
- Reliability: 0 (failure of one disk = data loss).
- Performance:
- Sequential Read/Write: (uses all disks).
- Random Read/Write: .
RAID 1 (Mirroring)
- Each disk has an exact copy on another disk.
- Tolerates single disk failure (or more, if lucky).
- Capacity: .
- Reliability: 1 disk (or more, depending on failures).
- Performance:
- Reads: (can use either copy).
- Writes: (must write to both copies).
RAID 4 (Parity-Based)
- Uses one disk for parity calculations.
- Recovers lost data using XOR calculations.
- Capacity: .
- Reliability: 1 disk failure tolerance.
- Performance:
- Sequential Read/Write: .
- Random Read: .
- Random Write: Severely bottlenecked by parity disk.
🔴 Small-Write Problem: Every write requires updating the parity disk, causing a bottleneck.
RAID 5 (Rotating Parity)
- Same as RAID 4, but parity is distributed across disks.
- Reduces the parity disk bottleneck, allowing better performance.
- Capacity: .
- Reliability: 1 disk failure tolerance.
- Performance:
- Sequential Read/Write: .
- Random Read: .
- Random Write: (better than RAID 4).
6. Summary Table
RAID | Capacity | Reliability | Seq. Read | Seq. Write | Rand. Read | Rand. Write |
---|---|---|---|---|---|---|
RAID 0 | 0 | |||||
RAID 1 | 1 (or more if lucky) | |||||
RAID 4 | 1 disk | (Bottlenecked) | ||||
RAID 5 | 1 disk |
7. Conclusion
- RAID 0: Best performance, no reliability.
- RAID 1: Best reliability, expensive (requires 2x storage).
- RAID 4: Efficient storage, poor write performance.
- RAID 5: Good balance of reliability and storage efficiency.
Choosing the right RAID level depends on whether performance, reliability, or storage efficiency is the priority.