- Previously
- Enforced modularity on a single machine via virtualization.
- Virtual memory, bounded buffers, threads.
- Saw monolithic vs. microkernels.
- Talked about VMs as a means to run multiple instances of an OS on a single machine with enforced modularity (bug in one OS won't crash the others).
- Big thing to solve was how to implement the VMM. Solution: Trap and emulate. How the emulation works depends on the situation.
- Another key problem: How to trap instructions that don't generate interrupts.
- Big thing to solve was how to implement the VMM. Solution: Trap and emulate. How the emulation works depends on the situation.
- Enforced modularity on a single machine via virtualization.
- What's left? Performance
- Performance requirements significantly influence a system's design.
- Today: General techniques for improving performance.
- Technique 1: Buy New Hardware
- Why? Moore's law => processing power doubles every 1.5 years, DRAM density increase over time, disk price (per GB) decreases, ...
- But:
- Not all aspects improve at the same pace.
- Moore's Law is plateauing.
- Hardware improvements don't always keep pace with load increases.
- Conclusion: Need to design for performance, potentially re-design as load increases.
- General Approach
- Measure the system and find the bottleneck (the portion that limits performance).
- Relax (improve) the bottleneck.
- Measurement
- To measure, need metrics:
- Throughput: Number of requests over a unit of time.
- Latency: Amount of time for a single request.
- Relationship between these changes depending on the context.
- As system becomes heavily-loaded:
- Latency and throughput start low. Throughput increases as users enter, latency stays flat...
- ..until system is at maximum throughput. Then throughput plateaus, latency increases.
- For heavily-loaded systems: Focus on improving throughput.
- Need to compare measured throughput to possible throughput: Utilization.
- Utilization sometimes makes bottleneck obvious (CPU is 100% utilized vs. disk is 20% utilized), sometimes not (CPU and disk are 50% utilized, and at alternating times).
- Helpful to have a model in place: What do we expect from each component?
- When bottleneck is not obvious, use measurements to locate candidates for bottlenecks, fix them, see what happens (iterate).
- To measure, need metrics:
- How to Relax the Bottleneck
- Better algorithms, etc. These are application-specific. 6.033 focuses on generally-applicable techniques.
- Batching, caching, concurrency, scheduling.
- Examples of these techniques follow. The examples related to operating systems (that's what you know), but techniques apply to all systems.
- Disk Throughput
- How does an HDD (magnetic disk) work?
- Several platters on a rotating axle.
- Platters have circular tracks on either side, divided into sectors.
- Cylinder: Group of aligned tracks.
- Disk arm has one head for each surface, all move together.
- Each disk head reads/writes sectors as they rotate past. Size of a sector = unit of read/write operation (typically 512B).
- To read/write:
- Seek arm to desired track.
- Wait for platter to rotate the desired sector under the head.
- Read/write as the platter rotates.
- What about SSDs?
- Organized into cells, each of which hold one (or 2, or 3) bits.
- Cells organized into pages; pages into blocks.
- Reads happen at page-level. Writes also at page-level, but to new pages (no overwrites of pages).
- Erases (and thus overwrites) are at block-level.
- Takes a high voltage to erase.
- How long does R/W take on HDD?
- Example disk specs:
- Capacity: 400GB
- Platters: 5
- # heads: 10
- # sectors per track: 567–1170 (inner to outer)
- # bytes per sector: 512
- Rotational speed: 7200 RPM => 8.3ms per revolution
- Seek time: Avg read seek 8.2ms, avg write seek 9.2ms.
- Given as part of disk specs
- Rotation time: 0–8.3ms.
- Platters only rotate in one direction.
- R/W as platter rotates: 35–62MB/sec.
- Also given in disk specs.
- So reading random 4KB block: 8.2ms + 4.1ms + ~.1ms = 12.4
- 4096 B / 12.4 ms = 322KB/s.
=> 99% of the time is spent moving the disk.
- Example disk specs:
- Can we do better?
- Use flash? For this particular random-access of reads, yes; SSDs would help if available.
- Batch individual transfers?
- .8ms to seek to next track + 8.3ms to read entire track = 9.1ms.
- .8ms is single-track seek time for our disk (again, from specs).
- 1 track contains ~1000sectors * 512B = 512KB.
- Throughput: 512KB/9.1ms = 55MB/s.
- .8ms to seek to next track + 8.3ms to read entire track = 9.1ms.
- Lesson: Avoid random access. Try to do long sequential reads.
- But how?
- If your system reads/writes entire big files, lay them out contiguously on disk. Hard to achieve in practice!
- If your system reads lots of small pieces of data, group them.
- But how?
- How does an HDD (magnetic disk) work?
- Caching
- Already saw in DNS. Common performance-enhancement for systems.
- How do we measure how well it works?
- Average access time: Hit_time * hit_rate + miss_time * miss_rate.
- Want high hit rate. How do we know what to put in the cache?
- Can't keep everything.
- So really: How do we know what to *evict* from the cache?
- Popular eviction policy: Least-recently used.
- Evict data that was used the least recently.
- Works well for popular data.
- Bad for sequential access (think: Sequentially accessing a dataset that is larger than the cache).
- Caching is good when:
- All data fits in the cache.
- There is locality, temporal or spatial.
- Caching is bad for:
- Writes (writes have to go to cache and disk; cache needs to be consistent, but disk is non-volatile).
- Moral: To build a good cache, need to understand access patterns
- Like disk performance: To relax disk as bottleneck, needed to understand details of how it works
- Concurrency/Scheduling
- Suppose server alternates between CPU and disk:
CPU: --A-- --B-- --C-- Disk: --A-- --B-- --C--
- Apply concurrency, can get:
CPU: --A----B----C-- ... Disk: --A----B-- ..
- This is a scheduling problem: Different orders of execution can lead to different performance.
- Example:
- 5 concurrent threads issue concurrent reads to sectors 71, 10, 92, 45, and 29.
- Naive algorithm: Seek to each sector in turn.
- Better algorithm: Sort by track and perform reads in order. Gets even higher throughput as load increases.
- Drawback: It's unfair.
- No one right answer to scheduling. Tradeoff between performance and fairness.
- Suppose server alternates between CPU and disk:
- Parallelism
- Goal: Have multiple disks, want to access them in parallel.
- Problem: How do we divide data across the disks?
- Depends on bottleneck:
- Case 1: Many requests for many small files. Limited by disk seeks. Put each file on a single disk, and allow multiple disks to seek multiple records in parallel.
- Case 2: Few large reads. Limited by sequential throughput. Stripe files across disks.
- Another case: Parallelism across many computers.
- Problem: How do we deal with machine failures?
- (One) Solution: Go to recitation tomorrow!
- Summary
- We can't magically apply any of the previous techniques. Have to understand what goes on underneath.
- Batching: How disk access works.
- Caching: What is the access pattern?
- Scheduling/concurrency: How disk access works, how system is being used (the workload).
- Parallelism: What is the workload?
- Techniques apply to multiple types of hardware.
- E.g., caching is useful regardless of whether you have HDD or SSD.
- We can't magically apply any of the previous techniques. Have to understand what goes on underneath.
- Useful numbers for your day-to-day-lives:
- Latency:
- 0.00000001ms: Instruction time (1 ns)
- 0.0001ms: DRAM load (100 ns)
- 0.1ms: LAN network
- 10ms: Random disk I/O
- 25–50ms: Internet east -> west coast
- Throughput:
- 10,000 MB/s: DRAM
- 1,000 MB/s: LAN (or100 MB/s)
- 100 MB/s: Sequential disk (or 500 MB/s)
- 1 MB/s: Random disk I/O
- Latency: