-
Resources
-
Chapter 1: The Anatomy of PostgreSQL: Architecture and Process Model
-
- Join this Course to access resources
- Join this Course to access resources
- Join this Course to access resources
-
-
Chapter 2: The Journey of a Query: Lexing, Parsing, and the Traffic Cop
-
Chapter 3: The PostgreSQL Rule System and Query Rewriting
-
- Join this Course to access resources
-
-
Chapter 4: The Query Planner Part I: Statistics and Cost Estimation
-
- Join this Course to access resources
-
-
Chapter 5: The Query Planner Part II: Path Generation and GEQO
-
- Join this Course to access resources
-
-
Chapter 6: The Executor: Processing the Plan Tree
-
- Join this Course to access resources
-
-
Chapter 7: Advanced Indexing Under the Hood
-
- Join this Course to access resources
-
-
Chapter 8: Multiversion Concurrency Control (MVCC) and Vacuuming
-
- Join this Course to access resources
-
-
Chapter 9: Memory Management and Caching Strategies
-
- Join this Course to access resources
-
-
Chapter 10: The Write-Ahead Log (WAL) and Crash Recovery
-
- Join this Course to access resources
-
-
Chapter 11: Replication: Physical and Logical
-
- Join this Course to access resources
-
-
Chapter 12: Distributed PostgreSQL and Sharding
-
- Join this Course to access resources
-
-
Chapter 13: Extending the Engine
-
- Join this Course to access resources
-
-
Project Based Assignments
Project 2: Forensic Analysis of MVCC and Vacuuming
Project 2: Forensic Analysis of MVCC and Vacuuming
PostgreSQL's Multiversion Concurrency Control (MVCC) provides excellent concurrent performance but requires diligent maintenance to prevent physical storage bloat. In this project, you will deliberately cause and then resolve massive table bloat.
Create a test database and use a pgbench custom script (provided) to simulate an intense Online Transaction Processing (OLTP) workload consisting of rapid, continuous UPDATE and DELETE statements on a single table.
Using the pageinspect extension, you will inspect the physical 8KB pages on disk. You must capture and explain the state of the tuple headers (xmin, xmax, ctid) to prove the existence of "dead" tuples. Calculate the exact bloat percentage of your table and its indexes. Next, you will design a custom autovacuum strategy specifically for this table by tuning parameters like autovacuum_vacuum_scale_factor to aggressively reclaim space. You must demonstrate the resolution of the bloat and explain how the Free Space Map (FSM) was updated.
Rubric:
| Criteria | Excellent | Proficient | Needs Improvement |
|---|---|---|---|
| MVCC Mechanics Application | Perfectly explains row visibility and the lifecycle of a tuple using xmin, xmax, and snapshot isolation concepts. | Explains xmin/xmax generally but struggles to explain edge cases or concurrent visibility. | Misunderstands how PostgreSQL determines if a row is dead or alive. |
| Physical Page Inspection | Successfully uses pageinspect to map physical page layouts and accurately calculates exact bloat metrics. | Uses pageinspect but relies on generic queries rather than calculating specific byte-level bloat. | Unable to successfully use the extension or interpret the binary page data. |
| Autovacuum Tuning | Sets highly optimized, table-specific autovacuum parameters and proves their mathematical efficiency. | Alters autovacuum settings globally or uses arbitrary numbers without mathematical justification. | Fails to trigger autovacuum effectively or relies exclusively on manual VACUUM FULL. |
| Documentation of FSM | Clearly demonstrates how the Free Space Map changes before and after the vacuuming process. | Mentions the FSM but does not provide empirical evidence of its changes. | Ignores the Free Space Map entirely. |
There are no comments for now.