The knowledge involved in building large-scale system, covering everthing form architecture to algorithms, from macro to micro.
Comments and suggestions are welcomed.
(Content is being sorted out, a little bit confusing right now)
☑︎ means read
⭐️ means recommend
Distributed Systems
- Time, Clocks, and the Ordering of Events in a Distributed System
- The End of an Architectural Era (It’s Time for a Complete Rewrite)
- Practical Uses of Synchronized Clocks in Distributed Systems
- Information Storage in a Decentralized Computer System
Data Processing
Stream Processing
SQL Workloads
Distributed Storage System
- Atlas: Baidu’s Key-value Storage System for Cloud Data
- PAST: Persistent and Anonymous Storage in a Peer-to-Peer Networking Environment
- OceanStore: An Architecture for Global-Scale Persistent Storage
- Alluxio: A Virtual Distributed File System
Distributed Computation
- Ray: A Distributed Framework for Emerging AI Applications
- Starling: A Scalable Query Engine on Cloud Functions
Database
Motivation
History
Architecture
- Architecture of a Database System
- Column Stores vs Row Stores : How Different Are They Really
- Socrates: The New SQL Server in the Cloud
- Large-scale Incremental Processing Using Distributed Transactions and Notifications
- Readings in Database Systems, 5th Edition
- Aria: A Fast and Practical Deterministic OLTP Database
- ☑ Bigtable: A Distributed Storage System for Structured Data
- Alibaba Hologres: A Cloud-Native Service for Hybrid Serving/Analytical Processing
- LLAMA: A Cache/Storage Subsystem for Modern Hardware
- POLARDB: InnoDB based shared-everything storage solution
- FoundationDB: A Distributed Unbundled Transactional Key Value Store
- POLARIS: The Distributed SQL Engine in Azure Synapse
- The Snowflake Elastic Data Warehouse
- Spanner: Google’s Globally-Distributed Database
Key-Value Store
RocksDB
Greenplum
- ☑︎ Greenplum: A Hybrid Database for Transactional and Analytical Workloads
- ☑︎ Greenplum: A Hybrid Database for Transactional and Analytical Workloads (中文翻译)
PostgreSQL
- THE DESIGN OF POSTGRES
- THE IMPLEMENTATION OF POSTGRES
- [ROWE87] The POSTGRES Data Model
- PostgreSQL Concurrency with MVCC | Heroku Dev Center
CockroachDB
- Living Without Atomic Clocks
- CockroachDB’s Consistency Model
- ☑︎ Life of a SQL Query
- ☑︎ SQL query planning and optimizations
- Index selection in CockroachDB
RisingWave
MVCC
- An Empirical Evaluation of In-Memory Multi-Version Concurrency Control
- ☑︎ PostgreSQL Concurrency with MVCC | Heroku Dev Center
- ☑︎ 浅谈数据库并发控制 - 锁和 MVCC - 面向信仰编程
- Implementing Your Own Transactions with MVCC
- SQL Transaction Isolation Levels Explained
Column Storage
SQL Parser
SQL Test
Query Engine
Push vs Pull
Query Optimization
- ☑︎ Access Path Selection in a Relational Database Management System
- Statistical Profile Estimation in Database Systems
- Building a Modern Database Using LLVM
- EFFICIENCY IN THE COLUMBIA DATABASE QUERY OPTIMIZER
- The Volcano Optimizer Generator: Extensibility and Efficient Search
- Volcano-An Extensible and Parallel Query Evaluation System
- The Cascades Framework for Query Optimization
- 逻辑优化 | PingCAP Docs
- ☑︎ Cascades Optimizer - 知乎
- ☑︎ Cost-Based Optimizer | CockroachDB Docs
- Optimizing Data Shuffling in Data-Parallel Computation by Understanding User-Defined Functions
- Spotting Code Optimizations in Data-Parallel Pipelines through PeriSCOPE
- Understanding the internals of GPORCA Optimizer - YouTube
Query Compilation
- ☑︎ 查询编译综述 - 知乎
- Runtime Code Generation in Cloudera Impala
- Generating code for holistic query evaluation
- Efficiently Compiling Efficient Query Plans for Modern Hardware
- Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last
- How to Architect a Query Compiler, Revisited
- Code generation for efficient query processing in managed runtimes
- Vectorization vs. Compilation in Query Execution
- Adaptive Execution of Compiled Queries
Vectorization
- ☑︎ Columnar Databases and Vectorization
- ☑ Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask
- ☑︎ How We Built a Vectorized Execution Engine
- ☑︎ Vectorizing the Merge Joiner in CockroachDB
- ☑︎ 40x faster hash joiner with vectorized execution
- ☑︎ Balancing vectorized query execution with bandwidth-optimized storage
SIMD
- ⭐️ Intel® Intrinsics Guide
- Fast array reversal with SIMD! - DEV Community
- x86 Assembly/SSE - Wikibooks, open books for an open world
- SSE Instructions - x86 Assembly Language Reference Manual
- c - SSE (SIMD): multiply vector by scalar - Stack Overflow
Pipeline
Peloton
- ☑︎ Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last
- MonetDB/X100: Hyper-Pipelining Query Execution
Join Algorithm
Comparison
- An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory
- ☑︎Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited
- A Seven-Dimensional Analysis of Hashing Methods and its Implications on Query Processing
- Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs
- Sort vs. Hash Revisited
Hash Join
- Hash join in MySQL 8 | MySQL Server Blog
- giorgospan/Radix-Hash-Join: Radix Hash Join - SIGMOD Contest 2018.
- Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs
- Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware
Sort Merge Join
Zig-Zag Merge Join
Other Perspectives
- A Practical Approach to Groupjoin and Nested Aggregates
- Fast and Effective Distribution-Key Recommendation for Amazon Redshift
Lock & Transaction
- Concurrency Control and Recovery
- Unreliable Guide To Locking
- Purchase Link: Transaction Processing: Concepts and Techniques
- Concurrency Control and Recovery in Database Systems
- Purchase Link: Theory of Database Concurrency Control (Principles of computer science series)
- Purchase Link: Access path selection in a relational database management system
- The Benchmark Handbook: For Database and Transaction Processing Systems
Log
- Physical log
- Logical logging
- Physiological logging
- Write Ahead Logging (WAL)
Deadlock Handling
- Deadlock avoidance
- Deadlock detection
- Timeout
- Wait-for graph
Two-Phase Locking(2PL)
- University of Waterloo CS 448 Database Systems - Two Phase Locking
- DBMS | Concurrency Control Protocol | Two Phase Locking (2-PL)-I - GeeksforGeeks
Classification of 2PL
- Basic 2PL
- Strict 2PL
- Conservative 2PL
- Rigorous 2PL
Isolation Level
- Read uncommited
- Read commited
- Repeatable read
- Serializale
Concurrency Control
- Lock
- Optimistic concurrency control
- Multiversion concurrency control (MVCC)
Optimistic Concurrency Control
- On Optimistic Methods for Concurrency Control
- Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores
Recovery
Write Ahead Logging (WAL)
Write Behind Logging
Key-Value Storage
Online Schema Change
- Online, Asynchronous Schema Change in F1
- cockroach/docs/RFCS/20151014_online_schema_change.md at master · cockroachdb/cockroach
Others
- Using Crash Hoare Logic for Certifying the FSCQ File System
- OLTP Through the Looking Glass, and What We Found There
Storage
In-Memory Cache & Storage
- MICA: A Holistic Approach to Fast In-Memory Key-Value Storage
- NSDI ‘14 - MICA: A Holistic Approach to Fast In-Memory Key-Value Storage
Colossus
- Colossus under the hood: a peek into Google’s scalable storage system
- Building Large-Scale Internet Services
Local Storage
BLOB Storage
- ☑ Finding a needle in Haystack: Facebook’s photo storage
- ☑ f4: Facebook’s Warm BLOB Storage System
- ☑︎ [Paper Notes] Facebook Haystack and F4 - 纯纯的 Blog
Distributed File System
Time Series Storage
Others
Disk Error Correction
Reed-Solomon
Data Structures & Algorithms
LST-Tree
B/B+ Tree
- ☑ B-Tree | Set 1 (Introduction) - GeeksforGeeks
- ☑ B-Tree | Set 2 (Insert) - GeeksforGeeks
- ☑ B-Tree | Set 3 (Delete) - GeeksforGeeks
- A Survey of B-Tree Locking Techniques
- Persistent B+-Trees in Non-Volatile Main Memory
- How to Build a Non-Volatile Memory Database Management System
- Modern B-Tree Techniques (2011) | Hacker News
Bw-Tree
Tree (Others)
- ☑ Database File Indexing - B+ Tree (Introduction) - GeeksforGeeks
- Introduction to R-tree - GeeksforGeeks
- The SB-tree: An Index-Sequential Structure for High-Performance Sequential Access
Range Filter
Skip List
- ☑︎ Skip List | Set 1 (Introduction) - GeeksforGeeks
- ☑︎ Skip List | Set 2 (Insertion) - GeeksforGeeks
- ☑︎ Skip List | Set 3 (Searching and Deletion) - GeeksforGeeks
- ☑︎ Skip Lists: A Probabilistic Alternative to Balanced Trees
- ☑︎ disque/skiplist.c at master · antirez/disque
- Skip Lists: Done Right
Materialized View
- Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing
- Napa: Powering Scalable Data Warehousing with Robust Query Performance at Google
- [Paper Notes] Napa: Powering Scalable Data Warehousing with Robust Query Performance at Google - 纯纯的 Blog
Distributed Algorithm
Consistency
- Consistency Models
- Seeing is Believing: A Client-Centric Specification of Database Isolation
- Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions
Course
Eventual Consistency
Consensus Algorithm
Raft
- ☑ In Search of an Understandable Consensus Algorithm (Extended Version)
- CONSENSUS: BRIDGING THEORY AND PRACTICE
- Just say NO to Paxos Overhead: Replacing Consensus with Network Ordering
Paxos
Zab
- Zab: High-performance broadcast for primary-backup systems
- ZooKeeper’s atomic broadcast protocol: Theory and practice
Distrubuted Hash Table (DHT)
Chord
Kademlia
File Format
Tracing
Scheduling
Scheduling Algorithm
Allocator
GPU Programming
Concurrency Programming
- ☑ Concurrency Freaks: Lock-Free and Wait-Free, definition and examples
- OneFile: A Wait-free Persistent Transactional Memory
- Let’s talk locks! - Speaker Deck
- Basics of Futexes - Eli Bendersky’s website
- A Practical Wait-Free Simulation for Lock-Free Data Structures
Structured Concurrency
Hazard Pointers
- Lock-Free Data Structures with Hazard Pointers
- folly/Hazptr.h at master · facebook/folly
- P1121R3: Hazard Pointers
PLT
Course
Distributed Systemes
- ☑ MIT 6.824
- CS294-91 Distributed Computing
- 6.852: Distributed Algorithms - Massachusetts Institute of Technology - Spring 2008
System Programming
UICD CS 241: System Programming
mit 6.033
Database
- CMU 15-445/645 :: Intro to Database Systems (Fall 2021)
- CMU 15-721 :: Advanced Database Systems (Spring 2020)
- ☑ 6.830 / 6.814: Database Systems - Data Systems Group @ MIT
Operating Systems
Data Structures
Waiting For Classification
System Programming
MMAP
- Are You Sure You Want to Use MMAP in Your Database Management System?
- re: Are You Sure You Want to Use MMAP in Your Database Management System? - Ayende @ Rahien
Memory Allocation
- ☑ Dynamic Memory Allocation in C using malloc(), calloc(), free() and realloc() - GeeksforGeeks
- Writing a Memory Allocator – Dmitry Soshnikov
Compiler
LLVM
- The Architecture of Open Source Applications: LLVM
- LLVM Language Reference Manual — LLVM 10 documentation
- LLVM’s Analysis and Transform Passes — LLVM 10 documentation
- ☑ My First Language Frontend with LLVM Tutorial — LLVM 10 documentation
Disk
SSD
Others
Filter
- Probabilistic Filters By Example: Cuckoo Filter and Bloom Filters
- Cuckoo Filter: Practically Better Than Bloom
- Bloom Filters by Example
- Ribbon filter: practically smaller than Bloom and Xor
Rust
HKT && GAT && ACT
- Generalizing over Generics in Rust (Part 1) - AKA Higher Kinded Types in Rust
- Generalizing over Generics in Rust (Part 1.5): Mechanisms
- Solving the Generalized Streaming Iterator Problem without GATs
- 🔬 Tracking issue for generic associated types (GAT) · Issue #44265 · rust-lang/rust
- Higher-Rank Trait Bounds - The Rustonomicon
Closures
Ownership
Golang
GC
- Golang’s Real-time GC in Theory and Practice - Making Pusher
- Go 1.5 concurrent garbage collector pacing - Google Docs
IEEE754
Computer Architecture
- Build a Modern Computer from First Principles: From Nand to Tetris (Project-Centered Course) | Coursera
- NandGame - Build a computer from scratch.
- Branch prediction
- Data alignment and caches