The knowledge involved in building large-scale system, covering everthing form architecture to algorithms, from macro to micro.

Comments and suggestions are welcomed.

(Content is being sorted out, a little bit confusing right now)

☑︎ means read

⭐️ means recommend

Distributed Systems

Data Processing

Stream Processing

SQL Workloads

Distributed Storage System

Distributed Computation

Database

Motivation

History

Architecture

Key-Value Store

RocksDB

Greenplum

PostgreSQL

CockroachDB

RisingWave

MVCC

Column Storage

SQL Parser

SQL Test

Query Engine

Push vs Pull

Query Optimization

Query Compilation

Vectorization

SIMD

Pipeline

Peloton

Join Algorithm

Comparison

Hash Join

Sort Merge Join

Zig-Zag Merge Join

Other Perspectives

Lock & Transaction

Log

  • Physical log
  • Logical logging
  • Physiological logging
  • Write Ahead Logging (WAL)

Deadlock Handling

  • Deadlock avoidance
  • Deadlock detection
    • Timeout
    • Wait-for graph

Two-Phase Locking(2PL)

Classification of 2PL

  • Basic 2PL
  • Strict 2PL
  • Conservative 2PL
  • Rigorous 2PL

Isolation Level

  • Read uncommited
  • Read commited
  • Repeatable read
  • Serializale

Concurrency Control

  • Lock
  • Optimistic concurrency control
  • Multiversion concurrency control (MVCC)

Optimistic Concurrency Control

Recovery

Write Ahead Logging (WAL)

Write Behind Logging

Key-Value Storage

Online Schema Change

Others

Storage

In-Memory Cache & Storage

Colossus

Local Storage

BLOB Storage

Distributed File System

Time Series Storage

Others

Disk Error Correction

Reed-Solomon

Data Structures & Algorithms

LST-Tree

B/B+ Tree

Bw-Tree

Tree (Others)

Range Filter

Skip List

Materialized View

Distributed Algorithm

Consistency

Course

Eventual Consistency

Consensus Algorithm

Raft

Paxos

Zab

Distrubuted Hash Table (DHT)

Chord

Kademlia

File Format

Tracing

Scheduling

Scheduling Algorithm

Allocator

GPU Programming

Concurrency Programming

Structured Concurrency

Hazard Pointers

PLT

Course

Distributed Systemes

System Programming

UICD CS 241: System Programming

mit 6.033

Database

Operating Systems

Data Structures

Waiting For Classification

System Programming

MMAP

Memory Allocation

Compiler

LLVM

Disk

SSD

Others

Filter

Rust

HKT && GAT && ACT

Closures

Ownership

Golang

GC

IEEE754

Computer Architecture

TLA+

Communication

Lexical Analyser

Type Theory

Network

Parallel Computing