About
These are my notes on Roberto Vitillo’s Understanding Distributed Systems. The book’s website can be found here.
Introduction
A distributed system is one in which the failure of a computer you didn’t even know exited can render your own computer unusable.
— Leslie Lamport
- Motivations for building distributed systems include:
- High availability: resilience to single-node failures
- Large workloads that are too big to fit on a single node
- Performance requirements (e.g. high resolution & low latency for video streaming)
Table of Contents
Part I: Communication
Chapter 2: Reliable Links
- We can derive the maximum theoretical bandwidth of a network link by dividing the size of the congestion window by the round trip time:
Chapter 3: Secure Links
- A secure communication link must make 3 guarantees:
- Encryption: asymmetric encryption and symmetric encryption (via TLS) are used to ensure that data can only be read by the communicating processes
- Authentication: the server and client should each authenticate that the other is who they claim to be, via certificates issued by certificate authorities (CAs)
- Integrity: TLS verifies the integrity of the data by calculating a message digest using a secure hash function
Chapter 4: Discovery
- The Domain Name System (DNS) is a distributed, hierarchical, and eventually consistent key-value store
Chapter 5: APIs
-
A textual format like JSON is self-describing and human-readable, at the expense of increased verbosity and parsing overhead