· Brittany Ellich · papers  · 3 min read

Time, Clocks, and the Ordering of Events in a Distributed System

Notes / Computer Science History / Time, Clocks, and the Ordering of Events in a Distributed System

Author(s): Leslie Lamport

Date: 1978

Notes

  • “A distributed system consists of a collection of distinct processes which are spatially separated, and which communicate with one another by exchanging messages.”
  • Clocks aren’t perfectly accurate and aligning real clocks is difficult.
  • How do we know the order of two events?
  • We can’t rely on ordering based on the clocks of the individual systems
  • Partial Ordering
    • A simple “happens before” relationship to determine the order between two events within the same process (state machines)
    • When two events cross in the event stream, we can’t really know which one came first. For partial ordering, it’s arbitrary
  • Logical Clocks
    • Sending a timestamp along with the message, the receiver can use the timestamp and increment it to a greater value, setting its own local clock
    • In a distributed system, each process keeps its own clock
  • Total Ordering
    • Use logical clocks to order events and then use an arbitrary way to determine which came first for concurrent events
    • Can be used to solve the mutual exlusion problem. Only one process can use a resource at a time so the processes must
    • Each process maintains its own request queue which is never seen by another process.
    • Defines the following algorithm: 1. Process P1 sends a message to all other processes to request a resource with a timestamp on the message 2. All other processes acknowledge the request on their request queue, and send a timestamped acknowledgement message to P1 3. P1 releases the resource by removing the requests resource message from its request queue and sends a timestamped releases resource message to all other processes 4. All other processes receive that releases resource message and removes the requests resource message from their request queues
      • Process P1 gets the resource when there is a requests resource message in its request queue, ordered before any other request in its queue, and P1 has received a message from every other process acknowledging the request after its initial timestamp
      • This algorithm is ideal in situations where there is no central process or central storage
  • Physical Clocks
    • A clock synchronizing algorithm to reduce drift among physical clocks using the current value of the process physical clock and the timestamp of the message received
  • Drawbacks
    • Fails to call out retries or errors/failing scenarios
  • “Almost nobody thought it was about state machines”…”Never understood why it’s so important in computer science… but what people get out of it is a way of thinking about distributed systems that felt obvious to me but wasn’t obvious…and got them thinking a different way”. - Interview with Leslie Lamport
    • Thinks about this as a physics problem, not a math problem.

Resources

Back to Blog

Related Posts

View All Posts »
Build the thing you wish to see in the world

Build the thing you wish to see in the world

For most of my career, I've been confusing building products with building businesses—and that confusion kept me from pursuing a lot of ideas. Two weeks off helped me realize that not everything needs to be a startup, and some of the best things we build are the ones we build just because we want them to exist.

A Software Engineer's Guide to Agentic Software Development

A Software Engineer's Guide to Agentic Software Development

I've cracked the code on breaking the eternal cycle - features win, tech debt piles up, codebase becomes 'legacy', and an eventual rewrite. Using coding agents at GitHub, I now merge multiple tech debt PRs weekly while still delivering features. Tickets open for months get closed. 'Too hard to change' code actually improves. This is the story of the workflow.