More The Pragmatic Engineer episodes

How AWS S3 is built thumbnail

How AWS S3 is built

Published 21 Jan 2026

Duration: 4694

Amazon Web Services S3 is discussed in terms of its massive scale, engineering complexities, and capabilities in managing and processing large volumes of data.

Episode Description

Brought to You By:Statsig The unified platform for flags, analytics, experiments, and more.Sonar The makers of SonarQube, the industry standard for au...

Overview

The podcast provides an in-depth look at the scale, engineering, and evolution of AWS S3, highlighting its position as the world's largest cloud storage service. It processes over a quadrillion requests annually, stores hundreds of exabytes of data, and manages more than 500 trillion objects. The discussion covers S3's foundational design based on eventual consistency, its shift toward strong consistency, and the use of formal methods to ensure correctness and reliability.

The podcast also examines S3's expanding role in data lakes and big data applications, including its growing support for structured and vector data formats such as Parquet, Iceberg, and S3 Vectors. Engineering challenges like crash consistency, failure allowances, and replication strategies are explored in detail. Innovations such as S3 tables, intelligent tiering, and cost management features are highlighted as key advancements. The overall focus is on S3's ability to evolve and adapt, supporting new data formats, integrating with AI and analytics workflows, and maintaining high availability, durability, and performance at scale.

Recent Episodes of The Pragmatic Engineer

27 May 2026 Building OpenCode with Dax Raad

OpenCode's rapid growth to 10 million users highlights challenges like feature overload and AI's limited impact on development speed, while underscoring tensions between innovation, product cohesion, sustainable practices, and the complexities of AI-driven workflows in software engineering.

20 May 2026 Why Rust is different, with Alice Ryhl

Rust prioritizes memory safety and performance via ownership, borrow checking, and `unsafe` blocks without garbage collection, balancing robust governance, community-driven tools like Cargo and Tokio, safety features including null safety and exhaustive pattern matching, and ongoing efforts to simplify learning curves and integrate AI-driven development, while standing out in system programming compared to TypeScript, JavaScript, and C++.

13 May 2026 TypeScript, C# and Turbo Pascal with Anders Hejlsberg

Anders Heilsberg's contributions to programming languages like Turbo Pascal, Delphi, C#, and TypeScriptshaping design philosophies, developer tools, and .NETalongside discussions on AI's impact on coding, type systems, and the evolution of language innovation.

29 Apr 2026 Building Pi, and what makes self-modifying software so fascinating

Pi, a minimalist self-modifiable AI coding agent for OpenClaw, examines engineering workflow challenges, ethical concerns, code quality issues, governance of non-expert contributions, and the evolving tension between AI-driven development, open-source ethics, and the enduring role of human expertise in software complexity.

22 Apr 2026 Designing Data-intensive Applications with Martin Kleppmann

The second edition of *Designing Data-Intensive Applications* updates its focus to cloud-native systems, serverless architectures, and data lakes while addressing distributed system challenges, ethical engineering, decentralized software, and emerging trends like AI integration and cryptographic supply chain applications.

More The Pragmatic Engineer episodes