How AWS S3 is built

Published 21 Jan 2026

Show Notes: newsletter.pragmaticengineer.com/p/how-aws-s3-is-built

Duration: 4694

Amazon Web Services S3 is discussed in terms of its massive scale, engineering complexities, and capabilities in managing and processing large volumes of data.

Episode Description

Brought to You By:Statsig The unified platform for flags, analytics, experiments, and more.Sonar The makers of SonarQube, the industry standard for au...

Overview

The podcast provides an in-depth look at the scale, engineering, and evolution of AWS S3, highlighting its position as the world's largest cloud storage service. It processes over a quadrillion requests annually, stores hundreds of exabytes of data, and manages more than 500 trillion objects. The discussion covers S3's foundational design based on eventual consistency, its shift toward strong consistency, and the use of formal methods to ensure correctness and reliability.

The podcast also examines S3's expanding role in data lakes and big data applications, including its growing support for structured and vector data formats such as Parquet, Iceberg, and S3 Vectors. Engineering challenges like crash consistency, failure allowances, and replication strategies are explored in detail. Innovations such as S3 tables, intelligent tiering, and cost management features are highlighted as key advancements. The overall focus is on S3's ability to evolve and adapt, supporting new data formats, integrating with AI and analytics workflows, and maintaining high availability, durability, and performance at scale.

Recent Episodes of The Pragmatic Engineer

13 May 2026 TypeScript, C# and Turbo Pascal with Anders Hejlsberg

Anders Heilsberg's contributions to programming languages like Turbo Pascal, Delphi, C#, and TypeScriptshaping design philosophies, developer tools, and .NETalongside discussions on AI's impact on coding, type systems, and the evolution of language innovation.

29 Apr 2026 Building Pi, and what makes self-modifying software so fascinating

Pi, a minimalist self-modifiable AI coding agent for OpenClaw, examines engineering workflow challenges, ethical concerns, code quality issues, governance of non-expert contributions, and the evolving tension between AI-driven development, open-source ethics, and the enduring role of human expertise in software complexity.

22 Apr 2026 Designing Data-intensive Applications with Martin Kleppmann

The second edition of *Designing Data-Intensive Applications* updates its focus to cloud-native systems, serverless architectures, and data lakes while addressing distributed system challenges, ethical engineering, decentralized software, and emerging trends like AI integration and cryptographic supply chain applications.

8 Apr 2026 DHHs new way of writing code

David Heinemeier Hansson shifts from critiquing AI coding tools to embracing an AI-first approach at 37signals, emphasizing Ruby on Rails' token efficiency, Omachi's user-friendly design, AI-driven productivity, evolving developer roles, and the balance between automation and craftsmanship in software innovation.

1 Apr 2026 Scaling Uber with Thuan Pham (Ubers first CTO)

Tuan Pham's journey from a Vietnamese refugee to Uber's tech leader, transforming its dispatch system via microservices, overcoming academic-to-industry challenges, and emphasizing adaptability, mentorship, and aligning innovation with real-world needs through AI and logistics.

More The Pragmatic Engineer episodes