The Age of Async Agents Cognition's Walden Yan & OpenInspect's Cole Murray

Published 28 May 2026

Duration: 01:08:02

The evolution of AI agent development shifts toward autonomous workflows via tools like Devin for code generation and OpenInspect for cloud management, addressing growth, infrastructure challenges, security, scalability, enterprise adoption, open-source initiatives, diverse non-engineering use cases, and the role of human oversight in AI-native coding.

Episode Description

The new AIEWF website is live! CFPs close in 2 days and we will run our first New Engineer Orientation this weekend, get your tickets booked ASAP as t...

Overview

The podcast explores the evolution of AI agent development, emphasizing a shift from manual model management to autonomous agent-driven workflows. Key advancements include tools like Devin (for autonomous code generation) and Open Inspect (for cloud agent management), alongside improved models such as Sonnet 3.7 and GPT 5.2, which enable agents to execute tasks like pull request generation with minimal human input. Growth metrics highlight surges in Devins usage (from 16% to 80% PR contributions) and increased interest in cloud agent tools, alongside challenges in scaling agent infrastructure, such as the limitations of cloud VMs and the move toward custom solutions. Architectural trade-offs between in-box and out-of-box agent systems are discussed, with a preference for separating logic from execution environments to balance security and flexibility.

The discussion also addresses infrastructure and deployment complexities, including the preference for VMs over Docker for agent execution and the need for sandbox strategies to ensure consistent testing environments. Memory systems for agents remain a challenge, with efforts to refine auto-generated memory management and align it with file system-like navigation for better scalability. Enterprise adoption highlights the role of companies like Cognition in onboarding teams, though challenges persist, such as AI literacy gaps and alignment with existing workflows. Open-source projects like OpenDevin and OpenInspect are explored as alternatives to proprietary solutions, while debates around monetization and the gray area of agent systems between infrastructure and service offerings are raised. Finally, the podcast touches on broader challenges, including code quality in AI-generated workflows, the risk of reward hacking, and the evolving role of AI in non-engineering tasks like competitor research and SRE triage.

What If

What if you shifted your sole development workflow to cloud-native agent systems like OpenInspect, bypassing local environments entirely?
- Move: Migrate all development tasks (PR generation, testing, debugging) to OpenInspect or similar platforms, using pre-configured sandboxes with automated snapshot restoration.
- Why Now?: The surge in Devins PR contributions (80% in March) and reduced reliability of local development environments (IDE weeds) demonstrate a clear trend toward cloud-first workflows.
- Expected Upside: Faster onboarding, consistent environments across tasks, and reduced dependency on local infrastructure, enabling parallel development of multiple projects.
What if you built a hybrid agent architecture that combines out-of-box security with in-box simplicity for solo operations?
- Move: Use OpenDevin as the "brain" (controller) and containerized environments (e.g., Docker, Firecracker VMs) as the "hands" (sandbox), isolating secrets and state from the central agent.
- Why Now?: Enterprise adoption challenges (e.g., security concerns with in-box agents) and the rise of custom infra like blockdiff file systems show demand for secure yet scalable agent setups.
- Expected Upside: Mitigate security risks while maintaining simplicity, enabling safe experimentation with autonomous code generation and testing without compromising on development speed.
What if you implemented a file-system-like memory system for your agent to track priorities and recurring tasks in real time?
- Move: Create a custom "memory.md" file structure for your agent, using markdown files to document project-specific tasks, priorities, and user preferences (e.g., draft vs. open PR status).
- Why Now?: Devons struggles with auto-generated memory overload (95% from user interactions) and rigid behaviors (e.g., "open as a draft PR") highlight the need for user-editable, structured memory.
- Expected Upside: Improved task contextualization for agents, allowing them to better align with your priorities and reduce errors from misinterpreted or outdated memory entries.

Takeaway

Adopt cloud-native agent tools like Devin for PR automation: Leverage tools like Devin to automate pull request generation from specifications, reducing manual code management by up to 80% (as shown by usage metrics). Focus on integrating these tools into your workflow to minimize repetitive tasks.
Use OpenInspect for GitHub code review with manual oversight: Implement OpenInspect to review code and handle alerts, but recognize that it requires manual tagging for actions like resolving merge conflicts. Ensure you manually vet and trigger critical tasks to maintain control.
Optimize environment setup with existing dev infrastructure: Avoid redundant setups by reusing existing development environments (e.g., dev boxes) and enforcing scoped secrets per machine. This reduces security risks and streamlines agent sandboxing.
Tune agent memory systems to avoid rigid behaviors: Regularly audit and adjust auto-generated memories (e.g., changing "draft PRs" to "open PRs") to prevent agents from developing unintended, inflexible rules based on over-reliance on specific details.
Prioritize environment consistency with Docker/VMs: Use Docker for infrastructure alignment with developer workflows but reserve full VMs for agent execution when required (e.g., for OS-specific tasks like iOS development). Pre-snapshot sandboxes can speed up setup and testing.

Recent Episodes of Latent Space

8 Jul 2026 Why AI Infrastructure must evolve for Agent Experience Akshat Bubna, Modal CTO

"Modo evolves from data pipelines to AI-driven workflow orchestration, emphasizing dynamic scaling, GPU support, and developer/agent-friendly tooling while avoiding vendor lock-in."

24 Jun 2026 Why the Frontier Ecosystem must be Open Matei Zaharia and Reynold Xin, Databricks

Databricks' expansion from a Berkeley meetup to a 100,000-attendee event, coupled with initiatives like OmniGens, Open Sharing, and Genie, addresses agent interoperability, open data formats, cloud security, scalable analytics, and evolving database architectures, while emphasizing open ecosystems and customer-driven AI innovation.

22 Jun 2026 Red-Teaming after Mythos Zico Kolter & Matt Fredrikson, Gray Swan

AI security challenges in large language models, such as data leakage and prompt injection, require adversarial testing, red teaming, tools like *Shade* and *Signal*, and structured frameworks to address integration risks, robustness gaps, and enterprise-specific security demands.

3 Jun 2026 Scaling Past Informal AI - Carina Hong, Axiom Math

Formal verification is positioned as a critical tool for advancing AI by ensuring system correctness through mathematical rigor, exemplified by Axiom Math's achievements, tools like Lean, challenges in AI generalization, and the vision of AI as a "superhuman mathematician" through verified reasoning.

3 Jun 2026 Satya Nadella: No Priors x Latent Space Crossover Special at Microsoft Build

Strategic AI development shifts to ecosystem-driven frameworks prioritizing value creation, covering Microsoft's rigorous model training, agent-driven workflow management, real-world impact challenges, innovative business models, inclusive AI participation, and redefining work through agentic systems.

More Latent Space episodes