More The TWIML AI Podcast episodes

How Capital One Delivers Multi-Agent Systems with Rashmi Shetty thumbnail

How Capital One Delivers Multi-Agent Systems with Rashmi Shetty

Published 16 Apr 2026

Duration: 00:54:52

Capital One's *Chat Concierge* multi-agentic AI system streamlines car-buying through self-reflection, real-time APIs, and LLM-driven workflows, addressing enterprise AI challenges like governance, scalability, and legacy system integration while prioritizing compliance, observability, and flexible platform adoption.

Episode Description

In this episode, Rashmi Shetty, senior director of enterprise generative AI platform at Capital One, joins us to explore how the company is designing,...

Overview

Capital Ones Chat Concierge is a multi-agentic AI system designed to streamline the car-buying process by integrating self-reflection, layered reasoning, and real-time API checks to assist with tasks like vehicle matching, test drive scheduling, financing approval, and trade-in valuations. This system exemplifies advanced, enterprise-grade agentic AI, reflecting Capital Ones shift from traditional machine learning to large language models (LLMs) and generative AI since 2023. The multi-agentic approach addresses complex, multifaceted challenges by decomposing tasks into specialized agents, enabling autonomous decision-making while aligning with governance, compliance, and risk management frameworks. This architecture emphasizes scalability, operationalization, and integration with existing data pipelines, leveraging cloud-based governance and auto-ML advancements.

Key challenges in developing agentic systems include managing multi-layered complexity, ensuring latency and performance as critical product features, and aligning with legacy infrastructure. Observability is central to these systems, requiring real-time monitoring of agent behavior, reasoning, and feedback loops, alongside replay analysis to trace decision pathways. The platform prioritizes developer tools that balance speed, safety, and governance, offering SDKs and frameworks to accelerate deployment while ensuring compliance, risk control, and seamless integration with enterprise data. Capital Ones strategy underscores the importance of leveraging existing data governance foundations, enabling rapid iteration from experimentation to production, and fostering collaboration between developers and compliance teams to align agentic AI with regulatory and operational needs.

Recent Episodes of The TWIML AI Podcast

21 May 2026 Relational Foundation Models for Enterprise Data with Jure Leskovec

Relational foundation models and graph-based machine learning, like GNNs, enable accurate predictions on structured data across biomedical research and industries by capturing complex relationships, integrating multi-scale data, and overcoming traditional limitations through automated feature extraction and hybrid modeling.

7 May 2026 How to Find the Agent Failures Your Evals Miss with Scott Clark

Distributional employs post-production analytics, unsupervised learning, and LLMs to analyze agent traces, detect patterns and anti-patterns like hallucinations, address distributional shifts, and generate actionable insights for AI system refinement in security and enterprise settings, emphasizing adaptive analytics and domain expertise.

30 Apr 2026 How to Engineer AI Inference Systems with Philip Kiely

AI inference deployment is accelerating, emphasizing inference engineering's critical role in optimizing generative models with advanced hardware and complex systems, while addressing challenges like latency, scalability, and modality-specific optimizations amid evolving industry trends and fragmented yet open-source-driven markets.

26 Mar 2026 The Race to Production-Grade Diffusion LLMs with Stefano Ermon

The text traces generative models' evolution from early image generation to diffusion models' stability, highlights Mercury II's advancements in speed and efficiency, and addresses ongoing challenges in scalability, multimodal integration, and future research in controllability and cross-modal unification.

More The TWIML AI Podcast episodes