More Practical AI episodes

Controlling AI Models from the Inside thumbnail

Controlling AI Models from the Inside

Published 20 Jan 2026

Duration: 2635

The podcast delves into the AI safety crisis, discussing ongoing struggles with AI-generated harm, the limitations of current security measures, and emerging solutions for real-time monitoring and more sophisticated safety protocols.

Episode Description

As generative AI moves into production, traditional guardrails and input/output filters can prove too slow, too expensive, and/or too limited. In this...

Overview

The podcast explores the ongoing challenges in AI safety, particularly the risks of AI systems generating harmful or unintended content such as violence, pornography, or dangerous advice. It differentiates between using AI for security and ensuring AI systems themselves are secure, stressing the importance of proactive safety measures beyond basic input and output filtering. Current approaches are criticized for being reactive, often relying on post-hoc analysis of outputs and struggling with detecting harmful content in complex media like video and audio.

The discussion highlights emerging solutions that use internal model instrumentation to identify unsafe behavior in real-time, offering a more efficient and scalable alternative. It also addresses the value of interpretability in AI, the need for layered defense strategies, and the potential of edge devices to support safety mechanisms with lower computational requirements. The conversation touches on economic and practical barriers to implementing strong safety measures and the difficulty of tailoring these systems to industry-specific needs, while envisioning a future of more adaptable and context-aware AI security frameworks.

Recent Episodes of Practical AI

25 Mar 2026 AI at the Edge is a different operating environment

Edge AI in 2026 focuses on deploying efficient, task-specific models at data sources for real-time applications like automation and IoT, driven by silicon advances, economic ROI, and challenges like latency and privacy, with strategies such as model cascading and hardware-software synergy.

17 Mar 2026 Humility in the Age of Agentic Coding

AI's transformative impact on software development includes productivity gains through tools like code generation, challenges in accuracy and reliability, debates over factual limitations and non-deterministic outputs, ethical concerns around job displacement, and the integration of AI into workflows via projects like Rue, which explore AI-human collaboration and the evolving role of developers.

9 Mar 2026 AI policy and the battle for computing power

AI development is being driven by the private sector, raising concerns about its alignment with democratic principles and sparking a need for international cooperation to establish safety standards.

18 Feb 2026 Cognitive Synthesis and Neural Athletes

Leadership styles need to shift towards empathy and authenticity to drive effectiveness, particularly in hybrid work environments and an increasingly AI-driven world.

13 Feb 2026 AI incidents, audits, and the limits of benchmarks

Experts highlight the need for robust AI safety measures, including developing methods to catalog and prevent AI incidents, and using data and third-party audits to identify and address flaws.

More Practical AI episodes