More MLOps.community episodes

Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs thumbnail

Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs

Published 24 Feb 2026

Duration: 01:25:49

The podcast dives into AI development, software engineering, and GPU innovation, focusing on efficient workloads and the trade-offs between quick solutions and scalable systems.

Episode Description

March 3rd, Computer History Museum CODING AGENTS CONFERENCE, come join us while there are still tickets left.https://luma.com/codingagentsChris Fregly...

Overview

The podcast covers multiple topics related to product innovation, software development, and engineering practices, focusing on trends and challenges in AI and machine learning. It examines the use of SageMaker HyperPods, which employ pre-warmed GPUs to improve efficiency for AI workloads, and discusses the rise of "throwaway" applications designed for specific, short-term needs without long-term maintenance. The conversation also addresses the trade-off between quick, functional solutions and scalable, robust systems, emphasizing the role of software engineers in developing reliable production-grade applications. Additionally, the episode explores the use of AI tools in code generation and debugging, including a feature called the "playground skill" for visualizing code flow.

The discussion extends to issues with GPU hardware and limitations in AI infrastructure, highlighting the need for better documentation and transparency. It also touches on the growing interest in optimizing AI models for specific applications and newer hardware such as NVIDIA's Blackwell. The author reflects on writing a book focused on co-design principles that integrate hardware, software, and algorithms, and underscores the importance of open-source tools and community collaboration in advancing AI development and deployment.

Recent Episodes of MLOps.community

31 Mar 2026 This One Shift Makes Developers Obsolete

Processing live stream data involves transcription, AI-driven skill categorization, GitHub organization, multimedia-comment correlation, and knowledge graphs, while addressing redundancy, AI costs, and MLOps trends, AI agent debates, adversarial workflows, security risks, and tooling like Open Claw and Agent Zero.

30 Mar 2026 Operationalizing AI Agents: From Experimentation to Production // Databricks Roundtable

Deploying AI agents in real-world systems demands robust safety protocols, human oversight, and structured testing to address risks like errors and vulnerabilities, while balancing innovation with responsibility through observability, governance, domain expertise, and tools like MLflow, across use cases from workflow automation to critical system reliability.

27 Mar 2026 arrowspace: Vector Spaces and Graph Wiring

Epiplexity introduces a framework redefining entropy and complexity with structural information, while topological search and graph-based methods enhance semantic accuracy in machine learning by preserving data through high-dimensional embeddings and hybrid geometric-topological analysis, outperforming traditional approaches in retrieval and reasoning tasks.

20 Mar 2026 Agentic Marketplace

AI-driven agent systems in OLX's classifieds marketplace aim to innovate user experiences by overcoming UI constraints through dynamic intent extraction, hybrid chat/UI models, and trust-building in real estate and motors, with future focus on logistics automation, secure transactions, and human-agent integration.

17 Mar 2026 Durable Execution and Modern Distributed Systems

Temporal enhances developer productivity by enabling crash-proof workflows through deterministic programming models, separating business logic from fault tolerance, and simplifying distributed systems with durable execution, workflows, activities, and persistence layers like Cassandra/Postgres.

More MLOps.community episodes