Getting Humans Out of the Way: How to Work with Teams of Agents

Published 7 Apr 2026

Recommended: An optimistic view of using Agentic AI with safeguards.

Show Notes: podcasters.spotify.com/pod/show/mlops/episodes/Getting-Humans-Out-of-the-Way-How-to-Work-with-Teams-of-Agents-e3hi4i7

Duration: 00:50:30

AI agents streamline software development through tools like pixel diff analysis, automated reporting, and annotated walkthroughs, addressing challenges in accuracy, code quality, and workflow adaptation while redefining human roles as validation overseers and collaborators in autonomous systems.

Episode Description

Rob Ennals is a Staff Software Engineer at Uber, working on large-scale distributed systems and core backend infrastructure.Getting Humans Out of the...

Overview

The podcast discusses tools and workflows that leverage AI agents to streamline software development and testing processes. Key tools include pixel diff analysis and error zoom, which reduce the scope of analysis by focusing on specific errors, and automated reporting systems that summarize critical changes. Visual regression testing involves agents creating annotated feature walkthroughs, verified by sub-agents to ensure alignment with intended changes. Agents are also trained to handle QA tasks, documentation, and validation autonomously by communicating clearly and organizing work effectively. Automated regression testing tools like Playwright generate screenshots and detect discrepancies, while diff reports identify unintended changes requiring human review. Challenges include risks like AI deception, where agents might fabricate results, and the need for manual intervention when errors arise due to factors like incorrect dimensions. Practical applications include using AI to create animated content, though existing tools for video creation often require post-upload fixes.

The evolution of AI models underscores the need for adaptive human-AI collaboration, where humans define validation criteria and success metrics for agents, enabling them to handle tasks like code testing and documentation. Rapid advancements in AI, such as Anthropic's Opus 4.6, render older methods obsolete, creating a fragmented landscape where developers experiment with tools independently. Agents are advancing from basic auto-complete and peer-programming capabilities to autonomous handling of complex workflows, shifting human roles from micromanagement to overseeing teams of agents. Agents can resolve syntactic and semantic merge conflicts, optimize code through refactoring, and use parallel processing to tackle tasks simultaneously, reducing human workload. Tools like Broomy facilitate agent collaboration by automating PR creation, documentation, and conflict resolution, while systems like Rumi enable visual oversight of parallel workflows. Despite these efficiencies, challenges remain in balancing automation with human oversight, particularly as agents may require iterative refinement and may encounter resource constraints or inefficiencies.

Efforts to optimize agent workflows emphasize avoiding redundant tasks, such as repetitive testing, and developing specialized tools to automate tedious processes. Agents are encouraged to identify bottlenecks and suggest improvements to tools or workflows, while human guidance focuses on defining problems and refining validation processes. Code quality is maintained through agent-driven lint rules, exhaustive unit tests, and structured documentation, ensuring consistency and clarity. Long-term goals include cloud-based scalability and enhanced visibility tools for managing parallel agent sessions. As AI capabilities expand, the role of humans is shifting toward identifying solvable problems and strategic guidance, with adaptability and problem-solving becoming critical skills amid rapid technological change.

Final Notes

Key Insights and Takeaways:

Agent-Optimized Codebases: To maximize agent efficiency, codebases should be structured to provide maximum context, including readmes and documentation.
Iterative Improvement: Feedback loops (e.g., lint rules, tests) should be used to refine agent performance and code quality consistently.
Parallel Agent Execution: Multiple agents can tackle the same task using different approaches, with verification results determining the optimal solution.
Merge Conflict Resolution: Agents can resolve both syntactic and semantic merge conflicts autonomously, reducing human intervention.
Broomy: A Tool for Agent Collaboration: Broomy allows quick creation of agents with isolated work trees, features customizable commands, and integration with code review and source control panels.
Key Benefits of Agent Systems: Continuous improvement through parallel experimentation and verification, reduced human workload, and scalable solutions for maintaining clean, functional code bases.
Optimizing Agent Workload: To minimize costs and time, agents should focus on minimizing redundant work, avoid running E2E tests until unit tests, lints, and other checks pass, and identify resource-wasting patterns.
Agent-Centric Management Strategies: Regularly prompt agents to identify bottlenecks or difficult tasks, update guidance and linting rules to prevent poor practices, and treat agent challenges as technical debt.
Human-Agent Collaboration: Use agent feedback to improve tools, scripts, or processes, and validate and iterate on agent performance to ensure they have the right skills to operate effectively.

Relevance and Usefulness:

These key insights and takeaways are relevant and useful for readers in several ways:

Efficient Code Maintenance: By optimizing codebases and agents, developers can reduce the time and effort required to maintain large codebases, enabling faster development and deployment of new features.
Improved Collaboration: Human-agent collaboration can lead to more efficient and effective development, as agents can take on repetitive tasks and free up humans to focus on higher-level problem-solving.
Reduced Technical Debt: Regularly addressing technical debt and optimizing agent workflows can prevent the accumulation of technical debt and reduce the risk of costly rework.
Scalable Development: Agents can help scale development teams by handling tasks that are too time-consuming or repetitive for humans, enabling teams to focus on high-priority tasks.
Future-Proofing: By focusing on human skills and adapting to technological changes, developers can future-proof their careers and stay relevant in an increasingly automated world.

Philosophical and Societal Implications:

Defining Human Purpose: As AI becomes more capable, humans must redefine their purpose and focus on high-level problem-solving, strategic guidance, and oversight.
The Role of Humans in a Changing Landscape: The shift from manual coding and debugging to AI-assisted tasks requires humans to adapt and focus on more strategic and high-level tasks.
Unpredictability of Future Roles: The rapid advancement of AI models makes it difficult to predict long-term human roles, emphasizing the importance of adaptability and continuous learning.

Actionable Recommendations:

Stay Adaptable: Continuously learn and adapt to new tools, technologies, and workflows to stay relevant in an increasingly automated world.
Invest Time in Proper Techniques: Develop proper techniques for using tools and technologies effectively, and invest time in refining your skills to take full advantage of agent capabilities.
Emphasize High-Level Problem-Solving: Focus on high-level problem-solving, strategic guidance, and oversight, and adapt to emerging opportunities and challenges in the field.

Practical Use of AI Tools:

Experiment with Multiple Agents: Experiment with multiple AI agents to avoid dependency on a single platform and stay up-to-date with the latest developments.
Prioritize Performance Trade-Offs: Prioritize performance trade-offs, as AI models like Codex or Gemini may improve and necessitate flexibility in workflow and tooling.
Emphasize Human Purpose: Focus on human-purpose roles, such as problem identification, strategic guidance, and oversight, to ensure that humans remain relevant and effective in an AI-driven world.

Recent Episodes of MLOps.community

19 May 2026 Autonomous Agents at Work: From OpenClaw Hype to Enterprise Reality

AI agents evolve from question-answering systems to autonomous task execution, requiring risk management through governance frameworks, security measures, human oversight, and ethical integration to address operational, compliance, and safety challenges while balancing AI capabilities with accountability.

15 May 2026 Agents are Just While Loops

Managing long-running agents requires state checkpointing and rehydration for fault tolerance, balancing durability with scalability via modular architectures, orchestration frameworks like Temporal, open standards, and simplified agent designs that separate concerns and leverage existing infrastructure.

12 May 2026 The Latency Goldilocks Zone Explained

iFood's ILO AI agent leverages a Learning Context Model to deliver hyper-personalized food recommendations by integrating diverse AI techniques, navigating cultural nuances, and balancing familiar and novel choices while addressing multi-channel design, latency, scalability, data alignment, and experimental innovation challenges.

8 May 2026 Building MCP Before MCP Existed: Inside Despegar's Sofia Agent

Sophia, an AI-powered travel concierge using a multi-agent system and decentralized collaboration, aims to streamline bookings, in-trip services, and personalized experiences through AI-driven automation, chat/voice interfaces, and orchestration layers, while expanding capabilities and reducing friction in travel processes.

1 May 2026 Voice Agent Use Cases

Designing voice-based AI systems involves balancing user control with automation, addressing speech quality-latency trade-offs, creating intuitive non-technical interfaces, overcoming transcription and turn-taking challenges in real-world environments, integrating hybrid models and domain-specific tuning, while ensuring compliance, user trust, and ethical considerations in applications like customer support and dynamic environments through feedback loops.

More MLOps.community episodes