More Software Engineering Radio episodes

Dave Airlie on Linux Kernel Maintenance thumbnail

Dave Airlie on Linux Kernel Maintenance

Published 3 Jun 2026

Duration: 01:09:27

The Linux kernel, the largest global software project, uses a hierarchical maintainer system with 80,150 contributors managing subsystems like DRM through public review, structured development cycles, and evolving practices to address scalability, quality, and integration challenges.

Episode Description

Dave Airlie, a Distinguished Engineer at Red Hat, speaks with host Gregory M. Kapfhammer about Linux kernel maintenance. After over-viewing the scale...

Overview

The Linux kernel is described as the largest software engineering project globally, managed by a dynamic group of 80 to 150 maintainers who oversee specialized subsystems like graphics, networking, storage, and memory management. These subsystems operate under hierarchical maintainership, with top-level figures such as Linus Torvalds delegating authority to intermediate maintainers who coordinate broader teams. A notable focus is the DRM (Direct Rendering Manager) subsystem, which manages GPU support, involving over 300 contributors per release and a complex hierarchy of co-maintainers and developers. Maintainers like Dave Airley play a critical role in aggregating and reviewing patches, submitting them to Linus Torvalds via weekly pull requests, while relying on public mailing lists for transparency. The process emphasizes collaboration and quality control, with maintainers prioritizing oversight over direct coding.

The kernels development cycle spans 9 to 10 weeks, including a two-week merge window for new features and a series of release candidates (RC1-RC7) leading to the final release. Regression management is a key challenge, with issues like display failures or performance drops requiring prompt fixes, often through patch reverts or iterative adjustments. Patch management follows strict guidelines, emphasizing concise, self-contained changes and detailed commit messages, while tools like Git, mailing lists, and limited CI systems facilitate collaboration. Challenges include scaling the DRM subsystems complexity, handling cross-subsystem conflicts, and balancing vendor-specific code sharing. Additionally, the lack of centralized CI workflows and reliance on email-based communication present ongoing hurdles for maintaining stability and efficiency.

Kernel maintenance also grapples with evolving technologies, such as the integration of Rust for safer code practices and the experimental use of AI for code review and regression detection. However, these innovations remain in early stages, with the community prioritizing structured, community-driven processes over rapid experimentation. Long-term stability is ensured through LTS (Long-Term Support) kernels, which receive security and bug fixes without major updates, though subsystems like graphics exhibit inconsistent support. Despite procedural rigor, the development process retains the appeal of hardware-software interaction, encouraging contributors to address niche issues or join specialized communities like GPU driver maintainers.

What If

  • What if you scaled a solo kernel maintenance workflow for a complex subsystem like DRM?

    • Move: Adopt a hierarchical maintainer model with sub-team co-maintainers and automate patch aggregation using tools like B4 or lei to manage contributor flows.
    • Why Now?: The DRM subsystem has 300+ contributors and 2,0003,000 changes per cycle, requiring structured scalability to prevent burnout and regression.
    • Expected Upside: Streamlined patch triage, reduced manual effort, and improved contributor alignment through clear delegation.
  • What if you integrated AI into your patch review process to catch regressions faster?

    • Move: Test AI tools (e.g., Claude, Gemini) on historical DRM patches to identify pattern-based regressions and validate patch correctness before submission.
    • Why Now?: The Linux kernel community is experimenting with AI for regression detection, and early tests show 50% false positive rates but potential for reducing costly post-release bugs.
    • Expected Upside: Faster feedback loops, reduced manual review burden, and earlier detection of subtle regressions in performance-critical code.
  • What if you developed a Rust-based kernel driver module to reduce memory management errors?

    • Move: Prototype a GPU or graphics driver in Rust using the graphics subsystems community resources, focusing on safety-critical areas like DMA buffer handling.
    • Why Now?: The Linux kernel community is exploring Rust for safety, with a growing younger contributor base and safety features that could mitigate common C-related bugs (e.g., lifetime management).
    • Expected Upside: Safer, more maintainable code with fewer runtime errors, and alignment with future kernel development trends like Rust integration.

Takeaway

  • Adopt Git and Mailing List Workflows for Patch Submission: Use Git for version control and tools like B4 or lei to streamline patch submission via Linux kernel mailing lists. Ensure patches are concise (under 100 lines), self-contained, and include detailed commit messages to align with community standards.
  • Engage in Subsystem-Specific Communities: Focus on joining niche mailing lists or platforms like IRC/Discord for subsystems (e.g., DRM, GPU drivers) rather than general kernel lists. This increases visibility and collaboration with maintainers and contributors working on similar challenges.
  • Align Development with Kernel Release Cycles: Plan contributions to align with the 9-week kernel cycle, submitting changes to the Linux next tree by RC6 to avoid deferral. Prioritize completing work before the merge window and avoid post-merge-window feature changes to maintain stability.
  • Avoid Cross-Platform Code Sharing Pitfalls: Refrain from duplicating code between operating systems (e.g., Windows/Linux drivers) to prevent "second-class driver" issues. Instead, focus on upstreaming work to the Linux kernel to ensure compatibility and maintainability.
  • Leverage Patch Series Best Practices: Structure contributions as patch series with a clear cover letter, including testing methods, fuzzing results, and rationale for changes. Use curated task lists (e.g., from subsystem maintainers) to identify low-effort fixes and gradually build credibility with the community.

Recent Episodes of Software Engineering Radio

27 May 2026 Dwayne McDaniel on the Engineering Challenges of Secrets Management

Managing secrets like credentials and API keys in software development risks leaks causing supply chain attacks (e.g., PyPy, Clot, Cisco) due to secrets sprawl, plaintext storage, and misuse, prompting solutions like time-bound credentials, decentralized systems, vault tools (e.g., HashiCorp Vault), and strategies such as credential rotation and encrypted storage amid over 28.65 million hard-coded secrets in GitHub in 2025.

20 May 2026 Rob Moffat on Risk-First Software Development

Recommended: Risk identification and management is a forgotten art

Software development prioritizes risk management through frameworks like test-driven development and agile, addressing hidden risks, AI deployment challenges, open-source dependencies, and organizational prioritization to balance innovation with safeguards.

13 May 2026 SE Radio 720: Martin Dilger on Understanding Eventsourcing

Recommended: Useful Architectural Pattern.

Event sourcing is a system design approach that records changes as sequential events to ensure historical traceability, uses event modeling for aligning systems with human workflows, contrasts with CRUD architectures, and emphasizes slice-based design, event streams, and practical applications like legacy modernization and workflow simplification.

6 May 2026 Birol Yildiz on Building an Agentic AI SRE

AI agents in SRE leverage autonomous decision-making, agentic search, and lightweight architectures to replace static runbooks, balancing autonomy with reliability challenges, context management, and human oversight in dynamic environments.

29 Apr 2026 Will Sentance on JS Modernization

JavaScript's evolution from a 1995 scripting language to a performance-optimized modern tool balances innovation with backward compatibility through TC39's incremental updates, browser advancements, community-driven libraries, key features like async/await and symbols, engine optimizations, and a design philosophy prioritizing flexibility and user-driven standardization for large-scale frameworks.

More Software Engineering Radio episodes