More Latent Space episodes

Artificial Analysis: The Independent LLM Analysis House  with George Cameron and Micah Hill-Smith thumbnail

Artificial Analysis: The Independent LLM Analysis House with George Cameron and Micah Hill-Smith

Published 9 Jan 2026

Duration: 4694

Artificial Analysis, an independent AI benchmarking platform, provides standardized evaluations and reports to address the lack of comprehensive and impartial AI benchmarks.

Episode Description

dont miss Georges AIE talk: https://www.youtube.com/watch?v=sRpqPgKeXNkFrom launching a side project in a Sydney basement to becoming the independent...

Overview

The podcast outlines the development of Artificial Analysis, an independent benchmarking platform launched in January 2024 to evaluate AI models and hosting providers. Initially a side project, it has evolved into a service offering public and private benchmarking, standardized reports, and custom evaluations for enterprises and AI companies. The platform aims to address the lack of comprehensive and impartial AI benchmarks, focusing on challenges such as model accuracy, cost, and performance in AI development. It operates through a free website, generating revenue from enterprise subscriptions and private evaluations.

The discussion touches on the evolution of benchmarking methodologies, the difficulties in evaluating AI models, and the introduction of new metrics like the Omniscience Index. It also highlights the increasing importance of measuring hallucination rates and model openness. Other topics include trends in AI model costs and performance, the emergence of agentic workflows, and the growing complexity and diversity of the AI ecosystem.

Recent Episodes of Latent Space

20 Mar 2026 Dreamer: the Personal Agent OS David Singleton

Dreamer is an AI platform democratizing access to agentic tools for non-technical users via customizable AI assistants, community-built apps, cross-device integration, and privacy-focused features, with a beta emphasis on accessibility, real-world productivity use cases, and third-party developer opportunities.

More Latent Space episodes