AI Explained Official Podcast

Episodes

OpenAI Backtracks on Superintelligence + Altman Brings His Timeline Forward

Jan 8 2025

Sam Altman unexpectedly brings his timelines to AGI forward, while OpenAI backtrack on superintelligence. None of these changes were heralded, but they are significant. Plus the new year brings new assessments of the true capability of models to automate 'large swathes of the economy'. I'll give my prediction on that front for 2025, announcement a new Simple Bench competition, and showcase Kling 1.6 vs Veo 2 vs Sora, and much more.

wandb.me/simple-bench
(Colab): https://colab.research.google.com/drive/1AVijcPnEkl8Gy_754XbRdG5m7Q5-9slg?usp=sharing

TheAgentCompany Paper: https://arxiv.org/pdf/2412.14161v1
Sam Altman Major Interview: https://www.bloomberg.com/features/2025-sam-altman-interview/?srnd=phx-ai
OpenAI Agent Coming Jan 2025: https://www.theinformation.com/articles/why-openai-is-taking-so-long-to-launch-agents?rc=sy0ihq
Altman Singularity: https://x.com/sama/status/1875603249472139576
Altman Original Timeline: https://www.youtube.com/watch?v=7dCPytNTnjk&t=621s
https://www.ft.com/content/34a7a082-e685-4e02-bca7-61ff89d99ed2
OpenAI Original Emails: https://www.lesswrong.com/posts/5jjk4CDnj9tA7ugxr/openai-email-archives-from-musk-v-altman-and-openai-blog
DeepMind Sky News 2014 Article: https://news.sky.com/story/google-buys-uk-intelligence-firm-deepmind-10419783
Altman Blog Reflections: https://blog.samaltman.com/reflections
OpenAI Changes Who Gets AGI: https://openai.com/index/why-our-structure-must-evolve-to-advance-our-mission/?s=09
OpenAI 5 Levels: https://www.bloomberg.com/news/articles/2024-07-11/openai-sets-levels-to-track-progress-toward-superintelligent-ai
Altman 2015: https://blog.samaltman.com/machine-intelligence-part-1
OpenAI React to Anthropic: https://www.theinformation.com/articles/how-anthropic-got-inside-openais-head?rc=sy0ihq
Microsoft $100B Definition: https://www.theinformation.com/articles/microsoft-and-openai-wrangle-over-terms-of-their-blockbuster-partnership?rc=sy0ihq
Epoch Scramble for Task Benchmark: https://x.com/tamaybes/status/1876692639363612919
GPQA Progress: https://epoch.ai/data/ai-benchmarking-dashboard
Task Length Crucial for ARC-AGI: https://anokas.substack.com/p/llms-struggle-with-perception-not-reasoning-arcagi
RL Environment Tweet: https://x.com/vedantmisra/status/1876327518157807990
Jason Wei Talk: https://www.youtube.com/watch?v=yhpjpNXJDco
Miles Brunda

Show more Show less

24 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
o3 - wow

Dec 21 2024

o3 isn’t one of the biggest developments in AI for 2+ years because it beats a particular benchmark. It is so because it demonstrates a reusable technique through which almost any benchmark could fall, and at short notice. I’ll cover all the highlights, benchmarks broken, and what comes next. Plus, the costs OpenAI didn’t want us to know, Genesis, ARC-AGI 2, Gemini-Thinking, and much more.

FrontierMath: https://epoch.ai/frontiermath
https://arxiv.org/pdf/2411.04872
Chollet Statement:https://arcprize.org/blog/oai-o3-pub-breakthrough
MLC Paper:
https://www.scientificamerican.com/article/new-training-method-helps-ai-generalize-like-people-do/?utm_campaign=socialflow&utm_source=twitter&utm_medium=social
AlphaCode 2: https://storage.googleapis.com/deepmind-media/AlphaCode2/AlphaCode2_Tech_Report.pdf
Human Performance on ARC-AGI: https://arxiv.org/pdf/2409.01374v1
Wei Tweet ‘3 months’:https://x.com/_jasonwei/status/1870184982007644614
Deliberative Alignment Paper: https://openai.com/index/deliberative-alignment/
Brown Safety Tweet: https://x.com/polynoamial/status/1870196476908834893
Swe-Bench Verified: https://openai.com/index/introducing-swe-bench-verified/
Amodei Prediction: https://x.com/OfirPress/status/1858567863788769518
David Dohan: 16 hours https://x.com/dmdohan/status/1870171404093796638
OpenAI Personal Writing: https://openai.com/index/learning-to-reason-with-llms/
https://simple-bench.com/
John Hallman Tweet: https://x.com/johnohallman/status/1870233375681945725

00:00 - Introduction
01:19 - What is o3?
03:18 - FrontierMath
05:15 - o4, o5
06:03 - GPQA
06:24 - Coding, Codeforces + SWE-verified, AlphaCode 2
08:13 - 1st Caveat
09:03 - Compositionality?
10:16 - SimpleBench?
13:11 - ARC-AGI, Chollet

Show more Show less

22 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
Never Browse Alone? - Gemini 2 Live and ChatGPT Vision

Dec 12 2024

The ‘Gemini 2 Era’ begins … with screen-sharing? But really, it’s a great free tool, for curiosity satisfying rather than bleeding-edge intelligence. I give you the benchmarks, the highlights and of course, the latest from OpenAI Advanced Voice Mode with Vision.
Plus Deep Research in Gemini Advanced, Simple Bench updates, Santa and what might be for some of you Google’s deflating admission.

00:00 - Introduction
00:38 - Live Interaction
03:43 - Gemini 2.0 Flash Benchmarks
05:10 - Audio and Image Output
06:38 - Project Mariner (+ WebVoyager Bench)
08:49 - But Progress Slowing Down?
10:43 - OpenAI Announcements + Games

https://aistudio.google.com/live
Gemini 2.0 Flash Benchmarks: https://deepmind.google/technologies/gemini/
Project mariner: https://deepmind.google/technologies/project-mariner/
WebVoyager: https://x.com/laurentsifre/status/1858918588683296875/photo/1
Gemini Game play: https://www.youtube.com/watch?v=IKuGNHJBGsc
Advanced Voice Mode OpenAI: https://www.youtube.com/watch?v=NIQDnWlwYyQ
https://simple-bench.com/
Claude Computer Use: https://docs.anthropic.com/en/docs/build-with-claude/computer-use
Oriol Vinyals Interview: https://www.youtube.com/watch?v=78mEYaztGaw&t=687s

Show more Show less

14 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
Sora is Out, But is it a Distraction?

Dec 10 2024

After a 10 month wait, OpenAI have released Sora to paying users. With just a prompt it can generate videos of up to 20 seconds in lower resolutions, and 10 seconds at 1080p if you can fork out $200/month. I’ve tested it and read the system card. The user interface is quite beautiful, even if the videos themselves operate until entirely new rules of physics. But I can’t help wondering if OpenAI want up to focus on releases like this, rather than some quietly broken promises.

80,000 hours Website, Podcast + Channel:
https://80000hours.org/
https://open.spotify.com/show/2WzJwXWBDnn4iZ7odKwDib https://www.youtube.com/@eightythousandhours/videos

https://openai.com/sora/

Sora Countries: https://help.openai.com/en/articles/10250692-sora-supported-countries
Sora Credits: https://help.openai.com/en/articles/10245774-sora-billing-credits-faq
https://runwayml.com/ and https://pika.art/home

DeepMind Veo: https://deepmind.google/technologies/veo/

Sam Altman Ads as Last Resort: https://www.windowscentral.com/software-apps/openai-could-chase-intrusive-ads-as-last-resort

But OpenAI Considering Ads: https://www.inc.com/ben-sherry/is-openai-getting-into-the-advertising-business-the-company-is-sending-mixed-messages/91033533

OpenAI Backtracks on Microsoft AGI Clause: https://www.ft.com/content/2c14b89c-f363-4c2a-9dfc-13023b6bce65

As Microsoft Boast of Labor Savings: https://www.theinformation.com/articles/microsofts-new-sales-pitch-for-ai-spend-less-money-on-humans?rc=sy0ihq

OpenAI Military Pivot: https://www.technologyreview.com/2024/12/04/1107897/openais-new-defense-contract-completes-its-military-pivot/

Employees Have Doubts: https://www.washingtonpost.com/technology/2024/12/06/openai-anduril-employee-military-ai/?nid=top_pb_signin&arcId=KZIV7PLRHBCVNPAIAAAVUNRHIM&account_location=ONSITE_HEADER_ARTICLE

Show more Show less

16 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
o1 Pro Mode – Full Analysis (plus o1 paper highlights)

Dec 5 2024

Oh boy. o1 pro mode out on the same night as o1 full. I read the 49 page paper, ran my own tests, spent my fuel allowance on Pro Mode and will give you all the highlights. Suffice to say the story is not as simple as it first appears.
Weights and Biases’ Weave: wandb.me/ai_explained
Plus, GPT-4.5? MLE Bench, Simple Update, Image Analysis and much more

o1 System Card: https://cdn.openai.com/o1-system-card-20241205.pdf
Apollo Research: https://www.apolloresearch.ai/research/scheming-reasoning-evaluations
Altman Tweet: https://x.com/AnonCEOMakeItAi/status/1864763052622504344
ChatGPT Pro: https://openai.com/index/introducing-chatgpt-pro/
Tibor Blaho: https://x.com/btibor91/status/1864709670470066605
Simple-bench.com

00:00 - Introduction
00:27 - ChatGPT Pro is $200
01:25 - OpenAI Benchmarks
03:20 - o1 System Card, o1 and o1 Pro Mode vs o1-preview
06:18 - Simple Bench surprising results on sample
08:31 - Weight & Biases
09:05 - Image Analysis Compared
12:51 - More Benchmarks and Safety

Show more Show less

17 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
AI Breaks Its Silence: OpenAI’s ‘Next 12 Days’, Genie 2, and a Word of Caution

Dec 5 2024

Calmest before the storm? Whatever analogy you want to use things had gotten quiet toward the end of 2024. But then tonight we got Genie 2, and a series of scheduled announcements from OpenAI. Sora is soon here, and o1, but I dive deeper into what it all means and whether reliability is on a path to being solved, ft: two recent papers.
Assembly AI Speech to Text: https://www.assemblyai.com/?utm_source=youtube&utm_medium=influencer&utm_campaign=ai_explained
Plus Kling Motion Brush, Simple Bench QwQ update and much more.

Genie 2: https://deepmind.google/discover/blog/genie-2-a-large-scale-foundation-world-model/
Jim Cramer: https://x.com/jimcramer/status/1864068878692675625
Give Us Full o1: https://x.com/tszzl/status/1863882905422106851
Verge Scoop: https://x.com/tomwarren/status/1864326361415925861
O1 Learning to Reason Benchmarks: https://openai.com/index/learning-to-reason-with-llms/
SIMA AI: https://arxiv.org/pdf/2404.10179
Genie Paper: https://arxiv.org/pdf/2402.15391
My Video on Genie: https://www.youtube.com/watch?v=gGKsfXkSXv8
Oasis Minecraft: https://x.com/risphereeditor/status/1852619965511204974
LLMs Procedural Knowledge Paper: https://arxiv.org/pdf/2411.12580
Bag of Heuristics Paper: https://arxiv.org/pdf/2410.21272
Jensen Huang Hallucinations: https://www.tomshardware.com/tech-industry/artificial-intelligence/jensen-says-we-are-several-years-away-from-solving-the-ai-hallucination-problem-in-the-meantime-we-have-to-keep-increasing-our-computation
DeepSeek Interview: https://www.chinatalk.media/p/deepseek-ceo-interview-with-chinas
Kling Motion Brush: https://klingai.com/image-to-video

Tim Rocktaschel Book: https://geni.us/ArtificialIntelligence

00:43 - OpenAI 12 Days, Sora Turbo, o1
03:06 - Genie 2
08:26 - Jensen Huang and Altman Hallucination Predictions
09:45 - Bag of Heuristics Paper
11:40 - Procedural Knowledge Paper
13:02 - AssemblyAI Universal 2
13:45 - SimpleBench QwQ and Chinese Models
14:42 - Kling Motion Brush

Show more Show less

15 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
New Google Model Ranked ‘No. 1 LLM’, But There’s a Problem

Nov 15 2024

A new and mysterious Gemini model appears at the top of the leaderboard, but is that the full story? I dig behind the headline to show you some anti-climactic results, give some context with leaks in the last 48 hours of diminishing returns to scaling, and add the response of Altman, OpenAI and co. The future is about to look a lot stranger...

80,000 hours Podcast and Channel: https://open.spotify.com/show/2WzJwXWBDnn4iZ7odKwDib
https://www.youtube.com/@eightythousandhours/videos

You can now gift memberships to AI Insiders (my Patreon w/ exclusive vids, network): https://www.patreon.com/AIExplained/gift

‘There is no wall’: https://x.com/sama/status/1856941766915641580
https://x.com/vedantmisra/status/1857148554105544708
Gemini Ranking: https://lmarena.ai/?leaderboard
API not yet up: https://x.com/OfficialLoganK/status/1857106844805681153
‘Just Die Chat’: https://x.com/koltregaskes/status/1856754648146653428
Google CEO tweet: https://x.com/sundarpichai/status/1857114106928718329
Sutskever Quote: https://www.reuters.com/technology/artificial-intelligence/openai-rivals-seek-new-path-smarter-ai-current-methods-hit-limitations-2024-11-11/
Another OpenAI Staffer Leaves: https://x.com/RichardMCNgo/status/1856843040427839804
Bloomberg Report: https://www.bloomberg.com/news/articles/2024-11-13/openai-google-and-anthropic-are-struggling-to-build-more-advanced-ai?s=09
Noam Brown on what OpenAI Researchers Believe: https://x.com/polynoamial/status/1855037689533178289
Clive Chan: https://x.com/itsclivetime/status/1855704120495329667
Chollet Responds to Altman: https://x.com/fchollet/status/1857060079586975852
https://x.com/sama/status/1856940152460869718
Altman Emails: https://x.com/TechEmails/status/1857285960997712356
Change of Heart: https://sd11.senate.ca.gov/news/senator-wiener-responds-openai-opposition-sb-1047
Amodei on ‘Empirical Regularities’: https://lexfridman.com/dario-amodei-transcript/
Verge Report: https://www.theverge.com/2024/10/25/24279600/google-next-gemini-ai-model-openai-december
OpenAI Agents in January: https://www.bloomberg.com/news/articles/2024-11-13/openai-nears-launch-of-ai-agents-to-automate-tasks-for-users?srnd=phx-ai

Show more Show less

15 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
Leak: ‘GPT-5 exhibits diminishing returns’, Sam Altman: ‘lol’

Nov 10 2024

The last few days have seen two narratives emerge. One, derived from yesterday’s OpenAI leak in TheInformation, that GPT-5/Orion is a disappointment, and less of a leap than GPT-3 to GPT-4. The second comes from a series of 4 clips (shown in this video) from Sam Altman, regarding the ‘clear path’ to AGI. Let’s go beyond the headlines (and through papers like Frontier Math) to get closer to the ground truth…

Plus Universal-2, Sora comments, Claude 3.5 Haiku SimpleBench update, and a great new AI video.

Assembly AI Speech to Text: https://www.assemblyai.com/?utm_source=youtube&utm_medium=influencer&utm_campaign=ai_explained

00:39 – Bear Case, TheInformation Leak
04:01 – Bull Case, Sam Altman
06:20 – FrontierMath
11:29 – o1 Paradigm
13:11 – Text to Video Greatness and Universal-2

TheInformation Leak: https://www.theinformation.com/articles/openai-shifts-strategy-as-rate-of-gpt-ai-improvements-slows?rc=sy0ihq
Noam Brown Replies: https://x.com/polynoamial/status/1855453104394637444
Sam Altman Y-Combinator Interview: https://www.youtube.com/watch?v=xXCBz_8hM9w&t=1556s
Altman Reply: https://x.com/sama/status/1855100359511097828
https://simple-bench.com/
FrontierMath Paper: https://arxiv.org/pdf/2411.04872
Frontier Math Blog Post: https://epochai.org/frontiermath
Tao: https://x.com/EpochAIResearch/status/1854996368814936250
MMLU Are We Done (cites me!): https://arxiv.org/pdf/2406.04127
Universal-2 https://www.assemblyai.com/research/universal-2
Noam Brown ‘We don’t know’: https://www.youtube.com/watch?v=Gr_eYXdHFis
Anthropic Founder Response: https://x.com/jackclarkSF/status/1855485569998217231
Sora (Runway Comment): https://x.com/c_valenzuelab/status/1855026417354129455
Sora New Vid: https://www.youtube.com/watch?v=_iETa2KDRuw
Darri3D Video: https://www.reddit.com/r/ChatGPT/comments/1gn0n3z/can_you/

Show more Show less

16 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free

Get Started

Popular Lists

Explore Audible

Episodes

OpenAI Backtracks on Superintelligence + Altman Brings His Timeline Forward

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

o3 - wow

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Never Browse Alone? - Gemini 2 Live and ChatGPT Vision

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Sora is Out, But is it a Distraction?

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

o1 Pro Mode – Full Analysis (plus o1 paper highlights)

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

AI Breaks Its Silence: OpenAI’s ‘Next 12 Days’, Genie 2, and a Word of Caution

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

New Google Model Ranked ‘No. 1 LLM’, But There’s a Problem

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Leak: ‘GPT-5 exhibits diminishing returns’, Sam Altman: ‘lol’

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed