muckrAIkers

Episodes

OpenAI's o1 System Card, Literally Migraine Inducing

Dec 23 2024

The idea of model cards, which was introduced as a measure to increase transparency and understanding of LLMs, has been perverted into the marketing gimmick characterized by OpenAI's o1 system card. To demonstrate the adversarial stance we believe is necessary to draw meaning from these press-releases-in-disguise, we conduct a close read of the system card. Be warned, there's a lot of muck in this one.Note: All figures/tables discussed in the podcast can be found on the podcast website at https://kairos.fm/muckraikers/e009/(00:00) - Recorded 2024.12.08 (00:54) - Actual intro (03:00) - System cards vs. academic papers (05:36) - Starting off sus (08:28) - o1.continued (12:23) - Rant #1: figure 1 (18:27) - A diamond in the rough (19:41) - Hiding copyright violations (21:29) - Rant #2: Jacob on "hallucinations" (25:55) - More ranting and "hallucination" rate comparison (31:54) - Fairness, bias, and bad science comms (35:41) - System, dev, and user prompt jailbreaking (39:28) - Chain-of-thought and Rao-Blackwellization (44:43) - "Red-teaming" (49:00) - Apollo's bit (51:28) - METR's bit (59:51) - Pass@??? (01:04:45) - SWE Verified (01:05:44) - Appendix bias metrics (01:10:17) - The muck and the meaningLinkso1 system cardOpenAI press release collection - 12 Days of OpenAIAdditional o1 CoverageNIST + AISI [report] - US AISI and UK AISI Joint Pre-Deployment TestApollo Research's paper - Frontier Models are Capable of In-context SchemingVentureBeat article - OpenAI launches full o1 model with image uploads and analysis, debuts ChatGPT ProThe Atlantic article - The GPT Era Is Already EndingOn Data Labelers60 Minutes article + video - Labelers training AI say they're overworked, underpaid and exploited by big American tech companiesReflections article - The hidden health dangers of data labeling in AI developmentPrivacy International article = Humans in the AI loop: the data labelers behind some of the most powerful LLMs' training datasetsChain-of-Thought Papers CitedPaper - Measuring Faithfulness in Chain-of-Thought ReasoningPaper - Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought PromptingPaper - On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language ModelsPaper - Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language ModelsOther Mentioned/Relevant SourcesAndy Jones blogpost - Rao-BlackwellizationPaper - Training on the Test Task Confounds Evaluation and EmergencePaper - Best-of-N JailbreakingResearch landing page - SWE BenchCode Competition - Konwinski PrizeLakera game = GandalfKate Crawford's Atlas of AIBlueDot Impact's course - Intro to Transformative AIUnrelated DevelopmentsCruz's letter to Merrick GarlandAWS News Blog article - Introducing Amazon Nova foundation models: Frontier intelligence and industry leading price performanceBleepingComputer article - Ultralytics AI model hijacked to infect thousands with cryptominerThe Register article - Microsoft teases Copilot Vision, the AI sidekick that judges your tabsFox Business article - OpenAI CEO Sam Altman looking forward to working with Trump admin, says US must build best AI infrastructure
Show more Show less

1 hr and 17 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
How to Safely Handle Your AGI

Dec 2 2024

While on the campaign trail, Trump made claims about repealing Biden's Executive Order on AI, but what will actually be changed when he gets into office? We take this opportunity to examine policies being discussed or implemented by leading governments around the world.(00:00) - Intro (00:29) - Hot off the press (02:59) - Repealing the AI executive order? (11:16) - "Manhattan" for AI (24:33) - EU (30:47) - UK (39:27) - Bengio (44:39) - Comparing EU/UK to USA (45:23) - China (51:12) - Taxes (55:29) - The muckLinksSFChronicle article - US gathers allies to talk AI safety as Trump's vow to undo Biden's AI policy overshadows their workTrump's Executive Order on AI (the AI governance executive order at home)Biden's Executive Order on AICongressional report brief which advises a "Manhattan Project for AI"Non-USACAIRNE resource collection on CERN for AIUK Frontier AI Taskforce report (2023)International interim report (2024)Bengio's paper - AI and Catastrophic RiskDavidad's Safeguarded AI program at ARIAMIT Technology Review article - Four things to know about China’s new AI rules in 2024GovInsider article - Australia’s national policy for ethical use of AI starts to take shapeFuture of Privacy forum article - The African Union’s Continental AI Strategy: Data Protection and Governance Laws Set to Play a Key Role in AI RegulationTaxesMacroeconomic Dynamics paper - Automation, Stagnation, and the Implications of a Robot TaxCESifo paper - AI, Automation, and TaxationGavTax article - Taxation of Artificial Intelligence and AutomationPerplexity PagesCERN for AI pageChina's AI policy pageSingapore's AI policy pageAI policy in Africa, India, Australia pageOther SourcesArtificial Intelligence Made Simple article - NYT's "AI Outperforms Doctors" Story Is WrongIntel report - Reclaim Your Day: The Impact of AI PCs on ProductivityHeise Online article - Users on AI PCs slower, Intel sees problem in unenlightened usersThe Hacker News article - North Korean Hackers Steal $10M with AI-Driven Scams and Malware on LinkedInFuturism article - Character.AI Is Hosting Pedophile Chatbots That Groom Users Who Say They're UnderageVice article - 'AI Jesus' Is Now Taking Confessions at a Church in SwitzerlandPolitico article - Ted Cruz: Congress 'doesn't know what the hell it's doing' with AI regulationUS Senate Committee on Commerce, Science, and Transportation press release - Sen. Cruz Sounds Alarm Over Industry Role in AI Czar Harris’s Censorship Agenda
Show more Show less

58 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
The End of Scaling?

Nov 19 2024

Multiple news outlets, including The Information, Bloomberg, and Reuters [see sources] are reporting an "end of scaling" for the current AI paradigm. In this episode we look into these articles, as well as a wide variety of economic forecasting, empirical analysis, and technical papers to understand the validity, and impact of these reports. We also use this as an opportunity to contextualize the realized versus promised fruits of "AI".(00:23) - Hot off the press (01:49) - The end of scaling (10:50) - "Useful tools" and "agentic" "AI" (17:19) - The end of quantization (25:18) - Hedging (29:41) - The end of upwards mobility (33:12) - How to grow an economy (38:14) - Transformative & disruptive tech (49:19) - Finding the meaning (56:14) - Bursting AI bubble and Trump (01:00:58) - The muckLinksThe Information article - OpenAI Shifts Strategy as Rate of ‘GPT’ AI Improvements SlowsBloomberg [article] - OpenAI, Google and Anthropic Are Struggling to Build More Advanced AIReuters article - OpenAI and others seek new path to smarter AI as current methods hit limitationsPaper on the end of quantization - Scaling Laws for PrecisionTim Dettmers Tweet on "Scaling Laws for Precision"Empirical AnalysisWU Vienna paper - Unslicing the pie: AI innovation and the labor share in European regionsIMF paper - The Labor Market Impact of Artificial Intelligence: Evidence from US RegionsNBER paper - Automation, Career Values, and Political PreferencesPew Research Center report - Which U.S. Workers Are More Exposed to AI on Their Jobs?ForecastingNBER/Acemoglu paper - The Simple Macroeconomics of AINBER/Acemoglu paper - Harms of AIIMF report - Gen-AI: Artificial Intelligence and the Future of WorkSubmission to Open Philanthropy AI Worldviews Contest - Transformative AGI by 2043 is <1% likelyExternalities and the Bursting BubbleNBER paper - Bubbles, Rational Expectations and Financial MarketsClayton Christensen lecture capture - Clayton Christensen: Disruptive innovationThe New Republic article - The “Godfather of AI” Predicted I Wouldn’t Have a Job. He Was Wrong.Latent Space article - $2 H100s: How the GPU Rental Bubble BurstOn ProductizationPalantir press release on introduction of Claude to US security and defenseArs Technica article - Claude AI to process secret government data through new Palantir dealOpenAI press release on partnering with Condé NastCandid Technology article - Shutterstock and Getty partner with OpenAI and BRIAE2BStripe agentsRobopairOther SourcesCBS News article - Google AI chatbot responds with a threatening message: "Human … Please die."Biometric Update article - Travelers to EU may be subjected to AI lie detectorTechcrunch article - OpenAI’s tumultuous early years revealed in emails from Musk, Altman, and othersRichard Ngo Tweet on leaving OpenAI
Show more Show less

1 hr and 7 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
US National Security Memorandum on AI, Oct 2024

Nov 6 2024
October 2024 saw a National Security Memorandum and US framework for using AI in national security contexts. We go through the content so you don't have to, pull out the important bits, and summarize our main takeaways.
(00:48) - The memorandum
(06:28) - What the press is saying
(10:39) - What's in the text
(13:48) - Potential harms
(17:32) - Miscellaneous notable stuff
(31:11) - What's the US governments take on AI?
(45:45) - The civil side - comments on reporting
(49:31) - The commenters
(01:07:33) - Our final hero
(01:10:46) - The muck

Links
United States National Security Memorandum on AI
Fact Sheet on the National Security Memorandum
Framework to Advance AI Governance and Risk Management in National Security
Related Media
CAIS Newsletter - AI Safety Newsletter #43
NIST report - Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile
ACLU press release - ACLU Warns that Biden-Harris Administration Rules on AI in National Security Lack Key Protections
Wikipedia article - Presidential Memorandum
Reuters article - White House presses gov't AI use with eye on security, guardrails
Forbes article - America’s AI Security Strategy Acknowledges There’s No Stopping AI
DefenseScoop article - New White House directive prods DOD, intelligence agencies to move faster adopting AI capabilities
NYTimes article - Biden Administration Outlines Government ‘Guardrails’ for A.I. Tools
Forbes article - 5 Things To Know About The New National Security Memorandum On AI – And What ChatGPT Thinks
Federal News Network interview - A look inside the latest White House artificial intelligence memo
Govtech article - Reactions Mostly Positive to National Security AI Memo
The Information article - Biden Memo Encourages Military Use of AI
Other Sources
Physical Intelligence press release - π0: Our First Generalist Policy
OpenAI press release - Introducing ChatGPT Search
WhoPoo App!!
Show more Show less
1 hr and 16 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
Understanding Claude 3.5 Sonnet (New)

Oct 30 2024
Frontier developers continue their war on sane versioning schema to bring us Claude 3.5 Sonnet (New), along with "computer use" capabilities. We discuss not only the new model, but also why Anthropic may have released this model and tool combination now.

(00:00) - Intro
(00:22) - Hot off the press
(05:03) - Claude 3.5 Sonnet (New) Two 'o' 3000
(09:23) - Breaking down "computer use"
(13:16) - Our understanding
(16:03) - Diverging business models
(32:07) - Why has Anthropic chosen this strategy?
(43:14) - Changing the frame
(48:00) - Polishing the lily
Links
Anthropic press release - Introducing Claude 3.5 Sonnet (New)
Model Card Addendum
Other Anthropic Relevant Media
Paper - Sabotage Evaluations for Frontier Models
Anthropic press release - Anthropic's Updated RSP
Alignment Forum blogpost - Anthropic's Updated RSP
Tweet - Response to scare regarding Anthropic training on user data
Anthropic press release - Developing a computer use model
Simon Willison article - Initial explorations of Anthropic’s new Computer Use capability
Tweet - ARC Prize performance
The Information article - Anthropic Has Floated $40 Billion Valuation in Funding Talks
Other Sources
LWN.net article - OSI readies controversial Open AI definition
National Security Memorandum
Framework to Advance AI Governance and Risk Management in National Security
Reuters article - Mother sues AI chatbot company Character.AI, Google over son's suicide
Medium article - A Small Step Towards Reproducing OpenAI o1: Progress Report on the Steiner Open Source Models
The Guardian article - Google's solution to accidental algorithmic racism: ban gorillas
TIME article - Ethical AI Isn’t to Blame for Google’s Gemini Debacle
Latacora article - The SOC2 Starting Seven
Grandview Research market trends - Robotic Process Automation Market Trends
Show more Show less
1 hr and 1 min

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
Winter is Coming for OpenAI

Oct 22 2024
Brace yourselves, winter is coming for OpenAI - atleast, that's what we think. In this episode we look at OpenAI's recent massive funding round and ask "why would anyone want to fund a company that is set to lose net 5 billion USD for 2024?" We scrape through a whole lot of muck to find the meaningful signals in all this news, and there is a lot of it, so get ready!

(00:00) - Intro
(00:28) - Hot off the press
(02:43) - Why listen?
(06:07) - Why might VCs invest?
(15:52) - What are people saying
(23:10) - How *is* OpenAI making money?
(28:18) - Is AI hype dying?
(41:08) - Why might big companies invest?
(48:47) - Concrete impacts of AI
(52:37) - Outcome 1: OpenAI as a commodity
(01:04:02) - Outcome 2: AGI
(01:04:42) - Outcome 3: best plausible case
(01:07:53) - Outcome 1*: many ways to bust
(01:10:51) - Outcome 4+: shock factor
(01:12:51) - What's the muck
(01:21:17) - Extended outro
Links
Reuters article - OpenAI closes $6.6 billion funding haul with investment from Microsoft and Nvidia
Goldman Sachs report - GenAI: Too Much Spend, Too Little Benefit
Apricitas Economics article - The AI Investment Boom
Discussion of "The AI Investment Boom" on YCombinator
State of AI in 13 Charts
Fortune article - OpenAI sees $5 billion loss in 2024 and soaring sales as big ChatGPT fee hikes planned, report says
More on AI Hype (Dying)
Latent Space article - The Winds of AI Winter
Article by Gary Marcus - The Great AI Retrenchment has Begun
TimmermanReport article - AI: If Not Now, When? No, Really - When?
MIT News article - Who Will Benefit from AI?
Washington Post article - The AI Hype bubble is deflating. Now comes the hard part.
Andreesen Horowitz article - Why AI Will Save the World
Other Sources
Human-Centered Artificial Intelligence Foundation Model Transparency Index
Cointelegraph article - Europe gathers global experts to draft ‘Code of Practice’ for AI
Reuters article - Microsoft's VP of GenAI research to join OpenAI
Twitter post from Tim Brooks on joining DeepMind
Edward Zitron article - The Man Who Killed Google Search
Show more Show less
1 hr and 23 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
Open Source AI and 2024 Nobel Prizes

Oct 16 2024
The Open Source AI Definition is out after years of drafting, will it reestablish brand meaning for the “Open Source” term? Also, the 2024 Nobel Prizes in Physics and Chemistry are heavily tied to AI; we scrutinize not only this year's prizes, but also Nobel Prizes as a concept.

(00:00) - Intro
(00:30) - Hot off the press
(03:45) - Open Source AI background
(10:30) - Definitions and changes in RC1
(18:36) - “Business source”
(22:17) - Parallels with legislation
(26:22) - Impacts of the OSAID
(33:58) - 2024 Nobel Prize Context
(37:21) - Chemistry prize
(45:06) - Physics prize
(50:29) - Takeaways
(52:03) - What’s the real muck?
(01:00:27) - Outro

Links
Open Source AI Definition, Release Candidate 1
OSAID RC1 announcement
All Nobel Prizes 2024
More Reading on Open Source AI
Kairos.FM article - Open Source AI is a lie, but it doesn't have to be
The Register article - The open source AI civil war approaches
MIT Technology Review article - We finally have a definition for open-source AI
On Nobel Prizes
Paper - Access to Opportunity in the Sciences: Evidence from the Nobel Laureates
Physics prize - scientific background, popular info
Chemistry prize - scientific background, popular info
Reuters article - Google's Nobel prize winners stir debate over AI research
Wikipedia article - Nobel disease
Other Sources
Pivot.ai article - People are ‘blatantly stealing my work,’ AI artist complains
Paper - GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
Paper - Reclaiming AI as a Theoretical Tool for Cognitive Science | Computational Brain & Behavior
Show more Show less
1 hr and 1 min

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free
SB1047

Sep 30 2024
Why is Mark Ruffalo talking about SB1047, and what is it anyway? Tune in for our thoughts on the now vetoed California legislation that had Big Tech scared.
(00:00) - Intro
(00:31) - Updates from a relatively slow week
(03:32) - Disclaimer: SB1047 vetoed during recording (still worth a listen)
(05:24) - What is SB1047
(12:30) - Definitions
(17:18) - Understanding the bill
(28:42) - What are the players saying about it?
(46:44) - Addressing critiques
(55:59) - Open Source
(01:02:36) - Takeaways
(01:15:40) - Clarification on impact to big tech
(01:18:51) - Outro

Links
SB1047 legislation page
SB1047 CalMatters page
Newsom vetoes SB1047
CAIS newsletter on SB1047
Prominent AI nerd letter
Anthropic's letter
SB1047 ~explainer

Additional SB1047 Related Coverage
Opposition to SB1047 'makes no sense'
Newsom on SB1047
Andreesen Horowitz on SB1047
Classy move by Dan
Ex-OpenAI employee says Altman doesn't want regulation

Other Sources
o1 doesn't measure up in new benchmark paper
OpenAI losses and gains
OpenAI crypto hack
"Murati out" -Mira Murati, probably
Altman pitching datacenters to White House
Sam Altman, 'podcast bro'
Paper: Contract Design with Safety Inspections
Show more Show less
1 hr and 19 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Listen for free

Get Started

Popular Lists

Explore Audible

Episodes

OpenAI's o1 System Card, Literally Migraine Inducing

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

How to Safely Handle Your AGI

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

The End of Scaling?

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

US National Security Memorandum on AI, Oct 2024

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Understanding Claude 3.5 Sonnet (New)

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Winter is Coming for OpenAI

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Open Source AI and 2024 Nobel Prizes

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

SB1047

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed