What the Freakiness of 2025 in AI Tells Us About 2026

Your video will begin in 10
Skip ad (5)
ultimate hustle

Thanks! Share it with your friends!

You disliked this video. Thanks for the feedback!

Added by admin
11 Views
It’s probably not possible to satisfactorily condense a 12 month’s worth of weird progress in AI, as well as predictions for the year to come, into one video. But I’m gonna try anyway because it has been a very strange time.

http://matsprogram.org/s26-aie


My new app! https://lmcouncil.ai


Patreon Interview: https://www.patreon.com/posts/robot-in-your-27-146376094

Chapters:
00:00 - Introduction
00:34 - Reasoning Models … and limits
02:54 - A playable world
03:36 - Realism
03:50 - AI Slop gone mainstream
05:03 - DolphinGemma
05:39 - Public Mood
07:34 - AI Enlisted
08:30 - GPT-5
11:05 - Open Weight not out
13:00 - METR Breakout
17:30 - VASA-1
18:28 - Lateral Productivity
20:15 - 1 or 1000 benchmarks needed?
24:54 - Continual Learning + Altman on Superintelligence
28:08 - Automated Information Discovery ft AlphaEvolve


Hassabis on Generality: https://x.com/demishassabis/status/2003097405026193809
https://www.youtube.com/watch?v=PqVbypvxDto

Gemini 3: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini_3_table_final_HLE_Tools_on.gif
Reasoning Trade-offs: https://arxiv.org/pdf/2504.13837

DolphinGemma: https://blog.google/technology/ai/dolphingemma/?s=09

Genie 3: https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/

METR Time Horizon: https://arxiv.org/pdf/2503.14499
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
Flaws: https://x.com/ShashwatGoel7/status/2002369517499105443
https://shash42.substack.com/p/how-to-game-the-metr-plot
https://x.com/METR_Evals/status/2002203627377574113

GPT-5 - Altman phd in everything: https://edition.cnn.com/2025/08/14/business/chatgpt-rollout-problems

https://simple-bench.com/

AI Slop: https://www.youtube.com/watch?v=I_3vxoJDD9k
https://www.theguardian.com/technology/2025/dec/16/boost-for-artists-in-ai-copyright-battle-as-only-3-per-cent-back-uk-active-opt-out-plan

Survey: https://x.com/SearchlightInst/status/2001057144842387920/photo/1

Nvidia Nemotron: https://x.com/percyliang/status/2000608134205985169

OpenAI Compute Flywheel: https://x.com/OpenAI/status/2001363007209914399/photo/1
Altman Interview: https://www.youtube.com/watch?v=2P27Ef-LLuQ

AI in Govt: https://x.com/jdcmedlock/status/1939814516503847259

Benchmark Gaming: https://techcrunch.com/2025/04/07/meta-exec-denies-the-company-artificially-boosted-llama-4s-benchmark-scores/

AlphaEvolve: https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf?utm_source=deepmind.google&utm_medium=referral&utm_campaign=gdm&utm_content=
Continual Learning: https://abehrouz.github.io/files/NL.pdf

Job Risk: https://archive.ph/20250708204527/https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic

GPT4o: https://x.com/AISafetyMemes/status/1916889492172013989

Vasa-1: https://www.microsoft.com/en-us/research/project/vasa-1/

Three Views: https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j/ai-timelines
Turing Test: https://x.com/tunguz/status/1907185471211422147

Karpathy Year in Review: https://karpathy.bearblog.dev/year-in-review-2025/

LLM Brainrot: https://arxiv.org/pdf/2510.13928

Lateral Productivity: https://www.aisi.gov.uk/frontier-ai-trends-report

Emotional Quotient: https://arxiv.org/pdf/2511.08394

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/


AI Insiders ($9!): https://www.patreon.com/AIExplained



Reasoning Models would boost results but not change paradigm
Genie 3 makes the world playable (literally, in the case of yesterday’s news)
DolphinDecoding
Veo 3.1 / Sora 2 / Nano Banana Pro / Elevenlabs Voice/Music
But AI slop everywhere
Public attitude very mixed (Hassabis)
AI gets enlisted in government, but too early to tell jobs/productivity impact
GPT 5 underwhelms but steady gain in coding/users/HLE (strange that it has to explain its future revenue)
Chinese models in dogged pursuit (not quite made it to my top 4/council). And soon Nvidia? Not Meta for now but SAM amazing
METR continues but will be misinterpreted

And 5 Points about 2026

Vasa wrong but …
Lateral Productivity + safety report, extends to robotics (Patreon mention/freaky robot clip)
Benchmark analogy (AI-2027/lesswrong charts), Amodei (codex report suggests not) and Sutskever (but recanted) and Altman (phd) were former, Altman not anymore (but must watch out for brain rot), and we can’t even decide how general we are … Hassabis/Yann debate
Information Progress analogy (still 2-3 years more of exquisite compression (Deepthink) and on next phase there are …
…examples already, Alphas (will it reduce the human bottleneck?) + Emotional Quotient
Category
Artificial Intelligence

Post your comment

Comments

Be the first to comment