AI Research

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Aug 8 13:11
Editor
Edited
Edited
2025 Dec 17 18:46

The scalability of methods is crucial in evaluating AI research

  • Data generalization is essential, methods that only work in specific environments are useless.
  • Cost-benefit analysis is important, good performance alone doesn't guarantee practicality.
  • Error analysis is mandatory, without understanding why it fails, improvement is impossible.
More inspiration from nature present, more confident on top-down belief that sustains you when experiments contradict you multifaceted beauty (
Ilya Sutskever
)
AI Research Notion
 

Top labs

 
 
https://researchtrend.ai/
 
 
Ilya Sutskever – We're moving from the age of scaling to the age of research
Ilya & I discuss SSI’s strategy, the problems with pre-training, how to improve the generalization of AI models, and how to ensure AGI goes well. 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Transcript: https://www.dwarkesh.com/p/ilya-sutskever-2 * Apple Podcasts: https://podcasts.apple.com/us/podcast/dwarkesh-podcast/id1516093381?i=1000738363711 * Spotify: https://open.spotify.com/episode/7naOOba8SwiUNobGz8mQEL?si=39dd68f346ea4d49 𝐒𝐏𝐎𝐍𝐒𝐎𝐑𝐒 - Gemini 3 is the first model I’ve used that can find connections I haven’t anticipated. I recently wrote a blog post on RL’s information efficiency, and Gemini 3 helped me think it all through. It also generated the relevant charts and ran toy ML experiments for me with zero bugs. Try Gemini 3 today at https://gemini.google - Labelbox helped me create a tool to transcribe our episodes! I’ve struggled with transcription in the past because I don’t just want verbatim transcripts, I want transcripts reworded to read like essays. Labelbox helped me generate the *exact* data I needed for this. If you want to learn how Labelbox can help you (or if you want to try out the transcriber tool yourself), go to https://labelbox.com/dwarkesh - Sardine is an AI risk management platform that brings together thousands of device, behavior, and identity signals to help you assess a user’s risk of fraud & abuse. Sardine also offers a suite of agents to automate investigations so that as fraudsters use AI to scale their attacks, you can use AI to scale your defenses. Learn more at https://sardine.ai/dwarkesh To sponsor a future episode, visit https://dwarkesh.com/advertise 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 – Explaining model jaggedness 00:09:39 - Emotions and value functions 00:18:49 – What are we scaling? 00:25:13 – Why humans generalize better than models 00:35:45 – Straight-shotting superintelligence 00:46:47 – SSI’s model will learn from deployment 00:55:07 – Alignment 01:18:13 – “We are squarely an age of research company” 01:29:23 -- Self-play and multi-agent 01:32:42 – Research taste
Ilya Sutskever – We're moving from the age of scaling to the age of research

AI research paper dataset from
Arxiv

 
 

Backlinks

AI Agent

Recommendations