LLM intuitions from OpenAI

Observation

  • computing cost is decreasing exponentially
  • teaching more low-level intelligence from induce incentive requires more computing
  • low level (transformer) for high incentive structure (intelligence)
  • unlike human, machine has different time budget
  • some abilities emerge with scale
  • Emergent ability this idea doesn't work yet
  • 진짜 잘되는 건 처음부터 잘 안된다
  • for such scalability and generalization, time and computing required
 

Approach

  • make learn them how we think
  • matmul + length + dimension
  • superintelligence를 위해서는 human method를 꼭 따라갈 필요는 없다 (loss 0)
  • learning objective and reasoning from induced incentive
  • larning something general for milons of real world task
 
 

Induced incentive

  • llm의 목표는 언어가 아니다 induced된 것
  • chatgpt는 agi를 align 위한 대중들이 참여하는 community platform
 
 
 

Email

Seeking Your Guidance: Career Advice for Pursuing Opportunities at OpenAI
Dear Hyung Won Chung,
I hope this message finds you well. I'm Seonglae Cho and I had the pleasure of attending the LLM talk at Yonsei University on 11/23.
As someone preparing to pursue jobs in AI companies, I am reaching out with the hope of seeking career advice. Currently, my goal is to secure a position at OpenAI, and to achieve this, I am actively engaged in job hunting. I am presently in the midst of the coding test process with Anthropic.
I would greatly appreciate any guidance you might offer to help me focus or provide directional comments towards joining OpenAI. While I am currently focusing more on engineering, my future goal is to contribute to research about superintelligence. I have also attached my resume to provide further background information about myself.
Your time in reading this message amid your busy schedule is sincerely appreciated. Additionally, I wanted to express my gratitude for sharing your insights during yesterday's lecture; your firsthand experiences from the industry were invaluable.
Best regards,
Seonglae Cho
 
 
 
 
핵심을 보고 부수적인 걸 신경쓰지 읺고 경험으로 확신을 가지기
 
 
 
Large Language Models (in 2023)
I gave a talk at Seoul National University. I titled the talk “Large Language Models (in 2023)”. This was an ambitious attempt to summarize our exploding field. Trying to summarize the field forced me to think about what really matters in the field. While scaling undeniably stands out, its far-reaching implications are more nuanced. I share my thoughts on scaling from three angles: 1:02 1) Change in perspective is necessary because some abilities only emerge at a certain scale. Even if some abilities don’t work with the current generation LLMs, we should not claim that it doesn’t work. Rather, we should think it doesn’t work yet. Once larger models are available many conclusions change. This also means that some conclusions from the past are invalidated and we need to constantly unlearn intuitions built on top of such ideas. 7:12 2) From first-principles, scaling up the Transformer amounts to efficiently doing matrix multiplications with many, many machines. I see many researchers in the field of LLM who are not familiar with how scaling is actually done. This section is targeted for technical audiences who want to understand what it means to train large models. 27:52 3) I talk about what we should think about for further scaling (think 10000x GPT-4 scale). To me scaling isn’t just doing the same thing with more machines. It entails finding the inductive bias that is the bottleneck in further scaling. I believe that the maximum likelihood objective function is the bottleneck in achieving the scale of 10000x GPT-4 level. Learning the objective function with an expressive neural net is the next paradigm that is a lot more scalable. With the compute cost going down exponentially, scalable methods eventually win. Don’t compete with that. In all of these sections, I strive to describe everything from first-principles. In an extremely fast moving field like LLM, no one can keep up. I believe that understanding the core ideas by deriving from first-principles is the only scalable approach. Disclaimer: I give my personal opinions and the talk material doesn't reflect my employer's opinion in any way.
Large Language Models (in 2023)
 
 
 

Recommendations