Micro-pessimist, Macro-optimist
Eval is a direction. Developing a benchmark means a finding a direction of benchmark intended in a benchmark designed space
인질을 활용해라 내가 정말 좋아하는 일과 내가 싫어하는 일을 묶어버려라. 두개 같이 하게. 아니면 hook 이나 환경을 미리 세팅해둬서 자연스럽게 그게 보이거나 신경쓰여서 그 일을 하도록
Jax https://build.nvidia.com/spark/jax/overview
연구에 계속 의도를 넣지말기 실험으로 관찰하고 나는 선택하고 구현 제대로 된지만
이런 아이디어로 이렇게 해봤는데 잘되더라. 이건 설득력 없고 이거보다 요런 모티베이션으로 어떤 문제해결 위해 특정 인사이트로 이렇게 해봤는데 실전에는 문제 있어서 어떤 테크닉 추가했더니 훨씬 좋아지더라 이렇게 말해야.
More inspiration from nature present, more confident on top-down belief that sustains you when experiments contradict you multifaceted beauty (Ilya Sutskever)
돈많이 받는 연구자는 하나밖에 없다. compute 나 scaling 하기 전에 연구자의 taste 로 작동할지 안할지 알아내는 인사이트로 그 비용을 줄여주고 빠르게 전진하게 하는 능력
Rather than new theoretical insights, a "well-packaged framework + stable empirical gains" is much more effective. Concise conceptual packaging with experimental reproducibility + computational discipline with safe novelty creates a sweet spot that appears to be a reasonable extension.
리뷰: 무조건 pessimistic 보단 응원받는 느낌 주게 내 관점도 보여주면서 안된다 현실적인 말보단
- streaming is the first thing to multimodality
Plan
일찍자고 tldr alphasignal 정상화
- acl
- icml 중요하다 매우 서울
oversquashing
- icml
moe sae
routerrl
(confidence manifold)
- acl workshop / iclr workshop
neruonpedia
confidence manifold ‣
agent state graph for ‣
- cvpr / neurips
math interpretability
agent crdt
probe as a tool
- 하반기
SNN Recurrent Model
robotics redteaming or control within simulation
streaming llm Streaming Vision Speech Model
- ARR march
agent state graph
routerrl
single token
stream writer?
Expecting trend
Modeling
Test Time compute
AI Agent
- Memory, Memo
- Tool Calling
나쁜 reserach result descirption
- 나쁜 예: “이 뉴런/feature는 거짓말을 나타낸다”
- “이 activation 방향을 intervene(ablate/patch/steer)하면, 특정 행동/정답률/로그잇이 예측 가능한 방식으로 변한다”
원인→결과 형태로.“정성”이 아니라 “측정 + 그래프” 중심
- effect size: (intervention 전후) 정확도/손실/특정 metric 변화량
- sweep: intervention strength(α) 스윕 곡선
- compare: method A vs baseline vs random 방향 vs matched-control 방향
- error bars 또는 최소한 여러 seed/여러 example 분포
다른 사람들에게 휘둘리지 말기, 내가 믿는 것을 유지하기. CorrSteer 제목, CRL conclusion 함부러 맡기지 말기. Umar UI UX 도 내가 맞다고 믿는 것으로 강하게 주장
food is food, sex is sex. find something really differentiate you. dont be emotional (베니스 비행기내) 장기목표를 위해 나에게 남아있는 일들을 고치기: 자신감 가지고 편하게 말잘뱉기, 긴장없이 발표하고 논문잘쓰기
Personal 2026 Paper plans
Agent State Graph
Done
Done
Working
Working
RouterRL MoE
Done
Done
Working
Working
Streaming Vision Speech Model
Done
Done
Working
Working
Proprioception as a Tool
Done
Done
Working
Working
Math Interpretability ODE/PDE
Done
Done
Working
Working
OptimismBench
Done
Done
Working
Working
Programming Intelligence
Done
Done
Working
Working
Latent Prompting Paper
Done
Done
Working
Working
Realtime VLA Steering
Done
Done
Working
Working
Better Steering Paper 2026
Done
Done
Working
Working
SNN Recurrent Model
Done
Done
Working
Working
Continuous Thought Transformer
Done
Done
Working
Working
Inner cortex outer cortex alignemnt
Done
Done
Working
Working
Feature Distillation
Done
Done
Working
Working
Font OCR project
Done
Done
Working
Working
Social Knowledge Diffusion
Done
Done
Working
Working
AI coding leveraging parser interpretability
Done
Done
Working
Working
다자간 AI
Done
Done
Working
Working
Multi-user multi turn RL
Done
Done
Working
Working
Future Prediction Agent
Done
Done
Working
Working
Curvature Geometry Manifold
Done
Done
Working
Working
Towards Confidence Manifold
Done
Done
Working
Working
Agent Tool Classifier
Done
Done
Working
Working
Oversquashing project
Done
Done
Working
Working
Neuronpedia Single token ratio
Done
Done
Working
Working
MoE SAE BAIR
Done
Done
Working
Working
benchmarks that measure how interact human well
Done
Done
Working
Working
Token Manifold
Done
Done
Working
Working
future prediction using moe experts
Done
Done
Working
Working
Intransigent AI
Done
Done
Working
Working
Engineering project
나한테 필요한 것
- ChatKit, GIS? 매일스는 흥미로운 것
- ElevneLabs
Personal AI Product Idea
- 물수제비 잘하게 던지는 기계
- 물건 어디든 배송해주는 amazon 같은거 말고 personal 드론 배송 즉 특급 대신하지만 훨씬 정교하게. 다만 법적 제약 까다로울듯
AI Paper Ideas
TreeFormer idea
Specific
Specific
Computable
Computable
Robot ai behavior neuron mechanistic interpretability
Specific
Specific
Computable
Computable
RL based Jailbreaking CoT
Specific
Specific
Computable
Computable
Jailbreaking with Stochastic Process theory
Specific
Specific
Computable
Computable
LLM self-play RL
Specific
Specific
Computable
Computable
Transformer Quantization using Sparse Activation
Specific
Specific
Computable
Computable
Back Forward Propagation
Specific
Specific
Computable
Computable
Binomial attention head is effective for scaling model
Specific
Specific
Computable
Computable
장기 information reward (infovore) project
Specific
Specific
Computable
Computable
Sparsity based AI Evaluation
Specific
Specific
Computable
Computable
RL Time series Transformer like Decision Transformer
Specific
Specific
Computable
Computable
AI Neuron Idea density
Specific
Specific
Computable
Computable
Structured knowledge documenting Dataset
Specific
Specific
Computable
Computable
Attention mechanism based retriever
Specific
Specific
Computable
Computable
Attention Sink with abstraction
Specific
Specific
Computable
Computable
Sentence Transformer gravity learning
Specific
Specific
Computable
Computable
Multimodal Harmful request blocking
Specific
Specific
Computable
Computable
Attention Head-Specific Memory
Specific
Specific
Computable
Computable
find cot steering vector
Specific
Specific
Computable
Computable
LLM alphabetical encoding Tokenizer
Specific
Specific
Computable
Computable
SAE on diffusion model or state space model
Specific
Specific
Computable
Computable
MAML to SAE layers & model transferability
Specific
Specific
Computable
Computable
Seonglae Cho