First keynote
- motivation behavior information
- embodyment bias expectation persona dog/humanoid
- only few wants humanoid?
- personality matching is important as a part of alignment vertical
- why you do ai
- generalized model are bad at specialized thing Proprioception as a Tool
LLM reasoning oral
overthinking reduction @router rl
- decs
- Decoupled Reward
- Curriculum Data Scheduling
- nrp detector, cnk?
- Dataset
- DeepScalarR
- https://huggingface.co/datasets/open-r1/OpenR1-Math-220k
- nrp
- gpqa-D
- livecodebench
- baselines
- tlmre
- thinkprune
- adaptthink
- LC-R1
- takeawawy
- efficient reasoning should penalize redunc=dancy not reasoning itself
- decs separte them out
- 기존 length penalty 기반 방식들은 trajectory-level reward와 token-level optimization 사이의 misalignment 때문에 성능 저하를 유발한다고 저자들은 지적한다. 저자들은 두 가지 이론적 결함을 밝혀낸다
- (1) 올바른 trajectory 내의 high-entropy exploratory token이 잘못 penalize되는 현상, (2) Necessary Reasoning Prefix(NRP) 이후 redundant token이 오히려 positive advantage로 보상되는 현상.
belief deviation
- GAE @router rl
- btr entroy?
- relation between memorization and exploration - summarizating form past observation on environments
MemAgent
- hqarl good ddataset?
- condext distribution
dense Retreival learner - next chunk prediction as a retrieval loss
- llm learning token is token level prefix ntp
- retrieval is chunk level with ntp on other thinkgs
AI Task vector with Model Merging
good
CRV
good
Second Oral
associate tokens
- can we approximate the parameters?
- bigram
- interchangability
- context
Sequences of Logits Reveal the Low Rank Structure of Language Models
Causal Structure Learning in Hawkes Processes
- causal discovery in partially observed multivariate hawkes process
- continuous to discrete reduction using bins?
temporal SAE
leveraging sequential property of language model
- single level token
- temporal features are naturally in-distribution for steering over generation
optimal transpork
hawkes process
causality and temporal smooth
eigenbench
- human evaluation does not really scales
- an unbiased metric for value alignment
Seonglae Cho