Elicitation frontier

Latent capability

Quantifies how many parameters are needed to "elicit" latent capabilities that LLMs already possess but don't readily display. Randomly selected (suggesting knowledge is distributed throughout the model, not concentrated in specific layers/modules) 10-100 parameters fine-tuned recovers 50% of the performance gap compared to full-parameter FT, while 1,000-10,000 parameters recover 95%. "Log of number of trained parameters" vs. "performance" follows a logistic S-curve. This pattern holds across model sizes (1B-8B), types (Llama/Qwen), and tasks (MC/generation/CoT).

MDL enables distinguishing 'teaching vs. eliciting'. If latent capability already exists in the model ⇒ MDL decreases sharply even with few parameters (efficient compression possible). If capability is absent ⇒ no MDL change until many parameters are modified (learning/teaching required). Performance gain = "how efficiently the model awakens structures it already knows". Minimal parameter adjustment can easily reveal dangerous capabilities. The elicitation frontier can serve as a capability forecasting tool.

openreview.net

https://openreview.net/pdf?id=Dkgx2pS4Ww

Elicitation frontier

Latent capability

Recommendations