Abstractive Non-Deterministic Turing Machine with probability
Autoregression is a statistical technique that uses past values to predict future values in a time series. It's a regression of a variable against itself.
only a predetermined number of best partial solutions are kept as candidates
explores a graph by expanding the most promising node in a limited set
use BFS to build Search Tree
A beam search is most often used to maintain tractability in large systems with insufficient amount of memory to store the entire search tree
sampling strategy
greedy
top_k
- 상위 k개의 확률이 높은 토큰 중에서 무작위로 선택
nucleus
- 확률이 높은 토큰을 일정한 확률 분포 내에서 선택하는 방식으로 가장 다양한 결과
기술이 발전하는 과정과 비슷하다 diverge and converge Paradigm