GPT 1

Creator

Creator

Seonglae Cho

Created

Created

2020 Aug 23 11:16

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Dec 3 21:6

Refs

Refs

OpenAI GPT (Radford et al., 2018; released in 2018/6)

12개의 Transformer Decoder layers

모델의 크기가 Pretrained Model의 성능에 영향을 준다라는 시사점

FFNN 대신에 Conv1D 를 사용했는데, 대규모학습시 좋다고 알려져 있다.

Improving language understanding with unsupervised learning

We’ve obtained state-of-the-art results on a suite of diverse language tasks with a scalable, task-agnostic system, which we’re also releasing. Our approach is a combination of two existing ideas: transformers and unsupervised pre-training. These results provide a convincing example that pairing supervised learning methods with unsupervised pre-training works very well; this is an idea that many have explored in the past, and we hope our result motivates further research into applying this idea on larger and more diverse datasets.

Improving language understanding with unsupervised learning

https://openai.com/index/language-unsupervised/

Improving language understanding with unsupervised learning

https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf

Backlinks

Recommendations

/////////