Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Object/NLP/Language Model/LLM/GPT/
GPT 1
Search

GPT 1

Creator
Creator
Seonglae Cho
Created
Created
2020 Aug 23 11:16
Editor
Editor
Seonglae Cho
Edited
Edited
2024 Dec 3 21:6
Refs
Refs
OpenAI GPT (Radford et al., 2018; released in 2018/6)
12개의 Transformer Decoder layers
모델의 크기가 Pretrained Model의 성능에 영향을 준다라는 시사점
FFNN 대신에 Conv1D 를 사용했는데, 대규모학습시 좋다고 알려져 있다.
 
 
 
Ilya Sutskever
Alec Radford
Improving language understanding with unsupervised learning
We’ve obtained state-of-the-art results on a suite of diverse language tasks with a scalable, task-agnostic system, which we’re also releasing. Our approach is a combination of two existing ideas: transformers and unsupervised pre-training. These results provide a convincing example that pairing supervised learning methods with unsupervised pre-training works very well; this is an idea that many have explored in the past, and we hope our result motivates further research into applying this idea on larger and more diverse datasets.
Improving language understanding with unsupervised learning
https://openai.com/index/language-unsupervised/
Improving language understanding with unsupervised learning
cdn.openai.com
https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
 

Backlinks

GPT

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Object/NLP/Language Model/LLM/GPT/
GPT 1
Copyright Seonglae Cho