Up next


What is GPT - Improving Language Understandingby Generative Pre-Training (paper explained)

2,995,032 Views
AI Lover
3
Published on 12/19/22 / In How-to & Learning

GPT is the first of the papers which proved the effectiveness of unsupervised pre-training for language processing tasks. This video is about GPT-1 which became quite an impactful work in the series of GPT papers that we now have (GPT-2 and GPT-3).


Paper: https://www.cs.ubc.ca/~amuham0....1/LING530/papers/rad
code: https://github.com/openai/finetune-transformer-l
Official OpenAI blog: https://openai.com/blog/language-unsupervised/


Paper Abstract:
Natural language understanding comprises a wide range of diverse tasks suchas textual entailment, question answering, semantic similarity assessment, anddocument classification. Although large unlabeled text corpora are abundant,labeled data for learning these specific tasks is scarce, making it challenging fordiscriminatively trained models to perform adequately. We demonstrate that largegains on these tasks can be realized bygenerative pre-trainingof a language modelon a diverse corpus of unlabeled text, followed bydiscriminative fine-tuningon eachspecific task. In contrast to previous approaches, we make use of task-aware inputtransformations during fine-tuning to achieve effective transfer while requiringminimal changes to the model architecture. We demonstrate the effectiveness ofour approach on a wide range of benchmarks for natural language understanding.Our general task-agnostic model outperforms discriminatively trained models thatuse architectures specifically crafted for each task, significantly improving upon thestate of the art in 9 out of the 12 tasks studied. For instance, we achieve absoluteimprovements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% onquestion answering (RACE), and 1.5% on textual entailment (MultiNLI).


AI Bites
YouTube: https://www.youtube.com/c/AIBites
Twitter: https://twitter.com/ai_bites
Patron: https://www.patreon.com/ai_bites
github: https://github.com/ai-bites

Show more
0 Comments sort Sort By

Up next