Up next


Lecture 12.3 Famous transformers (BERT, GPT-2, GPT-3)

2,314,014 Views
AI Lover
3
Published on 12/19/22 / In How-to & Learning

ERRATA:
In the "original transformer" (slide 51), in the source attention, the key and value come from the encoder, and the query comes from the decoder.

In this lecture we look at the details of some famous transformer models. How were they trained, and what could they do after they were trained.

slides: https://dlvu.github.io/slides/dlvu.lecture12.pdf
course website: https://dlvu.github.io
Lecturer: Peter Bloem

Show more
0 Comments sort Sort By

Up next