Up next

Lecture 12.3 Famous transformers (BERT, GPT-2, GPT-3)

2,314,016 Views· 12/19/22
Generative AI
Generative AI
3 Subscribers
3

ERRATA:
In the "original transformer" (slide 51), in the source attention, the key and value come from the encoder, and the query comes from the decoder.

In this lecture we look at the details of some famous transformer models. How were they trained, and what could they do after they were trained.

slides: https://dlvu.github.io/slides/dlvu.lecture12.pdf
course website: https://dlvu.github.io
Lecturer: Peter Bloem

Show more

 0 Comments sort   Sort By


Up next