Pytorch Image Captioning Tutorial

2,772,528 Views

AI Lover

Published on 12/15/22 / In How-to & Learning

In this tutorial we go through how an image captioning system works and implement one from scratch. Specifically we're looking at the caption dataset Flickr8k. There are multiple ways to improve the model: train a larger model (the one used is relatively small), train for longer and improve the model by adding attention similar to this paper: https://arxiv.org/abs/1502.03044.

Video of dataset (link in that video description to download the dataset yourself):
https://youtu.be/9sHcLvVXsns

✅ Support My Channel Through Patreon:
https://www.patreon.com/aladdinpersson

PyTorch Playlist:
https://www.youtube.com/playli....st?list=PLhhyoLH6Ijf

Github Repository:
https://github.com/aladdinpers....son/Machine-Learning

I stole the thumbnail image from Yunjeys Github on Image Captioning which I also used as a resource. The implementation in the video differs a bit, but it's definitely worth checking out:
https://github.com/yunjey/pytorch-tutorial

OUTLINE:
0:00 - Introduction
0:12 - Explanation of Image Captioning
05:15 - Overview of the code
06:07 - Implementation of CNN and RNN
20:03 - Setting up the training
30:36 - Fixing errors
32:18 - Small evaluation and ending

Up next

Pytorch Image Captioning Tutorial

Up next

Please note that if you are under 18, you won't be able to access this site.

Up next

Pytorch Image Captioning Tutorial

Up next

Language