TensorFlow Tutorial 19 - Custom Dataset for Text with TextLineDataset
In this video I will show you how to create an input pipeline when dealing with text. We focus on TextLineDataset which is a quite general method that you can adapt to many different text data structure. In the tutorial I show you mainly demonstrating how to load the imdb dataset from a text file but I also try and give you some ideas and what to do if you're dealing with text data is differently structured, like split over multiple text files or a translation dataset split over two text files.
Download the data (IMDB) used in the video here:
https://www.kaggle.com/dataset..../ff33c576e11e20d0c3a
I learned a lot and was inspired to make these TensorFlow videos by the TensorFlow Specialization on Coursera. Below you'll find both affiliate and non-affiliate links, the pricing for you is the same but a small commission goes back to the channel if you buy it through the affiliate link.
affiliate: https://bit.ly/3t3tgI5
non-affiliate: https://bit.ly/3kZgN5B
GitHub Repository:
https://github.com/aladdinpers....son/Machine-Learning
✅ Equipment I use and recommend:
https://www.amazon.com/shop/aladdinpersson
❤️ Become a Channel Member:
https://www.youtube.com/channe....l/UCkzW5JSFwvKRjXABI
✅ One-Time Donations:
Paypal: https://bit.ly/3buoRYH
Ethereum: 0xc84008f43d2E0bC01d925CC35915CdE92c2e99dc
▶️ You Can Connect with me on:
Twitter - https://twitter.com/aladdinpersson
LinkedIn - https://www.linkedin.com/in/al....addin-persson-a95384
GitHub - https://github.com/aladdinpersson
TensorFlow Playlist:
https://www.youtube.com/playli....st?list=PLhhyoLH6Ijf
OUTLINE:
0:00 - Introduction and Dataset Overview
1:39 - Load using TextLineDataset
4:13 - Filtering Dataset
8:12 - Creating Vocabulary
13:43 - Numericalizing with TokenTextEncoder
18:10 - Applying map on datasets
20:35 - Simple Model
22:30 - Dataset in Several Files
25:50 - Sketch Load Translation Dataset
29:22 - Ending
Generative AI