Convolutional Neural Networks - Ep. 8 (Deep Learning SIMPLIFIED)

3,023,588 Views

Generative AI

Published on 12/17/22 / In How-to & Learning

Out of all the current Deep Learning applications, machine vision remains one of the most popular. Since Convolutional Neural Nets (CNN) are one of the best available tools for machine vision, these nets have helped Deep Learning become one of the hottest topics in AI.

Deep Learning TV on
Facebook: https://www.facebook.com/DeepLearningTV/
Twitter: https://twitter.com/deeplearningtv

CNNs are deep nets that are used for image, object, and even speech recognition. Pioneered by Yann Lecun at New York University, these nets are currently utilized in the tech industry, such as with Facebook for facial recognition. If you start reading about CNNs you will quickly discover the ImageNet challenge, a project that was started to showcase the state of the art and to help researchers access high-quality image data. Every top Deep Learning team in the world joins the competition, but each time it’s a CNN that ends up taking first place.

A CNN tends to be a difficult concept to grasp. If you’ve ever struggled while trying to learn about these nets, please comment and share your experiences.

CNNs have multiple types of layers, the first of which is the convolutional layer. To visualize this layer, imagine a set of evenly spaced flashlights all shining directly at a wall. Every flashlight is looking for the exact same pattern through a process called convolution. A flashlight’s area of search is fixed in place, and it is bounded by the individual circle of light cast on the wall. The entire set of flashlights forms one filter, which is able to output location data of the given pattern. A CNN typically uses multiple filters in parallel, each scanning for a different pattern in the image. Thus the entire convolutional layer is a 3-dimensional grid of these flashlights.

Connecting some dots
- A series of filters forms layer one, called the convolutional layer. The weights and biases in this layer determine the effectiveness of the filtering process.

- Each flashlight represents a single neuron. Typically, neurons in a layer activate or fire. On the other hand, in the convolutional layer, neurons search for patterns through convolution. Neurons from different filters search for different patterns, and thus they will process the input differently.

- Unlike the nets we've seen thus far where every neuron in a layer is connected to every neuron in the adjacent layers, a CNN has the flashlight effect. A convolutional neuron will only connect to the input neurons that it “shines” upon.

The convoluted input is then sent to the next layer for activation. CNNs use backprop for training, but because a special engine called RELU is used for activation, the nets don’t suffer from the vanishing gradient problem.

In real world applications, image convolution results in 100s of millions of weights and biases, which has an adverse effect on performance. Thus after RELU, the activations are typically pooled in an adjacent layer to reduce dimensionality. Afterwards, there is usually a fully connected layer that acts as a classifier.

CNNs that are in use typically have an architecture with repeated sets of layers. Set 1 is a convolutional layer followed by a RELU. This set can be repeated a few times, and the repeated structure is followed by a pooling layer. This resulting combination forms set 2, which is also repeated a few more times. The final resulting structure is then attached to a fully connected layer at the end. This architecture allows the net to continuously build complex patterns from simple ones, all while lowering computing costs with dimensionality reduction.

CNNs are a powerful tool, but there is one drawback – they require 10s of millions of labelled data points for training. They also must be trained with GPUs for the process to be completed in a reasonable amount of time.

Credits
Nickey Pickorita (YouTube art) -
https://www.upwork.com/freelan....cers/~0147b8991909b2
Isabel Descutner (Voice) -
https://www.youtube.com/user/IsabelDescutner
Dan Partynski (Copy Editing) -
https://www.linkedin.com/in/danielpartynski
Jagannath Rajagopal (Creator, Producer and Director) -
https://ca.linkedin.com/in/jagannathrajagopal

0 Comments

Up next

Autoplay

Machine Learning Full Course 2025 | Complete Machine Learning Course in 11 Hrs | Intellipaat

Generative AI | 197 Views

Please note that if you are under 18, you won't be able to access this site.

Up next

Convolutional Neural Networks - Ep. 8 (Deep Learning SIMPLIFIED)

Up next

Language