Sound Generation with Deep Learning || Approaches and Challenges

2,462,757 Views

AI Lover

Published on 12/17/22 / In How-to & Learning

Join The Sound Of AI Slack community:
https://www.youtube.com/redirect?event=video_description&q=https%3A%2F%2Fjoin.slack.com%2Ft%2Fthesoundofai%2Fshared_invite%2Fzt-f71npumr-anli6W4QCuZ8UCj2gLoBkw&redir_token=QUFFLUhqbExxdFJ1S29mMVBKSFd0REZuUWFjUVpzQTFFUXxBQ3Jtc0ttWXZfTXpkRUlOdVNwU0poT1pmMkdkUzJFMkF4N19HSDVzbm5qUE43eV85bmwyNkFIdmlfbjVRTElhaUJIWmhfMDhJb1hQd1RaU1FmczNBWVA0ZjR6SXhiQ1JFV3owcURxc19KUmdlQU5KTzlmM0dxZw%3D%3D&v=KxRmbtJWUzI

In this video, you can get an understanding of the sound generation task, and learn to classify different types of sound generation systems.

I discuss the approaches and Deep Learning models used to generate sound. I also outline the challenges encountered with different methods, and discuss the features used to train generative sound systems.

Slides:
https://github.com/musikalkemi....st/generating-sound-

Interested in hiring me as a consultant/freelancer?
https://valeriovelardo.com/

Follow Valerio on Facebook:
https://www.facebook.com/TheSoundOfAI

Connect with Valerio on Linkedin:
https://www.linkedin.com/in/valeriovelardo

Follow Valerio on Twitter:
https://twitter.com/musikalkemist

===============================

Content:
0:00 Intro
0:33 Defining the sound generation task
1:17 Classification of sound generation systems
2:14 Types of generated sounds
3:41 Sound representations
4:07 Generation from raw audio
7:40 Challenges of raw audio generation
10:21 Generation from spectrograms
16:12 Advantages of generation from spectrograms
18:07 Challenges of generation from spectrograms
20:26 Can we generate sound with MFCCs?
21:26 DL architectures for sound generation
22:13 Inputs for generation
24:03 Details about the sound generative system we'll build
24:44 What's next?
===============================

Mentioned papers:

Wavenet: A Generative Model for Raw Audio:
https://arxiv.org/pdf/1609.03499.pdf

Jukebox: A Generative Model for Music
https://arxiv.org/pdf/2005.00341

DrumGAN: Synthesis of Drum Sounds with Timbral Feature Conditioning Using Generative Adversarial Networks
https://arxiv.org/pdf/2008.12073

Melnet: A generative model for audio in the frequency domain
https://arxiv.org/pdf/1906.01083.pdf

Up next

Sound Generation with Deep Learning || Approaches and Challenges

Up next

Please note that if you are under 18, you won't be able to access this site.

Up next

Sound Generation with Deep Learning || Approaches and Challenges

Up next

Language