Sound Generation with Deep Learning || Approaches and Challenges
Join The Sound Of AI Slack community:
https://www.youtube.com/redirect?event=video_description&q=https%3A%2F%2Fjoin.slack.com%2Ft%2Fthesoundofai%2Fshared_invite%2Fzt-f71npumr-anli6W4QCuZ8UCj2gLoBkw&redir_token=QUFFLUhqbExxdFJ1S29mMVBKSFd0REZuUWFjUVpzQTFFUXxBQ3Jtc0ttWXZfTXpkRUlOdVNwU0poT1pmMkdkUzJFMkF4N19HSDVzbm5qUE43eV85bmwyNkFIdmlfbjVRTElhaUJIWmhfMDhJb1hQd1RaU1FmczNBWVA0ZjR6SXhiQ1JFV3owcURxc19KUmdlQU5KTzlmM0dxZw%3D%3D&v=KxRmbtJWUzI
In this video, you can get an understanding of the sound generation task, and learn to classify different types of sound generation systems.
I discuss the approaches and Deep Learning models used to generate sound. I also outline the challenges encountered with different methods, and discuss the features used to train generative sound systems.
Slides:
https://github.com/musikalkemi....st/generating-sound-
Interested in hiring me as a consultant/freelancer?
https://valeriovelardo.com/
Follow Valerio on Facebook:
https://www.facebook.com/TheSoundOfAI
Connect with Valerio on Linkedin:
https://www.linkedin.com/in/valeriovelardo
Follow Valerio on Twitter:
https://twitter.com/musikalkemist
===============================
Content:
0:00 Intro
0:33 Defining the sound generation task
1:17 Classification of sound generation systems
2:14 Types of generated sounds
3:41 Sound representations
4:07 Generation from raw audio
7:40 Challenges of raw audio generation
10:21 Generation from spectrograms
16:12 Advantages of generation from spectrograms
18:07 Challenges of generation from spectrograms
20:26 Can we generate sound with MFCCs?
21:26 DL architectures for sound generation
22:13 Inputs for generation
24:03 Details about the sound generative system we'll build
24:44 What's next?
===============================
Mentioned papers:
Wavenet: A Generative Model for Raw Audio:
https://arxiv.org/pdf/1609.03499.pdf
Jukebox: A Generative Model for Music
https://arxiv.org/pdf/2005.00341
DrumGAN: Synthesis of Drum Sounds with Timbral Feature Conditioning Using Generative Adversarial Networks
https://arxiv.org/pdf/2008.12073
Melnet: A generative model for audio in the frequency domain
https://arxiv.org/pdf/1906.01083.pdf