How to run Large AI Models from Hugging Face on Single GPU without OOM
0
0
3,087 Views
Published on 06/02/23 / In
How-to & Learning
This demo shows how to run large AI models from #huggingface on a Single GPU without Out of Memory error. Take a OPT-175B or BLOOM-176B parameter model .These are large language models and often require very high processing machine or multi-GPU, but thanks to bitsandbytes, in just a few tweaks to your code, you can run these large models on single node.
In this tutorial, we'll see 3 Billion parameter BLOOM AI model (loaded from Hugging Face) and #LLM inference on Google Colab (Tesla T4) without OOM.
This is brilliant! Kudos to the team.
bitsandbytes - https://github.com/TimDettmers/bitsandbytes
Google Colab Notebook - https://colab.research.google.....com/drive/1qOjXfQIAU
Show more
0 Comments
sort Sort By