Up next


How to run Large AI Models from Hugging Face on Single GPU without OOM

3,084 Views
AI Lover
3
Published on 06/02/23 / In How-to & Learning

This demo shows how to run large AI models from #huggingface on a Single GPU without Out of Memory error. Take a OPT-175B or BLOOM-176B parameter model .These are large language models and often require very high processing machine or multi-GPU, but thanks to bitsandbytes, in just a few tweaks to your code, you can run these large models on single node.

In this tutorial, we'll see 3 Billion parameter BLOOM AI model (loaded from Hugging Face) and #LLM inference on Google Colab (Tesla T4) without OOM.

This is brilliant! Kudos to the team.

bitsandbytes - https://github.com/TimDettmers/bitsandbytes
Google Colab Notebook - https://colab.research.google.....com/drive/1qOjXfQIAU

Show more
0 Comments sort Sort By

Up next