How to run Large AI Models from Hugging Face on Single GPU without OOM

3,087 Views

AI Lover

Published on 06/02/23 / In How-to & Learning

This demo shows how to run large AI models from #huggingface on a Single GPU without Out of Memory error. Take a OPT-175B or BLOOM-176B parameter model .These are large language models and often require very high processing machine or multi-GPU, but thanks to bitsandbytes, in just a few tweaks to your code, you can run these large models on single node.

In this tutorial, we'll see 3 Billion parameter BLOOM AI model (loaded from Hugging Face) and #LLM inference on Google Colab (Tesla T4) without OOM.

This is brilliant! Kudos to the team.

bitsandbytes - https://github.com/TimDettmers/bitsandbytes
Google Colab Notebook - https://colab.research.google.....com/drive/1qOjXfQIAU

0 Comments

Up next

Autoplay

Official UK Chart: Top 10 Songs This Week | What's The Number 1 Single? | Official Charts

AI Lover | 0 Views

Please note that if you are under 18, you won't be able to access this site.

Up next

How to run Large AI Models from Hugging Face on Single GPU without OOM

Up next

Language