How to Configure GPU for Working with LLM Models

Configuring a GPU for working with large language models (LLM) requires considering several factors, such as hardware compatibility, driver installation, software configuration, and environment optimization. In this article, we will discuss step-by-step how to prepare your GPU for efficient work with LLM models.

1. Choosing the Right GPU

Before starting the configuration, it is important to choose the right GPU. LLM models require a lot of GPU memory (VRAM) and computational power. The most popular options are:

NVIDIA A100 – ideal for large models with a large number of parameters.
NVIDIA RTX 3090/4090 – a good choice for smaller models and experiments.
AMD MI200 – an alternative to NVIDIA, but requires additional steps in configuration.

2. Installing GPU Drivers

For NVIDIA Cards

Downloading Drivers:
- Visit the NVIDIA Driver Downloads page.
- Select the graphics card model and operating system.

Installing Drivers:

sudo apt update
sudo apt install -y nvidia-driver-535
sudo reboot

Checking Installation:
```
nvidia-smi
```

For AMD Cards

Downloading Drivers:
- Visit the AMD Driver Support page.
- Select the graphics card model and operating system.

Installing Drivers:

sudo apt update
sudo apt install -y rocm-opencl-runtime
sudo reboot

Checking Installation:
```
rocminfo
```

3. Installing CUDA and cuDNN

Installing CUDA

Downloading CUDA:
- Visit the CUDA Toolkit Archive page.
- Select the appropriate version for your system.

Installing CUDA:

sudo dpkg -i cuda-repo-<distro>_<version>_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/<distro>/x86_64/3bf863cc.pub
sudo apt update
sudo apt install -y cuda

Adding CUDA to the PATH Variable:

echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc

Installing cuDNN

Downloading cuDNN:
- Register on the NVIDIA cuDNN page.
- Download the appropriate version for your system.

Installing cuDNN:

sudo dpkg -i cudnn-local-repo-<distro>_<version>_amd64.deb
sudo apt update
sudo apt install -y libcudnn8

4. Configuring the Development Environment

Installing Python and Libraries

Installing Python:

sudo apt update
sudo apt install -y python3 python3-pip python3-venv

Creating a Virtual Environment:

python3 -m venv llm-env
source llm-env/bin/activate

Installing Libraries:

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118
pip install transformers datasets accelerate

5. Configuring the LLM Model

Example of Model Configuration with the Transformers Library

from transformers import AutoModelForCausalLM, AutoTokenizer

# Loading the model and tokenizer
model_name = "bigscience/bloom"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

# Preparing the input
input_text = "How to configure GPU for working with LLM models?"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")

# Generating the response
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

6. Optimizing Performance

Using the Accelerate Library

The Accelerate library allows for easy scaling of models across multiple GPUs.

from accelerate import Accelerator

accelerator = Accelerator()
model, optimizer = accelerator.prepare(model, optimizer)

Using DeepSpeed

DeepSpeed is a tool for optimizing large models.

deepspeed --num_gpus=4 train.py

Summary

Configuring a GPU for working with LLM models requires considering several factors, such as choosing the right hardware, installing drivers, configuring software, and optimizing the environment. Thanks to this article, you should be able to prepare your GPU for efficient work with large language models.