Fine-Tuning BERT Efficiently with LoRA for Sentiment Analysis (SST-2)

When working with large pre-trained language models like BERT, full fine-tuning can be memory-hungry, slow, and expensive. Parameter-Efficient Fine-Tuning (PEFT) offers a smarter approach — and in this project, I used one of its most popular methods, LoRA (Low-Rank Adaptation), to fine-tune bert-base-uncased for a binary sentiment classification task on the SST-2 dataset.
Why Use PEFT?
PEFT methods like LoRA enable:
Reduced memory usage — only a small number of parameters are trained
Faster fine-tuning — ideal for limited compute environments
Preservation of the base model — making it easier to reuse
Instead of updating all of BERT's parameters, LoRA introduces a few trainable rank-decomposed matrices that adapt the output of attention layers. The result: much less training overhead, nearly the same performance.
Task: Sentiment Classification on SST-2
The Stanford Sentiment Treebank v2 (SST-2) is a standard binary classification task from the GLUE benchmark. Given a sentence from a movie review, the model must predict whether the sentiment is positive or negative.
Implementation Steps
Libraries used:
transformers
datasets
peft
evaluate
torch
1. Tokenization & Dataset Loading
from datasets import load_dataset
from transformers import AutoTokenizer
dataset = load_dataset("glue", "sst2")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
def preprocess(example):
return tokenizer(example["sentence"], truncation=True, padding="max_length", max_length=128)
tokenized = dataset.map(preprocess, batched=True)
2. Apply LoRA to BERT
from peft import get_peft_model, LoraConfig, TaskType
from transformers import AutoModelForSequenceClassification
base_model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)
peft_config = LoraConfig(task_type=TaskType.SEQ_CLS, r=8, lora_alpha=32, lora_dropout=0.1)
model = get_peft_model(base_model, peft_config)
model.print_trainable_parameters()
3. Training with Hugging Face Trainer
from transformers import TrainingArguments, Trainer
training_args = TrainingArguments(
output_dir="output",
eval_strategy="epoch",
save_strategy="epoch",
num_train_epochs=3,
per_device_train_batch_size=8,
per_device_eval_batch_size=8,
weight_decay=0.01,
load_best_model_at_end=True,
fp16=True,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized["train"].shuffle(seed=42).select(range(2000)),
eval_dataset=tokenized["validation"].select(range(500)),
tokenizer=tokenizer,
compute_metrics=lambda eval_pred: {
"accuracy": (eval_pred[0].argmax(axis=-1) == eval_pred[1]).mean()
}
)
trainer.train()
Results
After just 3 epochs on a small subset of the SST-2 dataset (2k train, 500 val):
Epoch | Train Loss | Val Loss | Accuracy |
1 | 0.6689 | 0.6788 | 0.530 |
2 | 0.5914 | 0.5938 | 0.654 |
3 | 0.5376 | 0.5145 | 0.770 |
Accuracy improved from 53% → 77% in 3 epochs with LoRA, using only a tiny fraction of trainable parameters.
Inference Example
Upload the model to HuggingFace using HuggingFaceHub.
from transformers import pipeline
pipe = pipeline("text-classification", model="output/best_model", tokenizer="output/best_model")
pipe("The movie was absolutely wonderful!")
#Output → [{'label': 'POSITIVE', 'score': 0.98}]
Key Takeaways
LoRA + Hugging Face makes PEFT extremely easy and scalable.
You can fine-tune large models on consumer-grade GPUs or free platforms like Kaggle.
SST-2 is a great benchmark to test your fine-tuning workflows quickly.
Subscribe to my newsletter
Read articles from Sudhin Karki directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Sudhin Karki
Sudhin Karki
I am a Machine Learning enthusiast with a motivation of building ML integrated apps. I am currently exploring the groundbreaking ML / DL papers and trying to understand how this is shaping the future.