When working with large pre-trained language models like BERT, full fine-tuning can be memory-hungry, slow, and expensive. Parameter-Efficient Fine-Tuning (PEFT) offers a smarter approach — and in this project, I used one of its most popular methods, LoRA (Low-Rank Adaptation), to fine-tune bert-base-uncased for a binary sentiment classification task on the SST-2 dataset.

Why Use PEFT?

PEFT methods like LoRA enable:

Reduced memory usage — only a small number of parameters are trained
Faster fine-tuning — ideal for limited compute environments
Preservation of the base model — making it easier to reuse

Instead of updating all of BERT's parameters, LoRA introduces a few trainable rank-decomposed matrices that adapt the output of attention layers. The result: much less training overhead, nearly the same performance.

Task: Sentiment Classification on SST-2

The Stanford Sentiment Treebank v2 (SST-2) is a standard binary classification task from the GLUE benchmark. Given a sentence from a movie review, the model must predict whether the sentiment is positive or negative.

Implementation Steps

Libraries used:

transformers
datasets
peft
evaluate
torch

1. Tokenization & Dataset Loading

from datasets import load_dataset
from transformers import AutoTokenizer

dataset = load_dataset("glue", "sst2")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

def preprocess(example):
    return tokenizer(example["sentence"], truncation=True, padding="max_length", max_length=128)

tokenized = dataset.map(preprocess, batched=True)

2. Apply LoRA to BERT

from peft import get_peft_model, LoraConfig, TaskType
from transformers import AutoModelForSequenceClassification

base_model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)
peft_config = LoraConfig(task_type=TaskType.SEQ_CLS, r=8, lora_alpha=32, lora_dropout=0.1)
model = get_peft_model(base_model, peft_config)

model.print_trainable_parameters()

3. Training with Hugging Face Trainer

from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(
    output_dir="output",
    eval_strategy="epoch",
    save_strategy="epoch",
    num_train_epochs=3,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    weight_decay=0.01,
    load_best_model_at_end=True,
    fp16=True,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized["train"].shuffle(seed=42).select(range(2000)),
    eval_dataset=tokenized["validation"].select(range(500)),
    tokenizer=tokenizer,
    compute_metrics=lambda eval_pred: {
        "accuracy": (eval_pred[0].argmax(axis=-1) == eval_pred[1]).mean()
    }
)

trainer.train()

Results

After just 3 epochs on a small subset of the SST-2 dataset (2k train, 500 val):

Epoch	Train Loss	Val Loss	Accuracy
1	0.6689	0.6788	0.530
2	0.5914	0.5938	0.654
3	0.5376	0.5145	0.770

Accuracy improved from 53% → 77% in 3 epochs with LoRA, using only a tiny fraction of trainable parameters.

Inference Example

Upload the model to HuggingFace using HuggingFaceHub.

from transformers import pipeline

pipe = pipeline("text-classification", model="output/best_model", tokenizer="output/best_model")
pipe("The movie was absolutely wonderful!")
#Output → [{'label': 'POSITIVE', 'score': 0.98}]

Key Takeaways

LoRA + Hugging Face makes PEFT extremely easy and scalable.
You can fine-tune large models on consumer-grade GPUs or free platforms like Kaggle.
SST-2 is a great benchmark to test your fine-tuning workflows quickly.

Fine-Tuning BERT Efficiently with LoRA for Sentiment Analysis (SST-2)