If you've ever wanted fast, affordable access to powerful GPUs without managing complex infrastructure, DataCrunch might be what you’re looking for.

In this blog, I’ll walk you through how I used a Tesla V100 GPU on DataCrunch to train a convolutional neural network (CNN) on the CIFAR-10 dataset using PyTorch.

I’ll cover:

How to launch a GPU instance on DataCrunch
How to set up the Python environment and PyTorch
How to train and evaluate a CNN on CIFAR-10
What GPU usage looks like during training

Whether you're a student, hobbyist, or someone curious about deep learning on the cloud, this guide aims to show how simple it is to get started on a clean, GPU-powered machine.

Why Use DataCrunch?

There are a lot of cloud platforms offering GPUs, but DataCrunch has a few distinct advantages, especially for machine learning beginners.

Simple and Fast Access

You can launch GPU instances immediately without needing approvals or quota requests. Once you're signed in, you just need to top up your account and you're good to go with using any available instance.

Dedicated GPU Performance

DataCrunch gives you dedicated access to GPUs, so you’re not sharing compute time with others. They also offer spot instances, which are significantly lower-cost machines that can be interrupted or stopped at any time.

Transparent and Flexible Pricing

DataCrunch employs a pay-as-you-go model, charging in 10-minute intervals with no hidden fees, which provides granular billing and cost control. You can select between:

Fixed Pricing: A stable hourly rate for consistent budgeting.
Dynamic Pricing: Rates that adjust daily based on market demand, potentially lowering costs during off-peak times.

This flexibility allows you to align your compute expenses with your project's needs and budget.

Built for Machine Learning Workloads

The platform is designed with ML use cases in mind. You can start with pre-configured environments like JupyterLab, or launch a clean OS image and set everything up yourself.

If you're looking to experiment, train models, or build ML workflows with minimal setup time, DataCrunch makes it easy to get started.

Setting Up the Instance

After signing in to your DataCrunch account, the first step is to create a project. Projects help organize your compute resources and manage billing effectively.

Once your project is set up, navigate to the Instances tab and click on Deploy Instance to configure your machine.

For this walkthrough, I selected the following configuration:

GPU: Tesla V100
Storage: 50 GB
Pricing: Fixed
Operating System: Ubuntu 24.04 (JupyterLab)

Once you've configured your instance, just click “Deploy now” and it should be ready within a minute. You can access it directly through your browser using JupyterLab or connect via SSH from your terminal.

Installing PyTorch and Dependencies

Once your instance is running, open the JupyterLab interface or connect via SSH, depending on what you selected during setup. For this guide, I used JupyterLab.

To install PyTorch and tqdm, create a terminal and run:

pip install torch torchvision tqdm

That’s all you need to get started with model training.

Training the CNN on CIFAR-10

The CIFAR-10 dataset is a standard benchmark for image classification. It consists of 60,000 color images (32x32 pixels) across 10 categories such as airplanes, cats, and trucks. There are 50,000 training images and 10,000 test images.

For this example, I use stronger data augmentation and train a deeper convolutional neural network with batch normalization and dropout in PyTorch. The code below handles data loading with random crops and flips, model definition with multiple convolutional blocks, training with SGD, momentum, weight decay and a StepLR scheduler, live progress and accuracy reporting, and evaluation on the 10,000-image test set.

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.optim.lr_scheduler import StepLR
from tqdm import tqdm
import time

transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465),
                         (0.2470, 0.2435, 0.2616)),
])
transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465),
                         (0.2470, 0.2435, 0.2616)),
])

trainset = torchvision.datasets.CIFAR10(
    root='./data', train=True, download=True, transform=transform_train)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=128, shuffle=True, num_workers=4, pin_memory=True)

testset = torchvision.datasets.CIFAR10(
    root='./data', train=False, download=True, transform=transform_test)
testloader = torch.utils.data.DataLoader(
    testset, batch_size=100, shuffle=False, num_workers=4, pin_memory=True)

class BetterNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, 3, padding=1),
            nn.BatchNorm2d(64), nn.ReLU(inplace=True),
            nn.Conv2d(64, 64, 3, padding=1),
            nn.BatchNorm2d(64), nn.ReLU(inplace=True),
            nn.MaxPool2d(2, 2),           

            nn.Conv2d(64, 128, 3, padding=1),
            nn.BatchNorm2d(128), nn.ReLU(inplace=True),
            nn.Conv2d(128, 128, 3, padding=1),
            nn.BatchNorm2d(128), nn.ReLU(inplace=True),
            nn.MaxPool2d(2, 2),           

            nn.Conv2d(128, 256, 3, padding=1),
            nn.BatchNorm2d(256), nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, 3, padding=1),
            nn.BatchNorm2d(256), nn.ReLU(inplace=True),
            nn.MaxPool2d(2, 2),           
        )
        self.classifier = nn.Sequential(
            nn.Dropout(0.5),
            nn.Linear(256*4*4, 512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.5),
            nn.Linear(512, 10),
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), -1)
        return self.classifier(x)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
net = BetterNet().to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(
    net.parameters(),
    lr=0.1,
    momentum=0.9,
    weight_decay=5e-4
)
scheduler = StepLR(optimizer, step_size=30, gamma=0.1)

start = time.time()
for epoch in range(1, 51):
    net.train()
    running_loss = 0
    correct = 0
    total = 0
    loop = tqdm(trainloader, desc=f'Epoch {epoch}', leave=False)
    for inputs, targets in loop:
        inputs, targets = inputs.to(device), targets.to(device)
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        _, pred = outputs.max(1)
        total += targets.size(0)
        correct += pred.eq(targets).sum().item()
        loop.set_postfix(loss=running_loss/total, acc=100.*correct/total)

    scheduler.step()
    if epoch % 10 == 0:
        print(f"Epoch {epoch:3d} | "
              f"Train Acc: {100.*correct/total:.2f}%")

end = time.time()
print(f"Training finished in {(end-start)/60:.1f} min")

net.eval()
correct = 0
total = 0
with torch.no_grad():
    for inputs, targets in testloader:
        inputs, targets = inputs.to(device), targets.to(device)
        outputs = net(inputs)
        _, pred = outputs.max(1)
        total += targets.size(0)
        correct += pred.eq(targets).sum().item()

print(f"Test Accuracy: {100.*correct/total:.2f}%")

Performance and Observations

Training the BetterNet CNN on a Tesla V100 took approximately 6.4 minutes for 50 epochs with a batch size of 128. The model achieved a training accuracy of 94.54% by the final epoch.

After training I evaluated the model on the CIFAR-10 test set. The final test accuracy was 90.49%, which reflects the impact of stronger data augmentation and a deeper network.

During training the GPU remained fully responsive and stable. I ran the entire session in JupyterLab and did not encounter any delays or interruptions while executing the notebook.

This shows the Tesla V100 running at about 68°C with 74% utilization and only 1.7 GB of 16 GB memory in use. There was plenty of headroom for larger models or bigger batch sizes. Overall, this environment delivers consistent compute performance and easy scalability for future deep learning experiments.

Final Thoughts

This showed how easy it is to get up and running with machine learning on DataCrunch. Launching a GPU instance, setting up the environment, and training a CNN on CIFAR-10 took minimal effort, with no blockers or unnecessary complexity.

The experience felt fast and efficient throughout. I didn’t have to deal with quota requests, hidden configurations, or shared hardware performance issues. Everything just worked, which is exactly what you want when you're experimenting or iterating on models.

DataCrunch’s flexibility in pricing, straightforward interface, and reliable performance make it a strong option for anyone who needs GPU power without the overhead of managing cloud infrastructure. Whether you're a beginner learning the basics or someone running quick training jobs, it’s a platform worth exploring.

Join the community and take your projects to the next level. Sign up now on DataCrunch!

Training a PyTorch CNN on CIFAR-10 Using DataCrunch GPU Instance