How to Run YOLOv5 Inference From Golang with Python API

Marek SkopowskiMarek Skopowski
6 min read

There’s multiple ways of running a YOLO (You Only Look Once) inferences in Golang:

  • Call a YOLO model via the Python REST API

  • Communicate via the gRPC with Python service that runs a YOLO model

  • Use the onnxruntime_go to run the YOLO model in native GO environment

Today we’re going to focus on the fastest approach of all three of them, which is calling a YOLO model via the Python REST API.

Architecture

We’re going to write a simple Golang application, that will call the Python REST API with the provided image and write the model inference results to the CLI.

Golang ⇄ HTTP ⇄ Python (YOLOv5 inference)

With this approach, we’re getting:

  • Minimal setup

  • Quite nice performance, since it’s the Python that is doing the heavy lifting (YOLO inference)

  • Easy to containerize (two separate services: Go + Python)

Project structure:

yolo-in-go-with-python/
├── go-backend/
│   ├── go.mod
│   └── main.go
├── yolo-api/
│   ├── detect.py
│   └── requirements.txt
└── example.jpg

Looks simple, right? It is!

Python Inference Server (YOLOv5)

A minimal FastAPI server:

from fastapi import FastAPI, File, UploadFile
from fastapi.responses import JSONResponse
import torch
from PIL import Image
import io

# Initialize the FastAPI application
app = FastAPI()
# Load the pretrained YOLOv5s model from the Ultralytics repository
model = torch.hub.load("ultralytics/yolov5", "yolov5s", pretrained=True)

# Define the endpoint to handle object detection requests
@app.post("/detect")
async def detect(file: UploadFile = File(...)):
    # Read the uploaded image file as bytes
    image_bytes = await file.read()
    # Convert the byte data to a PIL Image
    image = Image.open(io.BytesIO(image_bytes))
    # Run the image through the YOLO model
    results = model(image)
    # Convert the detection results to a JSON response
    return JSONResponse(results.pandas().xyxy[0].to_dict(orient="records"))

Requirements:

The base server reqs are:

torch
fastapi
uvicorn
pillow

but in the requirements.txt you can find all my deps freeze that was used during this tutorial.

I strongly recommend using the venv - https://docs.python.org/3/library/venv.html and not to install the reqs in your local environment.

  1. Install deps: pip install -r requirements.txt

  2. Start a server: uvicorn detect:app --host 0.0.0.0 --port 8000

You should see something like this:

uvicorn detect:app --host 0.0.0.0 --port 8000
Using cache found in /.../.cache/torch/hub/ultralytics_yolov5_master
/.../.cache/torch/hub/ultralytics_yolov5_master/utils/general.py:32: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources as pkg
YOLOv5 🚀 2025-6-29 Python-3.11.6 torch-2.2.2 CPU

Fusing layers... 
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 
INFO:     Started server process [28206]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

Don’t mind the NNPACK warning, it’s related to the optimization that couldn’t be applied. All is working correctly!

Go Client to Call YOLOv5

Go Client:

package main

import (
    "bytes"
    "fmt"
    "io"
    "log"
    "mime/multipart"
    "net/http"
    "os"
    "path/filepath"
)

const (
    filePath   = "../example.jpg"
    yoloAPIURL = "http://localhost:8000/detect"
)

// main is the entry point for the application. It prepares the image,
// sends it to the YOLO API, and prints the result.
func main() {
    // Prepare the image file as a multipart form
    body, contentType, err := prepareMultipartForm(filePath)
    if err != nil {
        log.Fatal("Error preparing multipart form: ", err)
    }

    // Send the HTTP POST request to the YOLO API
    respBytes, err := sendYOLORequest(yoloAPIURL, body, contentType)
    if err != nil {
        log.Fatal("Error sending YOLO request: ", err)
    }

    // Print the detection results
    fmt.Println(string(respBytes))
}

// prepareMultipartForm creates a multipart/form-data body from the given file path.
// It returns the form body, content type, and any error encountered.
func prepareMultipartForm(filePath string) (*bytes.Buffer, string, error) {
    body := &bytes.Buffer{}
    writer := multipart.NewWriter(body)

    // Open the file
    file, err := os.Open(filePath)
    if err != nil {
        return nil, "", fmt.Errorf("failed to open file: %w", err)
    }
    defer func() {
        if e := file.Close(); e != nil {
            log.Println("Failed to close file", e)
        }
    }()

    // Create a new form file field
    part, err := writer.CreateFormFile("file", filepath.Base(filePath))
    if err != nil {
        return nil, "", fmt.Errorf("failed to create form file: %w", err)
    }

    // Copy the image data into the form
    _, err = io.Copy(part, file)
    if err != nil {
        return nil, "", fmt.Errorf("failed to copy file: %w", err)
    }

    // Close the multipart writer
    if err = writer.Close(); err != nil {
        log.Println("Failed to close writer", err)
    }

    return body, writer.FormDataContentType(), nil
}

// sendYOLORequest sends the image as a multipart POST request to the specified YOLO API.
// It returns the response body or an error.
func sendYOLORequest(apiURL string, body *bytes.Buffer, contentType string) ([]byte, error) {
    // Create a new HTTP POST request with the multipart data
    req, err := http.NewRequest(http.MethodPost, apiURL, body)
    if err != nil {
        return nil, fmt.Errorf("failed to create request: %w", err)
    }
    req.Header.Set("Content-Type", contentType)

    // Send the request and get the response
    resp, err := http.DefaultClient.Do(req)
    if err != nil {
        return nil, fmt.Errorf("failed to execute request: %w", err)
    }
    defer func() {
        if e := resp.Body.Close(); e != nil {
            log.Println("Failed to close body", e)
        }
    }()

    // Read and return the response body
    respBytes, err := io.ReadAll(resp.Body)
    if err != nil {
        return nil, fmt.Errorf("failed to read response body: %w", err)
    }

    return respBytes, nil
}

Run the Inference

  1. The Python REST API is up & running, if not, execute:
cd yolo-api && uvicorn detect:app --host 0.0.0.0 --port 8000
  1. Run the Go app:
cd go-backend && go run main.go

You should see the output like this:

go run main.go
[{"xmin":451.77557373046875,"ymin":256.8055114746094,"xmax":572.8908081054688,"ymax":355.9529724121094,"confidence":0.8660547733306885,"class":41,"name":"cup"},{"xmin":216.73318481445312,"ymin":242.79660034179688,"xmax":417.9637756347656,"ymax":352.3187561035156,"confidence":0.3558332026004791,"class":67,"name":"cell phone"},{"xmin":0.4250640869140625,"ymin":0.6914291381835938,"xmax":276.78839111328125,"ymax":174.0032958984375,"confidence":0.27563828229904175,"class":73,"name":"book"},{"xmin":211.1724090576172,"ymin":242.36141967773438,"xmax":421.87457275390625,"ymax":351.2012634277344,"confidence":0.26584678888320923,"class":63,"name":"laptop"}]

With the “pretty print” it loos like this:

go run main.go | jq .
[
  {
    "xmin": 451.77557373046875,
    "ymin": 256.8055114746094,
    "xmax": 572.8908081054688,
    "ymax": 355.9529724121094,
    "confidence": 0.8660547733306885,
    "class": 41,
    "name": "cup"
  },
  {
    "xmin": 216.73318481445312,
    "ymin": 242.79660034179688,
    "xmax": 417.9637756347656,
    "ymax": 352.3187561035156,
    "confidence": 0.3558332026004791,
    "class": 67,
    "name": "cell phone"
  },
  {
    "xmin": 0.4250640869140625,
    "ymin": 0.6914291381835938,
    "xmax": 276.78839111328125,
    "ymax": 174.0032958984375,
    "confidence": 0.27563828229904175,
    "class": 73,
    "name": "book"
  },
  {
    "xmin": 211.1724090576172,
    "ymin": 242.36141967773438,
    "xmax": 421.87457275390625,
    "ymax": 351.2012634277344,
    "confidence": 0.26584678888320923,
    "class": 63,
    "name": "laptop"
  }
]

As you can see this is mostly what we have in our image:

There’s no “laptop”, but the confidence score was really low - 0.26, so we shouldn’t be surprised by that. Also the “cell phone” is probably a tablet.

At the same time, your CLI output for the Python REST API show you incoming requests:

uvicorn detect:app --host 0.0.0.0 --port 8000
Using cache found in /.../.cache/torch/hub/ultralytics_yolov5_master
/.../.cache/torch/hub/ultralytics_yolov5_master/utils/general.py:32: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources as pkg
YOLOv5 🚀 2025-6-29 Python-3.11.6 torch-2.2.2 CPU

Fusing layers... 
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 
INFO:     Started server process [28206]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO:     127.0.0.1:51144 - "POST /detect HTTP/1.1" 200 OK
INFO:     127.0.0.1:51146 - "POST /detect HTTP/1.1" 200 OK
INFO:     127.0.0.1:51147 - "POST /detect HTTP/1.1" 200 OK
INFO:     127.0.0.1:51150 - "POST /detect HTTP/1.1" 200 OK

Have fun with detections!

Sources

0
Subscribe to my newsletter

Read articles from Marek Skopowski directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Marek Skopowski
Marek Skopowski

Software Engineer x Data Engineer - I make the world a better place to live with software that enables data-driven decision-making