← All articlesAI & Machine Learning

Computer Vision at the Edge: Deploying YOLO Models on Cheap Hardware

Deploy YOLOv8 and YOLOv11 object detection models on Raspberry Pi, Jetson Nano, and budget hardware. Complete guide with optimization and benchmarks.

Y
Yash Pritwani
14 min read

Why Edge Computer Vision?

Sending every camera frame to the cloud for processing means latency, bandwidth costs, and privacy concerns. Edge inference — running the model on the camera device itself — solves all three. A Raspberry Pi 5 running YOLOv8n processes 15-30 frames per second. That is fast enough for security cameras, retail analytics, quality inspection, and robotics.

<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 180" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="180" rx="12" fill="#1a1a2e"/><rect x="30" y="55" width="90" height="50" rx="8" fill="#6366f1" opacity="0.9"/><text x="75" y="85" text-anchor="middle" fill="#ffffff" font-size="12" font-family="system-ui">Code</text><rect x="150" y="55" width="90" height="50" rx="8" fill="#3b82f6" opacity="0.9"/><text x="195" y="85" text-anchor="middle" fill="#ffffff" font-size="12" font-family="system-ui">Build</text><rect x="270" y="55" width="90" height="50" rx="8" fill="#a855f7" opacity="0.9"/><text x="315" y="85" text-anchor="middle" fill="#ffffff" font-size="12" font-family="system-ui">Test</text><rect x="390" y="55" width="90" height="50" rx="8" fill="#2dd4bf" opacity="0.9"/><text x="435" y="85" text-anchor="middle" fill="#1a1a2e" font-size="12" font-family="system-ui">Deploy</text><rect x="510" y="55" width="60" height="50" rx="8" fill="#f59e0b" opacity="0.9"/><text x="540" y="85" text-anchor="middle" fill="#1a1a2e" font-size="12" font-family="system-ui">Live</text><path d="M122,80 L148,80" stroke="#e2e8f0" stroke-width="2" marker-end="url(#arrow1)"/><path d="M242,80 L268,80" stroke="#e2e8f0" stroke-width="2" marker-end="url(#arrow1)"/><path d="M362,80 L388,80" stroke="#e2e8f0" stroke-width="2" marker-end="url(#arrow1)"/><path d="M482,80 L508,80" stroke="#e2e8f0" stroke-width="2" marker-end="url(#arrow1)"/><defs><marker id="arrow1" markerWidth="8" markerHeight="6" refX="8" refY="3" orient="auto"><path d="M0,0 L8,3 L0,6" fill="#e2e8f0"/></marker></defs><text x="300" y="145" text-anchor="middle" fill="#94a3b8" font-size="11" font-family="system-ui">Continuous Integration / Continuous Deployment Pipeline</text></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">A typical CI/CD pipeline: code flows through build, test, and deploy stages automatically.</p></div>

Hardware Options and Their Capabilities

Device
Price
FPS (YOLOv8n)
Power
Best For

|--------|-------|--------------|-------|----------|

Raspberry Pi 5 (8GB)
$80
15-25
5W
Prototyping, light duty
NVIDIA Jetson Nano
$150
30-45
10W
Production edge AI
Jetson Orin Nano
$250
60-80
15W
Multi-camera, real-time
Orange Pi 5 (RK3588)
$90
20-35 (NPU)
8W
Budget production
Old laptop + GTX 1650
$200 used
80-120
60W
High throughput

At TechSaaS, we have deployed edge vision systems on everything from Raspberry Pis to refurbished laptops with discrete GPUs. The GTX 1650 in our Proxmox server handles real-time inference beautifully.

Setting Up YOLOv8 on a Raspberry Pi 5

# Install system dependencies
sudo apt update && sudo apt install -y python3-pip libopencv-dev

# Install ultralytics (YOLO library)
pip3 install ultralytics opencv-python-headless

# Test with a sample image
yolo detect predict model=yolov8n.pt source=bus.jpg

For real-time camera inference:

from ultralytics import YOLO
import cv2

model = YOLO("yolov8n.pt")  # Nano model, fastest

cap = cv2.VideoCapture(0)  # USB camera or CSI camera
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

while True:
    ret, frame = cap.read()
    if not ret:
        break

    results = model(frame, verbose=False, conf=0.5)

    # Draw bounding boxes
    annotated = results[0].plot()

    # Process detections
    for box in results[0].boxes:
        cls = int(box.cls[0])
        conf = float(box.conf[0])
        label = model.names[cls]
        print(f"Detected: {label} ({conf:.2f})")

    cv2.imshow("YOLO Edge", annotated)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()

Model Optimization for Edge Devices

ONNX Export

Converting to ONNX typically gives a 1.5-2x speedup:

from ultralytics import YOLO

model = YOLO("yolov8n.pt")
model.export(format="onnx", imgsz=640, simplify=True, opset=12)

Then use the ONNX model:

model = YOLO("yolov8n.onnx")
results = model("image.jpg")

<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 190" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="190" rx="12" fill="#0d1117"/><rect x="0" y="0" width="600" height="28" rx="12" fill="#1c2333"/><rect x="0" y="12" width="600" height="16" fill="#1c2333"/><circle cx="18" cy="14" r="5" fill="#ef4444"/><circle cx="34" cy="14" r="5" fill="#f59e0b"/><circle cx="50" cy="14" r="5" fill="#2dd4bf"/><text x="300" y="18" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="monospace">Terminal</text><text x="20" y="50" fill="#2dd4bf" font-size="11" font-family="monospace">$</text><text x="35" y="50" fill="#e2e8f0" font-size="11" font-family="monospace">docker compose up -d</text><text x="20" y="70" fill="#94a3b8" font-size="11" font-family="monospace">[+] Running 5/5</text><text x="20" y="88" fill="#2dd4bf" font-size="10" font-family="monospace"> &#x2713;</text><text x="38" y="88" fill="#94a3b8" font-size="10" font-family="monospace">Network app_default Created</text><text x="20" y="106" fill="#2dd4bf" font-size="10" font-family="monospace"> &#x2713;</text><text x="38" y="106" fill="#94a3b8" font-size="10" font-family="monospace">Container web Started</text><text x="20" y="124" fill="#2dd4bf" font-size="10" font-family="monospace"> &#x2713;</text><text x="38" y="124" fill="#94a3b8" font-size="10" font-family="monospace">Container api Started</text><text x="20" y="142" fill="#2dd4bf" font-size="10" font-family="monospace"> &#x2713;</text><text x="38" y="142" fill="#94a3b8" font-size="10" font-family="monospace">Container db Started</text><text x="20" y="165" fill="#2dd4bf" font-size="11" font-family="monospace">$</text><rect x="35" y="155" width="8" height="14" fill="#e2e8f0" opacity="0.7"/></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">Docker Compose brings up your entire stack with a single command.</p></div>

TensorRT for NVIDIA Devices

On Jetson or any NVIDIA GPU, TensorRT provides the best performance:

model = YOLO("yolov8n.pt")
model.export(format="engine", imgsz=640, half=True)  # FP16 TensorRT

# Use the optimized engine
model = YOLO("yolov8n.engine")
results = model("image.jpg")  # 2-3x faster than PyTorch

INT8 Quantization

For maximum speed on CPU, quantize to 8-bit integers:

model.export(format="onnx", imgsz=640, int8=True,
             data="calibration_dataset.yaml")

This reduces model size by 4x and speeds up inference 2-3x with only a 1-2% accuracy drop.

Building a Complete Edge Vision Pipeline

Here is a production-ready pipeline that detects objects and sends alerts:

import time
import json
import requests
from ultralytics import YOLO
import cv2

class EdgeVisionPipeline:
    def __init__(self, model_path, webhook_url, alert_classes=None):
        self.model = YOLO(model_path)
        self.webhook_url = webhook_url
        self.alert_classes = alert_classes or ["person"]
        self.last_alert = {}
        self.cooldown = 30  # seconds between alerts per class

    def should_alert(self, class_name: str) -> bool:
        now = time.time()
        last = self.last_alert.get(class_name, 0)
        if now - last > self.cooldown:
            self.last_alert[class_name] = now
            return True
        return False

    def send_alert(self, detections: list):
        payload = {
            "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
            "device": "edge-cam-01",
            "detections": detections
        }
        try:
            requests.post(self.webhook_url, json=payload, timeout=5)
        except requests.RequestException:
            pass  # Log but don't crash

    def run(self, camera_id=0):
        cap = cv2.VideoCapture(camera_id)
        print(f"Starting edge vision on camera {camera_id}")

        while True:
            ret, frame = cap.read()
            if not ret:
                time.sleep(1)
                continue

            results = self.model(frame, verbose=False, conf=0.5)
            alerts = []

            for box in results[0].boxes:
                label = self.model.names[int(box.cls[0])]
                if label in self.alert_classes and self.should_alert(label):
                    alerts.append({
                        "class": label,
                        "confidence": round(float(box.conf[0]), 2)
                    })

            if alerts:
                self.send_alert(alerts)

# Usage
pipeline = EdgeVisionPipeline(
    model_path="yolov8n.onnx",
    webhook_url="https://n8n.techsaas.cloud/webhook/vision-alert",
    alert_classes=["person", "car", "dog"]
)
pipeline.run(camera_id=0)

Training Custom Models

YOLO excels at custom object detection. To detect specific items (defective products, safety gear, specific vehicles):

# Prepare dataset in YOLO format
# images/train/, images/val/
# labels/train/, labels/val/

# Train on a GPU machine (doesn't need to be the edge device)
yolo detect train data=custom_dataset.yaml model=yolov8n.pt epochs=100 imgsz=640

# Export optimized model for your edge device
yolo export model=runs/detect/train/weights/best.pt format=onnx

Label your data with tools like Label Studio (self-hostable) or Roboflow. 200-500 labeled images per class is usually sufficient for good results.

Power and Thermal Management

Edge devices run 24/7. Manage power and heat:

# Raspberry Pi: Monitor temperature
vcgencmd measure_temp

# Set up a watchdog to restart if GPU overheats
# /etc/systemd/system/vision-watchdog.service
[Service]
ExecStart=/usr/local/bin/vision-pipeline.py
Restart=always
RestartSec=10

Use a heatsink and fan. At 25 FPS continuous, a Pi 5 runs at 65-75C — well within safe limits with passive cooling.

<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 200" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="200" rx="12" fill="#1a1a2e"/><text x="80" y="25" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="system-ui">Input</text><circle cx="80" cy="50" r="14" fill="none" stroke="#3b82f6" stroke-width="2"/><circle cx="80" cy="100" r="14" fill="none" stroke="#3b82f6" stroke-width="2"/><circle cx="80" cy="150" r="14" fill="none" stroke="#3b82f6" stroke-width="2"/><text x="230" y="25" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="system-ui">Hidden</text><circle cx="230" cy="45" r="14" fill="#6366f1" opacity="0.8"/><circle cx="230" cy="85" r="14" fill="#6366f1" opacity="0.8"/><circle cx="230" cy="125" r="14" fill="#6366f1" opacity="0.8"/><circle cx="230" cy="165" r="14" fill="#6366f1" opacity="0.8"/><text x="380" y="25" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="system-ui">Hidden</text><circle cx="380" cy="55" r="14" fill="#a855f7" opacity="0.8"/><circle cx="380" cy="100" r="14" fill="#a855f7" opacity="0.8"/><circle cx="380" cy="145" r="14" fill="#a855f7" opacity="0.8"/><text x="520" y="25" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="system-ui">Output</text><circle cx="520" cy="80" r="14" fill="none" stroke="#2dd4bf" stroke-width="2"/><circle cx="520" cy="130" r="14" fill="none" stroke="#2dd4bf" stroke-width="2"/><line x1="94" y1="50" x2="216" y2="45" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="50" x2="216" y2="85" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="50" x2="216" y2="125" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="50" x2="216" y2="165" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="100" x2="216" y2="45" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="100" x2="216" y2="85" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="100" x2="216" y2="125" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="100" x2="216" y2="165" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="150" x2="216" y2="45" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="150" x2="216" y2="85" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="150" x2="216" y2="125" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="94" y1="150" x2="216" y2="165" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="45" x2="366" y2="55" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="45" x2="366" y2="100" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="45" x2="366" y2="145" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="85" x2="366" y2="55" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="85" x2="366" y2="100" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="85" x2="366" y2="145" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="125" x2="366" y2="55" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="125" x2="366" y2="100" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="125" x2="366" y2="145" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="165" x2="366" y2="55" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="165" x2="366" y2="100" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="244" y1="165" x2="366" y2="145" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="394" y1="55" x2="506" y2="80" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="394" y1="55" x2="506" y2="130" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="394" y1="100" x2="506" y2="80" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="394" y1="100" x2="506" y2="130" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="394" y1="145" x2="506" y2="80" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/><line x1="394" y1="145" x2="506" y2="130" stroke="#e2e8f0" stroke-width="0.5" opacity="0.3"/></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">Neural network architecture: data flows through input, hidden, and output layers.</p></div>

Real-World Deployment Tips

1. Process every Nth frame: If you need detection, not tracking, process every 3rd frame and get 3x longer battery life 2. Region of interest: Crop the frame to only analyze relevant areas 3. Model selection matters: YOLOv8n (nano) vs YOLOv8s (small) is a 3x speed difference for only 5% accuracy loss on most tasks 4. Network resilience: Buffer alerts locally when WiFi drops, send when reconnected 5. Remote model updates: Pull new models via HTTP without redeploying the application

Edge AI is one of the fastest-growing fields in tech. At TechSaaS, we help companies deploy computer vision solutions on cost-effective hardware — no cloud dependency, no per-inference fees, no data leaving the premises.

#computer-vision#yolo#edge-computing#raspberry-pi#object-detection

Need help with ai & machine learning?

TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.