Skip to content
Free US shipping on orders over $50
Kindly Morrow
Guide

Raspberry Pi AI HAT+: Run Real AI Models on a $70 Board

By Kindly Morrow||16 min read

Raspberry Pi AI HAT+: Run Real AI Models on a $70 Board

The Raspberry Pi AI HAT+ is an M.2 add-on board that connects a Hailo-8L neural processing unit (NPU) to your Raspberry Pi 5. It delivers 13 TOPS (tera operations per second) of dedicated AI inference, entirely on-device. No cloud APIs. No subscriptions. No data leaving your network. You plug it in, install the software stack, and start running object detection, pose estimation, or small language models locally on a $70 board sitting on your desk.

That's the short version. The rest of this guide covers exactly what it does, what you can build with it, how it compares to running AI on the Pi's CPU alone, and whether you actually need one.


What Is the Raspberry Pi AI HAT+?

The AI HAT+ is an official Raspberry Pi accessory. It connects to the Pi 5's PCIe 2.0 interface through the M.2 HAT+ form factor (the same connector used by the Raspberry Pi M.2 HAT+ for NVMe SSDs).

Inside is a Hailo-8L chip. This is a purpose-built neural network accelerator, not a general-purpose processor. It does one thing: run pre-compiled neural network graphs as fast as possible while consuming minimal power. The Hailo-8L is rated at 13 TOPS, which puts it in the same neighborhood as an Intel Neural Compute Stick 2 (about 4 TOPS) or a Google Coral USB Accelerator (4 TOPS), but roughly 3x faster than either.

Key specs:

  • NPU: Hailo-8L, 13 TOPS INT8 inference
  • Interface: M.2 key M, PCIe 2.0 x1
  • Power draw: ~2.5W under full load (from the Pi's 5V rail)
  • Compatible boards: Raspberry Pi 5 only (Pi 4 and earlier lack PCIe)
  • Price: ~$70 USD
  • Software: Hailo TAPPAS framework, GStreamer pipelines, Python API
  • Supported frameworks: TensorFlow Lite, ONNX (via Hailo Model Zoo compilation)

The "L" in Hailo-8L stands for "Lite." Hailo also makes the full Hailo-8 (26 TOPS) and the newer Hailo-10H (40 TOPS, found in the AI HAT+ 2). The 8L is the entry point, and 13 TOPS is genuinely enough for most single-camera, single-model applications.


What 13 TOPS Actually Means in Practice

TOPS is a marketing-friendly number. What matters is: how fast does it actually run real models?

Here are approximate benchmarks for common tasks on the AI HAT+ (Hailo-8L, 13 TOPS) versus the Pi 5's CPU alone:

TaskModelAI HAT+ (Hailo-8L)Pi 5 CPU OnlySpeedup
Object detectionYOLOv8n (nano)~60 fps~3-5 fps12-20x
Object detectionYOLOv5s (small)~30 fps~1-2 fps15-30x
Pose estimationMoveNet Lightning~45 fps~5-6 fps8-9x
Image classificationMobileNetV2~200 fps~15 fps13x
Face detectionRetinaFace~50 fps~4 fps12x
Semantic segmentationDeepLabV3 (MobileNet)~20 fps~1 fps20x

Sources: Hailo Model Zoo benchmarks (hailo.ai/model-zoo), community reports from Hackster.io and the Raspberry Pi forums, Jeff Geerling's testing (March 2026). Exact numbers vary by input resolution and quantization.

The important pattern: the Hailo-8L takes tasks from "slideshow" frame rates into "usable video" territory. YOLOv8n at 60 fps means you can run real-time object detection on a live camera feed with no perceptible lag. That's the difference between a demo and a product.


What You Can Build With It

This is the section that matters. The AI HAT+ turns your Pi 5 from a general-purpose computer into a capable edge AI node. Here are six real projects, each with a code angle.

1. Person Detection Security Camera

What it does: A Pi 5 with a camera module and the AI HAT+ runs YOLOv8n to detect people (not just motion) in a video feed. It sends MQTT alerts to Home Assistant, saves snapshots to local storage, and optionally streams RTSP video to Frigate NVR.

Why the AI HAT+ matters: Motion detection catches every swaying branch and passing shadow. Person detection is specific. The Hailo-8L runs YOLOv8n at 60 fps, so there's zero detection lag. On the CPU alone, the same model runs at 3-5 fps, which means people can cross the frame without being detected.

The code angle: The GStreamer pipeline handles camera input, Hailo inference, and RTSP output in a single command. Python scripts subscribe to detection events via the Hailo TAPPAS API and publish MQTT messages. A 50-line Python script is all it takes to bridge the AI HAT+ to Home Assistant.

Total hardware cost: Pi 5 ($80) + AI HAT+ ($70) + Camera Module 3 ($25) + case and power supply ($20) = ~$195

2. Voice-Activated Home Assistant (Running Locally)

What it does: A local voice assistant that listens for a wake word, transcribes speech, processes commands, and responds. All on-device. No Alexa. No Google. No internet connection required after initial setup.

Why the AI HAT+ matters: Wake-word detection and speech-to-text are neural network tasks. The AI HAT+ handles the wake-word model continuously while the Pi's CPU handles the command parsing and text-to-speech response. This division of labor keeps the system responsive. Running everything on the CPU alone creates noticeable pauses between "hearing" you and "responding."

The code angle: OpenWakeWord (a Python library) handles wake-word detection, offloaded to the Hailo. Wyoming protocol integration connects to Home Assistant's voice pipeline. Whisper.cpp (the C++ port of OpenAI's Whisper) handles speech-to-text on the CPU while the NPU watches for the next wake word. The whole stack is Python and shell scripts.

Total hardware cost: Pi 5 + AI HAT+ + USB microphone + speaker ($10-20) = ~$175

3. Gesture-Controlled Desk Gadget

What it does: A small display (3.5" to 7" touchscreen) on your desk that responds to hand gestures detected by a camera. Wave to dismiss a notification. Point to scroll through a dashboard. Hold up a palm to pause music. No touch required.

Why the AI HAT+ matters: Hand landmark detection (MediaPipe's hand model or MoveNet) needs to run at 30+ fps to feel responsive. Anything slower and the gestures feel delayed. The Hailo-8L handles this comfortably while the Pi drives the display and manages application logic.

The code angle: MediaPipe's hand landmark model (compiled through the Hailo Model Zoo) detects 21 hand keypoints per frame. A Python classifier maps keypoint patterns to gestures. The display runs a lightweight web UI (Flask or FastAPI) that receives gesture events over a local WebSocket.

Total hardware cost: Pi 5 + AI HAT+ + Camera Module 3 + 5" display = ~$230

4. License Plate Reader for Your Driveway

What it does: A camera pointed at your driveway detects vehicles and reads license plates. Known plates (your household, frequent visitors) get logged silently. Unknown plates trigger an alert to your phone. All local processing, no cloud plate recognition API fees.

Why the AI HAT+ matters: License plate recognition is a two-stage pipeline. First, detect the vehicle and locate the plate region (YOLOv8 or similar). Second, run OCR on the plate region. Both stages are neural network inference. The Hailo-8L handles both at video frame rates. On CPU alone, this pipeline tops out around 1-2 fps, which means you miss plates on faster-moving vehicles.

The code angle: The Hailo TAPPAS framework supports cascaded model pipelines (detection, then classification/OCR on the detected region). PaddleOCR or a custom CRNN model handles the character recognition step. Results push to a SQLite database and a simple Flask API. Home Assistant automation triggers notifications for unknown plates.

Total hardware cost: Pi 5 + AI HAT+ + Camera Module 3 (or IR camera for night) = ~$195-220

5. Real-Time Pose Estimation for Workout Tracking

What it does: A camera watches you exercise and tracks body joint positions in real-time. Count reps automatically. Check form on squats and deadlifts. Log workout data locally. Display an overlay on a monitor showing your skeleton and rep count.

Why the AI HAT+ matters: MoveNet or BlazePose needs to track 17-33 body keypoints per frame. At 5-6 fps (CPU only), the skeleton jitters and rep counting becomes unreliable. At 45 fps (Hailo-8L), the tracking is smooth enough for real-time form feedback.

The code angle: MoveNet Lightning (compiled for Hailo) outputs joint coordinates as a NumPy array. A Python script calculates joint angles (knee angle for squats, hip angle for deadlifts) and applies simple threshold logic for rep counting. Pygame or OpenCV draws the skeleton overlay. Data logs to a CSV or SQLite database for tracking over time.

Total hardware cost: Pi 5 + AI HAT+ + Camera Module 3 + HDMI monitor = ~$220 (assuming you have a monitor)

6. Object Counting for Inventory and Maker Projects

What it does: Point a camera at a conveyor, a parts bin, or a shelf. The system counts specific objects as they appear, move, or are removed. Useful for small-batch manufacturing, 3D print farms, or even counting screws in a parts drawer.

Why the AI HAT+ matters: Object counting requires detection plus tracking across frames. The Hailo-8L runs YOLOv8n fast enough to track objects at full video speed. DeepSORT or ByteTrack (tracking algorithms) run on the CPU and assign persistent IDs to detected objects across frames.

The code angle: TAPPAS provides example pipelines for detection-plus-tracking. A Python script wraps the pipeline output, maintains a count, and exposes it via an API or MQTT. A web dashboard (Node-RED, Grafana, or a custom Flask app) shows the running count and history.

Total hardware cost: Pi 5 + AI HAT+ + Camera Module 3 = ~$175


Hardware Requirements

The AI HAT+ has specific requirements. Not every Pi will work.

What You Need

ComponentRequirementNotes
BoardRaspberry Pi 5 (4GB or 8GB)Pi 4, Pi 3, Pi Zero do not have PCIe. The AI HAT+ physically will not work.
OSRaspberry Pi OS (Bookworm, 64-bit)Required for Hailo kernel driver support. Ubuntu 24.04 also works with manual driver install.
Power supply27W USB-C (5.1V, 5A)The official Pi 5 power supply. The AI HAT+ adds ~2.5W to the Pi's power draw. A 15W supply will cause brownouts under load.
CoolingActive cooling (fan or heatsink with fan)The Hailo-8L generates heat. Passive cooling may throttle during sustained inference. The official Pi 5 Active Cooler works.
CameraRaspberry Pi Camera Module 3 (recommended)Any MIPI-CSI camera works. USB cameras also work but add latency.
Storage32GB+ microSD or NVMe SSDModels and the TAPPAS framework take several GB.

What You Don't Need

  • A Pi 5 with 8GB RAM. The Hailo-8L has its own memory (no LPDDR allocation from the Pi). 4GB Pi 5 works fine for single-model inference.
  • An internet connection after initial setup. All inference runs locally.
  • A GPU. The Hailo-8L is the GPU, conceptually. The Pi's VideoCore VII is still available for display output.

Form Factor Note

The AI HAT+ sits on top of the Pi 5 and connects via the PCIe FFC (flat flexible cable) that comes in the box. It occupies the same physical space as the M.2 HAT+ for NVMe drives. You cannot use both the AI HAT+ and an NVMe HAT+ simultaneously on a single Pi 5. If you need both AI inference and fast SSD storage, you'll need to use a USB 3.0 SSD instead of NVMe, or look at the AI HAT+ 2 ($130, Hailo-10H, 40 TOPS) which includes its own 8GB of LPDDR4X RAM and may eventually support a different stacking arrangement.


Setup Walkthrough

This is the high-level process. Full step-by-step details are in the official Raspberry Pi AI HAT+ documentation.

Step 1: Physical Installation

  1. Power off the Pi 5 completely.
  2. Connect the PCIe FFC cable from the Pi 5's PCIe connector to the AI HAT+ board.
  3. Mount the AI HAT+ on the GPIO header standoffs (included).
  4. Attach your camera to the Pi 5's camera connector (under the AI HAT+ board).
  5. Connect power, HDMI, and peripherals.

Step 2: Software Setup

# Update your system
sudo apt update && sudo apt full-upgrade -y

# Install the Hailo runtime and TAPPAS framework
sudo apt install hailo-all -y

# Reboot to load the kernel driver
sudo reboot

Step 3: Verify the Hardware

# Check that the Hailo device is detected
hailortcli fw-control identify

# Expected output: something like
# Hailo-8L (Device: 0000:01:00.0)
# Firmware Version: 4.x.x

Step 4: Run Your First Inference

# Run a YOLOv8 object detection demo on a camera feed
# (TAPPAS includes pre-compiled demo pipelines)
hailo-tappas-run detection --input /dev/video0

If you see bounding boxes appearing around objects in your camera feed, the AI HAT+ is working. The entire setup process takes about 15-20 minutes, most of which is waiting for apt install to finish.


Performance: AI HAT+ vs CPU vs Cloud APIs

Three ways to run AI inference on a Pi 5. Here's how they compare.

FactorAI HAT+ (Hailo-8L)Pi 5 CPU OnlyCloud API (e.g., Google Vision)
YOLOv8n fps~60~3-5Depends on network (typically 2-5 fps round-trip)
Latency per frame~17ms~200-330ms200-500ms (network dependent)
Monthly cost$0 (one-time $70)$0$1.50-3.00 per 1,000 images
Internet requiredNoNoYes
PrivacyAll data stays localAll data stays localImages sent to third-party servers
Power consumption~12W total (Pi + HAT)~8W~8W + cloud infrastructure
Max concurrent models1-2 (depending on model size)1 (slowly)Unlimited (pay per call)

The math on cloud APIs gets expensive fast. A security camera processing 10 frames per second, 24/7, generates about 26 million frames per month. At Google Cloud Vision's pricing ($1.50 per 1,000 images), that's $39,000/month. The AI HAT+ pays for itself in about 2.7 minutes of operation compared to cloud pricing at that rate.

Even at modest usage (100 inferences per day), cloud APIs cost $4-9/month. The AI HAT+ breaks even in 8-18 months. And you keep your data.

According to Raspberry Pi's official benchmarks, the Hailo-8L achieves 87% of its theoretical 13 TOPS on well-optimized models from the Hailo Model Zoo. Custom models compiled through the Hailo Dataflow Compiler typically achieve 60-80% efficiency depending on architecture and quantization quality.


Who Actually Needs This

Be honest with yourself about your project before spending $70.

You Should Buy the AI HAT+ If:

  • You have a specific AI project in mind (security camera, pose tracking, voice assistant) and you're ready to build it
  • You want real-time inference (30+ fps) from a camera feed on a Pi
  • Privacy matters to you and you want all processing to stay local
  • You've tried running AI models on the Pi 5's CPU and hit the performance wall
  • You're building something that will run continuously (always-on camera, voice assistant) where cloud API costs would accumulate

You Should NOT Buy the AI HAT+ If:

  • You just want to "experiment with AI" without a specific project. Start with the Pi 5's CPU and TensorFlow Lite. Upgrade to the HAT+ when you hit the fps wall.
  • You need to run large language models. The Hailo-8L is optimized for vision and classification models. Running LLMs locally on a Pi requires the AI HAT+ 2 ($130, 40 TOPS, 8GB dedicated RAM) or an external GPU.
  • You need NVMe SSD speed AND AI inference. The HAT+ occupies the PCIe slot. You can't stack it with the NVMe HAT+.
  • Your project only needs occasional inference (one image per minute, batch processing). The CPU is fine for that. The HAT+ matters when you need sustained, real-time throughput.
  • You're running a Pi 4 or earlier. No PCIe port. No way to connect the HAT+.

The Upgrade Path

If you outgrow the AI HAT+ (13 TOPS), the AI HAT+ 2 launched January 15, 2026 at $130. It uses the Hailo-10H chip at 40 TOPS and includes 8GB of dedicated LPDDR4X memory. The dedicated RAM means it can run vision-language models (VLMs) and small LLMs without consuming the Pi 5's system memory. Tom's Hardware and Jeff Geerling both covered it extensively. If you know you'll need multi-model pipelines or language model inference, start with the AI HAT+ 2 instead.


Raspberry Pi AI HAT+ vs Alternatives

The AI HAT+ isn't the only edge AI accelerator. Here's how it compares.

AcceleratorTOPSInterfacePricePi 5 CompatibleNotes
Raspberry Pi AI HAT+13PCIe (M.2)$70YesOfficial Pi accessory. Best integration.
Raspberry Pi AI HAT+ 240PCIe (M.2)$130Yes8GB dedicated RAM. Runs LLMs.
Google Coral USB4USB 3.0$60Yes (USB)Mature ecosystem. Lower performance. Limited model support (Edge TPU only).
Intel NCS2~4USB 3.0$70Yes (USB)Discontinued. OpenVINO support ending. Not recommended for new projects.
Hailo-8 M.2 Module26M.2$100+With M.2 HATBare module. Requires compatible M.2 carrier. Less Pi-specific documentation.

The AI HAT+ wins on integration. It's designed for the Pi 5, supported by Raspberry Pi's official documentation, and installs with a single apt install. The Coral USB is the only serious alternative for Pi users, but at 4 TOPS it's roughly 3x slower for the same models.


Frequently Asked Questions

Does the Raspberry Pi AI HAT+ work with the Pi 4?

No. The AI HAT+ requires a PCIe 2.0 interface, which only the Raspberry Pi 5 provides. The Pi 4 uses a USB bus for its peripherals and has no PCIe connector. There is no adapter or workaround. If you're on a Pi 4 and want AI acceleration, the Google Coral USB Accelerator (USB 3.0, 4 TOPS) is your option.

How much power does the AI HAT+ draw?

The Hailo-8L draws approximately 2.5W under full inference load. Combined with the Pi 5 (which draws 5-8W depending on CPU utilization), total system power is about 10-12W. You need the official 27W (5.1V, 5A) Pi 5 power supply. Lower-rated supplies will cause voltage drops, random reboots, or throttled performance. This is the single most common "why isn't it working" issue on the Pi forums.

Can I use the AI HAT+ and an NVMe SSD at the same time?

Not directly. Both the AI HAT+ and the M.2 HAT+ (for NVMe SSDs) use the Pi 5's single PCIe slot. You choose one. If you need fast storage alongside AI inference, use a USB 3.0 SSD (which gives you around 350-400 MB/s, versus 90 MB/s on microSD). Some community members have experimented with PCIe bifurcation, but it's not officially supported.

What AI models can the Hailo-8L run?

The Hailo Model Zoo includes pre-compiled versions of YOLOv5, YOLOv7, YOLOv8, MobileNet, EfficientNet, ResNet, SSD MobileNet, RetinaFace, MoveNet, DeepLabV3, and dozens more. Custom models trained in TensorFlow or PyTorch can be compiled through the Hailo Dataflow Compiler (DFC). The main constraint: models must fit in the Hailo-8L's on-chip memory. Very large models (anything above ~20M parameters at INT8) may need to be pruned or split.

Is the AI HAT+ good for running LLMs like Llama or Mistral?

The original AI HAT+ (Hailo-8L, 13 TOPS) is not ideal for LLMs. It's optimized for vision models (CNNs, transformers for image/video tasks). LLMs require large amounts of memory for token generation, and the Hailo-8L doesn't have enough on-chip buffer for models with billions of parameters. For local LLM inference on a Pi, the AI HAT+ 2 ($130, Hailo-10H, 40 TOPS, 8GB LPDDR4X) is the better choice. It was specifically designed with LLM and VLM workloads in mind.

How loud is the cooling setup?

The AI HAT+ itself has no fan. It uses a passive heatsink. The noise comes from whatever cooling you use on the Pi 5. The official Raspberry Pi Active Cooler is nearly inaudible at normal room temperature. Under sustained AI inference load in a warm room, the fan may spin up to a low hum. It's quieter than a laptop under load. If noise is a concern (bedroom security camera, for example), a larger passive heatsink like the Pimoroni Pibow Heatsink Case can keep temperatures manageable without any fan.


Last updated: April 2026. Prices and availability reflect current listings at time of publication.

Shop Related Products

Raspberry Pi AI HAT+: Run Real AI Models on a $70 Board | Kindly Morrow