Runs 100% locally on your Mac

Caption your AI
training images.
No cloud required.

Dataset Labeler uses a vision AI model running locally on your Mac to generate detailed, training-ready captions for every image in your dataset.

Free · macOS · No API keys · No internet needed
dataset-labeler — localhost:8080

Training AI models requires
thousands of captions.

Writing them manually takes days. Cloud tools cost money and send your images to external servers. Dataset Labeler solves both.

Fully automated
Drop a folder of images, click Generate. The AI writes detailed, structured captions for every image automatically.
🔒
Completely private
Your images never leave your computer. No uploads, no cloud processing, no external API calls whatsoever.
$0
Totally free
No subscription, no per-image cost, no token limits. Run it as much as you want on your own hardware.

Open the app.
Everything else is automatic.

The launcher handles the entire setup for you. No Terminal, no manual configuration.

01
Ollama is installed and started
The launcher checks if Ollama is present on your Mac. If not, it opens the download page. If it's installed but not running, it starts it automatically.
✓ Automated
02
The vision model is downloaded
Qwen2.5-VL 7B is pulled locally if not already present. This is a one-time ~6GB download. Progress is shown with a live log.
✓ Automated
qwen2.5vl:7b
03
A local HTTP server is started
Python's built-in HTTP server serves the app interface from your Documents folder on port 8080. No installation required.
✓ Automated
localhost:8080
04
The app opens in your browser
Your default browser opens the Dataset Labeler interface. Drop your images, configure your settings, and click Generate.
✓ Automated
05
Download your labeled dataset
When captioning is complete, export a ZIP containing all your images paired with their matching .txt caption files — ready to use in any training pipeline.
images + .txt files → .zip

Built for serious
dataset creation.

Every feature is designed around one goal — making your training captions as accurate and detailed as possible.

Captioning
Exhaustive visual descriptions
The AI describes every visible element — subjects, clothing, hair, expression, pose, background, lighting, camera angle, color palette and visual style.
Subject Control
Named subject injection
Set a subject name and every caption will use it instead of generic words like "woman" or "person". Optionally force captions to start with the name.
Subject Name Start with name
Focus
Feature emphasis presets
Tell the model to place extra detail on a specific element — face, pose, clothing, hands, lighting and more. Perfect for specialized training.
face pose clothing hands lighting background
Gallery
Masonry gallery view
Switch to gallery mode to visually browse your entire dataset. Click any image to open a fullscreen lightbox with keyboard navigation and inline caption editing.
Workflow
Concurrent processing
Process up to 5 images simultaneously. Filter by status — All, Pending, Done, Errors. Redo individual captions without reprocessing the whole set.
Up to 5× parallel Per-image redo
Export
One-click ZIP download
Download a ZIP containing all processed images with their paired .txt caption files. Drop it straight into your training pipeline.
image.png + image.txt .zip archive

Your images stay
on your machine.

Most AI tools send your data to external servers. Dataset Labeler never does.

🔒
Air-gapped by design
The vision model runs entirely inside Ollama on your Mac. No image, caption, or metadata is ever transmitted to any external server — not even for analytics.
No internet required after setup
No account or login
No telemetry or analytics
No API keys ever

Built on proven,
open source foundations.

Ollama
Local model runtime
ollama.com
Qwen2.5-VL
Vision language model
7B parameters
Electron
macOS launcher app
v30
React
App interface
v18

Start building your
dataset today.

Free download. macOS only. No setup beyond clicking Launch.

macOS 12+ · Apple Silicon & Intel · Free forever