Split image datasets into train, validation, and test sets with options for random or stratified splits, custom ratios, and annotation support.
Split image datasets into train/val/test sets. Supports random split, stratified split, and custom ratios. Use when user needs to split dataset for machine learning training.
# Simple split (80/10/10)
python scripts/splitter.py split /path/to/images/ --ratios 80 10 10
# With annotations
python scripts/splitter.py split /path/to/images/ --annotations /path/to/labels/
# YOLO format output
python scripts/splitter.py split /path/to/images/ --output /path/to/dataset/ --yolo
# Stratified by class
python scripts/splitter.py split /path/to/images/ --annotations labels/ --stratify
$ python scripts/splitter.py split ./images --ratios 80 10 10
Splitting dataset...
Total images: 1000
Train: 800 (80%)
Val: 100 (10%)
Test: 100 (10%)
✓ Created train/ (800 images)
✓ Created val/ (100 images)
✓ Created test/ (100 images)
pip install pillow
--ratios: Split ratios (train val test), default: 80 10 10--seed: Random seed for reproducibility--annotations: Path to annotations (will be split together)--output: Output directory--yolo: Output in YOLO dataset format--stratify: Maintain class distribution--copy: Copy files instead of movingZIP package — ready to use