MIPAR offers three deep learning architectures. Choose based on your analysis goal and dataset characteristics.
Architecture Overview
SegNet — General Purpose Segmentation
| SegNet | |
|---|---|
| Purpose | Fast-training general purpose segmentation |
| Best For | Most segmentation tasks, quick iteration |
| Output | Probability or layer map |
| Min Classes | 2 |
Key Strength: Lightweight and trains quickly. Accommodates most training data effectively.
Architecture Tuning: For SegNet, filter size typically matters more than depth
- Filter 5 or 7 for broad context (vs typical filter 3)
UNet — Robust or Complex Segmentation
| UNet | |
|---|---|
| Purpose | Robust segmentation with sharpest boundaries |
| Best For | Complex features, high variation, crispest boundaries needed |
| Output | Probability or layer map |
| Min Classes | 2 |
| Size Range | 20 MB (3×3) to 1 GB (6×3) |
Key Strength: Produces crispest boundaries. Generalizes well across image variation.
Architecture Sizing: Depth × Filter notation (e.g., 3×3, 4×3, 5×3)
- Increase depth to handle more variation (3→4→5→6)
- Rarely increase filter beyond 3 (memory scales fast)
- 3×3 = standard | 4×3 = moderate variation | 5×3 = high variation
YOLO — Object Detection
| YOLO | |
|---|---|
| Purpose | Detect and count discrete objects |
| Best For | Counting, localization, single-class problems |
| Output | Bounding boxes with confidence scores |
| Complexity | Basic (fast) or Advanced (more accurate) |
| Min Classes | 1 |
Key Strength: Faster annotation (boxes vs pixel masks), works with single class.
YOLO → Spotlight Pipeline:
- Train YOLO for fast detection → Feed boxes to Spotlight → Get pixel-accurate segmentation.
- This combines detection speed with segmentation precision. Ideal for counting + measuring workflows.
Decision Guide
| Your Goal | Dataset | Choice |
|---|---|---|
| Area, boundaries, morphology | Most cases | SegNet 3×3 (lightweight, fast training) |
| Area, boundaries, morphology | Complex/high variation | UNet 3×3 (increase depth if needed) |
| Crispest boundaries needed | Any | UNet 3×3 or higher |
| Count objects | Any | YOLO |
| Count + measure objects | Any | YOLO → Spotlight |
Training Data Requirements
How many images?
MIPAR tiles images during training (e.g., 3×2 = 6 sub-images). Typical needs:
- Fine-tuning: 5-8 images
- Complex features: 10-15 images
- Simple features: 3-5 images
Quality > Quantity. Well-chosen images spanning variation matter more than volume.
Annotation by Model Type:
Segmentation: Partial annotation OK – trace examples, unlabeled pixels ignored
YOLO: Must mark ALL instances per class. No partial annotation.
Epoch guidelines:
| Training Sub-images | Epochs |
|---|---|
| < 200 | 500-700 |
| 200-500 | 300-500 |
| 500-1000 | 100-300 |
| > 1000 | 50-100 |
Best Practices
Validation: Always enable. Set frequency = 10-20 epochs, Early Stop Patience = 2-3.
Augmentation:
- Geometry (rotation, shift, scale): Always useful
- Intensity: Rarely used. Only if images vary in brightness significantly. Disable if using preprocessing recipe.
Preprocessing Recipes:
Use if images vary in illumination. Recipe embeds in .DLM file and runs at application.
Iterative Improvement:
Deploy → find failures → correct → retrain → repeat. Failure cases are most valuable training data.
Train New vs. Updating:
- Update (fine-tune) an existing model when your task is similar to the original and you have limited new data (5-10 images).
- Train a new model when your task differs significantly, you need to change architecture parameters, or the existing model performs poorly.
Need more help with this?
Chat with an expert now ››


