MIPAR offers three deep learning architectures. Choose based on your analysis goal and dataset characteristics.

Architecture Overview

SegNet — General Purpose Segmentation

SegNet
Purpose Fast-training general purpose segmentation
Best For Most segmentation tasks, quick iteration
Output Probability or layer map
Min Classes 2

Key Strength: Lightweight and trains quickly. Accommodates most training data effectively.

Architecture Tuning: For SegNet, filter size typically matters more than depth

  • Filter 5 or 7 for broad context (vs typical filter 3)

UNet — Robust or Complex Segmentation

UNet
Purpose Robust segmentation with sharpest boundaries
Best For Complex features, high variation, crispest boundaries needed
Output Probability or layer map
Min Classes 2
Size Range 20 MB (3×3) to 1 GB (6×3)

Key Strength: Produces crispest boundaries. Generalizes well across image variation.

Architecture Sizing: Depth × Filter notation (e.g., 3×3, 4×3, 5×3)

  • Increase depth to handle more variation (3→4→5→6)
  • Rarely increase filter beyond 3 (memory scales fast)
  • 3×3 = standard | 4×3 = moderate variation | 5×3 = high variation

YOLO — Object Detection

YOLO
Purpose Detect and count discrete objects
Best For Counting, localization, single-class problems
Output Bounding boxes with confidence scores
Complexity Basic (fast) or Advanced (more accurate)
Min Classes 1

Key Strength: Faster annotation (boxes vs pixel masks), works with single class.

YOLO → Spotlight Pipeline:

  • Train YOLO for fast detection → Feed boxes to Spotlight → Get pixel-accurate segmentation.
  • This combines detection speed with segmentation precision. Ideal for counting + measuring workflows.

Decision Guide

Your Goal Dataset Choice
Area, boundaries, morphology Most cases SegNet 3×3 (lightweight, fast training)
Area, boundaries, morphology Complex/high variation UNet 3×3 (increase depth if needed)
Crispest boundaries needed Any UNet 3×3 or higher
Count objects Any YOLO
Count + measure objects Any YOLO → Spotlight

Training Data Requirements

How many images?
MIPAR tiles images during training (e.g., 3×2 = 6 sub-images). Typical needs:

  • Fine-tuning: 5-8 images
  • Complex features: 10-15 images
  • Simple features: 3-5 images

Quality > Quantity. Well-chosen images spanning variation matter more than volume.

Annotation by Model Type:

Segmentation: Partial annotation OK – trace examples, unlabeled pixels ignored

YOLO: Must mark ALL instances per class. No partial annotation.

Epoch guidelines:

Training Sub-images Epochs
< 200 500-700
200-500 300-500
500-1000 100-300
> 1000 50-100

Best Practices

Validation: Always enable. Set frequency = 10-20 epochs, Early Stop Patience = 2-3.

Augmentation:

  • Geometry (rotation, shift, scale): Always useful
  • Intensity: Rarely used. Only if images vary in brightness significantly. Disable if using preprocessing recipe.

Preprocessing Recipes:
Use if images vary in illumination. Recipe embeds in .DLM file and runs at application.

Iterative Improvement:
Deploy → find failures → correct → retrain → repeat. Failure cases are most valuable training data.

Train New vs. Updating:

  • Update (fine-tune) an existing model when your task is similar to the original and you have limited new data (5-10 images).
  • Train a new model when your task differs significantly, you need to change architecture parameters, or the existing model performs poorly.

Need more help with this?
Chat with an expert now ››

Was this helpful?

Yes No
You indicated this topic was not helpful to you ...
Could you please leave a comment telling us why? Thank you!
Thanks for your feedback.