Scribble Segmentation — Method Comparison

Quality curve: each method's images ranked best → worst

For each method we take its 228 per-image mIoU scores and sort them best-first, then plot. Read: "the top N images for method X all scored ≥ Y mIoU." Higher curves = better. The gap on the right side (worst images) shows how each method handles its hardest cases — that's where pseudo-labeling helped V4+V7 most.

Per-image quality: foreground IoU vs background IoU (V4+V7, train)

Each dot = one of 228 train images. Top-right corner = perfect. Dashed diagonal = bg IoU equals fg IoU. Most dots are below the diagonal — that's the structural pattern that bg is easier than fg in this dataset. Hover for details, click to jump.

Method-vs-method head-to-head (train)

Each cell: % of images where row method has higher mIoU than column.