✂️ Semantic Segmentatition
Introduction
Instead of predicting one label (cat, dog, etc.) per image, we will predict one label per pixel!
Each pixel should belong to a class (cat, dog, etc.) or to a background class.
Applications
Autonomous driving | Medicine |
---|---|
Representing the task
Similar to how we treat standard categorical values, we’ll create our target by one-hot encoding the class labels - essentially creating an output channel for each of the possible classes.
Models
Note that the model backbone can be a resnet, densenet, inception…
Naive model: Convolutions + Transpose Convolutions (stride=2)
Better model: Convs + TransposeConvs(stride=2) + Residual connections = UNET
History of the styate of the art
Name | Description | Date | Instances |
---|---|---|---|
FCN | Fully Convolutional Network | 2014 | |
SegNet | Encoder-decorder | 2015 | |
Unet | Concatenate like a densenet | 2015 | |
DeepLab | Atrous Convolution and CRF | 2016 | |
ENet | Real-time video segmentation | 2016 | |
PSPNet | Pyramid Scene Parsing Net | 2016 | |
FPN | Feature Pyramid Networks slides | 2016 | Yes |
DeepLabv3 | Increasing dilatation & field-of-view | 2017 | |
LinkNet | Adds like a resnet | 2017 | |
DeepLabv3+ | 2018 | ||
PANet | Path Aggregation Network | 2018 | Yes |
Panop FPN | Panoptic Feature Pyramid Networks | 2019 | ? |
PointRend | Image Segmentation as Rendering | 2019 | ? |
Post-processing (OPTIONAL)
- Conditional Random Fields (CRF)
- Grabcut
Metric ands losses
- Pixel-wise cross entropy
- IoU (F0):
(Pred ∩ GT)/(Pred ∪ GT)
=TP / TP + FP * FN
- Dice (F1):
2 * (Pred ∩ GT)/(Pred + GT)
=2·TP / 2·TP + FP * FN
- Range from
0
(worst) to1
(best) - In order to formulate a loss function which can be minimized, we’ll simply use
1 − Dice
- Range from
Pixel-wise cross entropy
Dice loss
Notebook: CAMVID dataset
Reference
- Blog: An overview of semantic image segmentation
- Image Segmentation Using Deep Learning: A Survey Nov 2020
- https://www.jeremyjordan.me/semantic-segmentation
- https://www.jeremyjordan.me/evaluating-image-segmentation-models
- Check Res2Net
- Check catalyst segmentation tutorial (Ranger opt, albumentations, …)
- this repo