👀 Image recognition

What is an Iamge

💾 Formats

🐇 (Fast) Dataloaders

ffcv is a dataloader that dramatically increases data throughput in model training

Realted work: DeepSpeed

🔨 Image preprocessing

Normalization

  1. Mean subtraction: Center the data to zero. x = x - x.mean()
  2. Standardize: Put the data on the same scale. x = x / x.std()

PCA and Whitening

  1. Mean subtraction: Center the data in zero. x = x - x.mean()
  2. Decorrelation or PCA: Rotate the data until there is no correlation anymore.
  3. Whitening: Put the data on the same scale. whitened = decorrelated / np.sqrt(eigVals + 1e-5)

ZCA Whitening with Zero component analysis (ZCA) is a very similar process.

Subtract Local Mean

CLAHE: Contrast Limited Adaptive Histogram Equalization

Data Augmentation

5 Libraries Options:

  • Fastai
  • TIMM
  • Torchvision
  • Albumentation
  • Kornia

TIMM

https://medium.com/towards-data-science/getting-started-with-pytorch-image-models-timm-a-practitioners-guide-4e77b4bf9055

OpenCV basics

Write text

# Put text
cv2.putText(img       = frame,
			text      = resolution_str,
			org       = (50, 100) ,
			fontFace  = cv2.FONT_HERSHEY_SIMPLEX,
			fontScale = 1,  
			color     = (255, 255, 255),
			thickness = 2)

Reference