Contrastive Learning-Image Pre-training
Connecting text and images trained by Contrastive Learning with text and image at the same time. CLIP makes input text to Embedding vector for image processing. This zero-shot capability made Image Labeling requirement very low.
CLIP Usages