CLIP: OpenAI's Multi-Modal Model

Constastive Language -Image Pretraining(CLIP) is a zero-shot multi-modal model that learns directly from the raw text about images. CLIP which efficiently…

--

--

--

Loves learning, sharing, and discovering myself. Passionate about Machine Learning and Deep Learning

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Feature Selection in Machine Learning

The Beginning of AlexNet

Getting Started in Natural Language Processing: Bag-of-Words & TF-IDF

How the chatbot understands sentences

5 Machine Learning Models You Can Deploy Using BigQuery

Deep Learning — Week 3, Part 1

How Data Quality Tools Deliver Clean Data for AI and ML

Food Wars

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Renu Khandelwal

Renu Khandelwal

Loves learning, sharing, and discovering myself. Passionate about Machine Learning and Deep Learning

More from Medium

Review — Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

Graph Neural Network — Attention Mechanism Applied and Computational Improvement

BERT — Bidirectional Encoder Representation from Transformers

OMNIVORE: A Single Model for Many Visual Modalities |Paper Summary|