- continuous bag of words and skipgram are two main models using Word2Vec
- Word Embeddings are vectors representations of words in a multi dimensional space that capture the semanatic meanings and relationships between words
- Words with similar meanings or contexts are represented by similar vectors
Word2Vec
- Word2Vec is a shallow 2 layer neural network designed to learn distributed word representations by predicting word context
- produces dense, low dimensional , continuous vector representations of words, unlike TF-IDF which produces sparse, high dimensional representations
CBOW
- Objective: Predicts a target word given its surrounding context words
- How it works: trains a simple neural network to maximize the probability of predicting the correct word based on its neighbors model uses target word around the context word in order to predict it
- the input layer contains context words and hidden layer contains the dimensions for word representations
- Process:
- Define context window size around target word
- use context wrods within this context window as input
- compute hidden layer representations
- predict the target word from the hidden representations
- Example:
- Today is a ____ day, with window size 3, context words: "today", "is", "a", "day" are used to predict the word "pleasant"
- CBOW is efficient for large datasets and computationally less expensive than SkipGram
- It differes from BoW unlike the traditional BoW which represents text as a collection of words and their frequencies without considering context, CBOW is an actual NN that captures context
SkipGram
- Objective is to predict the surrounding context words given the target word (opposite of CBOW)
- The model takes the target word and tries to predict the words appearing nearby in the sentence
- trains the model to maximize the probability of context words given a specific words in the middle of the context window
- the goal is to make the central word closer to its neighbors in vector space
- Ex: "cat" the model tries to predict words like "pet" "animal" "furry",
- skipgram works better for learning representations of rare words compared to CBOW it produces more meaningful embeddings