Home NLP Bigram Models

Bigram Models

NLP sushakanaujia · August 17, 2024 · 0 Comment

A Bigram model is a language model in which we predict the probability of the correctness of a sequence of words by just predicting the occurrence of the word “a” after the word “b”.

Example:

Consider the following sentence:

“I love machine learning.”

To build a bigram model from this sentence, we break it down into bigrams (pairs of words):

(“I”, “love”)
(“love”, “machine”)
(“machine”, “learning”)

How the Bigram Model Works:

Training:
- Suppose you have a corpus of text and you want to train a bigram model. You would count how often each bigram appears in the text.
- For instance, in a large corpus, you might find that “I love” appears 100 times, “love machine” appears 50 times, and so on.
Probability Calculation:
- The bigram model calculates the probability of a word given the previous word.
- For example, the probability of “love” given “I” would be:
- P(love|I)=Count(I love)/Count(I)
- If “I love” appears 100 times and “I” appears 200 times in the corpus, then:
- P(love|I)=100/200=0.5
Sentence Generation:
- To generate a new sentence, the model starts with an initial word and uses the bigram probabilities to predict the next word.

Practical Example:

Given a small corpus:

“I love machine learning.”
“I love coding.”
“Coding is fun.”

Bigrams and their counts:

(“I”, “love”): 2
(“love”, “machine”): 1
(“machine”, “learning”): 1
(“love”, “coding”): 1
(“coding”, “is”): 1
(“is”, “fun”): 1

Using these counts, you can calculate the probabilities for each bigram, which the model uses to predict or generate text.

sushakanaujia

sanodsolutions

Bigram Models

Example:

How the Bigram Model Works:

Practical Example:

Leave a Reply Cancel reply

Example:

How the Bigram Model Works:

Practical Example:

Related Posts

Count vectorization

Vector Representations of Words

Unigram Models

Word2Vec

Term Frequency-Inverse Document Frequency (TF-IDF)

NLP pipeline Step By Step

Leave a Reply Cancel reply