Sentiment Analysis
Contents
Sentiment Analysis¶
Sentiment analysis with NLTK¶
# If the cell below returns an error,
# uncomment the following to lines and execute this cell
# import nltk
# nltk.download('vader_lexicon')
from nltk.sentiment import SentimentIntensityAnalyzer as SIA
sia = SIA()
s = 'The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.'
sia.polarity_scores(s)
{'neg': 0.316, 'neu': 0.684, 'pos': 0.0, 'compound': -0.7506}
s = 'The enthusiastic programmer jumps over the quick brown fox jumps over the happy programmer jumps over the overjoyed fire fox.'
sia.polarity_scores(s)
{'neg': 0.083, 'neu': 0.552, 'pos': 0.366, 'compound': 0.8481}
neg
+ neu
+ pos
= 1.0
compound
varies between -1
and +1
.
res = sia.polarity_scores(s)
type(res)
dict
The data returned from poarity_scores
is a dictionary (dict). More on dictionaries.
# Access an element of a dictionary:
print(res['pos'])
0.366
Sentiment analysis with TextBlob¶
from textblob import TextBlob
s = 'The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.'
blob = TextBlob(s)
print(blob.sentiment)
Sentiment(polarity=-0.05555555555555556, subjectivity=0.8333333333333334)
s = 'The enthusiastic programmer jumps over the quick brown fox jumps over the happy programmer jumps over the overjoyed fire fox.'
blob = TextBlob(s)
print(blob.sentiment)
print(blob.sentiment.polarity)
Sentiment(polarity=0.5777777777777778, subjectivity=0.7999999999999999)
0.5777777777777778
Hugging Face¶
Installation¶
Hugging Face requires one of these machine learning frameworks: PyTorch, TensorFlow or Flax
# Installation with torch
pip install transformers[torch]
For mac users:
pip install 'transformers[torch]'
Sentiment analysis with DistilBERT¶
Link to the model on Hugging Face
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
Quick tour through using Hugging Face
from transformers import pipeline
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
s = 'The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.'
res = classifier(s)
print(res)
[{'label': 'NEGATIVE', 'score': 0.9923615455627441}]
s = 'The enthusiastic programmer jumps over the quick brown fox jumps over the happy programmer jumps over the overjoyed fire fox.'
res = classifier(s)
print(res)
[{'label': 'POSITIVE', 'score': 0.9940286874771118}]
Sentiment analysis with FinBERT¶
(Trained to analyze sentiment on financial text.)
Link to the model on Hugging Face
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("ProsusAI/finbert")
model = AutoModelForSequenceClassification.from_pretrained("ProsusAI/finbert")
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
s = 'The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.'
res = classifier(s)
print(res)
[{'label': 'neutral', 'score': 0.8497675061225891}]
s = 'The enthusiastic programmer jumps over the quick brown fox jumps over the happy programmer jumps over the overjoyed fire fox.'
res = classifier(s)
print(res)
[{'label': 'neutral', 'score': 0.8964928388595581}]
Write text to please a sentiment analysis¶
For e.g. »dax« instead of »fox« raises the score for positiveness from 0.72 to 0.81.
s = '''The market value of the enthusiastic programmer jumps over the increased
brown dax jumps high over the happy programmer jumps over the top of the overjoyed fire fox.'''.replace('\n', ' ')
res = classifier(s)
print(res)
[{'label': 'positive', 'score': 0.8132365345954895}]