Sentiment Analysis

Sentiment analysis with NLTK

# If the cell below returns an error,
# uncomment the following to lines and execute this cell

# import nltk
# nltk.download('vader_lexicon')
from nltk.sentiment import SentimentIntensityAnalyzer as SIA
sia = SIA()
s = 'The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.'
sia.polarity_scores(s)
{'neg': 0.316, 'neu': 0.684, 'pos': 0.0, 'compound': -0.7506}
s = 'The enthusiastic programmer jumps over the quick brown fox jumps over the happy programmer jumps over the overjoyed fire fox.'
sia.polarity_scores(s)
{'neg': 0.083, 'neu': 0.552, 'pos': 0.366, 'compound': 0.8481}

neg + neu + pos = 1.0

compound varies between -1 and +1.

res = sia.polarity_scores(s)
type(res)
dict

The data returned from poarity_scores is a dictionary (dict). More on dictionaries.

# Access an element of a dictionary:
print(res['pos'])
0.366

Sentiment analysis with TextBlob

from textblob import TextBlob
s = 'The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.'
blob = TextBlob(s)
print(blob.sentiment)
Sentiment(polarity=-0.05555555555555556, subjectivity=0.8333333333333334)
s = 'The enthusiastic programmer jumps over the quick brown fox jumps over the happy programmer jumps over the overjoyed fire fox.'
blob = TextBlob(s)
print(blob.sentiment)
print(blob.sentiment.polarity)
Sentiment(polarity=0.5777777777777778, subjectivity=0.7999999999999999)
0.5777777777777778

Hugging Face

Installation

Docs

Hugging Face requires one of these machine learning frameworks: PyTorch, TensorFlow or Flax

# Installation with torch
pip install transformers[torch]

For mac users:

pip install 'transformers[torch]'

Sentiment analysis with DistilBERT

Link to the model on Hugging Face

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

Quick tour through using Hugging Face

from transformers import pipeline

classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
s = 'The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.'
res = classifier(s)
print(res)
[{'label': 'NEGATIVE', 'score': 0.9923615455627441}]
s = 'The enthusiastic programmer jumps over the quick brown fox jumps over the happy programmer jumps over the overjoyed fire fox.'
res = classifier(s)
print(res)
[{'label': 'POSITIVE', 'score': 0.9940286874771118}]

Sentiment analysis with FinBERT

(Trained to analyze sentiment on financial text.)

Link to the model on Hugging Face

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("ProsusAI/finbert")

model = AutoModelForSequenceClassification.from_pretrained("ProsusAI/finbert")
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
s = 'The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.'
res = classifier(s)
print(res)
[{'label': 'neutral', 'score': 0.8497675061225891}]
s = 'The enthusiastic programmer jumps over the quick brown fox jumps over the happy programmer jumps over the overjoyed fire fox.'
res = classifier(s)
print(res)
[{'label': 'neutral', 'score': 0.8964928388595581}]

Write text to please a sentiment analysis

For e.g. »dax« instead of »fox« raises the score for positiveness from 0.72 to 0.81.

s = '''The market value of the enthusiastic programmer jumps over the increased
brown dax jumps high over the happy programmer jumps over the top of the overjoyed fire fox.'''.replace('\n', ' ')
res = classifier(s)
print(res)
[{'label': 'positive', 'score': 0.8132365345954895}]