The Black Book: A Sentiment Analysis from X’s Standpoint

6 min readOct 26, 2023

Background

About the black book film

Before delving into the analysis, it’s essential to understand the film’s context and any unique aspects that might impact its reception.

The black book is a Nigerian crime thriller film produced and directed by Editi Effiong, released to Netflix on 22 September 2023, starring Richard Mofe-Damijo, Sam Dede, Shaffy Bello, Femi Branch, Alex Usifo, Ada Laoye, Ireti Doyle, Olumide Oworu amongst many others.

The film’s storyline, which revolves around the relentless quest for justice by Paul Edima, a former military guerrilla portrayed by Nollywood icon Richard Mofe Damijo (RMD), has captured the heart of audience worldwide.

“The Black book” probes profoundly into themes of redemption, and the lingering shadows of the past, making it a must watch for lovers of gripping cinematic narratives.

Origin of the Analysis

Following the release of an exceptionally captivating movie trailer, a lively discussion on X ignited. It revolved around the perceived decline in film production within the Nigerian movie industry, (Nollywood), with some asserting that “The Black Book” might serve as a catalyst for a return to Nollywood’s former glory.

This development piqued my curiosity, prompting me to explore the public’s reception of this action thriller on X.

Consequently, I became motivated to investigate the prevailing sentiment and overall mood generated by the film.

This analysis covers the most engaging day, most popular cast, active contributors and overall sentiment of the film.

Use case

An analysis of this nature has practical utility for any company maintaining a presence on social media, allowing for the automatic prediction of customer sentiment, such as discerning whether customers are content or dissatisfied.

This streamlined process eliminates the necessity for human personnel to painstakingly review large quantities of tweets and customer reviews.

Tools used for the Analysis

Python: For Data collection and transformation.

Microsoft PowerBI: For Data visualization.

Data Collection

I collected a total of 2,222 post from X during the period spanning from September 22, 2023, to October 17, 2023. The process involved utilizing Python libraries, specifically Selenium and Beautiful Soup.

Selenium, for managing interactions such as login and search queries, while Beautiful Soup was employed for parsing the HTML, identifying relevant HTML elements based on their attributes, and extracting the desired information.

The data mined included information on; username, tweet_id, tweet_text, iso_8601_timestamp, reply count, retweet count, and like count using relevant keywords (The Black Book Nigeria, The Black Book Police, The Black Book Corruption), and hashtags (#theblackbook, #theblackbooknetflix, #blackbook #theblackbookmovie).

Data Cleaning

The data required no cleaning as it was collected in a python dictionary where each tweet was stored using a key made up of the tweeter’s username and the tweet id.

The data appeared to be free from discrepancies, anomalies, and data quality issues like missing values, duplicates, or outliers.

Data Preprocessing

To ensure the quality of the text data, data preprocessing was carried out using natural language processing (NLP), through the following steps;

Case conversion: Words were converted to lowercase for readability.
Tokenization: To split texts into words or units.
Removing stop words
Removing punctuations
Lemmatization: Reducing words to their base form.

from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer

def preprocess_text(text: str):
    lemmatized_tokens = preprocess_text_as_tokens(text)
    return ' '.join(lemmatized_tokens)

def preprocess_text_as_tokens(text: str):
    tokens = word_tokenize(text.lower())
    filtered_tokens = [ token for token in tokens if token not in stopwords.words('english') and token not in string.punctuation and token is not None]
    lemmatizer = WordNetLemmatizer()
    return [lemmatizer.lemmatize(token) for token in filtered_tokens]

Sentiment Analysis

Sentiment Analysis (or opinion mining) is a natural language processing (NLP) technique used in identifying and specifying the emotional tone of different kinds of texts. (epamSolutionsHub)

This type of analysis is used to determine whether a given text contains a negative, positive, or neutral mood.

The Valence Aware Dictionary and Sentiment Reasoner (VADER) library from Natural Language toolkit was used to grade each tweet with polarity scoring on metrics of positivity, negativity and neutrality.

import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

I employed a classification approach for the analysis, wherein tweets were considered positive if they scored above 0.5 for positivity and neutrality, with any tweet falling below these thresholds being categorized as negative.

analyzer = SentimentIntensityAnalyzer()

def get_sentiment(text: str):
    scores = analyzer.polarity_scores(text)
    if scores['pos'] >= 0.5:
        return 1
    elif scores['neu'] >= 0.5:
        return 1
    else:
        return 0

Data Visualization

The initial action taken in this process involved loading the data in a JSON format, which ensured that the data was presented in a structured manner, making it both human-readable and machine-readable.


def load_to_json (object, file_name: str):
    json_object = json.dumps(object)
    with open(file_name+'.json', 'w') as outfile:
        outfile.write(json_object)

The JSON file was then exported to PowerBI to make a visual representation of my data and communicate my findings using visuals.

Additionally, I renamed my tweets and retweets columns to “posts” and “reposts” respectively.

Insights/ Findings

The analysis revealed that Sunday had a higher number of posts compared to the rest of the days of the week. This observation is logical as Sunday is typically a non-working day for many people.

The overall sentiment analysis yielded a positive sentiment statistic of 94% and a negative sentiment statistic of 6%. It can be inferred that the film had a positive impact on its viewers.

The most prolific contributor during the analysis timeframe, with a total of 27 tweets, was Editi Effiong, who serves as the producer of the film. This observation is logical, given his significant role in the production of the film.

It was evident that the post which garnered the highest levels of engagement in terms of likes and reposts was by Mr. Peter Obi. In this post, he showered praise upon the film’s editor. This specific post achieved an impressive milestone with over one million views, 29.2k likes, and 7040 reposts on platform X.

In the conclusive evaluation, it was observed that the esteemed actor Richard Mofe-Damijo, frequently referred to as RMD, held the position of the most prominent cast member. This distinction was largely attributed to his portrayal of the film’s central character.

Conclusion

In a nutshell, the initial phase of the film’s release enjoyed a strong social media presence. However, later on, there was a noticeable decline. Interestingly, this decline did not have any adverse impact on the viewers’ sentiment towards the film, which continued to receive positive ratings.

I appreciate your time spent reading my piece.

To learn more about this project, visit my GitHub