Limited time offer

Get 25% off your order

Use the code below at checkout β€” offer expires soon.

Your promo codeBESTW25
25%
Expires in: 10:00
Claim my 25% discount

πŸŽ“ Get 20% off your first order! Use code FIRST20 at checkout. Order Now →

Home β€Ί In this project, students will implement a Naive Bayes Classifier (NBC) for sentiment analysis on a dataset containing reviews and their respective star ratings. The datasets, "train.csv" and "test.csv", will be provided. A review with a 5-star rating

In this project, students will implement a Naive Bayes Classifier (NBC) for sentiment analysis on a dataset containing reviews and their respective star ratings. The datasets, "train.csv" and "test.csv", will be provided. A review with a 5-star rating

Project Description:

In this project, students will implement a Naive Bayes Classifier (NBC) for sentiment analysis on a dataset containing reviews and their respective star ratings. The datasets, "train.csv" and "test.csv", will be provided. A review with a 5-star rating will be considered positive, while all other ratings will be considered negative. Do not use any publicly available code-vour code will be checked against public implementations or Al- generated codes.

The project consists of three tasks:

Task 1: Feature Selection (10 points)

β€’ Students will preprocess "train.csv" and select the top 1000 words (by frequency) as word features for their model. All other words will be ignored.

β€’ Please print out the top 20-50 words from the selected features.

β€’ Preprocessing Guideline:

a. Convert all text to lowercase.

b. Remove special characters.

c. Tokenize the text into words.

D. Remove stop words.

Task 2: Model Training and Evaluation (15 points)

β€’ Using "train.csv" and "test.csv", which they will use to train and evaluate their Naive Bayes Classifier with Laplace Smoothing

o Laplace Smoothing: Implement Laplace smoothing in the parameter estimation. For an attribute Xi with k values, Laplace correction adds 1 to the numerator and k to the denominator of the maximum likelihood estimate, o Evaluation measure: Accuracy

β€’ Please describe your observations and provide an analysis of their model's performance.

Task 3: Learning Curve Analysis (5 points)

β€’ Students will plot a learning curve by varying the amount of training data used [10%, 30%, 50%, 70%, 100%]. The testing set will remain unchanged.

β€’ For this plotting task only, students may use external plotting packages like the MatplotLib.

β€’ Students will describe their observations and provide an analysis of the learning curve.

Deliverables:

1. Python code implementation of the Naive Bayes Classifier.

2. README file for executing your code.

3. PDF report

πŸ“ Need Help With a Similar Assignment?

Our expert writers can deliver a 100% original, plagiarism-free paper tailored to your requirements with fast turnaround.

Get Professional Help Now →
WhatsApp
Limited Offer Get 25% off β€” use code BESTW25
No AI No Plagiarism On-Time Delivery Free Revisions
Claim Now