My Portfolio

Overview

The Tiny Language Classifier project was born out of my curiosity to test out Naive Bayes classifiers for text classification tasks. Language identification is a fundamental problem in natural language processing, and I wanted to explore how well a simple model like Naive Bayes could perform on this task.

This is nothing BIG, but I feel like using overly complex models for simple tasks is a common pitfall in machine learning these days.

Project Details

The model was trained and tested on a language identification dataset available on Hugging Face. The text data is then vectorized and fed into a Naive Bayes (MultinomialNB) classifier implemented using Scikit-Learn.

Even with its simplicity, the model achieved impressive over 90% accuracy on the test set.

Technologies Used

Python
Scikit-Learn
NumPy
Pandas
Hugging Face (datasets)

Tiny Language Classifier

Overview

Project Details

Technologies Used

Tags

Project Source