202511191237 Status: idea Tags: Datascience, NLP
NLP Lemmatization
Lemmatization reduces words to their base form (lemma), considering meaning and POS. Unlike Stemming, it avoids producing non-dictionary words
Why?
- Improves accuracy : Treats words like ārunningā and āranā as the same.
- Reduces redundancy : Smaller, cleaner datasets for analysis and ML training.
- Better model performance : Consistent word forms improve context understanding
References
- Dit is iets wat we leren voor Datascience. dit was informatie vanuit avans 2-2 datascience 2025-11-12. en daarbij horen deze slides
- I was writing a note about NLP which mentions this.