202511171507 Status: idea Tags: Datascience
Natural Language Processing NLP
Natural Language Processing (NLP) is a branch of artificial intelligence (AI) that enables machines to understand and process human languages, either in text or audio form.
NLP is widely used for speech recognition, language translation and text summarization.
NLP kan opgedeeld worden onder 2 categorieën:
Techniques
In general gaat de volgorde van gebruikte technieken ongeveer zo:
Tasks
Dingen waar NLP vooral voor gebruikt wordt zijn:
- Text generation
- Text summarization
- Machine translation
- Sentiment Analysis
- Named Entity Recognition
- Text Classification.
Applications
In wat voor eindproducten zit NLP dan bijvoorbeeld verwerkt?
- Voice assistants
- Speech-to-text
- Text analysis
- Chatbots
- Information retrieval
- Content Reccomendation
NLP vs NLU vs NLG
This table shows the differences between them.
| Aspect | NLP | NLG | NLU |
|---|---|---|---|
| Input | Raw or structured language | Structured data | Natural language text |
| Output | Structured or unstructured text | Human-readable text | Machine-readable meaning |
| Goal | Interpret & produce language | Generate natural-sounding text | Understanding meaning & intent |
| Techniques | Parsing, tagging, vectorization | Templates, ML models, transformers | Syntax analysis, semantics, embeddings |
| Tasks | Translation, speech-to-text, summarization | Reports writing, product descriptions | Intent detection, sentiment analysis |
| Tools | spaCy, NLTK, Hugging Face | GPT, T5, SimpleNLG | BERT, RoBERTa, Dialogflow |
Text representation Techniques
Convert textual data into numerical vectors:
- One-Hot Encoding
- Bag of Words (BOW)
- TF-IDF
- N-Gram Models (e.g., with NLTK)
- Latent Semantic Analysis (LSA)
- Latent Dirichlet Allocation (LDA)
References
- Dit is iets wat we leren voor Datascience. dit was informatie vanuit avans 2-2 datascience 2025-11-12. en daarbij horen deze slides
- Geeks for geeks: https://www.geeksforgeeks.org/nlp/natural-language-processing-nlp-tutorial/