VNLP: Turkish NLP Package
arXiv:2403.01309 · cs.CL, cs.AI, cs.LG · Submitted March 2, 2024
We present VNLP: the first dedicated, complete, open-source, well-documented, lightweight, production-ready, state-of-the-art NLP package for the Turkish language.
VNLP covers a wide range of tasks:
- Sentence splitting & text normalization
- Sentiment Analysis
- Named Entity Recognition (NER)
- Morphological Analysis & Disambiguation
- Part-of-Speech (POS) Tagging
Token classification models are based on “Context Model”, a novel architecture that is both an encoder and an auto-regressive model. Ships with pre-trained word embeddings and SentencePiece Unigram tokenizers.
Available via PyPI, with Python & CLI APIs, ReadtheDocs documentation, and a live demo.