Kumru LLM
VNGRS AI · Published September 29, 2025
Kumru is a 7.4B parameter decoder-only Turkish LLM pre-trained from scratch by the VNGRS AI team. Built with the goal of providing a foundational, in-house deployable model for Turkish — addressing security, compliance, and language quality requirements.
Technical Specs
- 7.4B parameters, decoder-only architecture (based on Mistral-v0.3 / LLaMA-3 equivalent)
- Pre-trained on 500 GB of cleaned & deduplicated Turkish corpora over 45 days (H100/H200 GPUs)
- Exposed to 300B tokens during pre-training
- Fine-tuned on ~1 million examples for diverse use cases
- 8,192 token context length (~20 A4 pages of Turkish text)
- Runs on 16GB VRAM (RTX A4000, RTX 3090) — suitable for on-premise deployment
Open-Source Version
Kumru-2B is released as an open-source model with the same specs (8,192 ctx, 300B tokens pre-training), requiring only 4.8GB memory — enabling mobile deployment without quantization.
Results
Evaluated on the Cetvel benchmark (26 Turkish NLP tasks including summarization, QA, NLI, machine translation). Both Kumru-7B and Kumru-2B surpass significantly larger models such as LLaMA-3.3-70B, Gemma-3-27B, and Qwen-2-72B on Turkish tasks.
Links: Blog Post · Demo · HuggingFace (2B)