Skip to content
GCC AI Research

Search

Results for "Gemma-7B"

GemmAr: Enhancing LLMs Through Arabic Instruction-Tuning

arXiv ·

The paper introduces InstAr-500k, a new Arabic instruction dataset of 500,000 examples designed to improve LLM performance in Arabic. Researchers fine-tuned the open-source Gemma-7B model using InstAr-500k and evaluated it on downstream tasks, achieving strong results on Arabic NLP benchmarks. They then released GemmAr-7B-V1, a model specifically tuned for Arabic NLP tasks. Why it matters: This work addresses the lack of high-quality Arabic instruction data, potentially boosting the capabilities of Arabic language models.

Fanar 2.0: Arabic Generative AI Stack

arXiv ·

Hamad Bin Khalifa University (HBKU) has released Fanar 2.0, the second generation of Qatar's Arabic-centric Generative AI platform, built entirely at QCRI. The core of Fanar 2.0 is Fanar-27B, which was continually pre-trained from a Gemma-3-27B backbone using 120 billion high-quality tokens and only 256 NVIDIA H100 GPUs. Fanar 2.0 includes capabilities like FanarGuard, Aura, Oryx, Fanar-Sadiq, Fanar-Diwan, and FanarShaheen for moderation, speech recognition, vision understanding, Islamic content, poetry generation, and translation. Why it matters: This shows that sovereign, resource-constrained AI development in the Arabic language is possible, producing competitive systems in the region.