Search

Results for "AraNet"

AraNet: A Deep Learning Toolkit for Arabic Social Media

arXiv · Dec 30

Researchers introduce AraNet, a deep learning toolkit for Arabic social media processing. The toolkit uses BERT models trained on social media datasets to predict age, dialect, gender, emotion, irony, and sentiment. AraNet achieves state-of-the-art or competitive performance on these tasks without feature engineering. Why it matters: The public release of AraNet accelerates Arabic NLP research by providing a comprehensive, deep learning-based tool for various social media analysis tasks.

AraGPT2: Pre-Trained Transformer for Arabic Language Generation

arXiv · Dec 31

The paper introduces AraGPT2, a suite of pre-trained transformer models for Arabic language generation, with the largest model (AraGPT2-mega) containing 1.46 billion parameters. Trained on a large Arabic corpus of internet text and news, AraGPT2-mega demonstrates strong performance in synthetic news generation and zero-shot question answering. To address the risk of misuse, the authors also released a discriminator model with 98% accuracy in detecting AI-generated text. Why it matters: This release of both the model and discriminator fills a critical gap in Arabic NLP and encourages further research and applications in the field.

AraTrust: An Evaluation of Trustworthiness for LLMs in Arabic

arXiv · Mar 14

The paper introduces AraTrust, a new benchmark for evaluating the trustworthiness of LLMs when prompted in Arabic. The benchmark contains 522 multiple-choice questions covering dimensions like truthfulness, ethics, safety, and fairness. Experiments using AraTrust showed that GPT-4 performed the best, while open-source models like AceGPT 7B and Jais 13B had lower scores. Why it matters: This benchmark addresses a critical gap in evaluating LLMs for Arabic, which is essential for ensuring the safe and ethical deployment of AI in the Arab world.

Fanar aims to advance Arabic presence in digital space| Gulf Times - Gulf Times

QCRI · Jun 14

The provided article content is missing, preventing a factual summary of its details. Information regarding Fanar's initiative to advance Arabic presence in the digital space could not be extracted. Specific actions, partnerships, or funding related to this endeavor are not available. Why it matters: Without the article content, the significance of Fanar's potential contributions to Arabic digital presence cannot be evaluated.

NCVC and KAUST launch SAUDINet to advance terrestrial ecology in Saudi Arabia

KAUST · Mar 12

The National Center for Vegetation Cover Development and Combating Desertification (NCVC) and KAUST have launched the SAUDINet initiative. The initiative aims to advance terrestrial ecology research in Saudi Arabia, focusing on restoring degraded lands, enhancing carbon sequestration and preserving biodiversity. NCVC’s workforce will receive specialized training in biodiversity monitoring and ecological sampling, with samples analyzed in KAUST’s labs. Why it matters: The partnership aims to establish Saudi Arabia as a global leader in the study of arid ecosystems and address the lack of data from hyper-arid lands in climate models.

Enhanced Arabic Text Retrieval with Attentive Relevance Scoring

arXiv · Jul 31

This paper introduces an enhanced Dense Passage Retrieval (DPR) framework tailored for Arabic text retrieval. The core innovation is an Attentive Relevance Scoring (ARS) mechanism that improves semantic relevance modeling between questions and passages, replacing standard interaction methods. The method integrates pre-trained Arabic language models and architectural refinements, achieving improved retrieval and ranking accuracy for Arabic question answering. Why it matters: This work addresses the underrepresentation of Arabic in NLP research by providing a novel approach and publicly available code to improve Arabic text retrieval, which can benefit various applications like Arabic search engines and question-answering systems.