Researchers at MBZUAI have developed GeoChat, a new vision-language model (VLM) specifically designed for remote sensing imagery. GeoChat addresses the limitations of general-domain VLMs in accurately interpreting high-resolution remote sensing data, offering both image-level and region-specific dialogue capabilities. The model is trained on a novel remote sensing multimodal instruction-following dataset and demonstrates strong zero-shot performance across tasks like image captioning and visual question answering.
MBZUAI, in partnership with IBM Research, is developing GeoChat+, a vision-language model (VLM) for multi-modal, temporal remote sensing image analysis. GeoChat+ builds on the previous GeoChat model, enhancing capabilities with multi-modal images from various Earth observation systems like Sentinel-1, Sentinel-2, Landsat, and high-resolution imagery. GeoChat+ will integrate data from multiple satellites at different times to detect environmental changes and analyze the impact on soil quality, air quality, and erosion. Why it matters: This advancement promises to revolutionize geographic data analysis, providing detailed reports for high-risk regions and aiding reforestation efforts.
Video-ChatGPT is a new multimodal model that combines a video-adapted visual encoder with a large language model (LLM) to enable detailed video understanding and conversation. The authors introduce a new dataset of 100,000 video-instruction pairs for training the model. They also develop a quantitative evaluation framework for video-based dialogue models.
Researchers developed Atlas-Chat, a collection of LLMs for dialectal Arabic, focusing on Moroccan Arabic (Darija). They constructed an instruction dataset by consolidating existing Darija language resources and translating English instructions. Atlas-Chat models (2B, 9B, 27B) outperform state-of-the-art and Arabic-specialized LLMs like LLaMa, Jais, and AceGPT on Darija NLP tasks. Why it matters: This work addresses the gap in LLM support for low-resource Arabic dialects, providing a methodology for instruction-tuning and benchmarks for future research.
Researchers developed COVIBOT, a smart chatbot to spread awareness and assist during the COVID-19 pandemic in Saudi Arabia. The chatbot uses Azure Cognitive Services and is available in both English and Arabic. COVIBOT's use cases were tested and validated using a scenario-based approach.
This paper presents a UI-level evaluation of ALLaM-34B, an Arabic-centric LLM developed by SDAIA and deployed in the HUMAIN Chat service. The evaluation used a prompt pack spanning various Arabic dialects, code-switching, reasoning, and safety, with outputs scored by frontier LLM judges. Results indicate strong performance in generation, code-switching, MSA handling, reasoning, and improved dialect fidelity, positioning ALLaM-34B as a robust Arabic LLM suitable for real-world use.