MBZUAI researchers developed Mobile-VideoGPT, a compact and efficient multimodal model for real-time video understanding on edge devices. The system uses keyframe selection, efficient token projection, and a Qwen-2.5-0.5B language model. Testing showed that Mobile-VideoGPT is faster and performs better than other models while being significantly smaller, and the model and code are publicly available. Why it matters: This research enables on-device AI processing for video, reducing reliance on remote servers and addressing privacy concerns, which can accelerate the adoption of AI in mobile and embedded applications.
Muhammad Shafique from NYU Abu Dhabi discusses building energy-efficient and robust EdgeAI systems. The talk covers trends, challenges, and techniques for optimizing software and hardware stacks. These optimizations aim to enable embodied AI in autonomous systems, IoT-Healthcare, Industrial-IoT, and smart environments. Why it matters: The research addresses key challenges in deploying AI on resource-constrained edge devices in the GCC region, particularly regarding energy efficiency and security.
RightNow-Arabic-0.5B-Turbo is a new 518M-parameter Arabic-specialized decoder LLM, built on Qwen2.5-0.5B, designed to bridge the gap between small multilingual and large Arabic-specialized models. Its development pipeline included adding 27,032 Arabic tokens via vocabulary injection, continued pretraining on 504M Arabic tokens, and fine-tuning with supervised instruction and direct preference optimization. The model achieved a 35.9% mean accuracy on three Arabic benchmarks (COPA-ar, Arabic HellaSwag, ArabicMMLU), outperforming all same-class open models and recovering 67% of SILMA-9B's mean accuracy at 1/18 the parameters, with all code and weights publicly released. Why it matters: This model significantly advances efficient Arabic NLP by providing a powerful, specialized sub-1B LLM suitable for edge deployment, making advanced Arabic AI more accessible and performant on resource-constrained devices.