MBZUAI Assistant Professor Qirong Ho is researching AI operating systems to standardize algorithms and enable non-experts to create AI applications reliably. He emphasizes that countries mastering mass production of AI systems will benefit most from the Fourth Industrial Revolution. Ho is co-founder and CTO at Petuum Inc., an AI startup creating standardized building blocks for affordable and scalable AI production. Why it matters: This research aims to democratize AI development and promote widespread adoption across industries in the UAE and beyond.
A presentation discusses using programmable network devices to reduce communication bottlenecks in distributed deep learning. It explores in-network aggregation and data processing to lower memory needs and increase bandwidth usage. The talk also covers gradient compression and the potential role of programmable NICs. Why it matters: Optimizing distributed deep learning infrastructure is critical for scaling AI model training in resource-constrained environments.
The paper introduces Sparse-Quantized Representation (SpQR), a new compression format and quantization technique for large language models (LLMs). SpQR identifies outlier weights and stores them in higher precision while compressing the remaining weights to 3-4 bits. The method achieves less than 1% accuracy loss in perplexity for LLaMA and Falcon LLMs and enables a 33B parameter LLM to run on a single 24GB consumer GPU. Why it matters: This enables near-lossless compression of LLMs, making powerful models accessible on resource-constrained devices and accelerating inference without significant accuracy degradation.
Abdulrahman Mahmoud, a postdoctoral fellow at Harvard University, discusses software-directed tools and techniques for processor design and reliability enhancement in ML systems. He emphasizes the need for a nuanced approach to numerical data formats supported by robust hardware. He advocates for integrating reliability as a foundational element in the design process. Why it matters: This research addresses the critical challenge of hardware reliability in AI processors, particularly relevant as the field moves towards hardware-software co-design for sustained growth.
This paper presents a reinforcement learning framework for optimizing energy pricing in peer-to-peer (P2P) energy systems. The framework aims to maximize the profit of all components in a microgrid, including consumers, prosumers, the service provider, and a community battery. Experimental results on the Pymgrid dataset demonstrate the approach's effectiveness in price optimization, considering the interests of different components and the impact of community battery capacity.