Skip to content
GCC AI Research

Search

Results for "compression"

On the Utility of Gradient Compression in Distributed Training Systems

MBZUAI ·

A CMU researcher, Dr. Hongyi Wang, presented an evaluation of gradient compression methods in distributed training, finding limited speedup in most realistic setups. The research identifies the root causes and proposes desirable properties for gradient compression methods to provide significant speedup. The talk was promoted by MBZUAI. Why it matters: Understanding the limitations of gradient compression techniques can help optimize distributed training strategies for AI models in the region.

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

arXiv ·

The paper introduces Sparse-Quantized Representation (SpQR), a new compression format and quantization technique for large language models (LLMs). SpQR identifies outlier weights and stores them in higher precision while compressing the remaining weights to 3-4 bits. The method achieves less than 1% accuracy loss in perplexity for LLaMA and Falcon LLMs and enables a 33B parameter LLM to run on a single 24GB consumer GPU. Why it matters: This enables near-lossless compression of LLMs, making powerful models accessible on resource-constrained devices and accelerating inference without significant accuracy degradation.

Towards Inclusive NLP: Assessing Compressed Multilingual Transformers across Diverse Language Benchmarks

arXiv ·

This paper benchmarks multilingual and monolingual LLM performance across Arabic, English, and Indic languages, examining model compression effects like pruning and quantization. Multilingual models outperform language-specific counterparts, demonstrating cross-lingual transfer. Quantization maintains accuracy while promoting efficiency, but aggressive pruning compromises performance, particularly in larger models. Why it matters: The findings highlight strategies for scalable and fair multilingual NLP, addressing hallucination and generalization errors in low-resource languages.

Professor Marc Genton and former postdoctoral fellow win the 2017 Wilcoxon Award

KAUST ·

KAUST Professor Marc Genton and his former postdoc Stefano Castruccio jointly won the 2017 Wilcoxon Award for their paper in Technometrics. Their paper, "Compressing an ensemble with statistical models: An algorithm for global 3D spatio-temporal temperature," details a data-compression scheme for climate simulations. The method reduces data-storage requirements and accelerates climate research capacity. Why it matters: This award highlights KAUST's contribution to statistical methods for climate modeling and big data analysis, particularly relevant for studying renewable energy resources in Saudi Arabia.

Machine Learning Integration for Signal Processing

TII ·

Technology Innovation Institute's (TII) Directed Energy Research Center (DERC) is integrating machine learning (ML) techniques into signal processing to accelerate research. One project used convolutional neural networks to predict COVID-19 pneumonia from chest x-rays with 97.5% accuracy. DERC researchers also demonstrated that ML-based signal and image processing can retrieve up to 68% of text information from electromagnetic emanations. Why it matters: This adoption of ML for signal processing at TII highlights the potential for advanced AI techniques to enhance research and security applications in the UAE.

Old images to anticipate the future

MBZUAI ·

MBZUAI researchers presented a new approach to video question answering at ICCV 2023. The method leverages insights from analyzing still images to understand video content, potentially reducing the computational resources needed for training video question answering models. Guangyi Chen, Kun Zhang, and colleagues aim to apply pre-trained image models to understand video concepts. Why it matters: This research could lead to more efficient and accessible video analysis tools, benefiting fields like healthcare and security where video data is abundant.