Skip to content
GCC AI Research

Search

Results for "memory augmentation"

Humanizing Technology with Assistive Augmentations

MBZUAI ·

This article discusses a talk on "Assistive Augmentation," designing human-computer interfaces to augment human abilities. Examples include 'AiSee' for blind users, 'Prospero' for memory training, and 'MuSS-Bits' for deaf users to feel music. Suranga Nanayakkara from the National University of Singapore will present the talk, highlighting insights from psychology, human-centered machine learning, and design thinking. Why it matters: Such assistive technologies can significantly improve the quality of life for individuals with disabilities and extend human capabilities.

Memory representation and retrieval in neuroscience and AI

MBZUAI ·

A Caltech researcher presented at MBZUAI on memory representation and retrieval, contrasting AI and neuroscience approaches. Current AI retrieval systems like RAG retrieve via fine-tuning and embedding similarity, while the presenter argued for exploring retrieval via combinatorial object identity or spatial proximity. The research explores circuit-level retrieval via domain fine-tuned LLMs and distributed memory for image retrieval using semantic similarity. Why it matters: The work suggests structured databases and retrieval-focused training can allow smaller models to outperform larger general-purpose models, offering efficiency gains for AI development in the region.

Retrieval Augmentation as a Shortcut to the Training Data

MBZUAI ·

This article discusses retrieval augmentation in text generation, where information retrieved from an external source is used to condition predictions. It references recent work on retrieval-augmented image captioning, showing that model size can be greatly reduced when training data is available through retrieval. The author intends to continue this work focusing on the intersection of retrieval augmentation and in-context learning, and controllable image captioning for language learning materials. Why it matters: This research direction has the potential to improve transfer learning in vision-language models, which could be especially relevant for downstream applications in Arabic NLP and multimodal tasks.

Research Talks: Bridging neuroscience and AI

MBZUAI ·

Caltech graduate student Surya Narayanan Hari presented his research on replicating human-like memory in machines at MBZUAI. He discussed how the thalamus, which filters sensory and motor signals in the brain, inspires the development of routed monolithic models in AI. Hari explained that memory retrieval occurs on object, embedding, and circuit levels in the human brain. Why it matters: This talk highlights the potential of neuroscience-inspired AI architectures for improving memory and information processing in AI systems, which could accelerate the development of more efficient and context-aware AI models in the region.

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

arXiv ·

A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.