Skip to content
GCC AI Research

Search

Results for "experiment design"

Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization

arXiv ·

This paper introduces Diffusion-BBO, a new online black-box optimization (BBO) framework that uses a conditional diffusion model as an inverse surrogate model. The framework employs an Uncertainty-aware Exploration (UaE) acquisition function to propose scores in the objective space for conditional sampling. The approach is shown theoretically to achieve a near-optimal solution and empirically outperforms existing online BBO baselines across 6 scientific discovery tasks.

Technology and design bring on the Wearable Revolution

KAUST ·

Sonny Vu, CEO of Misfit Wearables, spoke at KAUST about the importance of design in technology and shared his entrepreneurial philosophy. He emphasized rapid prototyping, user feedback, and enjoyable user experiences, as seen in his previous company AgaMatrix and his wearable activity monitor, the Shine. Misfit Wearables successfully raised $100,000 through crowdfunding in just nine and a half hours. Why it matters: This highlights KAUST's role in fostering entrepreneurship and promoting innovative approaches to product development in the region, particularly in wearable technology.

Evaluating Web Search Engines Results for Personalization and User Tracking

arXiv ·

This paper presents six experiments evaluating personalization and user tracking in web search engine results. The experiments involve comparing search results based on VPN location (including UAE vs others), logged-in status, network type, search engine, browser, and trained Google accounts. The study measures total hits, first hit, and correlation between hits to identify patterns of personalization. Why it matters: The findings shed light on the extent of filter bubble effects and potential biases in search results for users in the UAE and globally.

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

arXiv ·

A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.