Search

Results for "perturbation"

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

arXiv · Feb 1

Researchers from the National Center for AI in Saudi Arabia investigated the sensitivity of Large Language Model (LLM) leaderboards to minor benchmark perturbations. They found that small changes, like choice order, can shift rankings by up to 8 positions. The study recommends hybrid scoring and warns against over-reliance on simple benchmark evaluations, providing code for further research.

Award-winning algorithm aids observation

KAUST · Aug 27

KAUST researchers developed a machine learning algorithm to control a deformable mirror within the Subaru Telescope's exoplanet imaging camera, compensating for atmospheric turbulence. The algorithm, which computes a partial singular value decomposition (SVD), outperforms a standard SVD by a factor of four. The KAUST team received a best paper award at the PASC Conference for this work, which has already been deployed at the Subaru Telescope. Why it matters: This advancement enables sharper images of exoplanets, facilitating their identification and study, and showcases the impact of optimizing core linear algebra algorithms.