Search

Results for "audio perturbations"

Your voice can jailbreak a speech model – here’s how to stop it, without retraining

MBZUAI · Invalid Date

A new paper from MBZUAI demonstrates that state-of-the-art speech models can be easily jailbroken using audio perturbations to generate harmful content, achieving success rates of 76-93% on models like Qwen2-Audio and LLaMA-Omni. The researchers adapted projected gradient descent (PGD) to the audio domain to optimize waveforms that push the model towards harmful responses. They propose a defense mechanism based on post-hoc activation patching that hardens models at inference time without retraining. Why it matters: This research highlights a critical vulnerability in speech-based LLMs and offers a practical solution, contributing to the development of more secure and trustworthy AI systems in the region and globally.

VENOM: Text-driven Unrestricted Adversarial Example Generation with Diffusion Models

arXiv · Jan 14

The paper introduces VENOM, a text-driven framework for generating high-quality unrestricted adversarial examples using diffusion models. VENOM unifies image content generation and adversarial synthesis into a single reverse diffusion process, enhancing both attack success rate and image quality. The framework incorporates an adaptive adversarial guidance strategy with momentum to ensure the generated adversarial examples align with the distribution of natural images.

SemDiff: Generating Natural Unrestricted Adversarial Examples via Semantic Attributes Optimization in Diffusion Models

arXiv · Apr 16

This paper introduces SemDiff, a novel method for generating unrestricted adversarial examples (UAEs) by exploring the semantic latent space of diffusion models. SemDiff uses multi-attribute optimization to ensure attack success while preserving the naturalness and imperceptibility of generated UAEs. Experiments on high-resolution datasets demonstrate SemDiff's superior performance compared to state-of-the-art methods in attack success rate and imperceptibility, while also evading defenses.

ScoreAdv: Score-based Targeted Generation of Natural Adversarial Examples via Diffusion Models

arXiv · Jul 8

The paper introduces ScoreAdv, a novel approach for generating natural adversarial examples (UAEs) using diffusion models. It incorporates an adversarial guidance mechanism and saliency maps to shift the sampling distribution and inject visual information. Experiments on ImageNet and CelebA datasets demonstrate state-of-the-art attack success rates, image quality, and robustness against defenses.

Exploring Sound vs Vibration for Robust Fault Detection on Rotating Machinery

arXiv · Dec 17

The study introduces the Qatar University Dual-Machine Bearing Fault Benchmark dataset (QU-DMBF) containing sound and vibration data from two motors across 1080 conditions. It proposes a deep learning approach for sound-based fault detection, addressing limitations of vibration-based methods. Experiments on QU-DMBF show sound-based detection is more robust, independent of sensor location, and cost-effective while matching vibration-based performance. Why it matters: The new dataset and findings could shift the focus toward sound-based methods for more reliable and accessible predictive maintenance in industrial settings.