0% Complete
فارسی
Home
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
A Deep Learning Framework for Phase-Aware Feature Representation to Improve Sound Source Direction and Distance Estimation
Authors :
Zahra Abolfazli
1
Hamid Reza Abutalebi
2
1- Yazd University
2- Yazd University
Keywords :
(Sound Event Localization and Detection (SELD،phase spectrogram features،Conformer
Abstract :
This paper proposes a novel refinement of the network’s input features to improve distance and Direction-of-Arrival estimation in the sound event localization and detection system. Instead of relying on Mel energies, we propose using phase spectrograms as input feature, which effectively preserve inter-channel time delays and capture crucial wave propagation characteristics. Furthermore, we introduce architectural improvements for increased robustness. Specifically, Huber loss replaces MSE, reducing sensitivity to noise. Additionally, MHSA layers are replaced with Conformer blocks to better model both long-range dependencies and local interactions within the audio data. Our experimental results validate the effectiveness of the proposed phase-based feature representation and optimized architecture, demonstrating improvements in both DOA and distance estimation.
Papers List
List of archived papers
روشی برای تشخیص مرحله پیشرفت آلزایمر در تصاویرFMRI مبتنی بر شبکه های عصبی چگال
فرساد زمانی بروجنی - عباس بهره دار
طبقهبندی ترافیک رمز مبتنی بر یادگیری ماشین
افسانه معدنی - شقایق نادری - حسین قرایی
Artificial Empathy in AI-Based Mental Health: A Review
Shabnam Moradi
تحلیل و بررسی تکنیکهای محاسبات تقریبی
محمد میلاد صیاد - محمد رضا بینش مروستی - سید امیر اصغری
An integrated approach for estimating software cost estimation using Adaptive Neuro-Fuzzy Inference System and the Grey Wolf Optimization algorithm
Maryam Karimi - Taghi Javdani Gandomani - Mahdi Mosleh
A perceptual loss for screen content image super-resolution
Hossein Sekhavaty-Moghadam - Marzieh Hosseinkhani - Dr Azadeh Mansouri
Hardware Imperfection Effects in Wireless Virtual Reality System with Hybrid Beamforming
Nasim Alikhani - Abbas Mohammadi
طبقه بندی روش های شناسایی داده های تکراری در جهت تسهیل فرایند پاکسازی داده ها
مهدی جعفری - احمد عبدالله زاده بار فروش
NFV-Based Distributed Service Function Chaining with Imperfect Information
Mahsa Alikhani - Marzieh Sheikhi - Dr Vesal Hakami
A New Sentence Ordering Method Using BERT Pretrained Model
Melika Golestanipour - Seyedeh Zahra Razavi - Dr Heshaam Faili
more
Samin Hamayesh - Version 42.5.2