0% Complete
English
صفحه اصلی
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
A Deep Learning Framework for Phase-Aware Feature Representation to Improve Sound Source Direction and Distance Estimation
نویسندگان :
Zahra Abolfazli
1
Hamid Reza Abutalebi
2
1- Yazd University
2- Yazd University
کلمات کلیدی :
(Sound Event Localization and Detection (SELD،phase spectrogram features،Conformer
چکیده :
This paper proposes a novel refinement of the network’s input features to improve distance and Direction-of-Arrival estimation in the sound event localization and detection system. Instead of relying on Mel energies, we propose using phase spectrograms as input feature, which effectively preserve inter-channel time delays and capture crucial wave propagation characteristics. Furthermore, we introduce architectural improvements for increased robustness. Specifically, Huber loss replaces MSE, reducing sensitivity to noise. Additionally, MHSA layers are replaced with Conformer blocks to better model both long-range dependencies and local interactions within the audio data. Our experimental results validate the effectiveness of the proposed phase-based feature representation and optimized architecture, demonstrating improvements in both DOA and distance estimation.
لیست مقالات
لیست مقالات بایگانی شده
تحلیل و بررسی تکنیکهای محاسبات تقریبی
محمد میلاد صیاد - محمد رضا بینش مروستی - سید امیر اصغری
A clonal selection mechanism for load balancing in the cloud computing system
Melika Mosayyebi - Reza Azmi
Classification of mental states of human concentration based on EEG signal
Mehran Safari Dehnavi - Vahid Safari Dehnavi - Dr Masoud Shafiee
Architectural Insights: Comparing Weight Stationary and Output Stationary Systolic Arrays for Efficient Computation
Mahdi Kalbasi
Electrophysiological Modeling and Interactive Approaches of Electrical Circuits and Hypergraphs for Understanding Neural Circuit Dynamics
Arian Baymani - Maryam Naderi Soorki
Revert Propagation: Who are responsible for a contagion initialization in a Diffusion Network?
Arman Sepehr - Mohammadzaman Zamani - Hamid Beigy - Shabnam Behzad
Optimal selection of seed nodes by reducing the influence of common nodes in the influence maximization problem
Farzaneh Kazemzadeh - Ali Asghar Safaei - Mitra Mirzarezaee
Reinforced Detection: Deep Reinforcement Learning for Binary VoIP Classification in Encrypted Traffic
Mohsen Rajabpour - Mohammadmoein Asefi - Siavash Khorsandi
A Joint Trajectory and Energy Harvesting Method for an UAV Enabled Disaster Response Network
Hosein Mohammadi Firozjae - Javad Zeraatkar Moghaddam - Mehrdad Ardebilipour
مدل یادگیری عمیق با بازنمایی چند مقیاسی زمان برای پیشبینی آبشار اطلاعاتی در شبکههای اجتماعی
مبینا پناهی - مهدی عمادی
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.2