0% Complete
English
صفحه اصلی
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
A Deep Learning Framework for Phase-Aware Feature Representation to Improve Sound Source Direction and Distance Estimation
نویسندگان :
Zahra Abolfazli
1
Hamid Reza Abutalebi
2
1- Yazd University
2- Yazd University
کلمات کلیدی :
(Sound Event Localization and Detection (SELD،phase spectrogram features،Conformer
چکیده :
This paper proposes a novel refinement of the network’s input features to improve distance and Direction-of-Arrival estimation in the sound event localization and detection system. Instead of relying on Mel energies, we propose using phase spectrograms as input feature, which effectively preserve inter-channel time delays and capture crucial wave propagation characteristics. Furthermore, we introduce architectural improvements for increased robustness. Specifically, Huber loss replaces MSE, reducing sensitivity to noise. Additionally, MHSA layers are replaced with Conformer blocks to better model both long-range dependencies and local interactions within the audio data. Our experimental results validate the effectiveness of the proposed phase-based feature representation and optimized architecture, demonstrating improvements in both DOA and distance estimation.
لیست مقالات
لیست مقالات بایگانی شده
Electrophysiological Modeling and Interactive Approaches of Electrical Circuits and Hypergraphs for Understanding Neural Circuit Dynamics
Arian Baymani - Maryam Naderi Soorki
Beyond One-Hot: CatBoost for Heating and Cooling Load Prediction
Shayan Naghizadeh - Mohammad Saeed Rajabi - Ehsan Nazerfard
A perceptual loss for screen content image super-resolution
Hossein Sekhavaty-Moghadam - Marzieh Hosseinkhani - Dr Azadeh Mansouri
Secure Web-Based Control of ROS 1 Robots Using AES-256-GCM Encryption and LLM Integration
Ali Godarzvand chegini - Mohammad Arabian
Similarity Measures in Medical Image Registration: A Review Article
Zohre Mohammadi - Dr Mohammad Reza Keyvanpour
روشی چندوجهی برای تحلیل احساسات در زبان فارسی با استفاده نشریه ساختار بلاغی و ترنسفرمرها
ریحانه احمدی علیائی - امینه امینی - عباس جلیلوند
Hardware Imperfection Effects in Wireless Virtual Reality System with Hybrid Beamforming
Nasim Alikhani - Abbas Mohammadi
A Survey on Utilizing Reinforcement Learning in Wireless Sensor Networks Routing Protocols
Ali Forghani Elah Abadi - Seyedeh Elham Asghari - Sepideh Sharifani - Seyyed Amir Asghari - Mohammadreza Binesh Marvasti
Towards Provable Privacy Protection in IoT-Health Applications
Samane Sobuti - دکتر سیاوش خرسندی
بررسی امنیت وفقی در اینترنت وسایل نقلیه
سیده یگانه غیور باغبانی - دکتر سعید جلیلی سیده یگانه غیور باغبانی - سعید جلیلی -
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.8.0