0% Complete
فارسی
Home
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
A Deep Learning Framework for Phase-Aware Feature Representation to Improve Sound Source Direction and Distance Estimation
Authors :
Zahra Abolfazli
1
Hamid Reza Abutalebi
2
1- Yazd University
2- Yazd University
Keywords :
(Sound Event Localization and Detection (SELD،phase spectrogram features،Conformer
Abstract :
This paper proposes a novel refinement of the network’s input features to improve distance and Direction-of-Arrival estimation in the sound event localization and detection system. Instead of relying on Mel energies, we propose using phase spectrograms as input feature, which effectively preserve inter-channel time delays and capture crucial wave propagation characteristics. Furthermore, we introduce architectural improvements for increased robustness. Specifically, Huber loss replaces MSE, reducing sensitivity to noise. Additionally, MHSA layers are replaced with Conformer blocks to better model both long-range dependencies and local interactions within the audio data. Our experimental results validate the effectiveness of the proposed phase-based feature representation and optimized architecture, demonstrating improvements in both DOA and distance estimation.
Papers List
List of archived papers
طبقه بندی آسیبهای لیگامنت با استفاده از تحلیل تصاویر تشدید مغناطیسی توسط الگوریتمهای یادگیری عمیق
محسن اکبری - دکتر مریم مؤمنی محسن اکبری - مریم مؤمنی -
تحلیل سازههای موثر بر پذیرش فناوری بلاکچین و استفاده از آن در صنعت بیمه ایران با استفاده از تکنیک معادلات ساختاری (مطالعه موردی: شرکت کارگزاری رسمی بیمه زندگی خوب)
احسان هنری - آفرین اخوان
Ensemble Model Based on an Improved Convolutional Neural Network with a Domain-agnostic Data Augmentation Technique
Faraz Fatahnaie - Armin Azhdehnia - Seyyed Amir Asghari - Mohammadreza Binesh Marvasti
A New Sentence Ordering Method Using BERT Pretrained Model
Melika Golestanipour - Seyedeh Zahra Razavi - Dr Heshaam Faili
طبقه بندی روش های شناسایی داده های تکراری در جهت تسهیل فرایند پاکسازی داده ها
مهدی جعفری - احمد عبدالله زاده بار فروش
DRL-Based Phase Optimization for O-RIS in Dual-Hop Hard Switching FSO/RIS-aided RF and UWOC Systems
Aboozar Heydaribeni - Hamzeh Beyranvand - Sahar Eslami
3D Mesh ONoC: Design of low Insertion Loss and Non-blocking Optical Router and Efficient Routing Algorithm
Sanaz Asadinia - Elham Yaghoubi - Mostafa Sadeghi - Mahdi Mehrabi
Towards Provable Privacy Protection in IoT-Health Applications
Samane Sobuti - دکتر سیاوش خرسندی
A High-Speed Quantum Reversible Controlled Adder/Subtractor Circuit
Negin Mashayekhi - Mohammad Reza Reshadinezhad - Shekoofeh Moghimi
PersianRAG A Retrieval Augmented Generation System for Persian Language
Hossein Hosseini - Mohammad Sobhan Zare - Amir Hossein Mohammadi - Arefeh Kazemi - Zahra Zojaji - Mohammad Ali Nematbakhsh
more
Samin Hamayesh - Version 43.8.0