0% Complete
English
صفحه اصلی
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
A Deep Learning Framework for Phase-Aware Feature Representation to Improve Sound Source Direction and Distance Estimation
نویسندگان :
Zahra Abolfazli
1
Hamid Reza Abutalebi
2
1- Yazd University
2- Yazd University
کلمات کلیدی :
(Sound Event Localization and Detection (SELD،phase spectrogram features،Conformer
چکیده :
This paper proposes a novel refinement of the network’s input features to improve distance and Direction-of-Arrival estimation in the sound event localization and detection system. Instead of relying on Mel energies, we propose using phase spectrograms as input feature, which effectively preserve inter-channel time delays and capture crucial wave propagation characteristics. Furthermore, we introduce architectural improvements for increased robustness. Specifically, Huber loss replaces MSE, reducing sensitivity to noise. Additionally, MHSA layers are replaced with Conformer blocks to better model both long-range dependencies and local interactions within the audio data. Our experimental results validate the effectiveness of the proposed phase-based feature representation and optimized architecture, demonstrating improvements in both DOA and distance estimation.
لیست مقالات
لیست مقالات بایگانی شده
Enhancing Software Effort Estimation with an Integrated Approach of Particle Swarm Optimization and Genetic Algorithms in Analogy-based Method
Ehsan Nasr - Keyvan Mohebbi
Web Service Ranking based on QoS and Use Prefer
Seyed Hossein Siadat - Danial Ramezani - Fatemeh Ahani
Information Technology Risk Management Model for Remote Control Vehicles
Hamid Reza Naji - Aref Ayati
بهبود دقت و کارایی در شبکههای عصبی کانولوشنی با استفاده از روشهای محاسبات تقریبی
محمدرضا رفیعی نژاد - محمدرضا بینش مروستی - سید امیر اصغری
KGLM-QA: A Novel Approach for Knowledge Graph-Enhanced Large Language Models for Question Answering
Alireza Akhavan safaei - Pegah Saboori - Reza Ramezani - Mohammadali Nematbakhsh
An ESB-based Architecture for Authentication as a Service Through Enterprise Application Integration
Masoumeh Hashemi - Mehdi Sakhaei-nia - Morteza Yousef Sanati
Targeted Vaccination for COVID-19 Using Mobile Communication Networks
Mohammadmohsen Jadidi - Pegah Moslemi - Saeed Jamshidiha - Iman Masroori - Abbas Mohammadi - Vahid Pourahmadi
A clonal selection mechanism for load balancing in the cloud computing system
Melika Mosayyebi - Reza Azmi
Automatic Analysis of Inconsistencies in Inter-Enterprise Business Processes: Introducing a Formal Adaptation Patterns Catalog
Somayeh Ashourian - Shohreh َAjoudanian
بررسی روش یادگیری انتقالی جهت پیشبینی پیوند
علی روحانی فر - کمال میرزایی بدرآبادی
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.8.0