0% Complete
English
صفحه اصلی
/
چهاردهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Enhancing Supervised Learning in Speech Emotion Recognition through Unsupervised Representations
نویسندگان :
Niloufar Faridani
1
Amirali Soltani Tehrani
2
Ramin Toosi
3
1- دانشکده برق و کامپیوتر دانشگاه تهران
2- دانشکده برق و کامپیوتر دانشگاه تهران
3- دانشکده برق و کامپیوتر دانشگاه تهران
کلمات کلیدی :
Speech Emotion Recognition،Self-supervised Learning،Convolutional Neural Network
چکیده :
Speech Emotion Recognition (SER) is pivotal in enhancing human-computer interaction by enabling a deeper understanding of emotional states across various applications, contributing to more empathetic and effective communication. This study proposes an innovative approach integrating self-supervised feature extraction with supervised classification for emotion recognition from small audio segments. In the preprocessing step, to eliminate the need to craft audio features, we employed a self-supervised feature extractor based on the Wav2Vec model to capture acoustic features from audio data. Then, the output feature maps of the preprocessing step are fed to a custom-designed Convolutional Neural Network (CNN)–-based model to perform emotion classification. Utilizing the ShEMO dataset as our testing ground, the proposed method surpasses two baseline methods, i.e., support vector machine classifier and transfer learning of a pre-trained CNN. Comparing the proposed method to the state-of-the-art techniques in the SER task indicates the superiority of the proposed method. Our findings underscore the pivotal role of deep unsupervised feature learning in elevating the landscape of SER, offering enhanced emotional comprehension in the realm of human-computer interactions.
لیست مقالات
لیست مقالات بایگانی شده
Benchmarking Embedding Models for Persian-Language Semantic Information Retrieval
Mahmood Kalantari - Mehdi Feghhi - Nasser Mozayani
A Fuzzy Cluster-Based Routing Algorithm to Extend Wireless Sensor Network Lifetime
Mostafa Mirzaie - Armin Mazinani - Dr Sayyed Majid Mazinani
Electrophysiological Modeling and Interactive Approaches of Electrical Circuits and Hypergraphs for Understanding Neural Circuit Dynamics
Arian Baymani - Maryam Naderi Soorki
An integrated approach for estimating software cost estimation using Adaptive Neuro-Fuzzy Inference System and the Grey Wolf Optimization algorithm
Maryam Karimi - Taghi Javdani Gandomani - Mahdi Mosleh
Low-Power Phase-Based Stochastic MAC for FPGA
Kooroush Manochehri - Amir arsalan Sakhtianchi - Mehrshad Khosraviani
Application of Artificial Intelligence and Remote Sensing for Oil Spill Detection
َAmir Reza Ziaee - Masomeh Azimzadeh - Parvin Ahmadi
ارائه مدل هشت مولفه ای استراتژی جامع هوش مصنوعی سازمانی
محمد کاظم صیادی - نیلوفر مرادحاصل - علیرضا یاری
BMPA- DSL: Binary Marine Predators Algorithm to Identify Driver's Different Levels of Stress
Mahtab Vaezi - Mehdi Nasri - Farhad Azimifar - Mahdi Mosleh
An ESB-based Architecture for Authentication as a Service Through Enterprise Application Integration
Masoumeh Hashemi - Mehdi Sakhaei-nia - Morteza Yousef Sanati
A No-Code Platform for Developing Customizable Recommender Systems for Restaurants
Moein-Aldin AliHosseini - MohammadReza Sharbaf
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.8.0