0% Complete
English
صفحه اصلی
/
چهاردهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Enhancing Supervised Learning in Speech Emotion Recognition through Unsupervised Representations
نویسندگان :
Niloufar Faridani
1
Amirali Soltani Tehrani
2
Ramin Toosi
3
1- دانشکده برق و کامپیوتر دانشگاه تهران
2- دانشکده برق و کامپیوتر دانشگاه تهران
3- دانشکده برق و کامپیوتر دانشگاه تهران
کلمات کلیدی :
Speech Emotion Recognition،Self-supervised Learning،Convolutional Neural Network
چکیده :
Speech Emotion Recognition (SER) is pivotal in enhancing human-computer interaction by enabling a deeper understanding of emotional states across various applications, contributing to more empathetic and effective communication. This study proposes an innovative approach integrating self-supervised feature extraction with supervised classification for emotion recognition from small audio segments. In the preprocessing step, to eliminate the need to craft audio features, we employed a self-supervised feature extractor based on the Wav2Vec model to capture acoustic features from audio data. Then, the output feature maps of the preprocessing step are fed to a custom-designed Convolutional Neural Network (CNN)–-based model to perform emotion classification. Utilizing the ShEMO dataset as our testing ground, the proposed method surpasses two baseline methods, i.e., support vector machine classifier and transfer learning of a pre-trained CNN. Comparing the proposed method to the state-of-the-art techniques in the SER task indicates the superiority of the proposed method. Our findings underscore the pivotal role of deep unsupervised feature learning in elevating the landscape of SER, offering enhanced emotional comprehension in the realm of human-computer interactions.
لیست مقالات
لیست مقالات بایگانی شده
Improved Weighting in the Automated Texts Classification using Fuzzy Method
Hamidreza Sadrarhami - S. Mohammadali Zanjani - Ghazanfar Shahgholian
پیش بینی ارتباط میزان مرگ و میر با هم زمانی وجود دو بیماری در مبتلایان به کرونا به کمک بگارگیری شبکه عصبی Word2Vec
سمن مثقالی - دکتر جواد عسکری سمن مثقالی - جواد عسکری -
Coded Sharding for Vehicular Blockchains: A Lagrange Interpolation-Based Approach to IoV Scalability
Behdad Alagha - Maedeh Mosharraf
Spatial On–Off Keying Modulation with Mirror-Array Optical IRSs for Indoor Machine-to-Machine Visible Light Communication
Babak Sadeghi - Seyed Mohammad Sajad Sadough
Classical-Quantum Multiple Access Wiretap Channel with Common Message: One-shot Rate Region
Hadi Aghaee - Dr Bahareh Akhbari
An Optimized GBDT-Based Model Using SMOTE for Effective Diagnosis of Coronary Heart Disease
Elahe Moradi - Mohammad Javadian
ElectroCNN: Regressive CNN-based Energy Consumption Forecasting Leveraging Weather Data
Dharmi Patel - Mann Patel - Krisha Darji - Rajesh Gupta - Sudeep Tanwar - Jitendra Bhatia - Hossein Shahinzadeh
Knowledge gap extraction based on the learner click behavior in interaction with videos using the association rule algorithm
Yosra Bahrani - Omid Fatemi
یک رویکرد سریع تحلیل و شناسایی آسیب پذیری Next-Intent در برنامه های کاربردی اندروید
زهرا کلوندی - دکتر مهدی سخائی نیا زهرا کلوندی - مهدی سخائی نیا -
Dealing with Black-hole Attacks in Inter-vehicle Networks Using the Packet Delivery Rate Algorithm
Marzieh Sedighi - Mehdi Hamidkhani - Mostafa Sadeghi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.8.0