0% Complete
فارسی
Home
/
چهاردهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Enhancing Supervised Learning in Speech Emotion Recognition through Unsupervised Representations
Authors :
Niloufar Faridani
1
Amirali Soltani Tehrani
2
Ramin Toosi
3
1- دانشکده برق و کامپیوتر دانشگاه تهران
2- دانشکده برق و کامپیوتر دانشگاه تهران
3- دانشکده برق و کامپیوتر دانشگاه تهران
Keywords :
Speech Emotion Recognition،Self-supervised Learning،Convolutional Neural Network
Abstract :
Speech Emotion Recognition (SER) is pivotal in enhancing human-computer interaction by enabling a deeper understanding of emotional states across various applications, contributing to more empathetic and effective communication. This study proposes an innovative approach integrating self-supervised feature extraction with supervised classification for emotion recognition from small audio segments. In the preprocessing step, to eliminate the need to craft audio features, we employed a self-supervised feature extractor based on the Wav2Vec model to capture acoustic features from audio data. Then, the output feature maps of the preprocessing step are fed to a custom-designed Convolutional Neural Network (CNN)–-based model to perform emotion classification. Utilizing the ShEMO dataset as our testing ground, the proposed method surpasses two baseline methods, i.e., support vector machine classifier and transfer learning of a pre-trained CNN. Comparing the proposed method to the state-of-the-art techniques in the SER task indicates the superiority of the proposed method. Our findings underscore the pivotal role of deep unsupervised feature learning in elevating the landscape of SER, offering enhanced emotional comprehension in the realm of human-computer interactions.
Papers List
List of archived papers
بررسی کارآمدی فناوری وب 0.2 در پشتیبانی از فرآیندهای انسان محور و دانش مبنا
سید احسان ملیحی - فاطمه مشایخی کردکلا
Simulanteus Load Balancing of Servers and Controllers in SDN-based IoMT
Somaye Imanpour - Ahmadreza Montazerolghaem - Saeed Afahari
Optimal selection of seed nodes by reducing the influence of common nodes in the influence maximization problem
Farzaneh Kazemzadeh - Ali Asghar Safaei - Mitra Mirzarezaee
A Comparison between Slimed Network and Pruned Network for Head Pose Estimation
Amir Salimiparsa - Hadi Veisi - Mohammad-shahram Moin
PersianRAG A Retrieval Augmented Generation System for Persian Language
Hossein Hosseini - Mohammad Sobhan Zare - Amir Hossein Mohammadi - Arefeh Kazemi - Zahra Zojaji - Mohammad Ali Nematbakhsh
Listening with Precision: ASR-Guided Method and Fusion Strategy for Text-Dependent Speaker Verification
Mohammad Reza Molavi - Reza Khodadadi - Hossein Zeinali
تبیین ضرورت وجودی حکمرانی و تجزیه و تحلیل داده در سازمان با تاکید بر چرخه فناوری گارتنر
پیمان گرجی - سید محمدباقر جعفری
Automatic Analysis of Inconsistencies in Inter-Enterprise Business Processes: Introducing a Formal Adaptation Patterns Catalog
Somayeh Ashourian - Shohreh َAjoudanian
تشخیص زودهنگام سندروم داون از روی تصاویر سونوگرافی جنین با استفاده از مدلهای عمیق پیشآموزش دیده
فائزه سادات حسینی نیا - محرم منصوری زاده - حسن ختنلو
Integrating Wasserstein GANs for High-Speed Transformer-Based Neural Machine Translation
Parisa Nekoogol - Mostafa Salehi
more
Samin Hamayesh - Version 42.5.2