0% Complete
English
صفحه اصلی
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Enhancing Persian Speech Emotion Recognition with Contrastive Learning and Multimodal Fusion
نویسندگان :
Mobina Esmaeili
1
Vajiheh Sabeti
2
1- دانشگاه الزهرا(س)
2- دانشگاه الزهرا(س)
کلمات کلیدی :
Multimodal Emotion Recognitiont،Representation Learning،Representation Learning،Speech-Text Fusion،ShEMO Dataset
چکیده :
Emotion recognition from both speech and text in low-resource languages such as Persian presents significant challenges due to linguistic complexity and the scarcity of labeled datasets. Conventional multimodal fusion methods often struggle to capture nuanced cross-modal interactions and typically neglect inter-class emotional relationships. To address these limitations, this paper introduces a novel contrastive learning framework that employs pre-trained projection networks to enhance multimodal representations through a combination of intra-modal, inter-modal, and semi-contrastive objectives. The refined embeddings are integrated via a lightweight fusion layer for final emotion classification. In addition, an automatic speech recognition (ASR) system is incorporated to enrich textual inputs and improve linguistic diversity. Experiments on the ShEMO corpus demonstrate that the proposed approach achieves an accuracy of 83.04% and an unweighted average recall (UAR) of 88.1%, substantially outperforming traditional fusion-based baselines. These results confirm the effectiveness of the framework in improving cross-modal alignment and representation quality, highlighting its potential for intelligent interactive systems, social media sentiment analysis, and automated affective computing applications.
لیست مقالات
لیست مقالات بایگانی شده
تشخیص حمله تزریق داده کاذب با روش OCD در شبکه هوشمند برق
محدثه جلیلی سنجرانی - سعید جلیلی - محمدکاظم شیخ الاسلامی
Fast Online Character Recognition Using a Novel Local-Global Feature Extraction Method
Ayoub Parvizi - Dr Mohammad Kazemifard - Ziba Imani
Silicon photonic microring resonators: A Novel optical router based on Negative-First routing algorithm
Negin Bagheri Renani - Elham Yaghoubi
An ESB-based Architecture for Authentication as a Service Through Enterprise Application Integration
Masoumeh Hashemi - Mehdi Sakhaei-nia - Morteza Yousef Sanati
A Topic Based Method to Classify the Question Clarity in CQA Networks
Alireza Khabbazan - Dr Ahmad Ali Abin
A parallel approach to the fractional time delay model for predicting the spread of COVID-19
Mahdi Movahedian Moghaddam - Kourosh Parand
Classification and Evaluation of Privacy Preserving Data Mining Methods
Negar Nasiri - Mohammadreza Keyvanpour
Enhancing kNN-Based Intrusion Detection with Differential Evolution with Auto-Enhanced Population Diversity
Zohre Karimi - Zeinab Torabi
Benchmarking Embedding Models for Persian-Language Semantic Information Retrieval
Mahmood Kalantari - Mehdi Feghhi - Nasser Mozayani
Beyond One-Hot: CatBoost for Heating and Cooling Load Prediction
Shayan Naghizadeh - Mohammad Saeed Rajabi - Ehsan Nazerfard
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.8.0