0% Complete
فارسی
Home
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Enhancing Persian Speech Emotion Recognition with Contrastive Learning and Multimodal Fusion
Authors :
Mobina Esmaeili
1
Vajiheh Sabeti
2
1- دانشگاه الزهرا(س)
2- دانشگاه الزهرا(س)
Keywords :
Multimodal Emotion Recognitiont،Representation Learning،Representation Learning،Speech-Text Fusion،ShEMO Dataset
Abstract :
Emotion recognition from both speech and text in low-resource languages such as Persian presents significant challenges due to linguistic complexity and the scarcity of labeled datasets. Conventional multimodal fusion methods often struggle to capture nuanced cross-modal interactions and typically neglect inter-class emotional relationships. To address these limitations, this paper introduces a novel contrastive learning framework that employs pre-trained projection networks to enhance multimodal representations through a combination of intra-modal, inter-modal, and semi-contrastive objectives. The refined embeddings are integrated via a lightweight fusion layer for final emotion classification. In addition, an automatic speech recognition (ASR) system is incorporated to enrich textual inputs and improve linguistic diversity. Experiments on the ShEMO corpus demonstrate that the proposed approach achieves an accuracy of 83.04% and an unweighted average recall (UAR) of 88.1%, substantially outperforming traditional fusion-based baselines. These results confirm the effectiveness of the framework in improving cross-modal alignment and representation quality, highlighting its potential for intelligent interactive systems, social media sentiment analysis, and automated affective computing applications.
Papers List
List of archived papers
Classical-Quantum Multiple Access Wiretap Channel with Common Message: One-shot Rate Region
Hadi Aghaee - Dr Bahareh Akhbari
Two Novel Designs of Efficient Single-Bit Comparators in QCA Technology with Ultra-Low Energy Dissipation
Shobeir Fayazi - Hatam Abdoli
بررسی امنیت وفقی در اینترنت وسایل نقلیه
سیده یگانه غیور باغبانی - دکتر سعید جلیلی سیده یگانه غیور باغبانی - سعید جلیلی -
Aligning the Brick and Mortar cosmetic with digital transformation as the right way to overhaul the In-store Experience
Mehrgan Malekpour - Dr Federica Caboni
Handling Data Heterogeneity in Federated Medical Images Classification
Alireza Maleki - Hassan Khotanlou
A Hybrid Method to Reduce the Voltage Consumption in the Spiking Neural Networks
Shaghayegh Mehdizadeh saraj - Seyyed Amir Asghari - Mohammadreza Binesh Marvasti
کنترل کیفیت پیش_بینانه آمیزه_های لاستیکی مدلی یکپارچه بر اساس استاندارد پذیرش متغیرهای ANSI Z1.9 و پایش رئولوژیکی برخط
آکو یاری - فرهاد محمدزاده
Load Balancing in Software-Defined Networks Using Multi-Level Thresholds and Hybrid Switch Migration Strategies
Alireza Karimi - Mohammad yousef Darmani
Knowledge Distillation through a Knowledge Representation Approach (Knowledge Engineering)
Mohammad Hadi Safari Nader
Open-domain question classification and completion in conversational information search
Omid Mohammadi Kia - Mahmood Neshati - Mahsa Soudi Alamdari
more
Samin Hamayesh - Version 42.5.2