0% Complete
English
صفحه اصلی
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Enhancing Persian Speech Emotion Recognition with Contrastive Learning and Multimodal Fusion
نویسندگان :
Mobina Esmaeili
1
Vajiheh Sabeti
2
1- دانشگاه الزهرا(س)
2- دانشگاه الزهرا(س)
کلمات کلیدی :
Multimodal Emotion Recognitiont،Representation Learning،Representation Learning،Speech-Text Fusion،ShEMO Dataset
چکیده :
Emotion recognition from both speech and text in low-resource languages such as Persian presents significant challenges due to linguistic complexity and the scarcity of labeled datasets. Conventional multimodal fusion methods often struggle to capture nuanced cross-modal interactions and typically neglect inter-class emotional relationships. To address these limitations, this paper introduces a novel contrastive learning framework that employs pre-trained projection networks to enhance multimodal representations through a combination of intra-modal, inter-modal, and semi-contrastive objectives. The refined embeddings are integrated via a lightweight fusion layer for final emotion classification. In addition, an automatic speech recognition (ASR) system is incorporated to enrich textual inputs and improve linguistic diversity. Experiments on the ShEMO corpus demonstrate that the proposed approach achieves an accuracy of 83.04% and an unweighted average recall (UAR) of 88.1%, substantially outperforming traditional fusion-based baselines. These results confirm the effectiveness of the framework in improving cross-modal alignment and representation quality, highlighting its potential for intelligent interactive systems, social media sentiment analysis, and automated affective computing applications.
لیست مقالات
لیست مقالات بایگانی شده
پیشبینی میزان بقای بیماران مبتلا به سرطان ریه با استفاده از ترکیب کارآمد روشهای دادهکاوی و بهینهسازی رقابت استعماری
رخشان رمضانی سرچشمه - مهدی هاشمزاده - امین گلزاری اسکوئی
A Topic Based Method to Classify the Question Clarity in CQA Networks
Alireza Khabbazan - Dr Ahmad Ali Abin
Information Technology Risk Management Model for Remote Control Vehicles
Hamid Reza Naji - Aref Ayati
تشخیص و جلوگیری از حمله انعکاسی/تقویتی SSDP در شبکه های نرم افزار محور مبتنی بر 4P با استفاده از الگوریتم های یادگیری ماشین
امیرحسین کرمی - رضا محمدی
Two Novel Designs of Efficient Single-Bit Comparators in QCA Technology with Ultra-Low Energy Dissipation
Shobeir Fayazi - Hatam Abdoli
یک رویکرد سریع تحلیل و شناسایی آسیب پذیری Next-Intent در برنامه های کاربردی اندروید
زهرا کلوندی - دکتر مهدی سخائی نیا زهرا کلوندی - مهدی سخائی نیا -
PC-MCLD: Pose-Constrained and Multi-focal Conditioned Latent Diffusion for Person Image Synthesis
Hanieh Fazli - Reza Azmi
Generalized Self-Attentive Spatiotemporal GCN with OPTICS Clustering for Recommendation Systems
Saba Zolfaghari - Seyed Mohammad Hossein Hasheminejad
Spatial On–Off Keying Modulation with Mirror-Array Optical IRSs for Indoor Machine-to-Machine Visible Light Communication
Babak Sadeghi - Seyed Mohammad Sajad Sadough
Using Deconvolutional Variational Autoencoder for Answer Selection in Community Question Answering
Golshan Afzali Boroujeni - Heshaam Faili
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.8.0