0% Complete
English
صفحه اصلی
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Prompt-Based Composed Fashion Image Retrieval via Gated Detail-Enhanced Dual Cross-Attention Difference Modeling
نویسندگان :
Kosar Keshavarz
1
Reza Azmi
2
1- دانشگاه الزهرا(س)
2- دانشگاه الزهرا(س)
کلمات کلیدی :
Composed image retrieval،Composed query،Contrastive learning،Fashion retrieval،Multimodal retrieval،Text-guided image retrieval
چکیده :
With the rapid growth of online shopping and the vast amount of fashion-related visual content on the internet, accurate methods for fashion image retrieval have become increasingly important to enhance user satisfaction. The fashion domain is inherently fine-grained, characterized by subtle details such as color, pattern, cut, and embellishments, where even small variations lead to distinct styles. To address the limitations of purely text-based or image-based queries, we adopt a text-guided retrieval approach in which a reference image and a natural-language description jointly define the user’s intent. This paper extends sentence-level prompt-based retrieval frameworks by introducing explicit image-difference modeling. The proposed Gated Detail-Enhanced Dual Cross-Attention (GDD-CA) module models the relationship between reference and target images through dual cross-attention and a gated detail-enhancement mechanism, enabling the network to capture subtle, fine-grained visual variations. Experimental results on the Fashion-IQ dataset demonstrate that integrating detail-enhanced image-difference modeling into the prompt-based structure improves retrieval performance, achieving a 1.14% gain in Recall over previous methods.
لیست مقالات
لیست مقالات بایگانی شده
Using Trust Statements and Ratings by GraphSAGE to Alleviate Cold Start in Recommender Systems
Seyedeh Niusha Motevallian - Dr Seyed Mohammad Hossein Hasheminejad
Predicting Concentration of Particulate Matter (PM2.5) in Hamedan using Machine Learning Algorithms
Anita Karim Ghassabpour - Hatam Abdoli - Muharram Mansoorizadeh - Saeid Seyedi
ParsEL 1.0: Unsupervised Entity Linking in Persian Social Media Texts
Majid Asgari-bidhendi - Farzane Fakhrian - Dr Behrouz Minaei-bidgoli
From Faces to Words: An Efficient Persian Visual Lip Reading
Mana Amini - Sajjad Aemmi - Azadeh Ashouri - Reza Akhoundzadeh - Kourosh Hassanzadeh - Mohammad Reza Mohammadi
Beyond One-Hot: CatBoost for Heating and Cooling Load Prediction
Shayan Naghizadeh - Mohammad Saeed Rajabi - Ehsan Nazerfard
تشخیص بیماری شبکوری با استفاده از ترکیب الگوریتمهای یادگیری عمیق
میثم فتاحی
Customer Churn Prediction Using Data Mining Techniques for an Iranian Payment Application
Olya Rezaeian - Dr ُSeyedhamidreza Shahabi Haghighi - Dr Jamal Shahrabi
تخلیهبار محاسباتی ریزدانه تحرکآگاه در رایانش لبه برای اینترنت اشیاء
شکوفه نوروزی - دکتر زینب موحدی شکوفه نوروزی - زینب موحدی -
A Mathematical Optimization Approach for Preference Learning in Movie Recommender Systems with Shared Accounts
Milad Khademali - Fazlollah Aghamohammadi - Marjan Kaedi - Alireza Nasiri
Attention-Enhanced Ensemble Learning for Automated Stenosis Detection in X-ray Coronary Angiography Videos
Marzieh Sadat Hosseini - Ahmad R. Naghsh-Nilchi - Mehran Safayani - Masoumeh Sadeghi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.2