0% Complete
English
صفحه اصلی
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Prompt-Based Composed Fashion Image Retrieval via Gated Detail-Enhanced Dual Cross-Attention Difference Modeling
نویسندگان :
Kosar Keshavarz
1
Reza Azmi
2
1- دانشگاه الزهرا(س)
2- دانشگاه الزهرا(س)
کلمات کلیدی :
Composed image retrieval،Composed query،Contrastive learning،Fashion retrieval،Multimodal retrieval،Text-guided image retrieval
چکیده :
With the rapid growth of online shopping and the vast amount of fashion-related visual content on the internet, accurate methods for fashion image retrieval have become increasingly important to enhance user satisfaction. The fashion domain is inherently fine-grained, characterized by subtle details such as color, pattern, cut, and embellishments, where even small variations lead to distinct styles. To address the limitations of purely text-based or image-based queries, we adopt a text-guided retrieval approach in which a reference image and a natural-language description jointly define the user’s intent. This paper extends sentence-level prompt-based retrieval frameworks by introducing explicit image-difference modeling. The proposed Gated Detail-Enhanced Dual Cross-Attention (GDD-CA) module models the relationship between reference and target images through dual cross-attention and a gated detail-enhancement mechanism, enabling the network to capture subtle, fine-grained visual variations. Experimental results on the Fashion-IQ dataset demonstrate that integrating detail-enhanced image-difference modeling into the prompt-based structure improves retrieval performance, achieving a 1.14% gain in Recall over previous methods.
لیست مقالات
لیست مقالات بایگانی شده
بهبود هزینههای تراکنش در معماری مدیریت زنجیرهی تامین مبتنی بر زنجیرهی بلوکی
مژگان نوروزی نژاد - دکتر زهرا موحدی مژگان نوروزی نژاد - زهرا موحدی -
A Blockchain-Based Smart Contract Framework for Peer-to-Peer Energy Trading in Smart Grids
Hossein Shahinzadeh - Farshad Ebrahimi - S. Mohammadali Zanjani - Amirafshin Zamani - Saiedeh Mehrabani-Najafabadi - Gevork B. Gharehpetian
Predictive Maintenance using LSTM and Adaptive Windowing
Aien Ghanbari Adivi - Behrouz Shahgholi Ghahfarokhi
IoT-Driven Water Quality Management System using Deep Q-Network
Shakiba Rajabi - Komeil Moghaddasi
Embedding-Consistent Contrastive Learning: A Robust Approach for Imbalanced Classification
Sobhan Siamak - Eghbal Mansoori
Revolutionizing Credit Scoring: The Synergy of Mamba State Space and CNN Models
Behnam Sabzalian
Leveraging Retrieval-Augmented Generation for Persian University Knowledge Retrieval
Arshia Hemmat - Mohammad Hassan Heydari - Kianoosh Vadaei - Afsaneh Fatemi
Violence detection using one-dimensional convolutional networks
Narges Honarjoo - Ali Abdari - Dr Azadeh Mansouri
Using Deconvolutional Variational Autoencoder for Answer Selection in Community Question Answering
Golshan Afzali Boroujeni - Heshaam Faili
An Efficient Link Prediction Method using Community Structures
Dr Hadi Shakibian - Setareh Mokhtari
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.8.0