0% Complete
فارسی
Home
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
An LLM-Based Approach for Clarifying the Decisions of Vision Models in Autonomous Vehicles
Authors :
Omid Mosalmani
1
Mohammad Javad Rashti
2
Seyed Enayat Alavi
3
1- دانشگاه شهید چمران اهواز
2- دانشگاه شهید چمران اهواز
3- دانشگاه شهید چمران اهواز
Keywords :
Explainable AI،Prompt Engineering،Large Language Models،Autonomous Vehicles،Textual Explanation
Abstract :
With the increasing utilization of autonomous vehicles, the transparency and explainability of their decisions have become crucial for gaining user trust and enhancing road safety. Current textual explanation methods rely on limited datasets, leading to repetitive and superficial explanations. This research presents a hybrid system where the ADAPT decision-making model is used to predict driving actions, and its attention maps serve as an interface between visual data and the explanation module. Subsequently, large language models, from the Gemini and GPT families, receive the final decision, the attention map, and a carefully designed prompt to generate concise and understandable textual explanations. The primary innovation of this approach lies in combining the decision-making model with LLMs, leveraging their extensive knowledge beyond the constraints of training data to enable the generation of more precise and diverse explanations. The system is evaluated on the BDD-X dataset and measured against standard captioning metrics including BLEU-4, METEOR, ROUGE-L, CIDEr-D, and SPICE. The evaluation results indicate the superiority of explanation outputs in our system, compared to the baseline ADAPT, particularly in multi-reference scenarios, providing more fluent and contextually rich explanations. For instance, the output acquired from Gemini 2.5 Pro model achieves a METEOR score of approximately 19.45, a significant improvement of about 28 percent compared to 15.2 for ADAPT. Furthermore, supplementary experiments show that using a contour representation of the attention map and fine-tuning the models lead to increased visual-textual consistency and result stability. In summary, by linking the visual attention of the decision-making model to the linguistic capabilities of LLMs, this research takes a step toward developing more explainable and trustworthy autonomous vehicles.
Papers List
List of archived papers
A Topic Based Method to Classify the Question Clarity in CQA Networks
Alireza Khabbazan - Dr Ahmad Ali Abin
طبقه بندی روش های شناسایی داده های تکراری در جهت تسهیل فرایند پاکسازی داده ها
مهدی جعفری - احمد عبدالله زاده بار فروش
Establishing security using cryptography and biometric authentication to counter cyber-attacks
Mohammed ADIL AKABR - Mehdi Hamidkhani - Mostafa Sadeghi
Knowledge Distillation through a Knowledge Representation Approach (Knowledge Engineering)
Mohammad Hadi Safari Nader
Identifying Children's Personality Styles through Drawing Analysis using Machine Learning
Maedeh Mosharraf - Faezeh Banabazi
روش مهاجرت خوشهای برای بهبود بستربندی به مشتری در گردشکارهای بدون سرویسدهنده
محمدامین قسوری جهرمی - مهرداد آشتیانی - فاطمه بخشی
Two Novel Designs of Efficient Single-Bit Comparators in QCA Technology with Ultra-Low Energy Dissipation
Shobeir Fayazi - Hatam Abdoli
A Novel Service Deployment Policy in Fog Computing Considering The Degree of Availability and Fog Landscape Utilization Using Multiobjective Evolutionary Algorithms
Maryam Eslami - Dr Mehdi Sakhaei-nia
Blockchain-based Secure UAV-assisted Battlefield Operation underlying 5G
Dhruvi Pancholi - Nilesh Kumar Jadav - Sudeep Tanwar - Deepak Garg - S. Mohammadali Zanjani
Conceptual Intelligent Model for Visual Question Answering using Attention Mechanism and Relational Reasoning
ٍElham Alighardash - Dr Hassan Khotanlou - Vahid Pour Amin
more
Samin Hamayesh - Version 43.8.0