0% Complete
فارسی
Home
/
پانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
Leveraging Retrieval-Augmented Generation for Persian University Knowledge Retrieval
Authors :
Arshia Hemmat
1
Mohammad Hassan Heydari
2
Kianoosh Vadaei
3
Afsaneh Fatemi
4
1- University of Isfahan
2- University of Isfahan
3- University of Isfahan
4- University of Isfahan
Keywords :
Large Language Models،Natural Language Processing،Retrieval Augmented Generation،Dataset Generation،QuestionAnswering System
Abstract :
This paper introduces an innovative approach using Retrieval-Augmented Generation (RAG) pipelines with Large Language Models (LLMs) to enhance information retrieval and query response systems for university-related question answering. By systematically extracting data from the university's official website, primarily in Persian, and employing advanced prompt engineering techniques, we generate accurate and contextually relevant responses to user queries. We developed a comprehensive university benchmark, UniversityQuestionBench (UQB), to rigorously evaluate our system’s performance. UQB focuses on Persian-language data, assessing accuracy and reliability through various metrics and real-world scenarios. Our experimental results demonstrate significant improvements in the precision and relevance of generated responses, enhancing user experiences, and reducing the time required to obtain relevant answers. In summary, this paper presents a novel application of RAG pipelines and LLMs for Persian-language data retrieval, supported by a meticulously prepared university benchmark, offering valuable insights into advanced AI techniques for academic data retrieval and setting the stage for future research in this domain.\footnote{Dataset is publicly available at \url{https://huggingface.co/datasets/UIAIC/UQB}}
Papers List
List of archived papers
Binary water stream algorithm: a new meta-heuristic optimization technique
Faezeh Rahimi Sebdani - Mehdi Nasri
Inner and Outer Bearing Fault Diagnosis of electrical Motors Using a Proposed Algorithm and Vibration Signals
Vahid Safari Dehnavi - Masoud Shafiee
Targeted Vaccination for COVID-19 Using Mobile Communication Networks
Mohammadmohsen Jadidi - Pegah Moslemi - Saeed Jamshidiha - Iman Masroori - Abbas Mohammadi - Vahid Pourahmadi
جمعآوری، تحلیل و خلاصه سازی نظرات کاربران فارسی زبان در شبکههای اجتماعی پیرامون بیماری فراگیر کووید-19
محمدرضا شمس - محمد یاسین فخار محمدرضا شمس - محمد یاسین فخار -
A Novel Decentralized Privacy Preserving Federated Learning Model for Healthcare Applications
Saba Ameri - Reza Ebrahimi Atani
Integration of Electric Vehicles in Smart Grid using Deep Reinforcement Learning
Farkhondeh Kiaee
Aspect-Based Sentiment Analysis of After-Sales Service Quality: A Case Study of Snowa and Competitors Using Digikala Reviews
Safiyeh Samadanian - Marjan Kaedi
A perceptual loss for screen content image super-resolution
Hossein Sekhavaty-Moghadam - Marzieh Hosseinkhani - Dr Azadeh Mansouri
تخلیهبار محاسباتی ریزدانه تحرکآگاه در رایانش لبه برای اینترنت اشیاء
شکوفه نوروزی - دکتر زینب موحدی شکوفه نوروزی - زینب موحدی -
An OWA-Powered Dynamic Customer Churn Modeling in the banking industry Based on Customer Behavioral Vectors
Masoud Alizadeh - Mohammad Soleymannejad - Behzad Moshiri
more
Samin Hamayesh - Version 42.3.1