0% Complete
English
صفحه اصلی
/
پانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
GanjNet: Leveraging Network Modeling with Large Language Models for Persian Word Sense Induction
نویسندگان :
Amir Mohammad Kouyeshpour
1
Hadi Veisi
2
Saman Haratizadeh
3
1- دانشگاه تهران ٫ دانشکده علوم و فنون نوین
2- دانشگاه تهران ٫ دانشکده علوم و فنون نوین
3- دانشگاه تهران ٫ دانشکده علوم و فنون نوین
کلمات کلیدی :
Word Sense Induction،Network Modeling،Community Detection،Large Language Models،Persian NLP،Lexical Semantics
چکیده :
Abstract—This paper introduces GanjNet, a novel approach to Word Sense Induction (WSI) in the Persian language that leverages network modeling and community detection in conjunction with large language models (LLMs). We present a method that constructs semantic graphs from lexical substitutes generated by LLMs and applies community detection algorithms to uncover and distinguish word senses in unannotated text. GanjNet addresses challenges such as limited annotated resources, high degrees of polysemy, and context-sensitive meanings in Persian. By leveraging unsupervised techniques, we enhance sense induction without relying on extensive labeled data. Our experiments demonstrate that GanjNet outperforms existing methods on a custom dataset derived from MirasText, achieving a V-measure of 47% and a paired F-score of 58%, compared to the best baseline method with a V-measure of 41% and a paired F-score of 53%. These results showcase the potential of integrating community detection and LLMs for unsupervised semantic tasks in morphologically rich languages like Persian. Moreover, GanjNet’s flexibility offers practical applicability across various domains, including automatic thesaurus and WordNet generation, as well as assisting writers in context-sensitive word choice, demonstrating its broader impact on natural language understanding.
لیست مقالات
لیست مقالات بایگانی شده
AOV-IDS: Arithmetic Optimizer with Voting classifier for Intrusion Detection System
Amir Soltany Mahboob - Mohammad Reza Ostadi Moghaddam - Shima Yousefi
ElectroCNN: Regressive CNN-based Energy Consumption Forecasting Leveraging Weather Data
Dharmi Patel - Mann Patel - Krisha Darji - Rajesh Gupta - Sudeep Tanwar - Jitendra Bhatia - Hossein Shahinzadeh
Business Process Improvement Challenges: A Systematic Literature Review
Hanieh Kashfi - Fereidoon Shams Aliee
Identifying Children's Personality Styles through Drawing Analysis using Machine Learning
Maedeh Mosharraf - Faezeh Banabazi
تشخیص مراحل خواب با کمک جنگل تصادفی و ویژگی های فرکانسی استخراج شده از سیگنال های EEG و EOG
سیدعلی حسینی
تشخیص زودهنگام سندروم داون از روی تصاویر سونوگرافی جنین با استفاده از مدلهای عمیق پیشآموزش دیده
فائزه سادات حسینی نیا - محرم منصوری زاده - حسن ختنلو
ارائه یک رویکرد معنایی مبتنی بر آنتولوژی به منظور شناسایی تاکتیکهای معماری
احسان شریفی - دکتر احمد عبدالله زاده بارفروش
An Optimized GBDT-Based Model Using SMOTE for Effective Diagnosis of Coronary Heart Disease
Elahe Moradi - Mohammad Javadian
ParsEL 1.0: Unsupervised Entity Linking in Persian Social Media Texts
Majid Asgari-bidhendi - Farzane Fakhrian - Dr Behrouz Minaei-bidgoli
IoT-Driven Water Quality Management System using Deep Q-Network
Shakiba Rajabi - Komeil Moghaddasi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.8.0