International Conference on Information and Knowledge Technology

Home / پانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش

Knowledge Extraction from Technical Reports Based on Large Language Models: An Exploratory Study

Authors :

Parsa Bakhtiari¹ Hassan Bashiri² Alireza Khalilipour³ Masoud Nasiripour⁴ Moharram Challenger⁵

1- دانشگاه صنعتی همدان 2- دانشگاه صنعتی همدان 3- University of Antwerp 4- دانشگاه صنعتی همدان 5- University of Antwerp

Keywords :

Knowledge Extraction،Large Language Model،Fine Tuning

Abstract :

Organizations and companies possess a vast amount of documents generated over the years. These documents contain valuable information and knowledge that can be instrumental in resolving ambiguities and challenges experts face. Information retrieval and knowledge management systems are tools for extracting documents relevant to users’ informational needs, addressing part of the knowledge extraction challenge from these document collections. With the emergence of generative artificial intelligence and large language models that exhibit strong capa- bilities in understanding textual documents, knowledge extraction solutions have shifted towards utilizing these models. Large language models possess general knowledge obtained from pre- training methods, and there are various approaches to infuse domain-specific knowledge into the general understanding of the language model. This research first examines the possible techniques for fine-tuning a large language model in a specific domain. We then train the model using fine-tuning methods on a collection of documents and technical reports from the industry. Finally, we measure the improvement in the large language model’s capability to extract domain-specific knowledge.