Subclass Extraction of Taxonomic Relations Using Large Language Models (LLMs) for Quran Translation Texts

Authors

  • Rohana Binti Ismail
  • Mokhairi Bin Makhtar
  • Hasni Binti Hasan

DOI:

https://doi.org/10.22452/quranica.vol17no2.23

Keywords:

Taxonomy, Quranic Studies, Ontology Learning, Llms

Abstract

The Qur’an is the primary text of Islam, characterized by a complex textual structure. This presents challenges in understanding verses, semantic search, knowledge extraction, and question-answering systems. Ontology provides a structured approach to addressing these challenges by organizing their information into key ontology elements, such as concepts, relations, and object instances, into a formal, machine-readable format. Manual ontology construction, however, is a time- and labour-intensive task. Therefore, Ontology Learning (OL) supports the ontology construction process through semi-automatic or automatic methods for extracting ontology elements from text. This reduces reliance on manual work and enhances the efficiency of ontology development. Nonetheless, the extraction of subclasses for taxonomic relations, such as 'is-a' or 'part-of', remains a major constraint, especially in texts rich in meaning, such as the Qur’an. These taxonomic relations are usually implicit and difficult to detect. Recent advances in Artificial Intelligence have led to the widespread application of Large Language Models (LLMs) across various domains due to their ability to automate and accelerate ontology construction. However, their potential in Qur’anic studies has not yet been extensively explored. This study aims to evaluate the extraction of subclass taxonomic relations in the Qur’an, with a focus on texts related to the Hajj rituals. Experiments were conducted using GPT-4.0, powered by LLMs, employing the Chain of Thought (CoT) prompting strategy. The evaluation of subclass taxonomic relation extraction using LLMs showed varying performance. The model achieved the highest precision for the Location class, with a precision (P) of 0.80 and a recall (R) of 0.57, followed by moderate performance for Living Creation, with a precision (P) of 0.66 and a recall (R) of 0.40. Although no correct subclass taxonomic relations were identified for some classes, the LLMs still demonstrated the potential to generate a total of 39 new subclass taxonomic relations across all classes. In conclusion, the use of LLMs in extracting taxonomic relations from Qur’anic text shows promising results for certain classes. The model also demonstrated the capability to generate subclasses beyond the gold standard.

Downloads

Download data is not yet available.

Downloads

Published

30-09-2025