METHODS OF CONCEPT EXTRACTION IN LITERARY WORKS
DOI:
https://doi.org/10.32782/2307-1222.2025-60-30Keywords:
conceptual analysis, large language models, automated concept extraction, ontological modeling, BERT, natural language processingAbstract
The article explores both traditional and contemporary approaches to concept extraction in literary texts, with a particular focus on lexical analysis, semantic parsing, ontological modeling, and automated techniques based on large language models. Traditional manual methods are valued for their ability to capture nuanced literary meanings – such as symbolism, metaphor, and intertextual references – while taking into account cultural and stylistic context. However, their reliance on intensive human interpretation makes them difficult to apply to large text corpora and comparative studies. Modern automated approaches, especially those utilizing transformer architectures like BERT, introduce significant advantages in processing speed and scalability. Through mechanisms such as self-attention, these models effectively identify long-range contextual relationships and latent patterns within texts, enabling rapid detection and classification of key concepts across extensive datasets. Yet, the article emphasizes that these systems still face limitations when dealing with the figurative richness and semantic ambiguity inherent in literary discourse. The practical component of the research involves an experimental analysis of the concept of tolerance in articles from major English-language media outlets, including The New York Times, BBC News, and The Guardian. Automated extraction methods demonstrated strong potential for identifying general trends and conceptual usage patterns. Nonetheless, the findings underscore the need for expert interpretation to refine outputs, especially in cases involving subtle semantic shifts or interdisciplinary cultural references. In conclusion, the article proposes an integrative methodological framework that combines automated processing with human expertise. Automated tools are recommended for the preliminary structuring and classification of large volumes of textual data, while expert scholars are responsible for interpretative depth and validation. Future research directions include enhancing LLM adaptation for literary texts, building specialized training corpora, and incorporating ontological models to improve conceptual precision and reliability in literary studies.
References
Aggarwal C.C., Zhai C. A survey of text clustering algorithms. In: Mining Text Data. New York: Springer, 2012. P. 77–128. DOI: https://doi.org/10.1007/978-1-4614-3223-4_4.
Amid Mosul’s ruins, Pope denounces religious fanaticism: Live updates. The New York Times. 2021.
Areshey A.M. Exploring transformer models for sentiment classification: A comparison of BERT, RoBERTa, ALBERT, DistilBERT, and XLNet. Expert Systems. 2024. Vol. 41. P. 1–27. DOI: https://doi.org/10.1111/exsy.13701.
Bartalesi V., Meghini C. Using an ontology for representing the knowledge on literary texts: The Dante Alighieri case study. Semantic Web. 2017. Vol. 8. P. 385–394. DOI: https://doi.org/10.3233/SW-150198.
Benson A. F1 teams face tougher tests on flexi-wings at Chinese GP. BBC News. 2025.
Brewster C. Ontology learning from text: Methods, evaluation and applications. Computational Linguistics. 2006. Vol. 34. P. 569–572.
Bunting M. The problem with tolerance. The Guardian. 2011.
Давидюк Ю.Б. Методика концептуального аналізу художнього тексту. Мова і культура. 2014. Вип. 37. № 1. С. 289–293. URL: http://nbuv.gov.ua/UJRN/Mik_2014_17_1_51
Фісак І.В. Категорія «концепт» у сучасному науковому дискурсі. Філологічні науки. 2014. Вип. 17. С. 69–77. URL: http://nbuv.gov.ua/UJRN/Fil_Nauk_2014_17_12
Giglou H.B., D’Souza J., Auer S. LLMs4OL: Large language models for ontology learning. In: The Semantic Web – ISWC. Springer, 2023. P. 408–427. DOI: https://doi.org/10.1007/978-3-031-47240-4_22.
Giglou H.B., D’Souza J., Enge F. LLMs4OM: Matching ontologies with large language models. ESWC 2024 Special Track on LLMs for Knowledge Engineering. 2024. P. 23–34. DOI: https://doi.org/10.13140/RG.2.2.10832.42240.
Глібовська А.А. Відтворення національно-культурних особливостей вербалізації концепту «толерантність» у сучасній мові ЗМІ під час перекладу з англійської мови на українську. Київ, 2020.
Hu Y., Liu D., Wang Q. Automating knowledge discovery from scientific literature via LLMs: A dual-agent approach with progressive ontology prompting. arXiv. 2024. DOI: https://doi.org/10.48550/arXiv.2409.00054.
Lezard N. Treatise on Tolerance by Voltaire review – An attack on fanaticism. The Guardian. 2016.
Liu Y., Ott M., Nman G. RoBERTa: A robustly optimized BERT pretraining approach. arXiv. 2019. DOI: https://doi.org/10.48550/arXiv.1907.11692.
Mahboub A., Zater M. E., Al-Rfooh B. Evaluation of semantic search and its role in retrieved-augmented-generation (RAG) for Arabic language. arXiv. 2024. DOI: https://doi.org/10.48550/arXiv.2403.18350.
Malik K. Ideas can be tolerated without being respected. The distinction is key. The Guardian. 2020.
Mukanova A., Milosz M., Dauletkaliyeva A. LLM-powered natural language text processing for ontology enrichment. Applied Sciences. 2024. Vol. 14. P. 5860–5875. DOI: https://doi.org/10.3390/app14135860.
Михайлюк А., Михайлюк О., Пилипчук О. Формування лінгвістичної онтології на базі структурованого енциклопедичного ресурсу. Радіоелектронні і комп’ютерні системи. 2012. Вип. 4. С. 81–89. URL: http://nbuv.gov.ua/UJRN/recs_2012_4_14
Nananukula N., Kejriwala M. HALO: An ontology for representing and categorizing hallucinations in large language models. In: SPIE Defense + Commercial Sensing (DCS 2024). 2024. P. 1–15. DOI: https://doi.org/10.1117/12.3014048.
Nixon R. «Zero tolerance» immigration policy surprised agencies, report finds. The New York Times. 2018.
Onishi N., Meheut C. In France’s military, Muslims find a tolerance that is elusive elsewhere. The New York Times. 2021.
Riley A. Resurrection plants: The drought-resistant «zombie plants» that come back from the dead. BBC News. 2025.
Sacked Bradburn fined for discriminatory comments. BBC News. 2025.
To H.Q., Liu M. Towards efficient large language models for scientific text: A review. arXiv. 2024. DOI: https://doi.org/10.48550/arXiv.2408.10729.
Toro S., Anagnostopoulos A.V., Bello S.M. Dynamic retrieval augmented generation of ontologies using artificial intelligence (DRAGON-AI). Journal of Biomedical Semantics. 2024. Vol. 15, No. 19. DOI: https://doi.org/10.1186/s13326-024-00317-z.
Vaswani A., Shazeer N., Parmar N. Attention is all you need. arXiv. 2017. DOI: https://doi.org/10.48550/arXiv.1706.03762.
Zhenzhong L., Chen M., Goodman S. ALBERT: A lite BERT for self-supervised learning of language representations. arXiv. 2019. DOI: https://doi.org/10.48550/arXiv.1909.11942.
Zulkipli Z.Z., Maskat R., Teo N. H.I. A systematic literature review of automatic ontology construction. Indonesian Journal of Electrical Engineering and Computer Science. 2022. Vol. 28. P. 878–889. DOI: https://doi.org/10.11591/ijeecs.v28.i2.pp878-889.










