AUTOMATED GENERATION OF TECHNICAL DOCUMENTATION IN THE IT INDUSTRY USING LARGE LANGUAGE MODELS
DOI:
https://doi.org/10.32782/2307-1222.2025-59-22Keywords:
automation, technical documentation, large language models, ontologies, text generation, knowledge structuringAbstract
Automated generation of technical documentation is one of the key challenges in the modern IT industry, as high-quality and up-to-date documentation is essential for the effective work of developers, testers, and end users. This paper explores the potential of using large language models (LLMs) for generating technical documentation and evaluates the role of the ontological approach in structuring the documentation process.It has been noted that traditional methods of documentation creation face several difficulties, including a significant amount of manual labor, challenges in maintaining content relevance, and ensuring consistency across different sections. The lack of integration between documentation and software development processes leads to outdated materials and complicates their updates. Additionally, it is emphasized that in fast-paced development cycles, developers often pay insufficient attention to documentation, making it difficult to keep it up to date.Special attention is given to the conceptual integration of LLMs with ontological approaches, which allows not only the generation of texts but also ensures their structured organization according to formal knowledge models. The use of ontologies enhances documentation accuracy, reduces errors, and helps establish a unified documentation standard. This study presents an overview of approaches to integrating LLMs with ontologies, including automated knowledge extraction methods and the application of ontological modeling to documentation structuring.The research also analyzes the advantages and limitations of automated documentation generation. Key benefits include faster document creation, standardization, reduced workload for developers, and fewer human errors. However, several challenges are identified, such as the need for model training on domain-specific corpora, ensuring text accuracy, and integrating LLMs into the software development lifecycle.A separate section of the study is dedicated to analyzing practical cases of using LLMs for documentation generation. An example is presented of automatically creating documentation for an open-source software project written in Python, where GPT-4 was used to generate technical descriptions and user instructions. The results indicate that language models significantly improve documentation quality by ensuring a clear structure, proper formatting, and compliance with widely accepted standards. However, it is noted that for full compliance with official requirements, additional text editing and the inclusion of more detailed descriptions of parameters and variables may be necessary.The paper also discusses the future prospects of documentation automation, particularly through hybrid approaches that combine language models, ontologies, and quality control algorithms. It highlights the potential integration of LLMs into development environments (IDEs), which would allow automatic updates to documentation as code changes occur.It is emphasized that automating technical documentation generation using LLMs and ontologies is a promising direction that significantly increases developer efficiency, minimizes errors, and reduces the time required to maintain documentation. However, to ensure high accuracy and compliance with technical standards, further research is required in adapting language models to specific domains, developing methods for text accuracy control, and integrating automatic documentation systems with existing DevOps and code management tools.
References
Mukanova A., Milosz M., Dauletkaliyeva A. LLM-powered natural language text processing for ontology enrichment. Applied Sciences. 2024. Vol. 14. № 13. P. 5860–5875. DOI: 10.3390/app14135860.
Neuhaus F. Ontologies in the era of large language models – a perspective. Applied Ontologyю. 2024. Vol. 18. № 4. P. 399–407. DOI: 10.3233/AO-230072.
Hu Y., Liu D., Wang Q. Automating Knowledge Discovery from Scientific Literature via LLMs: A Dual-Agent Approach with Progressive Ontology Prompting. 2024. https://doi.org/10.48550/arXiv.2409.00054.
Palagin O., Kaverinsky V., Litvin A. OntoChatGPT Information System: Ontology- Driven Structured Prompts for ChatGPT Meta-Learning. International Journal of Computing. 2023. Vol. 22. № 2. P. 170–183. DOI: 10.47839/ijc.22.2.3086.
Giglou H.B., D’Souza J., Auer S. LLMs4OL: Large Language Models for Ontology Learning. The Semantic Web – ISWC. 2023. P. 408–427. DOI: 10.1007/978-3-031-47240-4_22.
Zulkipli Z.Z., Maskat R., Teo N.H.I. A systematic literature review of automatic ontology construction. Indonesian Journal of Electrical Engineering and Computer Science. 2022. Vol. 28. № 2. P. 878–889. DOI: 10.11591/ijeecs.v28.i2.pp878-889.
Vaswani A., Shazeer N., Parmar N. Attention is all you need. arXiv.1706.03762.2017. https://doi.org/10.48550/arXiv.1706.03762.
Devlin J., Chang M.W., Lee K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://doi.org/10.48550/arXiv.1810.04805.
Brown T.B., Mann B., Ryder N. Language Models are Few-Shot Learners. https://doi.org/10.48550/arXiv.2005.14165.
Wulf J., Meierhofer J. Utilizing Large Language Models for Automating Technical Customer Support. https://doi.org/10.48550/arXiv.2406.01407.
Cutting-Decelle A.F., Digeon A., Young R.I. Extraction Of Technical Information From Normative Documents Using Automated Methods Based On Ontologies: Application To The Iso 15531 Mandate Standard – Methodology And First Results. 2018. https://doi.org/10.48550/arXiv.1806.02242.
Rahman M.M., Finin T. Understanding and representing the semantics of large structured documents. 2018. https://doi.org/10.48550/arXiv.1807.09842.
Mateiu P., Groza A. Ontology engineering with Large Language Models. 2023. https://doi.org/10.48550/arXiv.2307.16699.
Ciatto G., Agiollo A., Magning M. Large language models as oracles for instantiating ontologies with domain-specific knowledge. Knowledge-Based Systems. 2025. Vol. 310. https://doi.org/10.48550/arXiv.2404.04108.
Härtel J., Härtel L., Lämmel R. Interconnected Linguistic Architecture. The Art, Science, and Engineering of Programming. 2017. Vol. 1. № 1. https://doi.org/10.48550/arXiv.1701.08122.
Leiker D., Finnigan S., Gyllen A.R. Prototyping the use of Large Language Models (LLMs) for adult learning content creation at scale. 2023. URL: https://doi.org/10.48550/arXiv.2306.01815.
Achachlouei M.A., Patil O., Joshi T. Document Automation Architectures: Updated Survey in Light of Large Language Models. 2023. http://dx.doi.org/10.48550/arXiv.2308.09341.
Lazar K., Vetzler M., Uziel G. SpeCrawler: Generating OpenAPI Specifications from API Documentation Using Large Language Models. 2024. https://doi.org/10.48550/arXiv.2402.11625.
Jayasuriya D., Perera I. Ontology Based Software Design Documentation for Design Reasoning. 2019. DOI: 10.1109/MERCon.2019.8818813.
Топчій Н. Електронний документообіг як основа сучасного документування на підприємстві. Вчені записки Таврійського національного університету імені В.І. Вернадського. Серія «Технічні науки». 2021. Vol. 32. № 2. C. 246–249.
Автоматизація та IT у 2025 р.: ключові тренди, які змінять бізнес. URL: https://inbase.com.ua/trendy-avtomatyzatsiyi-2025/.
Миколайчук Р., Миколайчук А. Використання технологій штучного інтелекту для автоматизації процесу обробки документів. Сучасні інформаційні технології у сфері безпеки та оборони. 2024. Vol. 50. № 2. С. 111–117. DOI: 10.33099/2311-7249/2024-50-2-111-117.
Kim J.L., Woo K. GLEAN: Generative Learning for Eliminating Adversarial Noise. 2024. https://doi.org/10.48550/arXiv.2409.10578.
Han B., Wang X., Wang Y. New Interaction Paradigm for Complex EDA Software Leveraging GPT. 2023. https://doi.org/10.48550/arXiv.2307.14740.
Sachidananda V., Kessler J.S., Lai Y. Efficient Domain Adaptation of Language Models via Adaptive Tokenization. 2021. https://doi.org/10.48550/arXiv.2109.07460.
Bhatia M.P., Kumar A., Beniwal R. Ontology Driven Software Development for Automated Documentation. Webology. 2018. Vol. 15. № 2.
Makin A. Ontology-Driven Knowledge Management Systems Enhanced by Large Language Models.2024. http://dx.doi.org/10.13140/RG.2.2.32979.18728.
Luo Q., Ye Y., Liang S. Repo Agent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation. 2024. https://doi.org/10.48550/arXiv.2402.16667.
Yang J., Wittern E., Ying A.T. Automatically Extracting Web API Specifications from HTML Documentation. 2018. https://doi.org/10.48550/arXiv.1801.08928.
Wang S., Tang Y., He D. gDoc: Automatic Generation of Structured API Documentation. 2018. https://doi.org/10.48550/arXiv.2303.13041.
Toro S., Anagnostopoulos A.V., Bello S.M. Dynamic Retrieval Augmented Generation of Ontologies using Artifcial Intelligence (DRAGON-AI). Journal of Biomedical Semantics. 2024. Vol. 15. № 19. DOI: 10.1186/s13326-024-00320-3.
Gretter R., Matassoni M., Falavigna D. Seed Words Based Data Selection for Language Model Adaptation. 2021. https://doi.org/10.48550/arXiv.2107.09433.