基于大语言模型的澄清式玉米知识图谱问答方法

邹佳贤; 陈雷; 何浩楠; 袁媛

doi:10.11975/j.issn.1002-6819.202503135

基于大语言模型的澄清式玉米知识图谱问答方法

Clarifying maize knowledge graph question answering method based on large language model

摘要

摘要: 针对农业种植影响因素众多导致的用户问答存在歧义的情况，该研究提出一种基于大语言模型的澄清式玉米知识图谱问答框架。该框架由知识检索器、事实分析器、问题澄清器和答案生成器组成，并通过引入行动标记对澄清式问答流程进行建模，实现多模块之间的协同控制与问答过程的可控执行。以玉米种植为例，构建玉米知识图谱和玉米种植相关的歧义问答数据集，并在此基础上开展试验。试验结果表明，该方法能够有效利用知识图谱检索相关知识，并通过生成澄清问题与用户进行主动交互，从而精准识别用户的真实需求。与通用领域和农业领域中的最优大模型ShizishanGPT相比，C-CornKGQA在ROUGE-L（基于最长公共子序列的召回率评估指标）和BLEU-4（基于四元语法的精确率评估指标）指标上分别提升了24.38和6.63个百分点。在问答场景中，该方法同样展现出一定的适用性和实际应用价值，可为农业领域智能服务系统提供技术支撑。

Abstract: Knowledge intelligence service can improve the production efficiency and decision-making in smart agriculture. Agricultural question-answering (QA) systems can integrate natural language processing and knowledge retrieval for precise cultivation. However, the conventional QA systems cannot resolve the ambiguous queries due to the complexity and domain specificity of the agricultural knowledge. In this study, a maize knowledge graph QA framework was proposed to clarify the QA systems using large language models (LLMs). The framework was composed of the knowledge retriever, fact analyzer, clarification question generator, and answer generator. These modules were interacted with via action tokens to control the clarification. Knowledge retrieval was implemented by a semantic relation matching model. The ComplEx was used to calculate the relevance score between entities and the question for the candidate answer retrieval. An encoder-decoder structure was employed to generate both clarification question templates and disambiguating options. A contextual Transformer was employed to jointly encode the user query and candidate answer entities during encoding. An example was taken as the technical QA in the context of maize cultivation. A maize knowledge graph with the LLMs was constructed to extract the fine-grained attributes and relations from the variety websites. An ambiguous QA multi-turn dialogue dataset and a clarification QA trajectory dataset were constructed to evaluate the proactive interaction of the QA system using the maize knowledge graph. A comparison was made on the ambiguous QA task in the existing LLMs (ChatGLM-4, Qwen2.5, Llama3.3, and ShizishanGPT). The structured knowledge graph was used to retrieve the domain-specific knowledge. The users were asked to generate the contextually clarification questions in a multi-turn manner. Thereby, an accurate identification was realized to fully meet their actual information needs. Compared with the optimal baseline models, there were improvements of 21.07, 15.53, 24.38, and 6.63 percentage points on the ROUGE-1, ROUGE-2, ROUGE-L, and BLEU-4 metrics, respectively, on the ambiguous QA multi-turn dialogue dataset. Furthermore, the candidate answer selection model demonstrated that high performance was achieved to effectively and accurately retrieve the possible answers corresponding to the ambiguous questions from the knowledge graph in most cases, which was improved by 23.7, 11.8, and 9.4 percentage points on the P@1, R@1, and mAP metrics, respectively. The clarification question generation model also outperformed the best baseline model, thus achieving improvements of 3.45, 6.57, 3.67, and 6.52 percentage points on the ROUGE-1, ROUGE-2, ROUGE-L, and BLEU-4, respectively. The ambiguities within user inputs were accurately detected to generate the clarification questions on the uncertainties. Furthermore, ablation experiments were performed to assess the contribution rate of the knowledge retrieval module, the question clarification module, and the trajectory learning mechanism. Both modules enhanced the overall performance of the QA system. Finally, the performance of the various models was verified using real-world agricultural QA cases. These practical evaluations were validated to effectively identify the ambiguities within the user queries. The contextually appropriate clarification questions were generated to provide accurate answers, indicating the applicability in the complex agricultural knowledge scenarios. In summary, the agricultural knowledge graph QA framework can be expected for proactive interaction in the QA systems. The interactivity and usability are enhanced in practical applications, such as agricultural consultation and intelligent agricultural customer service. Particularly in knowledge-intensive domains, such as agriculture, healthcare, and law, this approach can effectively bridge the knowledge gaps for the QA system to understand and accurately respond to ambiguous queries.

HTML全文

参考文献(53)

施引文献

资源附件(0)