Abstract
Knowledge intelligence service can improve the production efficiency and decision-making in smart agriculture. Agricultural question-answering (QA) systems can integrate natural language processing and knowledge retrieval for precise cultivation. However, the conventional QA systems cannot resolve the ambiguous queries due to the complexity and domain specificity of the agricultural knowledge. In this study, a maize knowledge graph QA framework was proposed to clarify the QA systems using large language models (LLMs). The framework was composed of the knowledge retriever, fact analyzer, clarification question generator, and answer generator. These modules were interacted with via action tokens to control the clarification. Knowledge retrieval was implemented by a semantic relation matching model. The ComplEx was used to calculate the relevance score between entities and the question for the candidate answer retrieval. An encoder-decoder structure was employed to generate both clarification question templates and disambiguating options. A contextual Transformer was employed to jointly encode the user query and candidate answer entities during encoding. An example was taken as the technical QA in the context of maize cultivation. A maize knowledge graph with the LLMs was constructed to extract the fine-grained attributes and relations from the variety websites. An ambiguous QA multi-turn dialogue dataset and a clarification QA trajectory dataset were constructed to evaluate the proactive interaction of the QA system using the maize knowledge graph. A comparison was made on the ambiguous QA task in the existing LLMs (ChatGLM-4, Qwen2.5, Llama3.3, and ShizishanGPT). The structured knowledge graph was used to retrieve the domain-specific knowledge. The users were asked to generate the contextually clarification questions in a multi-turn manner. Thereby, an accurate identification was realized to fully meet their actual information needs. Compared with the optimal baseline models, there were improvements of 21.07, 15.53, 24.38, and 6.63 percentage points on the ROUGE-1, ROUGE-2, ROUGE-L, and BLEU-4 metrics, respectively, on the ambiguous QA multi-turn dialogue dataset. Furthermore, the candidate answer selection model demonstrated that high performance was achieved to effectively and accurately retrieve the possible answers corresponding to the ambiguous questions from the knowledge graph in most cases, which was improved by 23.7, 11.8, and 9.4 percentage points on the P@1, R@1, and mAP metrics, respectively. The clarification question generation model also outperformed the best baseline model, thus achieving improvements of 3.45, 6.57, 3.67, and 6.52 percentage points on the ROUGE-1, ROUGE-2, ROUGE-L, and BLEU-4, respectively. The ambiguities within user inputs were accurately detected to generate the clarification questions on the uncertainties. Furthermore, ablation experiments were performed to assess the contribution rate of the knowledge retrieval module, the question clarification module, and the trajectory learning mechanism. Both modules enhanced the overall performance of the QA system. Finally, the performance of the various models was verified using real-world agricultural QA cases. These practical evaluations were validated to effectively identify the ambiguities within the user queries. The contextually appropriate clarification questions were generated to provide accurate answers, indicating the applicability in the complex agricultural knowledge scenarios. In summary, the agricultural knowledge graph QA framework can be expected for proactive interaction in the QA systems. The interactivity and usability are enhanced in practical applications, such as agricultural consultation and intelligent agricultural customer service. Particularly in knowledge-intensive domains, such as agriculture, healthcare, and law, this approach can effectively bridge the knowledge gaps for the QA system to understand and accurately respond to ambiguous queries.