Abstract:
In the context of smart agriculture and agricultural informatization, agricultural knowledge intelligence services have become key to improving production efficiency and decision-making. As a core component, agricultural question-answering (QA) systems integrate natural language processing and knowledge retrieval to offer precise cultivation guidance. However, traditional QA systems often fail to resolve ambiguous queries due to the complexity and domain specificity of agricultural knowledge. To address the issue, we proposed a clarification-based corn knowledge graph question-answering framework based on large language models (LLMs) to enhance the proactivity of QA systems. The framework was composed of a knowledge retriever, fact analyzer, clarification question generator, and answer generator. These modules interact via action tokens to control the clarification process. In this framework, knowledge retrieval was implemented by a semantic relation matching model, which utilized the ComplEx method to calculate the relevance score between entities and the question for candidate answer retrieval process. For clarification question generation, an encoder-decoder structure model was employed to generate both clarification question templates and disambiguating options. During the encoding process, a contextual Transformer was employed to jointly encode the user query and candidate answer entities. In this study, technical question-answering in the context of corn cultivation was used as an example. A detailed corn knowledge graph was constructed by leveraging LLMs to extract fine-grained attributes and relations from variety websites. To evaluate the proactive interaction capability of the QA system, an ambiguous QA multi-turn dialogue dataset and a clarification-based QA trajectory dataset were constructed based on the corn knowledge graph. Comparison results with existing LLMs (ChatGLM-4, Qwen2.5, Llama3.3, ShizishanGPT) on the ambiguous QA task indicated that the proposed method effectively utilized the structured corn knowledge graph to retrieve relevant domain-specific knowledge and actively interacted with users in a multi-turn manner by generating contextually appropriate clarification questions, thereby more accurately identifying and fulfilling their actual information needs. Compared to the optimal baseline models, the proposed method achieved improvements of 21.07, 15.53, 24.38, and 6.63 percentage points on the ROUGE-1, ROUGE-2, ROUGE-L, and BLEU-4 metrics, respectively, on the ambiguous QA multi-turn dialogue dataset. Furthermore, the candidate answer selection model demonstrated high performance, effectively and accurately retrieving possible answers corresponding to ambiguous questions from the knowledge graph in most cases, achieving improvements of 23.7, 11.8, and 9.4 percentage points on the P@1, R@1, and mAP metrics, respectively. The proposed clarification question generation model outperformed the best baseline model, achieving improvements of 3.45、6.57、3.67, and 6.52 percentage points on ROUGE-1, ROUGE-2, ROUGE-L, and BLEU-4, respectively. These results validate the model’s capability to accurately detect ambiguities within user inputs and to generate clarification questions that effectively address the identified uncertainties. Furthermore, ablation experiments were performed to assess the contributions of the knowledge retrieval module, the question clarification module, and the trajectory learning mechanism. The results demonstrated that both modules play a crucial role in enhancing the overall performance of the QA system. Finally, the performance of various models was analyzed and compared using real-world agricultural question-answering cases. Through these practical evaluations, the proposed method demonstrated its advantages in effectively identifying and resolving ambiguities within user queries. The results validate the method’s ability to generate contextually appropriate clarification questions and provide accurate answers, thereby confirming its applicability in complex agricultural knowledge scenarios. In summary, the proposed clarification-based agricultural knowledge graph question-answering framework offers a new research direction for proactive interaction in agricultural question-answering systems, enhancing interactivity and usability in practical applications such as agricultural consultation and intelligent agricultural customer service. On the other hand, for knowledge-intensive domains such as agriculture, healthcare, and law, this approach effectively bridges knowledge gaps and enhances the question-answering system's ability to understand and accurately respond to ambiguous queries.