Abstract:
Tea-flavored liquor can often represent one of the most favorite alcoholic beverages in recent years. A combination of tea and liquor flavors can be produced after raw material preparation, tea maceration, sugar supplementation, alcoholic fermentation, distillation, aging, and blending. However, its aroma profile is characterized by pronounced complexity and variability. Substantial challenges also remained to accurately assess and then predict the aroma quality. In this study, an explainable machine learning (ML) framework was developed to predict the sensory quality grades of the tea-flavored liquor. The key volatile compounds were identified under quality differentiation. A total of 110 tea-flavored liquor samples were taken after identification. A tasting panel was selected for the standardized protocols of the sensory assessment. Each sample was assigned to one of three predefined quality grades (Grade A, high quality; Grade B, medium quality; Grade C, low quality) after evaluation. The aroma compound profiles of the tea-flavored liquor were then determined using headspace solid phase micro-extraction-gas chromatography-mass spectrometry (HS–SPME–GC–MS). Furthermore, the 13 ML models were benchmarked. The performance of the model was assessed using accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (AUC). Shapley Additive exPlanations (SHAP) were employed to quantify the contribution rates of the aroma compounds to the predictions, in order to enhance the model interpretability. According to the sensory evaluation, 28 samples were classified as Grade A, 42 as Grade B, and 40 as Grade C. Among the 32 volatile aroma compounds, 24 exhibited significant differences (
P < 0.05) over the three quality grades, including 10 terpenes, 10 esters, and 4 higher alcohols. The samples of the RANK A and RANK B grades also displayed broadly similar aroma profiles, whereas most compounds in the RANK C were presented at significantly lower concentrations. Among the 13 ML models, the Radial Support Vector Machine (Radial SVM) achieved the best performance in the prediction, with an AUC of 0.92 and all accuracy, precision, recall, and F1-score values exceeding 0.8. The SHAP analysis further revealed that the terpenes constituted the largest subgroup among the top 20 most influential compounds, followed by 7 esters and 3 higher alcohols. Key aroma compounds contributed to the prediction, including linalool (floral), anethole (spicy), methyl salicylate (mint-like), nerol (floral), ethyl undecanoate (fruity), and isoamyl alcohol (alcoholic). Linalool, anethole, methyl salicylate, and ethyl undecanoate greatly contributed to the sensory quality of the tea-flavored liquor. While the nerol similarly shared a positive correlation, where its contribution rate followed a complex, nonlinear trend: Its positive influence first increased, then diminished as the concentrations rose. There was a synergistic interaction between esters and other terpene compounds at the lower concentrations. Collectively, the sensory perception of nerol was amplified after interaction. By contrast, the isoamyl alcohol was accumulated to diminish the overall aroma quality of the tea-flavored liquor. In conclusion, an accurate and interpretable ML model can be expected to identify the volatile compounds most critical to quality differentiation, particularly for the sensory quality grading of tea-flavored liquor. These findings can provide a scientific basis for the targeted optimization of the production, quality control, and flavor enhancement in the tea-flavored liquor. Future work can be expected to focus on the interactions among key aroma compounds, in order to enrich the theoretical foundation of the flavor chemistry in alcoholic beverages.