An Integrated Prediction Model of Chlorophyll Content in Fresh Tobacco Leaves Based on Wavelength Selection of FMR-NSGA
-
KANG Yapeng,
-
MENG Lingfeng,
-
WEN Junming,
-
REN Jie,
-
ZHU Rongguang,
-
JIANG Guihua,
-
CHANG Pengfei,
-
KE Lihua,
-
LI Sihong,
-
ZHANG Xueqing,
-
CHEN Junliang
-
Abstract
To achieve rapid and non-destructive detection of chlorophyll content in fresh tobacco leaves, this study developed a feature wavelength selection method and an ensemble model for chlorophyll content prediction based on the transmission spectra of fresh tobacco leaves at different maturity levels. Initially, the FMR-NSGA feature wavelength selection method was constructed by integrating univariate regression F-test, mutual information, recursive feature elimination, and non-dominated sorting genetic algorithm. Subsequently, the Mahalanobis distance method was employed to eliminate outliers, and dataset partitioning methods (K-S, SPXY, stratified sampling) were compared. Different regressors (PLSR, SVR, RF), preprocessing methods (Savitzky-Golay smoothing, centerlized, first derivative, and detrend), and feature wavelength selection methods (SPA, CARS, FMR-NSGA) were combined to construct base regressors, from which three optimal base regressors were selected. Finally, the ensemble model was constructed and optimized using the Stacking method and the Grey Wolf Optimizer. The results indicate that stratified sampling yielded the best dataset partitioning performance, with Savitzky-Golay smoothing and FMR-NSGA demonstrating superior performance across multiple regressors. The optimal base regressor (SVR-SG-FMR-NSGA) achieved R_\mathrmt^2 , RMSE, and RPD values of 0.821, 101.745 mg/L, and 2.364, respectively. The ensemble model outperformed the optimal base regressor, with R_\mathrmt^2 increasing by 5.24%, and R_\mathrmt^2 , RMSE, and RPD values of 0.864, 88.697 mg/L, and 2.711, respectively, enabling accurate prediction of chlorophyll content in fresh tobacco leaves. This method provides technical support and theoretical reference for the development of online detection equipment for intrinsic substance components in fresh tobacco leaves.
-
-