- Research
- Open access
- Published:
A novel approach integrating multispectral imaging and machine learning to identify seed maturity and vigor in smooth bromegrass
Plant Methods volume 21, Article number: 45 (2025)
Abstract
Smooth bromegrass (Bromus inermis) was adopted as experiment materials for identifying the seed maturity using a combination of multispectral imaging and machine learning. The trials were conducted to investigate the effects of three nitrogen application levels (0, 100 and 200 kg N ha− 1, defined as CK, N1 and N2 respectively) and two spikelet grain positions: superior grain (SG) at the basal position and inferior grain (IG) at the upper position, on smooth bromegrass seeds. The germination characteristics of the seeds revealed that the variations in nitrogen application and grain positions significantly influenced seeds vigor. The seed vigor of increased gradually with their maturity, reaching a high level at 30 and 36 days after anthesis. A stacking ensemble learning approach was employed to identify the seed maturity based on multispectral imaging and autofluorescence imaging. The results demonstrated that the Ensemble model outperformed Support Vector Machine, Bayesian, XGBoost and Random Forest across all evaluated metrics in different scenarios. The model accuracy in CK, N1 and N2 were 89%, 87% and 93%, respectively. Furthermore, the SHapley Additive exPlanations method was selected to interpret the Ensemble model, identifying important features such as 405, 430, 540, 630, 645, 690, 850, 880 and 970 nm. These features exhibited a significant correlation with fresh weight, shoot length and vigor index. These findings showed the high accuracy and generalizability of the Ensemble model for identifying the maturity and quality of smooth bromegrass seeds. Therefore, a new strategy would be offered for evaluating seed maturity and vigor level.
Introduction
Perennial forage grasses play a pivotal role in enhancing the sustainability of agricultural and livestock systems. Smooth bromegrass (Bromus inermis), a kind of perennial, superior forage grass of the Poaceae family, is mainly produced as hay and silage and for ruminants [1, 2]. However, low seed yield and poor seed quality of smooth bromegrass have remained a persistent concern for growers. The timing of harvest plays a pivotal role in obtaining superior quality of smooth bromegrass seeds. Harvesting at immature stage resulted in diminished seed yield and suboptimal seed quality [3], whereas delayed harvesting, often due to seed shattering, led to a reduction in seed yield [4]. In crops, superior grains exhibited early flowering and better developmental characteristics, while the inferior grains displayed delayed flowering and poorer development [5, 6]. This phenomenon led to inconsistency in seed development, posing a challenge for growers in discerning seed maturity. Nitrogen fertilization was a crucial agronomic practice for enhancing seed yield and quality [7, 8]. However, variations in soil conditions and fertilizer application rates could influence seed development and result in differences in seed quality [9, 10]. The conventional approach to determine the optimal harvest time for smooth bromegrass relied heavily on the grower’s judgment. Although it is a traditional way for determining seed maturity and evaluating seed quality through germination characteristics and physiological indices, these measurements are time-consuming and labor-intensive. Consequently, it is necessary to conduct an expeditious, non-destructive and high-throughput methodology to ascertain seed maturity of smooth bromegrass.
With the advancement of non-destructive testing techniques, near-infrared spectroscopy (NIR), hyperspectral imaging (HSI), multispectral imaging (MSI), autofluorescence imaging and computer vision technologies have been widely employed for varietal identification [11, 12], seed quality assessment [13,14,15,16] and development of seed pelleting formulations [17]. These studies have successfully accomplished seed-related tasks by synergizing with diverse machine learning models. Owing to their formidable predictive prowess, ensemble learning methodologies were gaining heightened popularity [18,19,20].
The methods employed to fuse multiple base learners were collectively referred as ensemble strategies. Ensemble learning approaches, included boosting series algorithms and bagging series algorithms, involved amalgamating and amalgamating multiple base learners to yield more accurate predictions [21, 22]. Boosting series algorithms and bagging series algorithms both combined existing classification or regression algorithms in a specific manner to create a more powerful classifier [23]. Multiple studies indicated that ensemble models exhibited robust performance compared to individual models [11, 24]. Stacking, a superior ensemble strategy, outperformed bagging and boosting algorithms. Stacking integrated a multi-layer learning architecture to combine outputs from various learning models, utilizing these outputs as input features for a meta-learner to perform model integration [25]. The utilization of a meta-learner could enhance the predictive outcomes of multiple base learners and overall model performance. However, there was less research using stacking ensemble machine learning to predict seed maturity.
Previous studies always exhibited favorable seed uniformity as the samples obtained through controlled procedures. However, the inherent inconsistencies during seed maturation in practical scenarios, coupled with varying environmental conditions and agronomic practices, would engender disparities within seed lots and ultimately impact prediction stability. This study was designed to address this complexity by incorporating diverse grain positions and nitrogen application conditions. The aim was to determine seed maturity with stacking ensemble model through the integration of seed germination characteristics and multispectral imaging techniques. This pursuit seeks to offer novel strategies to determine optimal harvesting periods for high yield and superior quality of smooth bromegrass seeds. Figure 1 was the technical route of this study.
Materials and methods
Experimental design and treatment
Samples were collected from Yuershan Ranch in Chengde City, Hebei province, China (41°44′N, 116°8′E; 1455 m elevation). There was 20.58 mg·kg− 1 available nitrogen (N), 10.40 mg·kg− 1 available phosphorus (P), 53.25 mg·kg− 1 available potassium (K) and 27.63 g·kg− 1 organic matter in the soil (0 ∼ 30 cm) before sowing. Experiment was conducted using a completely randomized block design with four blocks and three nitrogen application levels (0, 100 and 200 kg·N·ha− 1, denoted as CK, N1 and N2, respectively). The plot size was 4 m × 5 m. Bromus inermis seeds (originated in Canada, purchased at Beijing Rytway Ecological Technology Co., LTD) was sown at 0.45 m row spacing on July 8, 2020.
Sampling and measurement
Approximately 600 fertile tillers from the same plot that bloomed on the same day were marked at the flowering stage. 150 fertile tillers (10–15 middle spikelets of the fertile tillers) were collected at 16, 23, 30 and 36 days after anthesis (DAA). Superior grain (SG) and inferior grain (IG) were defined following Jiang et al. [26] with minor modification. The basal grain (the first and second grain from the bottom of the spikelet) in the middle spikelet were categorized as SG and the third to fifth grain were categorized as IG, after air dried (at 20 ∼ 25 ˚C, for 2 ∼ 3 days). Half of the samples were used for obtaining morphological, multispectral and autofluorescence data, the remaining samples were used for measurement of seed germination and physiological characteristics. The dry weight, fresh weight and moisture content were tested (Table 1).
Germination assays
Germination assays were conducted according to Rules of International Seed Testing Association (ISTA, 2020) with 8 h of light and 16 h of darkness, with a light intensity of 66% and a fluctuating temperature of 25/15°C. The experiment was repeated four times and 100 seeds per replicate. First germination count was tested on the 7th days for germination potential (GPT). Shoot lengths (SL), root lengths (RL), seedling fresh weight (FW), germination index (GI), mean germination time (MGT), vigor index (VI) and germination percentage (GPC) were tested at the final germination count on the 14th days. Seeds with primary roots of at least 2 mm long were recorded every day until the14th day to calculate MGT using the Eq. (1):
GI was calculated according to the Eq. (2):
VI was calculated according to the Eq. (3):
where T is the number of days counted from the beginning of germination, N is the number of seeds germinated on day T, and Gt is the number of germinated seeds per day.
Morphological, multispectral and autofluorescence data acquisition and preprocessing
A total of 200 uniform seeds of each treatment were selected for image acquisition using a VideometerLab4™ equipment (Videometer A/S, Herlev, Denmark). Monochrome images at 19 different wavelengths (365, 405, 430, 450, 470, 490, 515, 540, 570, 590,630, 645, 660, 690, 780, 850, 880, 940 and 970 nm) were collected by 19 LED flashes. The images have high spatial resolution at 40 μm/pixel, with a bit depth of 32 bits/pixel, and they were composed of 2192 × 2192 pixels. By installed long-pass (LP) filters, 8 excitation-emission combinations autofluorescences were obtained at 365/400, 365/500, 405/500, 430/500, 450/500, 630/700, 645/700 and 660/700 nm.
Analysis of autofluorescence and multispectral image
After acquiring the multispectral images, the regions of interest (ROIs), representing the average reflected light intensity of seeds at 19 multispectral wavelengths and 8 excitation-emission autofluorescences, were extracted from the background using VideometerLab v3.14 software. The morphological features, multispectral features and autofluorescence features were extracted into a dataset. Table S1 provided a detailed description of morphological features. Altogether, this dataset contained 42 features and 4431 samples. These features of the seeds were subsequently gathered into a matrix (X) for the analysis of their relationship with the corresponding maturity stages, nitrogen application treatments and grain positions (Y).
Stacking ensemble and base models
Stacking ensemble, containing Random Forest (RF), Support Vector Machine (SVM), XGBoost and Bayesian algorithms, were conducted with the ‘mlr3verse(v0.30)’ R package [27]. The dataset was partitioned into a 60% training subset and a 40% validation subset, with the validation subset strictly isolated from all training and tuning procedures. Within the training subset, model development incorporated leave-one-out cross-validation (LOO-CV) to preserve maximal training data utility, while 10-fold cross-validation was employed for hyperparameter optimization via random search. The optimised hyperparameters are presented in the Table 2, with all non-specified parameters maintained at their default values as per the model’s original architecture.
Where TP, FP, TN and FN represent true positive, false positive, true negative and false negative, respectively.
Data analysis
Turkey’s test was employed to examine smooth bromegrass seeds at different maturity stages and nitrogen levels (P < 0.05). Student’s t-test was employed to examine seeds at two grain positions (P < 0.05). We used the ‘agricolae(v1.3-7)’ R package to facilitate the analysis. We employed principal component analysis (PCA) to assess seed germination characteristics with the “FactoMineR(v2.1.0)” R package. The visualization aspect of the experiment involved rendering the ROI for smooth bromegrass seeds in different nitrogen levels, maturity stages and grain positions using Nonlinear Canonical Discriminant Analysis (nCDA) in VideometerLab v3.14 software.
Result
Effect of DAA, nitrogen and grain position on seed germination
Seed germination characteristics including GPT, GPC, RL, SL, FW, MGT, GI and VI were found to be influenced by DAA, nitrogen application and grain position (Table 3). All these indices were significantly (P < 0.001) affected by DAA. GPT, VI, SL and FW were significantly (P < 0.05) affected by nitrogen application. Grain position significantly (P < 0.05) affected all seed germination indices except for GPC. The interaction of DAA and nitrogen application significantly (P < 0.05) affected GPT; however, no significant interactions were observed among DAA, nitrogen application and grain position on seed germination indices. According to the results of seed germination indices (Table 4), the values of RL, SL, FW and VI showed a trend for gradual increasing with days after anthesis increased. In contrast, GPT and GI values exhibited an initial rise followed by a decline, reaching their peak at 23 DAA. The GPC of seeds at 16 DAA was significantly (P < 0.05) lower than those at 23 DAA, 30 DAA and 36 DAA. It was found for nitrogen application to significantly (P < 0.05) enhance the SL of smooth bromegrass seeds. The SL (P < 0.001) and VI (P < 0.05) of SG were significantly greater than those of IG; conversely, the MGT of IG was significantly (P < 0.05) higher than that of SG.
Additionally, a PCA was performed for all germination indices (Fig. 2). We found that PC1 and PC2 explained 55.6% and 20.6% of the total variation, respectively.
The PCA analysis revealed that seeds from different DAA, N treatments, and spikelet positions were primarily separated based on DAA. Specifically, 16DAA and 23DAA samples showed distinct separation along PC1, while 16DAA and 23DAA were clearly distinguished from 30DAA and 36DAA. Notably, the 30DAA and 36DAA samples clustered together in the analysis. Furthermore, FW, VI, GI, SL, GPT, RL and GPC exhibited a positive correlation with PC1, while GMT showed a negative correlation. Notably, FW made the most significant positive contribution to PC1. Regarding PC2, FW, VI, SL, RL and MGT were positively correlated, whereas GI, GPC and GPT were negatively correlated. In addition, FW, VI, SL and RL displayed higher correlation coefficients (Fig. 2A). Furthermore, the evaluation of germination characteristics’ contribution to the first two principal components showed that FW, VI, GI, SL, and GPT were the foremost contributors, while GPC exhibited the least significant contribution (Fig. 2B).
PCA of seed germination characteristics for different treatment. (A) The biplot of PCA results of germination characters. (B) The contribution of germination characters to PC1 and PC2. Points of varying colors indicated distinct treatments. Three repetitions were denoted by four smaller points of the same color, and mean values were represented by larger circles. The contribution of distinct germination characters to the principal component was represented by the length of the arrows, while the correlation between various germination characters was indicated by the orientation of these arrows. The red dotted line represents the average contribution of the variables
Exploration and analysis of morphological, multispectral and autofluorescence data
Fifteen morphological features were derived from RGB images of seeds at four maturity stages, three nitrogen application treatments and two grain positions (Table S1). The morphological features revealed that the values of BetaShape_a, BetaShape_b, CIELab A*, CIELab L*, Compactness Circle, Hue, Width/Length Ratio and Width in seeds at 16 DAA were significantly (P < 0.05) different from those at 36 DAA seeds under three nitrogen application treatment. The values of Vertical orientation in seeds at 16 DAA were significantly (P < 0.05) different from those at 36 DAA under N1 treatment. Moreover, the values of Vertical orientation and Saturation in seeds at 16 DAA were significantly (P < 0.05) different from those at 36 DAA under N2 treatment. On the other hand, the distinction of the four mature levels was not obvious in other morphological features.
Furthermore, the mean multispectral reflectance of seeds from different treatment exhibited similar trends. The average reflectance showed an increase corresponding to the rise in wavelength (Fig. S1, Table S2). Within the wavelength range of 405 to 490 nm, the reflectance of seeds at 16 and 23 DAA were lower than those at 30 and 36 DAA under three nitrogen application treatment. In the wavelength range of 630 to 660 nm, the reflectance of seeds at 16, 23 and 30 DAA were lower than those at 36 DAA under CK. Whereas, in both N1 and N2 treatments, the reflectance of seeds at 16 DAA was lower than that at 23, 30 and 36 DAA in the wavelength range from 570 to 690 nm. Furthermore, the reflectance at 16 and 23 DAA were higher than those at 30 and 36 DAA in the spectral range of 780 to 970 nm under three treatments. Additionally, the impact of SG and IG on the spectrum mainly presented at 16 and 23 DAA. In the wavelength range of 405 to 540 nm, the reflectance of SG at 16 DAA was lower than those of IG under CK and N1 treatment, whereas, the reflectance of SG was higher than IG under N2 treatment. In the wavelength range of 780 to 970 nm, the reflectance of SG at 23 DAA was lower than those of IG under CK, whereas, the reflectance of SG was higher than IG under N1 and N2 treatment.
We additionally obtained autofluorescence spectra at eight excitation wavelengths combinations under three nitrogen application conditions (Fig. S2). The results revealed that under CK, the autofluorescence values at 660/700 nm, 645/700 nm, 630/700 nm, 450/500 nm, 430/500 nm and 405/500 nm of smooth bromegrass seeds decreased gradually with the increasing of maturity, while under N1 and N2 treatments, they first increased and then decreased. The autofluorescence values at 660/700 nm, 645/700 nm and 630/700 nm could clearly distinguish between the seeds at 16 DAA and 23 DAA from 30 DAA and 36 DAA under three nitrogen application treatment. Interestingly, under CK, for the SG and IG classifications at different maturity stages, the SG and IG at 23 DAA could be distinguished by the autofluorescence values 660/700 nm, 645/700 nm, 630/700 nm, 450/500 nm, 430/500 nm and 405/500 nm (Fig. S2 A). Whereas, under N1 treatment, the autofluorescence values at 660/700 nm, 450/500 nm, 430/500 nm, 405/500 nm and 365/500 nm could distinguish between the SG and IG at 16 DAA. Meanwhile, the SG and IG at 23 DAA could be differentiated by the autofluorescence values at 450/500 nm, 430/500 nm, 405/500 nm 365/500 nm and 365/400 nm. Furthermore, the autofluorescence values at 450/500 nm, 405/500 nm, 365/500 nm and 365/400 nm could distinguish the SG and IG at 36 DAA (Fig. S2 B). Under N2 treatment, the SG and IG at 23 DAA could be differentiated by the autofluorescence values at 405/500 nm, 365/500 nm and 365/400 nm. Meanwhile, the autofluorescence values at 660/700 nm, 645/700 nm and 630/700 nm could distinguish the SG and IG at 30 DAA. Furthermore, the autofluorescence values at 660/700 nm and 405/500 and could distinguish the SG and IG at 36 DAA (Fig. S2 C).
To explore the separation patterns of seeds at different maturity stages under various features and N treatments, we conducted LDA. For morphological characteristics, the variance explained by LD1 showed an increasing trend with higher nitrogen application, rising from 78.7% in CK to 89.8% in N1 and 93.3% in N2 (Fig. 3A, B, C). However, morphological features alone could not effectively distinguish seeds at different maturity stages. Regarding multispectral features, LD1 explained 72.4%, 69.2%, and 80.4% of the variance under CK, N1, and N2 treatments, respectively. Under CK treatment, seeds at 16 DAA were distinctly separated from other maturity stages, while N1 treatment showed overlapping distributions across all maturity stages (Fig. 3D, E). Notably, under N2 treatment, seeds at different maturity stages demonstrated better separation (Fig. 3F). Although autofluorescence features showed an increasing trend in LD1 explained variance from CK (64.6%) to N1 (74.8%) and N2 (81.1%), the separation between maturity stages was not distinct across all nitrogen treatments (Fig. 3G, H, I). Further analysis of the integrated dataset revealed that LD1 explained 66.9%, 69.5%, and 79.6% of the variance under CK, N1, and N2 treatments, respectively. N2 treatment exhibited the best separation between different maturity stages (Fig. 3J, K, L). Interestingly, we found that in both multispectral and integrated datasets, the gap between SG and IG decreased with increasing nitrogen application, suggesting that nitrogen application promotes the development of inferior grains.
Two-dimensional scatter plots of LDA results based on different data types at various days after anthesis (DAA) under nitrogen treatments. (A–C) Morphological characteristics, (D–F) Multispectral features, (G–I) Autofluorescence features, and (J–L) Integrated dataset analysis. Each column from left to right represents CK, N1, and N2 nitrogen treatment levels, with data points colored according to DAA (16, 23, 30, and 36 days). Different colored ellipses represent DAA, with solid and dashed ellipses indicating SG and IG positions, respectively. The percentages in parentheses show the proportion of variance explained by each LD axis
We conducted nCDA on different treatment seeds’ spectra (Fig. 4). Seeds of 16_IG and 36_SG treatments were randomly selected for normalization with blue and red colors in nCDA analysis, respectively. Visualizations of nCDA were conducted by randomly selecting 50 seeds from each treatment. The results showed that seeds at 16 DAA had lowest nCDA values in blue color. In contrast, seeds at 36 DAA had highest nCDA values in red color. As the number of days after anthesis increased, the number of red seeds in nCDA images gradually increased.
Model analysis using stacking ensemble machine learning for separating different seed maturity
We conducted a stacking ensemble model (Ensemble) consisting of RF, SVM, XGBoost and Bayesian algorithms (Fig. 5). To compare the performance of different models, we used several evaluation metrics: accuracy, Kappa values, sensitivity, specificity and precision. The results showed that under three nitrogen application treatments and the mixture treatment (combining CK, N1 and N2 treatments), Ensemble was the best for all parameters in the multi-source fusion data compared to SVM, Bayesian, XGBoost and RF. Under CK, the accuracy scores on the testing set for Ensemble, SVM, Bayesian, XGBoost and RF were 0.89, 0.66, 0.49, 0.54 and 0.63, respectively. The kappa values were 0.85, 0.55, 0.31, 0.39, 0.50. The Ensemble exhibited the highest precision value (0.89). The sensitivity value across the five models exhibited a range from 0.48 to 0.89, whereas the specificity value varied from 0.83 to 0.96. The confusion matrix of Ensemble showed that the accuracy of distinguishing seed at 16, 23, 30 and 36 DAA were 98%, 95%, 84% and 87%, respectively (Fig. 5A). Under N1 treatment, the accuracy scores on the testing set for Ensemble, SVM, Bayesian, XGBoost and RF were 0.87, 0.66, 0.49, 0.56 and 0.64, respectively. The kappa values were 0.82, 0.55, 0.32, 0.41, 0.52. In relation to precision, Ensemble showed the highest value (0.87). The sensitivity value across the five models varied between 0.50 and 0.86, and the specificity value ranged from 0.83 to 0.96. The confusion matrix of Ensemble showed that the accuracy of distinguishing seed at 16, 23, 30 and 36 DAA was 91%, 80%, 90% and 96%, respectively (Fig. 5B). In terms of N2 treatment, the accuracy scores on the testing set for Ensemble, SVM, Bayesian, XGBoost and RF were 0.93, 0.72, 0.47, 0.56 and 0.67, respectively. The kappa values were 0.90, 0.62, 0.30, 0.42, 0.56. In relation to precision value, Ensemble showed the highest value (0.93). The sensitivity value across the five models varied from 0.47 to 0.93, and specificity value varied from 0.82 to 0.98. The confusion matrix of Ensemble showed that the accuracy of distinguishing seed at 16, 23, 30 and 36 DAA were 95%, 90%, 90% and 97%, respectively (Fig. 5C). For mixture treatment, the model performance of the Ensemble had declined. the accuracy scores on the testing set for Ensemble, SVM, Bayesian, XGBoost and RF were 0.81, 0.68, 0.46, 0.54 and 0.63, respectively. The confusion matrix of Ensemble showed that the accuracy of distinguishing seed at 16, 23, 30 and 36 DAA were 92%, 84%, 81% and 91%, respectively (Fig. 5D).
These results indicated that the classification effect of the Ensemble on seed maturity was best under N2 treatment, followed by CK, then N1 treatment. Under mixture treatment, the declining in Ensemble model performance may be due to the increased inconsistency within the seeds caused by nitrogen application and grain position. Interestingly, the seeds at 16 DAA and 23 DAA were more effectively identified under CK, while the seeds at 30 DAA and 36 DAA were more accurately discerned under N1 and N2 treatment.
Furthermore, the SHapley Additive exPlanations (SHAP) methodology was employed to furnish explanations for the Ensemble model. The results indicated that nitrogen application influenced the global importance of features, with distinct sets of features identified under each treatment (Fig. 6). Noteworthily, 690 nm, 570 nm and 405 nm were features present in both N1 and N2 treatments, but not in CK. The results indicated that nitrogen application might cause changes in certain substances during the seed development process, and these substance changes could be used to better distinguish seeds at different maturity stages.
Interpretation of Ensemble model under CK, N1, N2 and mixture treatment utilizing SHAP. (A) Top fifteen important features of Ensemble model under CK; (B) N1 treatment; (C) N2 treatment; (D) mixture treatment. The horizontal coordinates of the scatterplot show the SHAP value for each feature across the 200 samples. Feature value with high values are shown in yellow and with low values are shown in purple
Fifteen features, exhibiting importance in mixture treatment, were extracted to conducted a correlation analysis with top four contributors to germination characteristics (Fig. 7). The results revealed significant correlations (P < 0.001) between saturation and CIELab B*. Significantly, there was a negative correlation (P < 0.05) observed between saturation and the wavelengths 850 nm, 880 nm, 970 nm, as well as 405/500 nm, 430/500 nm and 450/500 nm. A significant (P < 0.05) negative correlation was observed between CIELab B* and 690 nm. A positive correlation was observed among the following groups of features: 405 nm, 430 nm and 540 nm; 540 nm, 630 nm, 645 nm and 690 nm; 850 nm, 880 nm, 970 nm, 365/500 nm, 405/500 nm, 430/50 nm and 450/500. Both 405 nm and 430 nm exhibited a negative correlation with 850 nm, 880 nm, 970 nm, 365/500 nm, 405/500 nm, 430/500 nm and 450/500 nm. It was found that FW exhibited a significant (P < 0.05) positive correlation with 405 nm, 430 nm, 540 nm, 630 nm, 645 nm and 690 nm, and a negative correlation with 880 nm. Additionally, SL showed a significant (P < 0.05) positive correlation with 405 nm and 430 nm, and VI showed a significant (P < 0.05) positive correlation with 405 nm, 430 nm, 540 nm and 645 nm. Furthermore, SL and VI presented significant (P < 0.05) negative correlation with 850 nm, 880 nm and 970 nm.
Pearson correlation between FW, GI, SL, VI and important features, as well as the correlation among important features. ‘*’, ‘**’ and ‘***’ indicated significant correlation levels at P < 0.05, P < 0.01 and P < 0.001 respectively, while ns indicated no correlation at P > 0.05. A solid line signified a positive correlation, whereas a dotted line signified a negative correlation
Discussion
Grain positions and nitrogen application promoted seed variation
The results of the variance analysis revealed significant effects of nitrogen application and grain positions on seed germination (Table 3). The position of grains on the spike in cereal crops led to differences in seed size and quality [28, 29]. However, we found that the interaction effects between DAA, N application and grain position were not significant. The possible reason for this is that the biological effects of these factors are relatively independent, implying that they affect seeds through different pathways and have independent direct effects on seed germination characteristics. Heavier, well-filled grains are called superior, while lighter, poorly filled grains are inferior [30]. In wheat (Triticum aestivum), superior grains bloom early at the bottom of the spikelet, while inferior grains bloom later at the top [26]. In this study, we found that superior grain had a greater weight than inferior grain (Table 1). And, there were differences in germination characteristics between superior and inferior grains (Table 4). These differences may be attributed to several physiological factors. Superior grains, which develop at the bottom of the spike, benefit from a higher grain filling rate, better dry matter accumulation, and a longer period for nutrient supply, while inferior grains, developing later at the top, are less favored in these aspects. The late differentiation of florets, lower carpel weight at anthesis, and weaker endosperm development further contribute to these differences [26]. Inferior grains showed greater spatial and temporal variation and were more sensitive to environmental factors like water, temperature, and fertilizer [31]. Appropriate application of nitrogen and potassium fertilizers facilitated the development of inferior grains in wheat and rice(Oryza sativa), resulting in an increase in grain weight [5, 6, 32]. In this study, the application of nitrogen significantly enhanced both grain weight (Table 1) and seedling shoot length (Table 4). These results revealed that grain position and nitrogen application resulted inconsistencies of seed sample, which caused unstable prediction accuracy in seed maturity distinguishing in different scenarios.
Morphological, multispectral and autofluorescence information variation at different seed maturity
Seed vigor is acquired progressively during seed development and reaches its highest level upon the completion of seed physiological maturation. The appropriate timing for harvesting constituted a pivotal factor in obtaining seeds of elevated vigor [3]. Previous studies primarily used methods like seed aging and artificial heating to achieve different vigor levels, enhancing spectral differentiation [33, 34]. During production, factors such as harvest time, grain positions, irrigation and fertilization contributed to divergences in seed vigor. In this study, we conducted four levels of seed maturity, three treatments of nitrogen application, and two grain positions to evaluate their impact on seed quality and vigor. These factors are crucial for optimizing seed harvest timing, improving seed quality, and understanding how nitrogen fertilization and grain position influence seed development. The results showed that BetaShape_a, BetaShape_b, CIELab A*, CIELab L*, Compactness Circle, Hue, Width/Length Ratio and Width could effectively distinguish between seeds at 16 DAA and 36 DAA under three nitrogen treatments (Table S1). However, these features were not sufficient to accurately distinguish seeds at 23 DAA and 30 DAA. Multispectral imaging had been proven effectiveness in discerning seeds through their spectral characteristics [14, 35, 36]. This study showed that Seeds at early stages (16 and 23 DAA) had significantly lower reflectance than those at later stages (30 and 36 DAA) in the wavelength range of 405 nm to 490 nm (Table S2), which related to the changes of chlorophyll content and seed color [37, 38]. Furthermore, the reflectance of seeds in the early stages of development were significantly higher than that of seeds in the later stages of development under the wavelength range of 780 nm to 970 nm, which could be related to the seed lipid content and water content [39]. Moreover, both 405 nm and 430 nm showed a significant positive correlation with FW, SL and VI, while 850 nm, 880 nm and 970 nm all had a significant negative correlation with SL and VI (Fig. 7). These results indicated that these features had potential as indicators to evaluate seed quality. Additionally, under N2 treatments, 630/700 nm offered more distinct identification of seed maturity stages, aligning with findings from prior studies [14, 40, 41]. The seeds in this study were harvested with a 2–3 day air drying step, which resulted in the degradation of chlorophyll and the natural dark brown colour of smooth bromegrass seeds, possibly contributing to the fact that 630/700 nm, 645/700 nm and 660/700 nm were not effective in distinguishing the maturity stage of seeds under the other treatments in this study.
The models for identifying seed maturity in different application scenarios
Traditionally, seed maturity has been assessed using indicators such as color, water content, and chlorophyll fluorescence signals [42,43,44]. In this study, we observed that seed vigor increased progressively with maturity, reaching its peak at 30 and 36 days after anthesis (DAA), as evidenced by seed germination characteristics (Table 4). The outcomes of PCA demonstrated its ability to differentiate between seeds at the maturity stages of 16 and 23 DAA and those at 30 and 36 DAA (Fig. 2). This differentiation can serve as a valuable reference for determining the optimal harvest time for smooth bromegrass seeds. Overall, PCA proves to be a promising method in agricultural practices, particularly for optimizing seed harvest timing and enhancing seed quality assessment processes. However, traditional methods of seed analysis are often time-consuming and subjective, relying heavily on the interpretations of different seed analysts. This subjectivity can compromise the reliability of the analysis and requires a high degree of consistency in the samples being examined [45]. In contrast, non-destructive seed quality identification has been successfully achieved through the application of machine learning algorithms combined with multispectral and hyperspectral image analysis [46, 47]. Supervised methods such as Linear Discriminant Analysis (LDA) and non-Linear Canonical Discriminant Analysis (nCDA) have been employed to distinguish between different seed types [48,49,50]. In this study, nCDA effectively distinguished seeds at various maturity levels (Fig. 4). However, factors such as nitrogen application and grain position were found to influence the LDA’s effectiveness in distinguishing seed maturity (Fig. 3).
Most previous researches focused on the combination of a single model such as RF, neural network (NN) and SVM [33, 46, 51]. However, this study uncovered that a single classification model manifested suboptimal efficacy in differentiating the multifaceted aspects of seed maturity (Fig. 5). This phenomenon could potentially be ascribed to the variations introduced by grain positions and nitrogen application, which contributed to increased seed inconsistency, thereby influencing the distinguish outcomes. So, an ensemble learning model were devised by employing a stacking technique to solve this problem. Stacking was a complex technique that integrated a multi-tiered learning architecture to combine outputs from various learning models [25]. Before embarking on the construction of the stacking model, the paramount to rigorously evaluate both the accuracy and variability of potential models. This evaluation would be helpful for the judicious selection of the base learners [19]. In this study, we constructed a robust stacking ensemble algorithm, which is composed of RF, XGBoost, SVM, and Bayesian models as the base models. The rationale behind this selection was the belief that a diverse set of base learners offered a more comprehensive insight into the relationships between input and output features. However, directly using the base learners’ outputs as a training set for the meta-learner could introduce the risk of overfitting. To mitigate this, the framework incorporated the k-fold cross-validation method, ensuring that the meta-learner’s training samples were drawn from previously unutilized samples in the dataset. Compared with SVM, Bayesian, XGBoost and RF, the Ensemble model exhibited superior performance across all evaluated metrics in four application scenarios (model performance: N2 > CK > N1 > mixture) (Fig. 5). The stacking ensemble not only improved the accuracy of seed maturity identification but also exhibited strong generalization ability and robustness. Interestingly, the global importance of the model’s features varied in different application scenarios, implying that nitrogen application caused changes in certain characteristics during the seed development process. Furthermore, the seeds at 16 and 23 DAA were more effectively identified under CK, while the seeds at 30 and 36 DAA were more accurately discerned under N1 and N2 treatment. In summary, stacking ensemble model showed the significant potential of the seed maturity classification under varying fertilizer conditions. In further research, validating the model with seed data from multiple locations and years can improve both the accuracy and applicability of the model.
Conclusion
Based on the characteristics of seed germination, it was observed that the seed vigor increased gradually with their maturity, reaching a high level at 30 and 36 DAA. This study employed a stacking ensemble learning approach to identify seed maturity of smooth bromegrass in four application scenarios based on multispectral imaging and autofluorescence imaging. By comprehensively comparing with SVM, Bayesian, XGBoost and RF, the Ensemble model showed the best performance in four application scenarios, with important features exhibiting significant correlations with seed germination indices. These findings demonstrated the high accuracy and strong generalizability of the Ensemble model for identifying the maturity and quality of smooth bromegrass seeds.
Data availability
No datasets were generated or analysed during the current study.
Code availability
Not applicable.
References
Ferdinandez YSN, Coulman BE. Nutritive values of smooth bromegrass, meadow bromegrass, and meadow × smooth bromegrass hybrids for different plant parts and growth stages. Crop Sci. 2001;41:473–8.
Smart AJ, Schacht WH, Volesky JD, Moser LE. Seasonal changes in dry matter partitioning, yield, and crude protein of intermediate wheatgrass and smooth bromegrass. Agron J. 2006;98:986–91.
Finch-Savage WE, Bassel GW. Seed vigour and crop establishment: extending performance beyond adaptation. EXBOTJ. 2016;67:567–91.
Dong Y, Wang Y-Z. Seed shattering: from models to crops. Front Plant Sci [Internet]. 2015 [cited 2023 Oct 12];6. Available from: http://journal.frontiersin.org/Article/10.3389/fpls.2015.00476/abstract
Liang W, Zhang Z, Wen X, Liao Y, Liu Y. Effect of non-structural carbohydrate accumulation in the stem pre-anthesis on grain filling of wheat inferior grain. Field Crops Res. 2017;211:66–76.
Yang J, Zhang J. Grain-filling problem in ‘super’ rice. J Exp Bot. 2010;61:1–5.
Sinclair TR, Rufty TW. Nitrogen and water resources commonly limit crop yield increases, not necessarily plant genetics. Global Food Secur. 2012;1:94–8.
Triboi E, Abad A, Michelena A, Lloveras J, Ollier JL, Daniel C. Environmental effects on the quality of two wheat genotypes: 1. quantitative and qualitative variation of storage proteins. Eur J Agron. 2000;13:47–64.
Liu Y, Liao Y, Liu W. High nitrogen application rate and planting density reduce wheat grain yield by reducing filling rate of inferior grain in middle spikelets. Crop J. 2021;9:412–26.
Wen D, Xu H, He M, Zhang C. Proteomic analysis of wheat seeds produced under different nitrogen levels before and after germination. Food Chem. 2021;340:127937.
Jia Z, Sun M, Ou C, Sun S, Mao C, Hong L, et al. Single seed identification in three Medicago species via multispectral imaging combined with stacking ensemble learning. Sensors. 2022;22:7521.
Yang L, Zhang Z, Hu X. Cultivar discrimination of single alfalfa (Medicago sativa L.) seed via multispectral imaging combined with multivariate analysis. Sensors. 2020;20:6575.
Amanah HZ, Joshi R, Masithoh RE, Choung M-G, Kim K-H, Kim G, et al. Nondestructive measurement of anthocyanin in intact soybean seed using fourier transform Near-Infrared (FT-NIR) and fourier transform infrared (FT-IR) spectroscopy. Infrared Phys Technol. 2020;111:103477.
Jia Z, Ou C, Sun S, Wang J, Liu J, Sun M, et al. Integrating optical imaging techniques for a novel approach to evaluate Siberian wild Rye seed maturity. Front Plant Sci. 2023;14:1170947.
Varela JI, Miller ND, Infante V, Kaeppler SM, De Leon N, Spalding EP. A novel high-throughput hyperspectral scanner and analytical methods for predicting maize kernel composition and physical traits. Food Chem. 2022;391:133264.
Wang Y, Peng Y, Zhuang Q, Zhao X. Feasibility analysis of NIR for detecting sweet corn seeds Vigor. J Cereal Sci. 2020;93:102977.
Jia Z, Ou C, Sun S, Wang J, Liu J, Li M, et al. A novel approach using multispectral imaging for rapid development of seed pellet formulations to mitigate drought stress in alfalfa. Comput Electron Agric. 2023;212:108136.
Alves Ribeiro VH, Reynoso-Meza G. Ensemble learning by means of a multi-objective optimization design approach for dealing with imbalanced data sets. Expert Syst Appl. 2020;147:113232.
Li Q, Song Z. Prediction of compressive strength of rice husk Ash concrete based on stacking ensemble learning model. J Clean Prod. 2023;382:135279.
Sun S, Wang S, Wei Y. A new ensemble deep learning approach for exchange rates forecasting and trading. Adv Eng Inform. 2020;46:101160.
González S, García S, Del Ser J, Rokach L, Herrera F. A practical tutorial on bagging and boosting based ensembles for machine learning: algorithms, software tools, performance study, practical perspectives and opportunities. Inform Fusion. 2020;64:205–37.
Webb GI, Zheng Z. Multistrategy ensemble learning: reducing error by combining ensemble learning techniques. IEEE Trans Knowl Data Eng. 2004;16:980–91.
Zhou ZH. Ensemble methods: foundations and algorithms [Internet]. 0 ed. Chapman and Hall/CRC; 2012 [cited 2023 Oct 12]. Available from: https://www.taylorfrancis.com/books/9781439830055
Farooq F, Ahmed W, Akbar A, Aslam F, Alyousef R. Predictive modeling for sustainable high-performance concrete from industrial wastes: A comparison and optimization of models using ensemble learners. J Clean Prod. 2021;292:126032.
Wu T, Zhang W, Jiao X, Guo W, Alhaj Hamoud Y. Evaluation of stacking and blending ensemble learning methods for estimating daily reference evapotranspiration. Comput Electron Agric. 2021;184:106039.
Jiang D, Cao W, Dai T, Jing Q. Activities of key enzymes for starch synthesis in relation to growth of superior and inferior grains on winter wheat (Triticum aestivum L.) Spike. Plant Growth Regul. 2003;41:247–57.
Lang M, Binder M, Richter J, Schratz P, Pfisterer F, Coors S, et al. mlr3: A modern object-oriented machine learning framework in R. JOSS. 2019;4:1903.
Ishimaru T, Hirose T, Matsuda T, Goto A, Takahashi K, Sasaki H, et al. Expression patterns of genes encoding Carbohydrate-metabolizing enzymes and their relationship to grain filling in rice (Oryza sativa L.): comparison of Caryopses located at different positions in a panicle. Plant Cell Physiol. 2005;46:620–8.
Liu Y, Liang H, Lv X, Liu D, Wen X, Liao Y. Effect of polyamines on the grain filling of wheat under drought stress. Plant Physiol Biochem. 2016;100:113–29.
Langer RHM, Hanif M. A study of floret development in wheat (Triticum aestivum L). Ann Botany. 1973;37:743–51.
Peng T, Du Y, Zhang J, Li J, Liu Y, Zhao Y et al. Genome-wide analysis of 24-nt siRNAs dynamic variations during rice superior and inferior grain filling. PLoS ONE. 2013;8:e61029.
Fu J, Huang Z, Wang Z, Yang J, Zhang J. Pre-anthesis non-structural carbohydrate reserve in the stem enhances the sink strength of inferior spikelets during grain filling of rice. Field Crops Res. 2011;123:170–82.
Barboza Da Silva C, Oliveira NM, De Carvalho MEA, De Medeiros AD, De Lima Nogueira M, Dos Reis AR. Autofluorescence-spectral imaging as an innovative method for rapid, non-destructive and reliable assessing of soybean seed quality. Sci Rep. 2021;11:17834.
Ma T, Tsuchikawa S, Inagaki T. Rapid and non-destructive seed viability prediction using near-infrared hyperspectral imaging coupled with a deep learning approach. Comput Electron Agric. 2020;177:105683.
Hu X, Yang L, Zhang Z. Non-destructive identification of single hard seed via multispectral imaging analysis in six legume species. Plant Methods. 2020;16:116.
Zhang S, Zeng H, Ji W, Yi K, Yang S, Mao P, et al. Non-Destructive testing of alfalfa seed Vigor based on multispectral imaging technology. Sensors. 2022;22:2760.
ElMasry G, Mandour N, Wagner M-H, Demilly D, Verdier J, Belin E, et al. Utilization of computer vision and multispectral imaging techniques for classification of Cowpea (Vigna unguiculata) seeds. Plant Methods. 2019;15:24.
Boelt B, Shrestha S, Salimi Z, Jørgensen JR, Nicolaisen M, Carstensen JM. Multispectral imaging – a new tool in seed quality assessment? Seed Sci Res. 2018;28:222–8.
Barlocco N, Vadell A, Ballesteros F, Galietta G, Cozzolino D. Predicting intramuscular fat, moisture and Warner-Bratzler shear force in pork muscle using near infrared reflectance spectroscopy. Anim Sci. 2006;82:111–6.
Donaldson L, Williams N. Imaging and spectroscopy of natural fluorophores in pine needles. Plants. 2018;7:10.
Kenanoglu BB, Demir I, Jalink H. Chlorophyll fluorescence sorting method to improve quality of Capsicum pepper seed lots produced from different maturity fruits. Horts. 2013;48:965–8.
Ellis RH. Temporal patterns of seed quality development, decline, and timing of maximum quality during seed development and maturation. Seed Sci Res. 2019;29:135–42.
Jalink H, Van Der Schoor R, Frandas A, Van Pijlen JG, Bino RJ. Chlorophyll fluorescence of Brassica oleracea seeds as a non-destructive marker for seed maturity and seed performance. Seed Sci Res. 1998;8:437–43.
Zhao P, Chu L, Wang K, Zhao B, Li Y, Yang K, et al. Analyses on the pigment composition of different seed coat colors in Adzuki bean. Food Sci Nutr. 2022;10:2611–9.
Rahman A, Cho B-K. Assessment of seed quality using non-destructive measurement techniques: a review. Seed Sci Res. 2016;26:285–305.
Batista TB, Mastrangelo CB, De Medeiros AD, Petronilio ACP, Fonseca De Oliveira GR, Dos Santos IL, et al. A reliable method to recognize soybean seed maturation stages based on Autofluorescence-Spectral imaging combined with machine learning algorithms. Front Plant Sci. 2022;13:914287.
Xia Y, Xu Y, Li J, Zhang C, Fan S. Recent advances in emerging techniques for non-destructive detection of seed viability: A review. Artif Intell Agric. 2019;1:35–47.
Mortensen AK, Gislum R, Jørgensen JR, Boelt B. The use of multispectral imaging and single seed and bulk Near-Infrared spectroscopy to characterize seed covering structures: methods and applications in seed testing and research. Agriculture. 2021;11:301.
Shrestha S, Deleuran L, Gislum R. Classification of different tomato seed cultivars by multispectral visible-near infrared spectroscopy andchemometrics. J Spectr Imaging. 2016;5:a1.
Vrešak M, Halkjaer Olesen M, Gislum R, Bavec F, Ravn Jørgensen J. The use of image-spectroscopy technology as a diagnostic method for seed health testing and variety identification. PLoS ONE. 2016;11:e0152011.
Pang L, Wang J, Men S, Yan L, Xiao J. Hyperspectral imaging coupled with multivariate methods for seed vitality Estimation and forecast for Quercus variabilis. Spectrochim Acta Part A Mol Biomol Spectrosc. 2021;245:118888.
Funding
This research was supported by the earmarked fund for CARS (CARS-34).
Author information
Authors and Affiliations
Contributions
PM conceived and designed the experiment. CO performed the experiments. CO, ZJ and SZ analyzed the data. SZ, SS, MS and JL contributed to the experiment. SJ and ML provided writing suggestions. CO, SZ and ZJ wrote the paper, and PM revised the paper. and All authors contributed to the article and approved the submitted version.
Corresponding author
Ethics declarations
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ou, C., Jia, Z., Zhao, S. et al. A novel approach integrating multispectral imaging and machine learning to identify seed maturity and vigor in smooth bromegrass. Plant Methods 21, 45 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13007-025-01359-8
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13007-025-01359-8