A statistical method for high-throughput emergence rate calculation for soybean breeding plots based on field phenotypic characteristics

Sun, Yan; Li, Mengqi; Liu, Meiling; Zhang, Jingyi; Cao, Yingli; Ao, Xue

doi:10.1186/s13007-025-01356-x

Research
Open access
Published: 24 March 2025

A statistical method for high-throughput emergence rate calculation for soybean breeding plots based on field phenotypic characteristics

Yan Sun¹,
Mengqi Li¹,
Meiling Liu¹,
Jingyi Zhang¹,
Yingli Cao¹ &
…
Xue Ao¹

Plant Methods volume 21, Article number: 40 (2025) Cite this article

435 Accesses
Metrics details

Abstract

In the process of smart breeding, the rapid statistics of soybean emergence rate, as an important part of breeding screening, face challenges under environmental constraints, especially the selection and breeding of soybean varieties in dense environments. Due to the influence of environmental factors, the existing methods have shortcomings, such as low throughput, low efficiency, and insufficient precision. Therefore, an effective and precise statistical method is required. In this study, UAV (Unmanned Aerial Vehicle)-scale data combined with ground measurement data were used as the research object to explore the feasibility of improving the throughput, efficiency, and accuracy of breeding screening under intensive soybean planting. To this end, a set of technical solutions, including background removal, object detection, and accurate counting, were designed. Firstly, a combined background segmentation method based on contrast enhancement filtering combined with ultra-green eigenvalues and the Otsu algorithm was proposed to remove the complex background in remote sensing images and retain the morphological information of soybean seedlings. Secondly, the deep learning object detection model was used to infer and predict the processed images to label soybean seedlings. Then, a soybean seedling counting algorithm was constructed: by establishing a soybean seedling growth model, the idea of “growth normalization” was proposed, and the expansion-compression factor was defined to eliminate the influence of soybean seedling growth inconsistency on counting. After statistical and in-depth analysis of the growth and planting characteristics of soybean seedlings under overlapping conditions, the “inter-seedling occlusion counting algorithm” was proposed to solve the problem of overlapping counting between seedlings. In order to solve the problem of an overlapping bounding box, a soft strategy is specially designed to avoid the redundant values brought by it. Finally, according to the calculation results, the statistical thematic map of soybean emergence rate based on plot plots was displayed. After experiments, the proposed method can effectively count the number of soybean seedlings in the image, with an overall accuracy of 99.18% and an error rate of 0.82%. In addition, Yolov8n had the best recognition effect in the soybean seedling detection task, with a mAP (0.5–0.95) of 85.15%. The proposed background segmentation method increased the mAP (0.5–0.95) of the detection results by 4.06%. It has been demonstrated through experimental tests and verifications that solid support for the statistical work concerning the soybean emergence rate under the condition of intensive planting is provided by this method. This innovative method has played a facilitating role in accelerating the breeding process and has also provided some new ideas and reference directions for further exploration of efficient screening.

Introduction

Rapid statistics of soybean emergence rate are of great significance for field managers. It can assess soybean variety quality, predict yields, guide field management, and optimize planting strategies, thereby helping field managers make more informed planting decisions and improve production efficiency. However, the traditional crop emergence statistics mainly rely on manual observation, which is inefficient, and it is difficult to respond quickly to environmental changes and emergencies, delaying management and decision-making. In recent years, remote sensing technology based on UAV platforms has developed rapidly and has become an effective means to help regional farmland monitoring due to its unique advantages such as reliability, flexibility, and economy [1].

At present, the traditional image detection method is not ideal for the detection of soybean seedling images with dim brightness and shadows, and it is easy to have false detection and missed detection. In addition, in the dense planting environment, it is difficult for the existing methods to carry out rapid, accurate, and efficient counting statistics of crop seedlings. Therefore, it is particularly important to improve the detection accuracy of soybean seedling images and design a high-throughput soybean seedling counting algorithm.

Accurate detection of seedlings in images is a critical step in counting seedlings. At present, the mainstream methods of image segmentation and object detection are based on deep learning [2,3,4]. Quan et al. [5] employed VGG16 as a pre-trained network for Faster R-CNN to detect corn seedlings at various growth stages against backgrounds of weeds and soil in the field. Machefer et al. fine-tuned the existing deep learning architecture, Mask R-CNN, achieving dual objectives of counting and segmentation for the detection and segmentation of individual plants of two low-density crops: potatoes and lettuces. Furthermore, computer vision techniques have been utilized, involving algorithms such as edge detection and watershed for extracting plant contours, followed by threshold segmentation algorithms like OTSU for isolating the plants themselves [6,7,8] applied the Otsu thresholding method to segment rapeseed objects from vegetation index images and extracted shape features such as area, aspect ratio, and ellipse fitting from the segmented rapeseed plants to build a regression model for seedling counting. Additionally, features like vegetation indices, colors, and textures were constructed, and machine learning algorithms such as K-means clustering, Random Forest (RF), and Support Vector Machine (SVM) were employed for classification [9,10,11] initially performed clustering using K-means and then identified corn seedlings by constructing a multi-feature fusion input to a support vector machine. Although the aforementioned methods can detect seedlings in images, their performance is not ideal in cases where the image brightness is low and there is interference from shadows.

At present, the main means of detecting and counting seedling information in the field is to use remote sensing images obtained by UAVs combined with deep learning technology. Liu et al. [12] developed a rapid estimation system for maize seedling emergence based on deep learning algorithms. RGB images acquired by unmanned aerial vehicles (UAVs) were used to develop an optimal model for seedling position, spacing, and size recognition. [10] proposed an efficient, fast, and real-time method for counting cabbage seedlings (combined with improved YOLOv8n, tracking algorithms, and image processing) to accurately track cabbage seedlings in the field and use unmanned aerial vehicles (UAVs) for counting. Zhuang et al. [13] developed a comprehensive solution to determine maize seedling emergence and leaf emergence rates using a field orbital phenotyping platform and convolutional neural networks, aiming to swiftly and precisely extract corn seedling information in field settings while minimizing labor costs. Gao et al. [14], on the other hand, introduced an automated recognition approach for maize seedlings from UAV images that is adaptable to various complex scenarios (encompassing different varieties and seedling developmental stages) through fine-tuning of the Mask R-CNN model. Pan et al. [15] provided a method based on a modified Faster RCNN for the automatic detection and counting of sugarcane seedlings using aerial photography. The most commonly used methods are image or video streaming combined with deep learning or computer vision to identify seedlings. Tan et al. [16] examined CenterTrack, a tracking approach leveraging deep convolutional neural networks, for tallying cotton seedlings and flowers in video sequences. Building upon their prior research, Tan et al. [17] refined the cotton seedling tracking method by integrating a first-order object detection deep neural network with optical flow, enhancing both tracking speed and counting precision. In an effort to address the limitations of current manual observation methods, [18] investigated the use of computervision technology for automated detection at two crucial maize growth stages: seedling emergence and the three-leaf stage. Meanwhile, [19] introduced a deep learning-driven plant counting algorithm aimed at reducing input variables and eliminating the need for geometric or statistical information. They employed You Only Look Once version 3 (YOLOv3), an object detection model and photogrammetry technique, to isolate, pinpoint, and enumerate cotton plants during the seedling stage. Zhang et al. [20] developed a CNN model to identify the leaves in UAV images and used it to estimate the rape stand count. The results showed that the leaf counting based on CNN achieved the best performance in the four-to-six-leaf stage. This study confirmed the feasibility of automatically, quickly and accurately estimating the number of rape seed stands in the field. For the purpose of rapid and automated counting of corn seedlings, Liu et al. [21] developed three models: a corner detection model (C), a linear regression model (L), and a deep learning model (D) to estimate the number of maize seedlings in fields. Notably, the L model demonstrates good adaptability to various data in identifying the number of maize seedlings. Additionally, Feng et al. [22] evaluated cotton emergence two weeks after planting by utilizing high-resolution narrowband spectral indices derived from unmanned aerial vehicles (UAVs), which were captured using a pushbroom hyperspectral imager flying at a height of 50 m above the ground. These studies highlight the potential of computer vision, regression models, and spectral indices in improving the efficiency and accuracy of crop monitoring and counting, thereby contributing to the development of precision agriculture.

Most of the crop seedlings tested above were corn, cotton, sugarcane, cabbage, etc., and their planting density was relatively low, and the individual characteristics were obvious. However, in the dense planting environment, it is difficult to make rapid, accurate, and efficient quantitative statistics of crop seedlings in the above method. For soybean seedlings at the three-leaf stage, the overlap or occlusion between seedlings is serious, and a statistical method applicable to the current environment is urgently needed. In order to solve the above problems, this paper proposed a statistical method for high-throughput emergence rate of soybean breeding plots based on field phenotypic characteristics. In view of the overlap and uneven growth of soybean seedlings, we designed a statistical algorithm for the number of soybean seedlings based on plot plots by establishing a soybean seedling growth model and conducting in-depth analysis, which can calculate the number of soybean seedlings more accurately based on deep learning object detection technology. Secondly, this paper proposes a method for preprocessing soybean seedling images, which can quickly segment and optimize the image quality of farmland plots in batches and help improve the accuracy of the detection model. In addition, in order to solve the problem of duplicate labeling in the detection results, this paper designs an elimination strategy to avoid the counting error caused by redundant labeling. Finally, this paper quickly obtained the information of soybean emergence rate from the screened plots and intuitively displayed the emergence of soybean seedlings.

Materials and methods

Data collection

The experiment was carried out in the soybean breeding experimental base of Shenyang Agricultural University, Shenyang City, Liaoning Province, China, and the test was oriented to the screening test of soybean dense varieties, with a length of about 93 m from northwest to southeast and a width of about 81 m from southwest to northeast, with a total of 2599 breeding plots, as shown in Fig. 1. The sowing date is May 1, 2024, the planting density is 200,000 plants/ha, the row spacing is 50 cm, the plant spacing is 10 cm, and 20 soybean plants are planted in each breeding plot, according to the same horizontal line.

The data collection test was conducted at the three-leaf stage of soybean seedlings (May 24, 2024) to collect drone orthophoto images and ground emergence rate data, and the test used a drone DJIM300 (Shenzhen, Guangdong Province, China), equipped with a PI camera, with a resolution of 8192 × 5460 and 45 million effective pixels. The flight test was selected at 15 o'clock in the afternoon of the same day, when the weather was clear and windless, and the flight was carried out according to the flight path pre-set in the DJIGO software, with 80% side overlap, 80% course overlap, and 22 m flight altitude. The digital image registration, correction, and stitching were completed by DJIterra software, and the overall orthophoto size of the test area was 76,875 × 75,268. A batch segmentation algorithm was designed for soybean breeding plots, and the overall experimental area was divided into 2599 plots in batches with each breeding plot as a unit, and the image size of each plot was 2040*480. A total of 694 breeding plots were randomly selected as the original data source of the experiment, and divided into training datasets, validation sets and test sets according to the ratio of 7:2:1, for training the detection model.

Batch segmentation and background removal of soybean breeding plots

Design of batch segmentation method for breeding plots

In order to obtain the counting results in the unit of plot plots, we adjusted the level of the remote sensing images of soybean seedlings and cut them in batches. First of all, with the help of PhotoShop software, the overall image is adjusted to be horizontal and divided into four parts. After that, the cell lines are adjusted to be parallel by the oblique cutting method. Make each plot form a standard grid. Finally, Python was used to realize batch cutting in small plots. The process diagram is shown in Fig. 2.

Background removal of breeding plot image

The background information in the original image (e.g., shadows, dead grass, soil, etc.) will interfere with the annotation effect and detection results of soybean seedlings in the image to varying degrees and then affect the calculation of the number of soybean seedlings. In order to reduce the interference of background information in soybean seedling images, this paper intends to remove the background of the original images of soybean seedlings to improve the accuracy of soybean seedling detection results.

Contrast enhancement

Due to the dim brightness of the original image, it is difficult to distinguish the soybean seedlings from the shadow parts, which results in a poor image of the soybean seedlings after the background is removed. In order to suppress the useless information in the image and retain the morphological information of the soybean seedlings in the image as completely as possible, we chose to brighten the soybean seedlings and the surrounding parts in the original image, which will enhance the contrast between the soybean seedlings and the surrounding shadows and soil. This helps us to better separate soybean seedlings from the background [23]. Figure 3 shows the effect before and after the adjustment.

Noise filtering

There are also some noise spots in the original soybean seedling images collected, and if the background is removed directly, some more complex spots will be retained in the image. These spots blur the soybean seedlings and their boundaries in the image, which can affect the subsequent detection process. Therefore, in this study, the spots in the images of soybean seedlings will be filtered out. In this paper, four mainstream filtering methods are selected, namely median filtering (Uğur-[24]), bilateral filtering (Frédo-[25]), Gaussian filtering [26], and mean filtering [27]. Through experimental comparison, the clarity of the image after passing the median filter with a convolutional kernel size of 3 × 3 is the highest, and the texture details of the image are relatively intact. As shown in Fig. 4, the unfiltered and filtered soybean seedlings are de-background images, respectively.

The ultra-green eigenvalues combined with Otsu remove the background of the image

In order to separate the background to reduce interference, we chose to extract the soybean seedlings in the image using a combination of the ultra-green eigenvalue segmentation algorithm [28] and the Otsu algorithm [26]. The ultra-green feature segmentation algorithm mainly conducts image segmentation based on the uniqueness presented by the green vegetation parts in plant images in terms of color features. In the common RGB color space, green vegetation usually has relatively specific distribution rules for color components. The excess green feature value precisely takes advantage of such rules and uses specific calculation methods to highlight the differences in color features between the vegetation area and other background areas (such as soil, shadow, etc.), and then realizes the segmentation and extraction of the vegetation parts from the entire image. The Otsu algorithm does not need to set other parameters artificially and is a method that automatically selects thresholds, and its calculation process is simple and stable.

Usually, the super green characteristic value of each pixel in the image is calculated first, and the calculation formula is generally based on the values of the three RGB color channels. In this paper, the formula for calculating the color index of green vegetation is as follows:

$$\begin{array}{c}ExG=0.441R-0.811G+0.385B+18.78745\end{array}$$

(1)

where R, G, and B represent the RGB intensity values for each pixel. Through this calculation, vegetation pixels will obtain relatively high ultra-green characteristic values due to their large proportion of green components, while other non-vegetation background areas will obtain relatively low values.

As can be seen from Fig. 5, the boundary details of the soybean seedling contour after removing the background become clearer, and the annotation accuracy of the soybean seedling sample data is improved.

Semi-automatic annotation of soybean seedling images

Labelme is commonly used for object detection and segmentation tasks in deep learning. When using this tool to annotate soybean seedling data, you need to manually draw a rectangle to label all soybean seedlings in the image, which is relatively time-consuming. In order to save the workload and time of manual annotation of sample data, a semi-automatic data annotation method is adopted in this paper. Combined with the deep learning model, the soybean seedlings to be labeled in the image were identified in an unsupervised manner. After labeling, you only need to fine-tune the labeling area to quickly complete the labeling of all sample data. The specific steps are as follows: First, manually label 200 sample data points. Secondly, this part of the data was sent into the initial deep learning model for pre-training, and the weight file of the trained model was obtained. Then, with the help of training weights, the model was allowed to identify and label the next 200 soybean seedling images and manually fine-tune the labeling data. Then, a total of 400 samples were used to train the detection model for the second time, and the model weight file after the second training was obtained. Finally, with the help of secondary training weights, the model can identify and label the remaining images and fine-tune the labeling results. The renderings during the labeling process are shown in Fig. 6. It can be seen that after the first training, the model can automatically label the soybean seedlings in the image, but the automatic labeling effect is not good, and the labels overlap. After the second training, the accuracy of annotation was improved.

Selection of a soybean seedling target detection model

In order to count the emergence of soybean seedlings in each plot, according to the algorithm designed in this paper, it is necessary to detect the soybean seedlings in the image first to obtain the length information of the detection box boundary. The YOLO series of object detection algorithms has been widely used in the field of object detection due to its advantages of fast speed, high accuracy, strong ability to process multiple targets, strong versatility, simple network structure, and continuous optimization and iteration [29]. In this experiment, we compared the recognition performance of five different object detection models of the YOLOv8n series, among which the YOLOv8n version of the model performed better in terms of accuracy and speed when detecting soybean seedlings, so the YOLOv8n version of the model was used as the basic network for the detection of soybean seedling images [30].

The object detection network marks the seedling position in the way of generating a rectangular selection and records the coordinate information of the rectangular selection. Therefore, we can calculate the length of the selection boundary and other relevant information by extracting the coordinate information of the selection.

Design of an emergent counting method based on soybean phenotype

Counting algorithm for uniformly grown soybean seedlings

In order to realize the soybean seedling counting based on the plot of the plot, we designed a soybean seedling counting algorithm. The algorithm combined with the deep learning model calculates the number of soybean seedlings detected in each plot of the plot. First, in order to facilitate the analysis of the problem, we simulated the soybean seedling growth legend in a plot, as shown in Fig. 7, and the length units in the image are pixels. According to the actual planting situation, the number of soybean seedlings planted in each plot is 20, and the planting is in accordance with the same horizontal line. Assuming that the growth of soybean seedlings in each planting pit is uniform and the bean seedlings are just unobstructed between them, the rectangular space occupied by each bean seedling is ${{w}_{\alpha }\times h}_{\alpha }$ (${h}_{\alpha }{=w}_{\alpha }$).

Therefore, the number of soybean seedlings in each plot can be statistically calculated according to the formula:

$$\begin{array}{c}{Num}_{0}=k\times \frac{1}{W}\sum {w}_{\alpha }\end{array}$$

(2)

where: $k$ represents the number of soybean seedlings actually planted in each plot (20 by default). $W$ represents the sum of the corresponding horizontal widths of the 20 planting pits in the unit plot (the effective value of W is "1716.3182" after statistical calculation). ${w}_{\alpha }$ represents the horizontal width of each soybean seedling under uniform growth.

Algorithm for counting soybean seedlings with uneven growth—“growth normalization”

The above calculation method has certain limitations and is not applicable to the number statistics of soybean seedlings with uneven growth in the image. Therefore, we simulated the growth legend of soybean seedlings in a plot under the condition of uneven growth (no overlap), further analyzed the growth information of a single soybean seedling, and designed a counting formula for a single soybean seedling in the case of uneven growth, as shown in Fig. 8. The length units in the image are pixels.

We put forward the idea of "growth normalization": by introducing the expansion factor $\rho$ and the shrinkage factor $\sigma$, all the soybean seedlings with uneven growth per plant were adjusted to the standard soybean seedlings of uniform size (the horizontal pixel width of the standard soybean seedlings is equal to the horizontal pixel width of each planting pit, that is, the soybean seedlings that are just unobstructed) are counted, and the calculation formula is as follows:

$$\begin{array}{c}{Num}_{1}=k\times \frac{1}{W}{\times w}_{\omega }\end{array}$$

(3)

$$\begin{array}{c}{w}_{\omega }=\sum {w}_{\alpha }\text{+} \, \sum \rho {\cdot w}_{\beta }\text{+}\sum {\sigma \cdot w}_{\gamma }\end{array}$$

(4)

$$\begin{array}{c}\sigma =\frac{{w}_{\alpha }}{{w}_{\gamma }}\end{array}$$

(5)

$$\begin{array}{c}\rho =\frac{{w}_{\alpha }}{{w}_{\beta }}\end{array}$$

(6)

where: $k$ represents the number of soybean seedlings actually planted in each plot (20 by default). $W$ represents the sum of the corresponding horizontal widths of the 20 planting pits in the unit plot. ${w}_{\omega }$ represents the total horizontal width of the soybean seedling detection frame. ${w}_{\alpha }$ represents the horizontal width of the detection frame of each soybean seedling under the standard rally. ${w}_{\beta }$ represents the horizontal width of the detection frame of each soybean seedling under weak growth. ${w}_{\gamma }$ represents the horizontal width of the detection frame of each soybean seedling under strong growth. $\rho$ represents the expansion factor. $\sigma$ stands for shrinkage factor.

Counting algorithm under overlap between soybean seedlings

There will be a large number of overlapping seedlings in the image to be detected, and it is impossible to identify the individual and count the number. In order to solve the above problems, by statistically analyzing all overlapping soybean seedling images and combining the growth and planting characteristics of soybean seedlings at the three-leaf stage, an algorithm was designed to count the number of soybean seedlings in each plot under the overlapping situation. In this study, the growth legend of soybean seedlings in the unit plot was simulated under overlapping conditions, and the growth of soybean seedlings was divided into three critical cases that may lead to counting errors, which were strong growth on both sides, weak growth on both sides, and strong on one side and weak on the other side growth. As shown in Fig. 9, the length units in the image are pixels.

After analysis, we found that the key factor affecting the number of overlapping soybean seedlings was the uneven growth of the seedlings on both sides. Under the premise of fixed planting points, the uneven growth of seedlings on both sides will make the horizontal width value of the target detection frame different, which will lead to the deviation of the calculation based on the detection box edge length information. Therefore, we performed a statistical analysis of the growth of soybean seedlings in all plots. After analysis, we found that the growth degree of soybean seedlings in this period was less than 1.25 times of the standard increase, and there was $\Delta {d}_{1} <\frac{W \, }{4k}$ $\Delta d_{1} = w_{\alpha } - w_{0} \,,\,w_{\alpha }$ representing the actual horizontal width of each soybean seedling). At the same time, the planting point of each soybean seedling is located in the center of each planting pit, and there is $\Delta {d}_{2} <\frac{W \, }{2k}$ ($\Delta {d}_{2}={w}_{0}-{w}_{\alpha }, {w}_{\alpha }$ representing the actual horizontal width of each soybean seedling). Based on the above conditions, we designed a statistical algorithm for the number of soybean seedlings in the case of overlap between seedlings in the plot, and the specific calculation formula is as follows:

$$\begin{array}{c}{Num}_{2}=\sum \left[\frac{{\omega }_{\text{w}}}{{\text{w}}_{0}}\right]\end{array}$$

(7)

$$\begin{array}{c}{\text{w}}_{0}=\frac{1}{n}\sum \frac{{W}_{n}}{k}\end{array}$$

(8)

where: ${Num}_{2}$ represents the total number of overlapping seedlings in the unit plot. ${\omega }_{\text{w}}$ represents the total width of the detection frame of overlapping soybean seedlings in the unit plot. ${W}_{n}$ represents the sum of the horizontal widths corresponding to the 20 planting pits in each unit plot. ${\text{w}}_{0}$ represents the effective value of the width of the horizontal pixel for each standard planting pit in the unit plot (the effective value of ${w}_{0}$ is "85.8159" after statistical calculation).

Calculation improvements to eliminate the impact of overlapping detection boxes

Overlapping markers can occur during soybean seedling testing, as shown in the red-marked area in Fig. 10. This phenomenon is affected by the degree of training of deep learning models because the existing technology cannot completely eliminate the recognition error of the detection model through training. Therefore, we propose a soft strategy to solve this problem. The union interval contained in the horizontal direction of the detection region is used as valid information to eliminate the redundant values caused by the overlap of labels. The calculation formula is as follows:

$$w^{\prime}_{\omega } = Box_{1} x_{l1} , x_{r1} \cup Box_{2} x_{l2} , x_{r2} \cup \ldots \cup Box_{i} x_{li} , x_{ri}$$

(9)

$$Num_{3} = \frac{{w_{\omega }^{\prime } }}{{{\text{w}}_{\alpha } }}$$

(10)

$$\begin{array}{c}Num=Nu{m}_{1}+Nu{m}_{2}+Nu{m}_{3}\end{array}$$

(11)

where: ${Box}_{i}$ represents the interval contained in each overlapping detection frame. ${x}_{li}$ and ${x}_{ri}$ represent the $x$ value of the upper left corner coordinate and the $x$ value of the lower right corner coordinate of different overlapping detection frames, respectively. ${w}_{\omega }{\prime}$ is the value of the union interval contained in the overlapping detection frame. $Nu{m}_{3}$ represents the statistical calculation formula for the number of soybean seedlings in the overlapping detection frame. $Num$ represents the final statistical formula for the number of soybean seedlings per unit plot.

Results

Evaluation index and test environment

Among the issues that this article focuses on, a series of evaluation metrics are selected to evaluate the performance of the model. These metrics include precision ($P$), recall ($R$), mean average precision ($mAP$), and $F1$ score, which are calculated as follows:

$$\begin{array}{c}P =\frac{TP}{TP+FP}\end{array}$$

(12)

$$\begin{array}{c}R =\frac{TP}{TP+FN}\end{array}$$

(13)

$$\begin{array}{c}AP = {\int }_{0}^{1}PdR\end{array}$$

(14)

$$\begin{array}{c}mAP =\frac{1}{N} {\sum }_{i=1}^{N}{AP}_{i}\end{array}$$

(15)

$$\begin{array}{c}F1 = \frac{2PR}{P+R}\end{array}$$

(16)

Among them, TP (true positive) indicates the number of positive samples correctly predicted by the model. FP (false positive) indicates the number of negative samples that the model incorrectly predicts as positive, and FN (false negative) indicates the number of positive samples that the model incorrectly predicts as negative. N represents the number of categories (The object to be detected in this paper is soybean seedlings, so the value of category N is set to 1.). ${AP}_{i}$ is the area under the precision-recall curve for each class, mAP is the average of the ${AP}_{i}$ across multiple classes, and the F1 value considers accuracy and recall to be equally important and measures both accuracy and recall.

In this paper, mAP (0.5) and mAP (0.5–0.95) are used as evaluation indicators, which represent the average accuracy of each category with an IoU threshold of 0.5 and the average accuracy of each category with an IoU threshold from 0.5 to 0.95, respectively. In addition, GFlops (Giga Floating-point Operations Per Second), parameters, and model size were used as indicators to measure the complexity of the model.

The training of the network model in this experiment is based on the pytorch deep learning framework. The test hardware environment is as follows: the CPU uses Intel (R) Core (TM) i7-12700F clocked at 2.10 GHz, the memory is 32 GB, the GPU uses NVIDIA GeForce RTX4070, and the video memory capacity is 12 GB. Using the Windows10 operating system, the python version is 3.10.

Comparison of object detection effects of different models

Table 1 shows the detection results of different detection models on the soybean seedling dataset after removing the background. It can be seen that the results of different detection models under different indicators are different. Yolov5n's accuracy and mAP(0.5) were 0.95 and 0.98 respectively, which were the best among all models, and its model size, parameter quantity, and GFPLPs performance indicators were also the best among all models, but its recall and mAP(0.5–0.95) were not good. The mAP of Yolov8n was 0.5–0.95, which was the best, and the results of model size, inference time, parameter quantity, and GFLOPs were close to those of Yolov5n. Considering that the soybean seedlings after detection need to be counted in this paper, the mAP(0.5–0.95) of the detection model is required to be high. Therefore, under the premise of similar detection speed, Yolov8n is more suitable as the detection model in this study.

Table 1 Comparison of detection effects of different models

Full size table

Effect of image background removal on object detection model

Comparison of the performance of object detection models

Table 2 shows a comparison of the accuracy and performance of soybean seedlings using the Yolov8n model before and after removing the background of the image. Test No. 1 is the result of the test before the background is removed. Test No. 2 is the result of the background removal. The accuracy, recall, and mean average accuracy of test No. 2 were higher than those of test No. 1, indicating that the method of removing the background of soybean seedling images was helpful to improve the detection effect.

Table 2 Comparison of the performance of object detection models

Full size table

Comparison of training and validation processes

Figure 11 compares the metric curves of the model training process before and after the background removal, including P (Precision), R (Recall), mAP0.5, and mAP0.95 (0.5–0.95). As can be seen in Figures (a) and (d), the performance of the corresponding indicators of the model has been significantly improved, indicating that the background in the image does have obvious interference in the positioning of the model. It can be concluded that removing the image background can strengthen the model's ability to identify and locate soybean seedlings in the soybean seedling detection task, so that it can achieve better performance.

Comparison of the results of the confusion matrix

Figure 12 shows the comparison of the confusion matrix trained by the target detection model on the dataset before and after removing the background of the soybean seedling image. Figure 12a shows the result before removing the background of the image, and Fig. 12b shows the result after removing the background of the image. The comparison shows that before removing the background, the number of soybean seedlings correctly framed by the model was 1014, and the total number of false and missed soybean seedlings was 167. After removing the background, the number of soybean seedlings correctly selected by the model was 1035, and the total number of false and missed soybean seedlings was 139. After comparison, after the model was trained on the dataset with the background of soybean seedling images removed, the number of soybean seedlings correctly selected was 21 more than that before the background was removed, the number of soybean seedlings selected by the wrong frame was 7 less, and the number of missed detections was reduced by 11. The results show that the training of the model after removing the background of the soybean seedling image is effective in improving the detection performance of the detection model.

Comparison of detection results

Figure 13 shows the detection effect of the detection model on the validation set before and after removing the background of the soybean seedling image. Figure 13a shows the detection effect of soybean seedlings in the validation set before background removal, and Fig. 13b shows the detection effect of soybean seedlings in the validation set after background removal. It can be seen that before removing the background, the detection effect of the detection model is poor, and the detection frame overlapping, missing detection, and false detection occur. After removing the background, the phenomenon of overlapping detection frames, missed detections, and false detections in the model was reduced. The results show that the method of removing the background of the soybean seedling image improves the detection performance of the model.

Testing and evaluation of soybean emergence counting methods

Statistical results of seedling of non-uniform growth

We tested the proposed soybean seedling counting formula. From the test set, an image of a soybean seedling under non-uniform growth was randomly selected. The results of the test are shown in Fig. 14. Figure 14a is the original image. Figure 14b is the result image after counting, where ‘Size’ represents the size of the image, Time represents the detection time of the image, and Totals is the number of soybean seedlings counted in the image.

A total of three areas containing soybean seedlings were detected in the image, namely three seedlings with uneven growth. After information extraction, the pixel values of the horizontal widths of the three regions were obtained: ${w}_{\gamma 1}=$ 89.3969, ${w}_{\gamma 2}=$ 137.0114 and ${w}_{\beta }=$ 80.2890, respectively, totaling 306.6973.

Statistical formula by the number of soybean seedlings per plant:

$$\begin{array}{c}\rho =\frac{{w}_{\alpha }}{{w}_{\beta }}\end{array}$$

(17)

$$\begin{array}{c}\sigma =\frac{{w}_{\alpha }}{{w}_{\gamma }}\end{array}$$

(18)

$$\begin{array}{c}{w}_{\omega }=\sum {\sigma \cdot w}_{\gamma }+\sum {\rho \cdot w}_{\beta }=257.4477\end{array}$$

(19)

$$\begin{array}{c}Nu{m}_{1}=k\times \frac{1}{W}{\times w}_{\omega }=20\times \frac{1}{{17}{\text{16}}\text{.}{3182}}\times 257.4477=3\end{array}$$

(20)

If the soybean seedlings are counted by the sum of the horizontal widths of the detection frames in the figure, the calculation result is as follows:

$$\begin{array}{c}{Num}{\prime}=\left[\frac{306.6973}{85.8159}\right]=\left[3.5739\right]=4 (Error rate:33.33\%)\end{array}$$

(21)

Comparing the results of the two calculation methods, it can be seen that the number of soybean seedlings in the image can be counted more accurately by using the soybean seedling counting method proposed in this paper, and the effectiveness of the soybean seedling counting formula under non-uniform growth is verified.

Counting results under the influence of overlap between soybean seedlings

We tested the proposed soybean seedling counting formula and selected a soybean seedling image from the collected soybean seedling sample test set. The results of the test are shown in Fig. 15. Figure 15a shows the original image. Figure 15b is the result image after counting, where ‘Size’ represents the image size, Time represents the detection time of the image, and Totals is the number of soybean seedlings counted in the image.

A total of three areas containing soybean seedlings were detected in the image, namely the overlapping area between two seedlings and one single-plant seedling area. After extracting the pixel values of the horizontal width of the three regions, the pixel values of the horizontal widths of the three regions were ${w}_{\gamma }=$ 105.6736, ${\omega }_{\text{w}1}=$ 1148.5964 and ${\omega }_{\text{w}2}=$ 261.4375, respectively, totaling 1515.7435.

Counting formula for numbers of soybean seedlings without overlap:

$$\begin{array}{c}\sigma =\frac{{w}_{\alpha }}{{w}_{\gamma }}\end{array}$$

(22)

$$\begin{array}{c}{w}_{\omega }=\sum {\sigma \cdot w}_{\gamma }=85.8159\end{array}$$

(23)

$$\begin{array}{c}Nu{m}_{1}=k\times \frac{1}{W}{\times w}_{\omega }=20\times \frac{1}{{17}{\text{16}}\text{.}{3182}}\times 85.8159=1\end{array}$$

(24)

Counting formula for numbers of soybean seedlings with overlap:

$$Num_{2} = \sum \left[ {\frac{{\omega_{{\text{w}}} }}{{{\text{w}}_{\alpha } }}} \right] = \left[ {\frac{261.4735}{{85.8159}}} \right] + { }\left[ {\frac{1148.5964}{{85.8159}}} \right] = \left[ {3.0469} \right] + \left[ {13.3844} \right] = 3 + 13 = 16$$

(25)

Therefore, the total number of soybean seedlings in the image is:

$$\begin{array}{c}Num=Nu{m}_{1}+Nu{m}_{2}=1+16=17\end{array}$$

(26)

If the soybean seedlings are counted by the sum of the horizontal widths of the detection frames in the figure, the calculation result is as follows:

$$\begin{array}{c}{Num}{\prime}=\left[\frac{1515.7435}{85.8159}\right]=\left[17. 6627\right]=18 (Error rate:5.88\%)\end{array}$$

(27)

Comparing the results of the two calculation methods, it can be seen that how to perform a rough count, counting soybean seedlings directly using the sum of pixels of the horizontal width of the detection frame extracted from the image, will produce statistical errors. The soybean seedling counting method proposed in this paper can accurately count the number of soybean seedlings in the image, which verifies the effectiveness of the soybean seedling counting formula without overlap.

Count results under the influence of marker overlapping

Figure 16 shows a schematic diagram of a soybean seedling breeding plot before and after the marker overlap was eliminated. After comparison, two overlapping areas appeared in the image before the marker overlap was eliminated, and the number of soybean seedlings detected was 25, which was 6 more than the actual number of soybean seedlings. The number of soybean seedlings detected after the elimination of marker overlap was 19, which was consistent with the true number. The results showed that the marker overlap elimination strategy was effective in improving the accuracy of soybean seedling statistics.

High-throughput calculation and auxiliary screening of emergence rate in soybean breeding plots

Soybean emergent counting method test

The accuracy of the method was verified by the random cutting and breeding plots, and the cutting area contained a total of 48 breeding plots, and the number of soybean seedlings in the breeding plots was quantitatively calculated. The number of soybean seedlings was measured by comparing the measured soybean seedlings on the ground with the counting method, as shown in Fig. 17. According to statistics, the total number of soybean seedlings measured on the ground was 852, the total number of seedlings measured by the counting method was 859, and the overall calculation error was 7 (marked in the white box in Fig. 18a), and the overall calculation error rate was 0.82%. After analysis, the error occurred due to the overgrowth of a few soybean seedlings and the misdetection of a small number of scattered leaves near the edge of the plot. The overall accuracy rate is 99.18%, and the overall accuracy meets the application requirements. The statistical results of soybean emergence rate are shown in Fig. 18b.

Soybean emergence rate calculation and auxiliary screening

The soybean emergence counting method was used to calculate the emergence rate of 2599 breeding plots in the whole area of the screening test of dense-tolerant soybean varieties in a high-throughput calculation, and the visualization results are shown in Fig. 19. According to the needs of breeding screening, the soybean emergence rate was divided into five grades, among which the number of soybean seedlings in the first-level breeding plot was 17–20, with a total of 858 breeding plots, accounting for 33% of the total number of plots. The number of soybean seedlings in the secondary breeding plots was 12–16, with a total of 598 breeding plots, accounting for 23% of the total number of plots. The number of soybean seedlings in the tertiary breeding plot was 8–11, and a total of 390 breeding plots were grown, accounting for 15% of the total number of plots. The number of soybean seedlings in the fourth-level breeding plot was 4–7, with a total of 312 breeding plots, accounting for 12% of the total number of plots. The number of soybean seedlings in the fifth-level breeding plot was 0–3, and there were 441 breeding plots, accounting for 17% of the total number of plots. Among them, the soybean seedling emergence in the primary breeding plot and the secondary breeding plot was in good condition, which could be used as the key tracking and observation object in the follow-up link. The seedling emergence performance of the third-level breeding plot was moderate, and the seedling emergence of the fourth-grade breeding plot and the fifth-grade breeding plot was poor. Due to the influence of the spring cold climate this year, it cannot fully reflect the emergence traits of soybean varieties, and it is necessary to repeat the test next year for further verification. The soybean emergence counting method proposed in this paper can effectively and accurately calculate the soybean emergence rate, and the statistical results can provide quantitative data support for the screening of soybean varieties under intensive planting.

Discussion

This study integrates image processing technology and deep-learning technology to propose a high-throughput statistical method for the emergence rate of soybean seedlings based on field phenotypic characteristics. In the image preprocessing stage, the designed background removal, equidistant cutting, and stitching methods have achieved certain results. These methods are implemented through Python programming, which has improved the usability of images to some extent. Median filtering performs well in noise processing and lays a foundation for subsequent analysis. The background segmentation method that combines the super-green eigenvalue with the Otsu method can effectively separate soybean seedlings from the background in most cases. The batch segmentation and stitching functions also provide convenience for experimental operations.

The semi-automatic annotation method has significantly improved the data annotation efficiency and saved labor and time costs. In terms of model selection, YOLOv8n performs relatively outstandingly in the image detection of the soybean seedling stage. It has good recognition effects and also has certain advantages in terms of model size and detection time. After background removal, its detection performance is further enhanced.

The designed counting algorithm for soybean seedlings innovatively proposes the concept of "growth normalization" and related algorithm strategies, which effectively solve the problems of soybean seedling growth and overlapping counting to a certain extent and achieve a relatively high accuracy rate in dense-planting scenarios. The explored precise statistical technical path covers the complete process from image acquisition and processing to detection and counting. The generated emergence thematic map plays a certain auxiliary role in breeding selection.

Predecessors have respectively studied the seedling quantity statistics of maize, wheat, and rice (mostly using the mapping and statistical methods based on UAV spectra) [31,32,33]. Due to the limitation of crop density, their estimation ability and accuracy are restricted to some extent. The statistical method designed in this paper has certain interpretability in mechanism, certain transfer and application capabilities, and is applicable to the crop seedling quantity statistics in dense-planting environments based on UAVs.

However, this study has obvious limitations. In terms of weed interference, the colors of weeds in the breeding plots are similar to those of soybean seedlings, which is extremely likely to lead to misdetection during the actual detection process. Although weeds or lodging situations have been excluded through manual screening and other methods during the experiment, the specific influence mechanism and quantitative relationship of weeds on the counting results of soybean seedlings are still unclear and need further in-depth research. Although the deep-learning-based object detection model is a feasible direction for distinguishing weeds, the training effect and generalization ability of this model have not been fully verified at present. There are uncertainties in aspects such as its adaptability in different environments and the stability of the model.

In terms of growth range adaptability, the statistical method constructed in this study is only applicable to densely planted soybean seedlings whose current growth degree is less than 1.25 times the normal growth and whose planting row intervals are obvious. When the growth degree of soybeans exceeds this range, the characteristics such as the morphology and leaf extension degree of the seedlings change complexly, and the existing algorithms are difficult to accurately identify and count. When the planting row intervals are not obvious, the seedlings in adjacent rows are easy to confuse with each other, resulting in a significant increase in counting errors.

In terms of the expansion of relevant information research, this study has not yet involved aspects such as disease identification and statistics related to soybean seedlings, growth trend prediction, and the acquisition of information related to other crops. Research on the health status of soybeans, growth trend prediction, and the correlation with other crops is crucial for a comprehensive understanding of the soybean planting system and the optimization of planting strategies. Subsequent research can be carried out around these unaddressed areas to further improve the soybean field phenotype research system.

Conclusion

This study designed the methods of image background removal and equidistant cutting and stitching based on Python. Through comparison, median filtering was determined to be the best denoising method, and the combination of the super-green eigenvalue and the Otsu method achieved excellent background segmentation results. The semi-automatic annotation method was used to annotate the image data of the soybean seedling stage. After comparing five models, YOLOv8n was determined to be the most suitable detection model. The "growth normalization" concept was proposed, and various algorithm strategies were designed to construct a soybean seedling counting algorithm with an accuracy rate of 99.18%. The precise statistical technical path for the emergence rate of soybeans in intensive planting scenarios was explored, and a thematic map of soybean seedling emergence was formed to assist breeding selection. The research results of this study provide important reference bases for soybean variety breeding and planting strategy optimization. However, there is still room for improvement in aspects such as weed interference, growth range adaptability, and the expansion of related information research, which awaits further in-depth research.

Data availability

No datasets were generated or analysed during the current study.

References

Lukas R, et al. High-throughput field phenotyping of soybean: spotting an ideotype. Remote Sensing Environ. 2022. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.rse.2021.112797.
Article Google Scholar
Ma J, et al. Improving segmentation accuracy for ears of winter wheat at flowering stage by semantic segmentation. Comput Electron Agric. 2020. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.compag.2020.105662.
Article Google Scholar
Majeed Y, et al. Deep learning based segmentation for automated training of apple trees on trellis wires. Comput Electron Agric. 2020. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.compag.2020.105277.
Article Google Scholar
Zan X, et al. Automatic detection of maize tassels from UAV images by combining random forest classifier and VGG16. Remote Sens. 2020;12:3049. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs12183049.
Article Google Scholar
Quan L, et al. Maize seedling detection under different growth stages and complex field environments based on an improved faster R-CNN. Biosys Eng. 2019;184:1–23. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.biosystemseng.2019.05.002.
Article Google Scholar
Zhao B, et al. Rapeseed seedling stand counting and seeding performance evaluation at two early growth stages based on unmanned aerial vehicle imagery. Front Plant Sci. 2018. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fpls.2018.01362.
Article PubMed PubMed Central Google Scholar
García-Martínez H. Digital count of corn plants using images taken by unmanned aerial vehicles and cross correlation of templates. Agronomy. 2020;10:469. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/agronomy10040469.
Article Google Scholar
Li B, et al. The estimation of crop emergence in potatoes by UAV RGB imagery. Plant Methods. 2019;2019(15):15. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13007-019-0399-7.
Article Google Scholar
Zhou C, et al. An integrated skeleton extraction and pruning method for spatial recognition of maize seedlings in MGV and UAV remote images. IEEE Trans Geosci Remote Sensing. 2018;56(8):4618–32. https://doiorg.publicaciones.saludcastillayleon.es/10.1109/TGRS.2018.2830823.
Article Google Scholar
Chen Y, et al. Weed and corn seedling detection in field based on multi feature fusion and support vector machine. Sensors. 2021;21:212. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/s21010212.
Article CAS Google Scholar
Wang, et al. Review of plant identiffcation based on image processing. Archiv Comput Methods Eng. 2017;24:637–54. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11831-016-9181-4.
Article Google Scholar
Liu M, et al. Quantitative evaluation of maize emergence using UAV imagery and deep learning. Remote Sens. 2023;15:1979. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs15081979.
Article Google Scholar
Zhuang L, et al. Maize emergence rate and leaf emergence speed estimation via image detection under field rail-based phenotyping platform. Comput Electron Agric. 2024. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.compag.2024.108838.
Article Google Scholar
Gao Y, et al. Enhancing green fraction estimation in rice and wheat crops: a self-supervised deep learning semantic segmentation approach. Plant Phenomics. 2023;5:64. https://doiorg.publicaciones.saludcastillayleon.es/10.34133/plantphenomics.0064.
Article Google Scholar
Pan Y, et al. Identification and counting of sugarcane seedlings in the field using improved faster R-CNN. Remote Sens. 2022;14:5846. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs14225846.
Article Google Scholar
Tan C. Anchor-free deep convolutional neural network for tracking and counting cotton seedlings and flowers. Comput Electron Agric. 2023. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.compag.2023.108359.
Article Google Scholar
Tan C. Towards real-time tracking and counting of seedlings with a one-stage detector and optical flow. Comput Electron Agric. 2022. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.compag.2021.106683.
Article Google Scholar
Zhenghong Yu. Automatic image-based detection technology for two critical growth stages of maize: Emergence and three-leaf stage. Agric For Meteorol. 2013;174–175:65–84. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.agrformet.2013.02.011.
Article Google Scholar
Oh S, et al. Plant counting of cotton from UAS imagery using deep learning-based object detection framework. Remote Sens. 2020;12:2981. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs12182981.
Article Google Scholar
Jian Zhang et al. 2020. Rapeseed stand count estimation at leaf development stages with UAV imagery and convolutional neural networks. Front Plant Sci. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fpls.2020.00617.
Liu S, et al. Estimating maize seedling number with UAV RGB images and advanced image processing methods. Precision Agric. 2022;23:1604–32. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11119-022-09899-y.
Article Google Scholar
Feng A, et al. Evaluation of cotton emergence using UAV-based narrow-band spectral imagery with customized image alignment and stitching algorithms. Remote Sens. 2020;12:1764. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs12111764.
Article Google Scholar
Tong J, et al. Image-based vegetation analysis of desertified area by using a combination of ImageJ and Photoshop software. Environ Monit Assess. 2024;196:306. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s10661-024-12479-4.
Article PubMed Google Scholar
Erkan U, et al. Different applied median filter in salt and pepper noise. Comput Electr Eng. 2018;70:789–98. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.compeleceng.2018.01.019.
Article Google Scholar
Durand F, et al. Fast bilateral filtering for the display of high-dynamic-range images. ACM Trans Graph. 2002;21:257–66. https://doiorg.publicaciones.saludcastillayleon.es/10.1145/566654.566574.
Article Google Scholar
Otsu N et al. 1979. A Threshold Selection Method from Gray-Level Histograms. in IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62–66, Jan. https://doiorg.publicaciones.saludcastillayleon.es/10.1109/TSMC.1979.4310076.
H. Yin et al. 2019. Side Window Filtering. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 8750–8758, https://doiorg.publicaciones.saludcastillayleon.es/10.1109/CVPR.2019.00896.
T. Kataoka et al Crop growth estimation system using machine vision. Proceedings 2003 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM 2003), Kobe, Japan, 2003; 2, b1079-b1083. https://doiorg.publicaciones.saludcastillayleon.es/10.1109/AIM.2003.1225492.
Badgujar CM, et al. Agricultural object detection with you only look once (YOLO) algorithm: a bibliometric and systematic literature review. Comput Electron Agric. 2024. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.compag.2024.109090.
Article Google Scholar
R. Varghese et al. 2024. YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness. 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), Chennai, India, 2024, pp. 1–6, https://doiorg.publicaciones.saludcastillayleon.es/10.1109/ADICS58448.2024.10533619.
Shu M, et al. Using the plant height and canopy coverage to estimation maize aboveground biomass with UAV digital images. Eur J Agron. 2023;151: 126957. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.eja.2023.126957.
Article Google Scholar
Wei L, et al. Wheat biomass, yield, and straw-grain ratio estimation from multi-temporal UAV-based RGB and multispectral images. Biosys Eng. 2023;234:187–205. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.biosystemseng.2023.08.002.
Article CAS Google Scholar
Gao Xiang, et al. Maize seedling information extraction from UAV images based on semi-automatic sample generation and mask R-CNN model. Eur J Agron. 2023. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.eja.2023.126845.
Article Google Scholar

Download references

Funding

The research was funded by the Key Research Project of Liaoning Provincial Department of Education under Grant, China. Thegrant number is JYTZD2023123. This funding supported our research activities.

Author information

Authors and Affiliations

Shenyang Agricultural University, Shenyang, China
Yan Sun, Mengqi Li, Meiling Liu, Jingyi Zhang, Yingli Cao & Xue Ao

Authors

Yan Sun
View author publications
You can also search for this author inPubMed Google Scholar
Mengqi Li
View author publications
You can also search for this author inPubMed Google Scholar
Meiling Liu
View author publications
You can also search for this author inPubMed Google Scholar
Jingyi Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Yingli Cao
View author publications
You can also search for this author inPubMed Google Scholar
Xue Ao
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Design of the work: [YC],[XA]; methodology: [YS], [ML]; formal analysis and investigation: [YS], [ML],[JZ]; writing—original draft preparation: [YS], [YC]; writing—review and editing: [YC],[XA]; funding acquisition: [YC]. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Yingli Cao or Xue Ao.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent to publish

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Sun, Y., Li, M., Liu, M. et al. A statistical method for high-throughput emergence rate calculation for soybean breeding plots based on field phenotypic characteristics. Plant Methods 21, 40 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13007-025-01356-x

Download citation

Received: 18 November 2024
Accepted: 03 March 2025
Published: 24 March 2025
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13007-025-01356-x

A statistical method for high-throughput emergence rate calculation for soybean breeding plots based on field phenotypic characteristics

Abstract

Introduction

Materials and methods

Data collection

Batch segmentation and background removal of soybean breeding plots

Design of batch segmentation method for breeding plots

Background removal of breeding plot image

Contrast enhancement

Noise filtering

The ultra-green eigenvalues combined with Otsu remove the background of the image

Semi-automatic annotation of soybean seedling images

Selection of a soybean seedling target detection model

Design of an emergent counting method based on soybean phenotype

Counting algorithm for uniformly grown soybean seedlings

Algorithm for counting soybean seedlings with uneven growth—“growth normalization”

Counting algorithm under overlap between soybean seedlings

Calculation improvements to eliminate the impact of overlapping detection boxes

Results

Evaluation index and test environment

Comparison of object detection effects of different models

Effect of image background removal on object detection model

Comparison of the performance of object detection models

Comparison of training and validation processes

Comparison of the results of the confusion matrix

Comparison of detection results

Testing and evaluation of soybean emergence counting methods

Statistical results of seedling of non-uniform growth

Counting results under the influence of overlap between soybean seedlings

Count results under the influence of marker overlapping

High-throughput calculation and auxiliary screening of emergence rate in soybean breeding plots

Soybean emergent counting method test

Soybean emergence rate calculation and auxiliary screening

Discussion

Conclusion

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent to publish

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Plant Methods

Contact us