Uncategorized

Wind speed prediction for site selection and reliable operation of wind power plants in coastal regions using machine learning algorithm variants


Comparing predictive models using evaluation metrics

After applying the relevant equations of the eight ML models in a Python environment, the prediction results, including RMSE, MAE, MSE, and R2, are obtained, which are displayed in Tables 9, 10, 11, 12. Overall, all models exhibit similar performance, providing moderate predictions. Notably, CatBoost model outperforms other machine-learning models across various performance metrics for both weather stations.

Table 9 Creating and comparing 14 models using tenfold cross-validation and hyperparameter tuning with Hyperopt optimization for Dataset 1 (best results are bolded)
Table 10 The evaluation metrics for 14 models on both validation and test segment for Dataset 1 (best results are bolded)
Table 11 Creating and comparing 14 models using tenfold cross-validation and hyperparameter tuning with Hyperopt optimization for Dataset 2 (best results are bolded)
Table 12 The evaluation metrics for 14 models on both validation and test segment for Dataset 2 (best results are bolded)

Utilizing tenfold cross-validation, CatBoost (CATB) demonstrated the best performance across all datasets.

In Dataset 1, the model attained an MSE of 0.3745, MAE of 0.3984, and R2 of 0.6218 for Kutubdia station. For Cox’s Bazar station, the model yielded an MSE of 0.9462, MAE of 0.6164, and R2 of 0.514. In Dataset 2, the model demonstrated optimal performance with MSE values of 0.3224 and 0.3541, MAE values of 0.4117 and 0.4347, and R2 values of 0.8618 and 0.8755 for Kutubida and Cox’s Bazar, respectively. Post-hyperparameter optimization, the model’s performance saw notable improvement on both datasets. However, it is worth noting that, in some scenarios, without any parameter tuning, the LSTM and GRU models exhibit superior performance in the context of tenfold cross-validation. Following hyperparameter tuning, CatBoost emerged as the top-performing model, demonstrating impressive outcomes in Dataset 1. For Kutubdia, it achieved an MSE of 0.3744, MAE of 0.399, and R2 of 0.6218. Similarly, for Cox’s Bazar, it delivered an MSE of 0.9382, MAE of 0.6162, and R2 of 0.518. Shifting focus to Dataset 2, CatBoost emerged as the top-performing model, with an MSE of 0.3218 and 0.3533, MAE of 0.4117 and 0.4342, and R2 of 0.8621 and 0.8758 for Kutubdia and Cox’s Bazar, respectively.

In the validation phase, CatBoost performed exceptionally, showcasing distinguished results with an MSE of 0.3388, MAE of 0.3912, and R2 of 0.6409 for Kutubdia station in Dataset 1. Similarly, it attained the best results with an MSE of 0.9328, MAE of 0.6157, and R2 of 0.5192 for Cox’s Bazar station. Turning attention to Dataset 2, the CatBoost again outperformed its counter models with an MSE of 0.3309 and 0.3713, MAE of 0.415, and 0.4398, and R2 of 0.858 and 0.8714 for Kutubdia and Cox’s Bazar, respectively. Following closely, the LGBM model illustrated the second-best performance for all datasets. Moving to the testing phase, in Dataset 1, CatBoost achieved an MSE of 0.3942, a MAE of 0.4042, and an R2 of 0.6242 for Kutubdia station. Again, CatBoost showcased notable performance with an MSE of 0.9906, MAE of 0.6363, and R2 of 0.4994 for Cox’s Bazar in Dataset 1. In Dataset 2, the dominating performance was achieved by CatBoost, with an MSE of 0.3305, MAE of 0.4164, and R2 of 0.8552 for Kutubdia. For Cox’s Bazar, the model’s performance is nearly identical to Kutubdia, with an MSE of 0.3744, MAE of 0.4415, and R2 of 0.867. For each scenario (validation and testing phase), the LGBM model demonstrated performance closely trailing behind the leading model in all datasets. Conversely, the AdaBoost demonstrated relatively lower performance compared to the other models with the exception of the Cox’s Bazar station in Dataset 1. In this case, the Lasso model attained the lowest evaluation metrics.

Apart from the results shown in Tables 9, 10, 11, 12, the difference between the observed wind speed observations and the predicted wind speed based on the best-performing prediction model during the testing phase is also depicted in scatter plots, histograms, and box plots (Figs. 7, 8, and 9). Figures 7 and 8 show the scatter plot and forecasting error histogram plot, respectively, for both datasets during testing phase.

Fig. 7
figure 7

Scatter plots of wind speed prediction for Dataset 1 and Dataset 2

Fig. 8
figure 8

Histograms and Gaussian kernel density functions of wind speed prediction for Dataset 1 and Dataset 2

Fig. 9
figure 9

Boxplots of the prediction error for Dataset 1 and Dataset 2

The scatter plot presents the predicted versus the observed wind speed values. Plots evaluate the cause-and-effect relationship between projected and observed wind speed and measure the robustness of the association between these two variables using the coefficient of determination R2. In terms of R2 for Kutubdia and Cox’s Bazar in Dataset 1, the Catboost model produced the best prediction performance (R2 = 0.642 and 5342, respectively). Similarly, in Dataset 2 the model produced the best results (R2 = 0.8552 and 0.867, respectively) for both stations. Additionally, there is considerably less deviation from the regression line in Dataset 1 for all cluster points compared to Dataset 2. In contrast to Dataset 2, the CatBoost model exhibited robust prediction performance for Dataset 1. In summary, when compared to BMD data, the CatBoost model exhibited the least deviation from the line for all data samples, marking a significant shift in NASA data. This aligns with the accuracy metrics, particularly the R2 values presented in Tables 9, 10, 11, 12.

The histogram plot graphically interprets the error distribution by displaying the number of error values within a certain range, and it includes the Gaussian kernel density function to guarantee that the error follows a normal distribution. The plots indicate that in Dataset 1, the CatBoost model exhibits the standard deviation (0.6278 and 0.9952 for Kutubdia and Cox’s Bazar, respectively), suggesting that the data points cluster closely around the mean. Meanwhile, in Dataset 2, the CatBoost model demonstrates a standard deviation of 0.5749 and 0.6119 for Kutubdia and Cox’s Bazar, respectively. The smaller standard deviation was achieved by the model in case of Kutubdia station for both datasets. This implies that the data points are more tightly grouped around the mean when predicted by this model.

Figure 9 displays boxplots illustrating prediction errors for various models using test datasets. Each graph represents the distribution of residual errors, indicating key statistics like minimum, first quartile, median, third quartile, and maximum values. The bagging and boosting ensemble models, particularly RF, GBR, XGBoost, LightGBM, and CatBoost, showcase similar performance, with noticeable differences in the width of the box across all datasets. Regarding outliers, all models perform in a similar manner.

Quartile percent values, which may indicate additional information about the efficacy of each model individually, are shown in Tables 13 and 14. It is seen that the CatBoost produces a smaller IQR of 0.5408 for Kutubdia station in Dataset 1 than the other models do. For Cox’s Bazar station the decision tree (DT) model has the smallest IQR of 0.6845. In Dataset 2, the CatBoost model generates the smallest IQR, measuring 0.6369 for Kutubdia and 0.6730 for Cox’s Bazar station. The bagging and boosting ensemble models exhibit lower standard deviations in prediction values for, primarily due to their ensemble learning nature and effective handling of outliers. These models combine multiple weak learners and apply regularization techniques to prevent overfitting, resulting in more stable and consistent predictions. Additionally, their focus on important features contributes to the reduced variability in predictions across different data points.

Table 13 Quartile percent of the prediction error for Dataset 1 (minimum Std. deviation and IQR are bolded)

As stated earlier, in this study, 14 ML techniques, including MLR, Lasso, Ridge, Elastic Net, KNN, DT, RF, GBR, AdaBoost, XGBoost, LightGBM, CatBoost, LSTM, and GRU, have been used to estimate the short− time wind speed forecast. Result shows, the CatBoost model is identified as the most proficient predictor of short-term wind speed forecast based on the conducted estimation procedures, exhibiting the smallest error metric scores and the highest level of accuracy compared to alternative methods. However, the forecasting accuracy for Dataset 2 surpasses that of Dataset 1. Table 15 displays the performance assessment, showcasing the most successful outcome achieved, in contrast to models examined in previous studies.

Table 14 Quartile percent of the prediction error for Dataset 2 (minimum Std. deviation and IQR are bolded)
Table 15 Performance comparison of the suggested models with the models from prior studies

Generation scale and turbine compatibility

Wind resource assessment is a critical step in evaluating the viability of a location for harnessing wind energy. It involves understanding the wind characteristics unique to a specific site, essential for optimizing the design and performance of wind energy projects. In this study, maximum likelihood estimation (MLE) of the Weibull distribution is used which to aid in modeling the probability distribution of the observed and predicted wind speeds of both stations, providing valuable insights into the expected wind energy potential.

Based on the superior prediction accuracy demonstrated, we have opted to proceed with the satellite data for further investigation, favoring it over BMD data.

In order to correspond with the wind speed measurements commonly recorded by commercial turbines at hub heights of 50 m and 120 m, the wind speed data were transformed from 10 m to those specific heights using the logarithmic law wind formula. The weather station is located in a built-up area, and the roughness value (z0) in this context falls within the range of 0.1 to 0.4 m. For our analysis, we have adopted the value of 0.3 (Islam et al., 2013). Figure 10 illustrates the probability density function (PDF) plot of both observed and predicted wind speed data for both stations. The average wind velocity and wind power density have been computed using the Weibull distribution parameters (k and c) detailed in Tables 16 and 17, corresponding to heights of 50 m and 120 m. Wind power class and generation scale have been assigned based on the calculated wind power density. While the parameter values exhibit slight variations from those of the observed data, consistent matching of wind power class and generation scale is observed across all cases, except for Kutubdia station at 120 m height.

Fig. 10
figure 10

Probability density function of observed and predicted wind speed data for both stations

Table 16 Weibull k and c parameters, mean wind speed, wind power density, and generation scale at 50 m
Table 17 Weibull k and c parameters, mean wind speed, wind power density, and generation scale at 120 m

When confronted with a location characterized by small and marginal generation-scale wind speeds (e.g., Kutubdia and Cox’s Bazar), there are particular factors to take into account when selecting and optimizing turbines. It becomes imperative to opt for turbines specifically engineered to function effectively in such circumstances. For instance, specialized turbines with efficient blades are crucial for capturing energy from slower winds in low-wind conditions. A larger rotor diameter allows for more effective energy extraction at lower wind speeds. Additionally, selecting turbines with lower cut-in speeds ensures power generation starts at lower wind speeds, maximizing overall energy yield. Optimizing pitch control is crucial for maximizing energy extraction from low wind speeds. Fine-tuning the turbine’s speed regulation system, including adjusting the generator’s speed curve, enhances efficiency in these conditions. Additionally, careful consideration of wake effects and proper spacing between turbines, coupled with advanced wake modeling techniques, plays a pivotal role in optimizing energy production within the wind farm. It is noteworthy to mention that the actual turbine specifications may vary based on manufacturers and specific models. It is important to consult the manufacturer’s specifications for precise details. Recent and popular models such as Vestas, Siemens Gamesa, General Electric (GE) Renewables, Nordex, Enercon, Senvion, Suzlon, Goldwind, Ming Yang, and Envision Energy are commonly employed for turbines in sites with lower wind speeds. Table 18 displays the attributes of some low-speed wind turbines of different models as observed in recent years (Bauer, 2023). The decision options for turbine selection involve evaluating two key criteria: capacity factor (CF), which is widely utilized as a primary decision factor, and annual average energy output (Darwish et al., 2019).

Table 18 Characteristics of some on-shore wind turbines for the chosen sites

In this investigation, the capacity factor is considered an evaluation metric for choosing the suitable turbine based on the observed satellite data. Table 19 displays the annual average energy output and capacity factor associated with each turbine type listed in Table 18, based on the observed satellite data for both locations. The findings indicate that among the various turbine models, the Goldwind model exhibits the most favorable performance. Specifically, the turbine GW 171/3850 distinguishes itself as the most fitting choice, demonstrating the highest capacity factor for both locations (37.17% and 46.99% for Kutubdia and Cox’s Bazar, respectively). It is important to highlight that the turbines with a capacity factor equal to or exceeding 20% are considered viable for the respective sites (Islam et al., 2013).

Table 19 Annual energy output and capacity factor of considered turbines for 120 m hub height

Figure 11 shows the wind power curve or wind turbine power performance curve of the highest CF turbine (Goldwind GW 171/3850), which illustrates the relationship between observed wind speed and the electrical power output of a wind turbine for 120 m hub height. The curve shows how the turbine’s power output increases with higher wind speeds until reaching the rated power (Assareh et al., 2016). The power curves exhibit identical characteristics for both stations. The wind turbine begins to generate power at the cut-in wind speed, the minimum speed required for power generation. At the rated wind speed, the turbine achieves its maximum designed power output. Beyond the cut-out wind speed, the turbine shuts down to prevent damage. This is the maximum wind speed the turbine can withstand.

Fig. 11
figure 11

Power curve of Goldwind GW 171/3850 for 120 m hub height

In low-wind sites, ensuring a continuous power supply requires the integration of a hybrid system. This system combines a wind turbine with an alternative power source, such as solar panels or a small-scale generator, to supplement energy production during periods of low or no wind. If the wind speeds are inadequate, the hybrid system consistently shifts to an alternative power source so that it can allow the turbine to uphold operating. A reliable and uninterrupted power supply can be secured by this approach, which is particularly effective for low- and unstable wind sites. A hybrid system upgrades the overall performance and sustainability of the energy generation system in such conditions by tactically adjusting the wind and secondary energy sources.

Various strategies are involved in the reliable operation of a wind power plant to ensure the effectual and firm performance of the wind turbines as well as the overall plant. Accurate prediction of wind speeds lays out informative perceptions that are devoted to the optimization and stability of the operation of plants. Some key techniques regarding the reliable operation of wind power plants are mentioned here (Commission, 2022):

  • Variations in wind conditions can be anticipated by the operators using wind speed predictions. The plant can optimize energy production and maximize the efficiency of power generation by balancing the pitch and yaw of the turbines based on predicted wind speeds.

  • The employment of advanced control systems to manage loads on the turbines can be enabled using prediction of wind speeds in advance. The operating parameters of the turbines can be adjusted by the control algorithms to guarantee optimal performance and minimize wear and tear, benefiting to the long-term stability of the equipment.

  • Accurate prediction of wind speeds can be used to manage the integration of wind power into the electrical grid. Uncertain swings can be anticipated by grid operators in energy production. Thus, proactive measures, such as adjusting energy reserves or activating alternative sources, can be undertaken to maintain grid stability.

  • Operators can antedate the time period of increased stress on turbine components using predictions of wind conditions. This allows planning maintenance activities during periods of lower wind speeds, reducing downtime and confirming the reliability of the plant.

  • Wind speed forecast can help distribute effective resources, including human resources and spare parts. Operators can maintain inspections, repairs, and maintenance tasks relying on predicted wind conditions. Thus, they can optimize the allocation of resources to enhance the system reliability.

  • Grid operators, energy market participants, and plant owners who rely on a stable and predictable energy output for planning and operational decision-making can be anticipated by wind speed predictions.

  • Precise wind speed predictions can be used by utilities and grid operators for long-term planning and grid development. Predicting future wind conditions helps in determining the feasible locations for new wind projects and planning the extension of the existing grid infrastructure to assist increase renewable energy capacity.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *