Cefixime removal via WO3/Co-ZIF nanocomposite using machine learning methods – Scientific Reports

admin
36 Min Read

To exploit the optimization models with GA artificial intelligence software and SOLVER software, it is necessary to provide non-coded regression coefficients by taking lm from the coded regression results in R software. The SOLVER software can also be utilized as an optimization tool to find optimal values for parameters. This software employs various optimization algorithms to determine the best parameter values for the objective function. In the SOLVER model, after inputting the regression coefficients from R, the desired formula was written in the SOLVER software. The upper and lower bounds for each parameter were set, the object was specified, and the SOLVER software was instructed to handle changeable cells. Constraints for each variable were defined48. By executing the command in the SOLVER software, and considering the necessary limitations, it was provided the best optimization results. Utilizing GA offers significant advantages due to its randomized mechanisms, particularly effective in solving problems with a large number of variables. The software can obtain optimal local or global points by enhancing the population through evolutionary processes across generations. The interactive population features make it highly beneficial for practical and industrial optimization scenarios. GA optimization was done in Anaconda software and the Jupiter environment. Initially, the required libraries for the GA model were installed. The adjusted size value for the system was determined, and the objective function for the model was defined. Necessary adjustments were made to maximize the objective function. Genetic algorithm settings included the number of individuals, generations, and mutation rate. The minimum and maximum ranges for each data were specified and entered into the software. Next, the initial population was created with the new ranges. Arrays were initialized to store the best and mean fitness values in each generation. The parameters were restricted to valid ranges, and fitness values were calculated for each individual in the population. The best and mean values were added to the appropriate arrays. Parents were selected based on the probability of backtracking. Offspring were generated using crossover and mutation. Offspring replaced parents in the population. Interaction plots for all pairs were generated. A grid of X values for the current pair was created, and fitness values were calculated for each combination of X and Y. The interaction plot was then drawn. Layout adjustments were made to prevent clipping of titles and vertical space between subplots was increased. Also, the convergence curve with the best and mean fitness values was plotted. In the end, the best solution and optimal removal for the current model were presented 49,50.

SEM images were employed to investigate the structure and morphology of WO and WO/Co-ZIF nanocomposite. The presence of a layered structure in WO with a thickness of approximately 90 nm (Fig. 1a) is clearly evident. Following the deposition of Co-ZIF crystals on WO, no significant alteration in the initial structure of the two compounds was observed (Fig. 1b). Moreover, SEM mapping confirmed the homogeneous structure of the nanocomposite, verifying the presence of Co, C, N, O, and W elements in the nanocomposite structure (Fig. 1c). Moreover, AFM analysis provided insights into the surface topology of the WO/Co-ZIF nanocomposite (Fig. 1d), corroborating the results presented in the SEM analysis regarding surface morphology and the homogeneous structure of the nanocomposite. This analysis of pores was conducted in the 2.0 µm × 2.0 µm region (Fig. 1e). In the 2D and 3D images presented in bright regions, a better understanding of the sample’s surface structure was achieved. The RSM roughness was determined to be approximately 1.049 nm. Additionally, the length of the line was determined to be around 516.5 nm. Furthermore, a nominal diameter of 138.3 nm was assigned to this nanocomposite.

The objective of designing based on this model is to aid in analyzing the simultaneous effects of the various variables on a response. After the design matrix was generated using CCD in the R software, the data and responses were fitted with three RSM models (Factorial model, Quadratic model and Factorial-quadratic model). These three models were utilized for statistical analysis and designing response surface models, these models are included in the supplementary file. A comparison of the three RSM models with the mentioned criteria is presented in Table 1.

Based on the results presented in Table 1, it is evident that the Quadratic Factorial Model, with higher values of F-statistic, R-squared (R), and lower AIC, RSS, and p-value compared to the other two models, as well as having an insignificant Lack of Fit, can be considered as the best model for aligning with the data. Therefore, the Quadratic Factorial Model was utilized for predicting results and designing the relevant formula. The ANOVA analysis for the Quadratic Factorial Model was presented in the following Table 2 (x = initial Cefixime concentration (mg L), x = pH, x = Time (min), x = catalyst dosage (g L)).

In the ANOVA table, it is entirely evident that in the Quadratic Factorial Model, four independent variables (FO (x, x, x, x)), the second power of variables (PQ (x, x)), and the interactions between variables (TWI (x, x, x)) are significant and play a fundamental role in the model’s alignment with the corresponding data. The significant p-values in the first three rows of the ANOVA table emphasize this fact. Therefore, based on the results presented in the ANOVA table, the regression table related to the Quadratic Factorial Model was designed and provided in Table 3. It is worth noting that all the designs for RSM in the R software are based on the encoded data. Based on Table 3, the corresponding formula for predicting the results was provided in Eq. (6). The predicted results of the model were presented in Table S1 of the supporting information (SI).

Based on ANOVA analysis, the impact of interaction between independent variables on the response (dependent variable) was investigated by designing Contour and Perspective plots. Contour and Perspective plots for examining the interaction effects between x:x, x:x, and x:x were presented in Fig. S2, SI. In Figs. S2 (a) and S2 (d), the interaction effect between the independent parameters x (pH) and x (Time) were examined. As evident, when the value of x is 4.5, increasing the value of x enhances the performance, especially when it is noticeable at values lower than 4.5. In x values above 4.5, with the simultaneous increase of two parameters, no change in removal performance was observed. The reduction in the x variable simultaneously with an increase in the x variable leads to an improvement in the performance of the photocatalytic process. This could have various reasons, one of which might be the alteration in the photocatalyst structure with changing pH. It is plausible that at lower x levels, the size and shape of the nanoparticles change, improving light adsorption and photocatalytic activity. Additionally, at lower pH levels, the number of reactive sites on the surface of the photocatalyst increases, enhancing photocatalytic activity. Increasing doping and photocatalyst oxidation by changing x could be another reason for the improvement of the photocatalytic activity at the lower x levels. The decrease in pH facilitates the phase transfer process from liquid to solid (e.g., removal of by-products), enhancing photocatalytic activity. It is also possible that the catalyst conditions at lower x change increase selectivity by radicals present in the reaction. A decrease in the pH, due to the increased interaction of light with active materials at the reaction site and also the generation of electron-hole pairs, can improve the photocatalytic activity. Furthermore, the decrease in pH can enhance the optical properties (light adsorption) of the catalyst, leading to increased efficiency. In Figs. S2 (b) and S2 (e), the interaction effect of x (pH) and x (catalyst amount) on the removal performance of Cefixime was examined. It is observed that with a decrease in x and an increase in x, the efficiency removal of Cefixime increases. Moreover, an increase in the catalyst amount in the reaction environment facilitates electron transfer. Additionally, reducing the activation energy with an increase in the catalyst amount can accelerate the photocatalytic reaction rate. In some cases, catalysts may degrade in the reaction environment; increasing the catalyst amount improves its resistance to erosion. However, it should be noted in pH values above 4.5, an increase in the catalyst amount did not affect the removal performance. Based on Figs. S2 (c) and S2 (f), it was observed that simultaneous increases in x and x lead to an increase in photocatalyst efficiency. The plots clearly show that below 80 min of reaction time, increasing the catalyst amount has no effect on removal efficiency, and above 80 min of reaction time, increasing the catalyst amount (x) significantly enhances removal efficiency. Therefore, according to Fig. S4, acidic pH values, a time above 80 min, and a catalyst amount above 14.0 g L demonstrate the best performance in photocatalytic removal efficiency .

After training the data, the predicted results for the test and training data were calculated and presented in Fig. 2. In this figure, predicted vs. actual data was plotted for both test and training datasets. These plots can serve as a crucial tool for evaluating the performance of the ANN model. In these plots, the X and Y axes represent the values of actual and predicted, respectively. Each point on these plots represents a sample from the test and training data. The closeness of the data to the regression line or the location of the data on the regression line indicates the high accuracy of the model in data analysis. In the results related to the test data (8 tests out of 39 tests were included and were randomly selected), a slight deviation from the regression line was observed, while this deviation was much less in the 31 data related to training, and most of the points were on the regression line.

To further investigate, residual values were computed for both test and training datasets, and the corresponding plots were presented in Fig. 3. In other words, the difference between the actual (observed) and predicted values of the model was calculated, and the results are visualized in Fig. 3. Indeed, these plots were utilized to examine the distribution of errors and the prediction accuracy of the model under scrutiny. In a good model, residuals should be randomly scattered and close to zero. This situation indicates a good prediction of the model on both test and training data. A plot with a specific and non-random pattern indicates that there are structures or patterns in the data that the model failed to capture. In such cases, there is a need to improve the model or modify input features. The presence of significant differences between residuals indicates the existence of outliers that need to be removed from the data. In the present study, due to some large differences in the residuals, it was necessary to remove outliers. A good residual plot includes scattered points around the zero line, and the points are randomly and uniformly distributed. By examining the plots in Fig. 3, it is observed that in both the test and training residual plots, the data points are well scattered around the zero line. Upon closer inspection, it is evident that the data points were chosen completely randomly. Additionally, the homogeneity of the data points to each other is visible in both plots. Furthermore, to examine the model’s performance more thoroughly, additional evaluation metrics such as MAE (Mean Absolute Error), RMSE (Root Mean Squared Error), and R Score were scrutinized for both test and training datasets. The evaluation results of the model indicated that the values for MAE, RMSE, and R Score for the testing data were 3.22, 3.91, and 0.92, respectively; while for the training data, these values were 0.9, 1.02, and 0.99, respectively. The higher values of MAE and RMSE, along with the lower R Score for the testing data compared to the training data, are visible in Fig. 3.

To investigate the importance of features in the ANN model, the feature importance in the neural network model plot was employed (Fig. 4). This plot illustrates the importance of each parameter and its impact on predicting the model results. The plot depicts the importance of parameters with a bar chart, showing the importance of each parameter from largest to smallest. Thus, parameter x has the highest importance, and x has the least importance. In other words, parameter x has the most significant effect on predicting the model results. However, since all the plots are trending upwards (positively), it indicates that an increase in the values of these four parameters has a positive impact on the model’s output. The prediction results of the model are presented in Table S1, SI.

For interpreting the output results from the SVR software, several plots were utilized. Initially, a plot of actual data versus predicted model data for random test and training data was created. The results obtained are presented in Fig. 5. Interpreting these plots provides a better understanding of the SVR model’s performance. In contrast to the ANN model, it was observed that in the plot related to test data, the points are very close to the regression line, indicating low error and excellent model performance on training data. Regarding the regression plot related to training data, some of the data points were scattered around the regression line, suggesting potential overfitting, even though a significant number of points lie directly on the regression line.

For a more in-depth examination of outliers present around regression lines, residual plots were employed (Fig. 6). It can be observed that there are 2 and 4 outliers or errors in test and training data, respectively. This discrepancy might be associated with the model’s failure to adapt to the relevant test and training data, leading to incorrect predictions. Another reason could be the model’s dependence on specific features. Also, errors in test data may stem from features that are absent in the training data. The different distribution of test data compared to training data could be another reason for these errors. Outliers can also contribute to significant errors in the model. Furthermore, to analyze the model further, evaluation metrics such as MAE, RMSE, and R Score were examined. According to the results extracted from the SVM software, the values for MAE, RMSE, and R Score for test and training data were determined to be (1.54, 2.02, 0.98) and (3.85, 2.02, 0.88), respectively. These results indicate that for test data, the MAE is 1.54, signifying that a lower MAE can be more effective in predicting test data. Additionally, an RMSE of less than 2.03 can indicate high accuracy in predicting test data. The R Score value of 0.98 indicates that the model’s interpretation of test data was well done, and the model has adapted well to the test data. Furthermore, the high R Score signifies the ability of the model to explain 98% of the changes in the response variable. It’s important to note that test data are the most critical parameter for evaluating performance. These data indicate how well the model can adapt to unknown data. The better the test data is, the better the model performs on the new data. On the other hand, it should be noted that the training data is used to train the model, and the basis of the model prediction is based on the conditions of the training data. In conclusion, considering the presented results from the test data, it is evident that the model has a very good performance in predicting results, and the MAE and RMSE values indicate very low errors in the model. Therefore, the displayed error values in the model (Fig. 6) are not likely to significantly interfere with the model’s performance in predicting test results.

The importance of each parameter was evaluated based on its weight in the SVR model (Fig. 7). This plot assisted in determining the level of influence of each parameter in the model’s decision-making process. According to Fig. 7, Through the presented weights, the impact of each variable in predicting the results can be determined. Based on Fig. 7, three weight groups are evident: a group with very high positive weights, a group with very high negative weights, and a group with weights close to zero. Parameters such as x and x, which have significantly positive weights, indicate an increase in the positive model output prediction with an increase in these two parameters. Conversely, an increase in the parameter x leads to a decrease in the positive model output prediction, playing a crucial role in negative model decisions. Parameter x, with a negative weight close to zero, suggests minimal influence on the model’s decision-making and indicates that the model pays little attention to this parameter. Consequently, this parameter has a negligible role in the final model prediction. It is important to note that the sign of a parameter, whether positive or negative, is not indicative of its importance in the model’s decision-making. Instead, the weight assigned to each parameter determines its significance in the model’s decision-making process. Therefore, according to the SVR model, the most and least important parameters in decision-making are x and x, respectively. The model prediction results are presented in Table S1, SI.

To compare the performance of the SVR and ANN models:

To optimize the parameters, Artificial Intelligence (AI) software such as Genetic Algorithm (GA) and SOLVER were employed. After defining the objective function for the GA software, crucial parameters, that significantly impact the process, were input into the software. These essential parameters were extracted from non-coded data using Eq. (7), previously prepared in the R software. The formula was derived through regression analysis based on the non-coded parameter values. The regression analysis for the non-coded values of the parameters was provided in Table S2 in the supporting information file. After determining the important parameters through the relevant formula, the range of each variable was specified and input into the software. The goal of this step is to constrain the parameter space for the software. Subsequently, the initial population was defined, and the desired operations were selected. Following that, the processes of crossover, mutation, and population updating, along with the evaluation of the fitness function, were executed. In some cases, it may be necessary to adjust the parameters of the genetic algorithm or the search space; in such cases, the mentioned steps need to be repeated to achieve the optimal result.

The convergence chart for the genetic algorithm is presented in Fig. 8. This chart has two variables, “Generation” and “Fitness Value,” which play a crucial role in the progress of the genetic algorithm. It illustrates the changes in the fitness value over generations. In this chart, “Generation,” representing the respective population, has been set at 150, and these individuals or populations have evolved from the previous generation. The number 150 indicates the execution of the genetic algorithm up to the last 150th generation, and the results are presented based on this number of generations. The number 180 for the fitness value represents the fitness value in the corresponding generation, indicating the overall improvement in the algorithm’s performance at that moment. These results depict the execution trend of the algorithm at a specific stage, showing how optimal each parameter is in the search space and allowing adjustments to each parameter’s value in each generation for further optimization. Two values, “Best Fitness Curve” and “Mean Fitness Curve,” are observable in the chart. These curves depict two different states of the algorithm. “Best Fitness Curve” shows the best fitness and displays the points indicating the best fitness conditions in the target generation. The higher the points, the better the model has reached better solutions. The “Mean Fitness Curve” chart indicates the algorithm’s progress in improving the overall fitness of the population. According to the Mean Fitness Curve chart, it is evident that the curve quickly tends towards the optimal value, indicating rapid convergence. Although oscillatory transitions are observed in the chart, indicating issues such as encountering local slopes in the search space. Therefore, based on the observed charts, it is clear that the algorithm easily reached the optimal values for the parameters.

Interactive charts were designed and plotted for all parameter pairs in the genetic algorithm (Fig. 9). In these charts, the interactive effect between two parameters in a genetic algorithm was examined and analyzed. Additionally, analyzing these charts helps provide the best parameter settings for improving the algorithm’s efficiency. It also aids in selecting the best model with minimal complexity. These charts indicate the sensitivity of parameters to each other, which can be very useful in decision-making regarding the effects of each parameter. These charts thoroughly evaluate the interaction between different parameters and demonstrate the optimization impact of these parameters. Considering the densely populated population in the chart, it is evident that there is a strong interaction between parameters x and x, as well as between x and x. Meanwhile, there is a weak interaction between x and x, x and x, and x and x. The weakest interaction is between x and x. Parameters with weak interaction between them indicate that a change in one parameter will not significantly affect another parameter. In other words, the impact of each parameter in optimization should be examined separately. Conversely, strong interaction indicates a strong interactive effect between parameters and their simultaneous impact on optimization. When there is a strong interaction between parameters, the optimization process of the model is conducted more harmoniously, considering the interactive effects. Whereas, when a parameter is examined alone, parameter settings may not be as effective. In reality, making correct and principled decisions about parameter tuning and achieving better and more accurate results in optimization requires a precise evaluation of the interaction between parameters and their effects on the genetic algorithm. As portrayed in Fig. 9, the colors represent objective function values, with yellow and blue indicating the highest and lowest fitness values, respectively. Clusters of red dots in certain areas signify optimal points based on the interaction of two independent parameters. Yellow dots denote the best positions for achieving optimal results. The optimal ranges for X and X are between 5 to 10 and 2.6 to 4, respectively, while for X and X, the optimal ranges considering parameter interaction are 115.5 to 118 and 0.1 to 0.6, respectively. Some parameters show minimal interactive effects. This diagram helps pinpoint model strengths and weaknesses and identify optimal points within the genetic algorithm. The color gradient reflects model improvement, shifting from blue to yellow to indicate better optimization, with yellow clusters highlighting high-value algorithmic optimal points. Figure 9 demonstrates a substantial interaction between the parameter pairs (X, X) and (X, X). The red points are concentrated in a yellow area, indicating a strong correlation between X and X, as well as X and X. Conversely, for the parameter pairs (X, X) and (X, X), the interaction between the parameters is weaker. The red points are more dispersed in the yellow area, depicting a weaker correlation between (X, X) and (X, X). The red points in the charts represent potential solutions identified by the genetic algorithm. These points are randomly generated during the genetic algorithm’s search process and are then assessed based on their fitness to the objective function. This indicates that these points are potential solutions likely to yield the best values for the objective function. It’s important to note that the number of red points in the charts depends on the genetic algorithm’s settings. Some red points may be situated in a darker area. These points represent potential solutions with lower fitness to the objective function but may still be worth considering. To pinpoint the most accurate optimal point, a more precise parameter search may be necessary. For the parameter pairs (X, X) and (X, X), the optimal points are widely dispersed and do not follow a distinct pattern. This suggests that the relationship between these parameters is more complex, making it challenging to find a single optimal solution. The genetic algorithm is inspired by the process of evolution in nature. In this algorithm, a set of initial solutions is randomly generated. These solutions are then evaluated based on their fitness to the objective function. Solutions with higher fitness are more likely to pair with each other. This process is repeated until an “optimal” solution is found. In these charts, the genetic algorithm has repeatedly altered the parameter values and calculated the objective function for each new combination. Since the genetic algorithm operates randomly, the dispersion of optimal points in the (X, X) and (X, X) charts indicates that the relationship between these parameters is complex, and the optimal values of one parameter significantly depend on the values of the other parameter. Conversely, for the parameter pairs (X, X) and (X, X), the dispersion of optimal points is less compared to the pairs (X, X) and (X, X). (If the optimal points form a straight line or a smooth curve, it indicates a linear or weak relationship between the two parameters). This suggests that the relationship between these parameters is weaker. In other words, while the optimal values of one parameter in the pairs (X, X) and (X, X) significantly depend on the values of the other parameter, this dependency is less for the pairs (X, X) and (X, X). This means that optimal values for X and X (or X and X) can be found relatively independently of each other. The lesser dispersion of optimal points in the (X, X) and (X, X) charts indicates that the algorithm was significantly less sensitive to changes in X and X compared to (X , X ) and (X and X).

Figure 10 presents the optimal conditions and values for each parameter determined by GA. In the chart, the values of INDEX and VALUE indicate the parameter’s position and its optimal value, respectively. For instance, parameter x is positioned at 2 in the chart and has an optimal value of 117.65. In this way, variables are distinguished from each other. According to the chart, the best values for x, x, x, and x are 6.14 mg L, 3.13, 117.65 min, and 0.19 g L, respectively. Indeed, this chart allows decision-makers in the various industrial sectors to gain a better understanding of the performance of the genetic algorithm in the different optimization phases.

Initially, after activating SOLVER, the uncoded formula provided by the R software (Eq. 7), was used to define and specify variables and the objective function. In other words, using Eq. (7), the optimization function was created in a cell, and variables for which optimal values needed to be determined were set up in a column. First, the cell containing the objective function was selected. Then, the goal of the objective function was specified, indicating whether the objective function should reach a specific value, a minimum value, or a maximum value. Next, the variables to be optimized were selected. The necessary constraints for the objective function were specified. Subsequently, the type of SOLVER, such as Simplex LP or GRG Nonlinear, was identified. Then, optimization was performed by the system, and the best variable values were determined in the results. According to the cases mentioned in the maximum amount of the response variable, the values of x, x, x and x parameters were determined to be 5 mg L, 3, 120 min, and 0.19 g L respectively. Therefore, in comparing the results between the SOLVER and GA models, the following outcomes were obtained: The optimization results obtained from the SOLVER and GA models yield slightly different optimal values for the parameters. Specifically, for parameter (x), the SOLVER model suggests an optimal value of 5 mg L, while the GA model indicates a slightly higher value of 6.14 mg L. Both models converge on the same optimal value of 3 for parameter (x). However, there is a slight discrepancy in the optimal value for parameter (x), with the SOLVER model suggesting 120 min and the GA model indicating 117.65 min. Nonetheless, both models agree on an optimal value of 0.19 g L for parameter (x). Overall, while there are minor differences between the optimization results of the SOLVER and GA models, they generally exhibit a degree of agreement in identifying optimal parameter values. In Table 4, a comparison between different processes in pollutant removal using various artificial intelligence models is provided.

Share This Article
By admin
test bio
Please login to use this feature.