Machine-learning-aided thermochemical treatment of biomass: a review

Thermochemical treatment is a promising technique for biomass disposal and valorization. Recently


Introduction
Biomass is a kind of renewable energy with huge reserves, possessing environment-friendly and carbon-neutral properties. Using biomass is one of the effective ways to alleviate future energy needs while mitigating the greenhouse effect and environmental pollution crises (Leng et al., 2019b;Chen et al., 2022c). Thermochemical processes, such as torrefaction, pyrolysis, gasification, and hydrothermal carbonization/liquefaction/gasification, are important methods to valorize biomass or biowastes (carbon-neutral materials), such as forestry waste (wood, woody biomass, etc.), agricultural wastes (straw, husks, grasses, etc.), sewage sludge, animal manure, food waste, and algae (Peterson et al., 2008;Leng et al., 2021b;Perera et al., 2021;Xu et al., 2022). During the thermochemical treatment of biomass, valuable oil and gas products can be generated, which can be used for the production of biofuels and biochemicals to replace fossil resources (Tuck et al., 2012;Ragauskas et al., 2014;. In addition, versatile carbonaceous materials are also generated, which can be used to replace fossil resources through the production of char fuel (Liu et al., 2015;Leng et al., 2021aLeng et al., , 2020bLi et al., 2023) while char carbon (black carbon) could also be used in soli for sequestrating carbon and mitigating climate change (Leng et al., 2019b and c). Many variables affect the thermochemical treatment performance, and investigations on the thermochemical reaction behavior and mechanisms, as well as the optimization of the as-produced gas/oil/char products by conventional experimental methods, are time-consuming and labor-intensive, which can be overcome by incorporating machine learning (ML) technology.
The term "artificial intelligence (AI)" was introduced by the American scientist John McCarthy at a conference at Dartmouth College in 1956. However, AI was not very popular until the end of the 20 th century and the beginning of the 21 st century, when computer science and AI algorithms were fully developed (Xu et al., 2021b). Currently, AI, particularly ML, has been developed and widely used in various areas, including the recognition of speech and visual objects, detection of objects, and prediction of the yields, compositions, and properties of products, as well as the reaction 1787  1789  1789  1789  1789  1790  1791  1791  1793  1794  1794  1795  1798  1799  1799  1800  1801  1801  1801  1802  1802  1802  1802  1802  1803  1803  1804  1804 behavior of chemical reactions, e.g., thermochemical reactions. ML is believed to be as popular as, if not more popular than, thermodynamic equilibrium, kinetics, and computational fluid dynamics (CFD) to model highly complex processes efficiently (Ascher et al., 2022b). For example, ML reduced the computational expense of detailed kinetic models by four orders of magnitude and predicted the outputs of the detailed kinetic models with very high accuracy for new data (Hough et al., 2017). However, ML has only recently been widely applied to aid the thermochemical treatment of biomass.
The number of studies available in the existing literature, journal distribution, and funding agencies focused on the ML-aided thermochemical treatment of biomass based on the dataset of the Web of Science are shown in Figure 1 and Table 1 Table 1). Before 2018, only a few studies were published on the thermochemical treatment of biomass using ML (Fig. 1). However, in the last three years (2020-2022), extensive research has been conducted to predict the yields, compositions, and properties of char, oil, gas, and aqueous phases from the thermochemical treatment of biomass. Some researchers have published reviews exclusive to individual thermochemical technologies ( Table 2). ML and statistical approaches for biomass torrefaction have been reviewed (Manatura et al., 2023). In another review published earlier, the applications of AI-based    and 2019, but only a limited number of cases related to thermochemical treatment processes were included, mainly cases of biomass pyrolysis and gasification (Liao and Yao, 2021). Several other reviews have been recently published concerning either ML-aided hydrothermal treatment Umenweke et al., 2022;Zhang et al., 2023) or gasification/pyrolysis of biomass (Ascher et al., 2022a). However, to the best of our knowledge, no review has been published that covers and compares the ML-aided wet and dry thermochemical treatments of biomass. In addition, the input and output features of ML models and the predictive performance and interpretation of the studied ML models have not been systematically reviewed. The present review aims to summarize and compare the up-to-date research on both ML-aided wet and dry thermochemical treatments of biomass and provide guidance for future studies. In the second section of the review, biomass and wet/dry thermochemical processes are introduced, with particular emphasis on the descriptors of the variables used in previous ML studies. The third section summarizes the ML schemes and popular algorithms used in this area. In the fourth section, the application of ML for predicting the yields, compositions, and properties (structural characteristics were also included as properties) of products from wet and dry thermochemical treatments of biomass are compared and discussed. Finally, challenges and strategies to bridge them are provided before the conclusions are presented.

Biomass characterization
Biomass is mainly composed of organic matter; the yield of biomass volatile matter is generally more than 50%, and the major elements in biomass, presented in order of weight percentage from high to low, are C, O, H, and N (Vassilev et al., 2010;Leng et al., 2021c). The variation in the ash yield of biomass is significant, ranging from near 0 to 50% or even higher, and the composition of ash is even more complex, constituting dozens of inorganic components, if not hundreds (Vassilev et al., 2012 and. The remainder, moisture ignored, is fixed carbon, which is less abundant than volatile matter and ash (Vassilev et al., 2010;Leng et al., 2021c).
Biomass types are commonly classified according to their biological compositions. Lignin, cellulose, and hemicellulose are the major components of most traditional lignocellulosic biomasses, namely forestry and agricultural biomasses, including wood and woody biomass, husks, straw, and grasses, and the contents of these three components vary depending on the specific biomass type (Vassilev et al., 2010). For lignocellulosic biomass, other biological components are generally not characterized, and extractives are determined by subsequent leaching with various solvents (Vassilev et al., 2012). For biomass resources originating from microorganisms and animals, as well as fruit wastes, lipids, proteins, and carbohydrates are the dominant components (Ge et al., 2021). In algal biomass (microalgae and macroalgae), sewage sludge, and food waste, non-fibrous carbohydrates also exist (Vassilev et al., 2010 and. From a biochemical perspective, lignin is a polymer of phenolic units (phydroxyphenyl, guaiacyl, and syringyl), cellulose is a polymer of glucose, hemicellulose is made of various sugars (including glucose, xylose, and mannose), and proteins are polymers of 20 α-amino acids . By contrast, lipids are composed of triglycerides, whose building block, i.e., fatty acids, can vary significantly depending on the lipid sources (Sawangkeaw and Ngamprasertsith, 2013;Leng et al., 2020a;. Generally, lipids with relatively short carbon chains and highly saturated fatty acids are fat, more commonly found in biomass of animal origin, whereas lipids with long carbon chains and low-saturation levels are oil, more commonly seen in the biomass of plant and microbial origin (Sawangkeaw and Ngamprasertsith, 2013;Leng et al., 2020a). During thermochemical treatment, lignin tends to yield biochar, and lipids yield bio-oil, while the other biochemical components contribute differently to char, oil, gas, and aqueous phases, depending on the processing conditions . It should be noted that both elemental and biological compositions can be expressed either on a dry basis or a dry-ash-free (DAF) basis. The numerical and category descriptors for biomass are listed in Table 3.
The descriptors for biomass are related to each other, and some biomass descriptors can be predicted by other biomass descriptors using ML. For example, the elemental composition of biomass can be predicted by proximate analysis (Ghugare et al., 2017;Olatunji et al., 2019) or infrared spectroscopy (Tao et al., 2020), and the cellulose, hemicellulose, and lignin content of biomass can be predicted by ultimate analysis (Xing et al., 2019;Kartal and Özveren, 2021). Although biomass, as a material, can be characterized by many other structure and property descriptors, such as mechanical properties, these descriptors were seldom used as variables for ML. The main reason is that only a few studies have reported such descriptors.

Dry thermochemical treatment
Dry thermochemical treatment processes mainly refer to torrefaction, slow/fast pyrolysis, and gasification generally used to process biomass with low moisture content, such as lignocellulosic biomass. These three processes are differentiated by the process parameters (mainly the temperature and heating rate) and their major products. Torrefaction is a process that operates at lower temperatures and heating rates of < 300 °C and < 20 °C/min, respectively (Dai et al., 2019), and is generally used to pretreat biomass for combustion or facilitate the following processes to yield a solid product (da Silva et al., 2018). Slow pyrolysis also occurs at low heating rates, but the temperatures are generally higher than those of torrefaction, i.e., 300-700 °C, with biochar as the dominant product (Liu et al., 2015). Owing to the low heating rates and mild reaction conditions, the residence time in torrefaction and slow pyrolysis ranges from several hours to days. Similarly, fast pyrolysis is generally processed at 300-700 °C but at much higher heating rates, i.e., ranging from 100 °C/min to more than 1000 °C/s, with bio-oil as the major product . Gasification is a process that involves high heating rates and high working temperatures, ranging from 800 to more than 1000 °C, with syngas being the dominant product (Molino et al., 2016). Owing to the high heating rates and strong reaction conditions, the residence time in fast pyrolysis and gasification is generally within several seconds.
Apart from the temperature, heating rate, and residence times mentioned above, other process parameters can be used to describe these processes and may have considerable effects on biomass thermal treatment performance. For example, the purge gas for pyrolysis is normally inert, and an oxygendeficient atmosphere is used in commercial facilities. In torrefaction, an oxidative atmosphere such as air is also commonly used to enhance the energy density of torrefaction char (da Silva et al., 2018). While reactive atmospheres, such as air, water, and O2, are required during gasification to enhance the efficient breaking of chemical bonds. When microwave pyrolysis is used, the microwave power replaces the temperature and acts as the dominant descriptor of this process (Mari Selvam and Balasubramanian, 2022). Other important descriptors include particle size, purge gas flow rate, reactor type and characteristics (e.g., bed materials, heating source), and catalyst. The descriptors for the dry thermochemical treatment processes are shown in Table 3.

Wet thermochemical treatment
Wet thermochemical treatment processes are generally used to treat biomass with high moisture content, such as algae, sludge, manure, and food waste, so that the biomass moisture can be used as a reaction solvent, thus removing the need for energy-intensive pre-drying. The wet thermochemical treatment processes include i) hydrothermal carbonization (HTC), also called wet torrefaction, which works at low temperatures (180-260 °C) and low pressures (2)(3)(4)(5), and the main product is hydrochar (also called biochar) Zhai et al., 2017); ii) hydrothermal liquefaction (HTL), which is conducted at temperatures of 250-400 °C and pressures of 5-20 MPa, with bio-oil being the dominant product (Huang et al., 2013;Huang and Yuan, 2015); and iii) supercritical water gasification (SCWG), or hydrothermal gasification, which processes biomass at temperatures of 380-650 °C and pressures of 20-40 MPa to produce syngas rich in H2, CO, and CH4 (He et al., 2014;Su et al., 2015).
In addition to temperature and pressure during hydrothermal treatment, other parameters, such as residence time, solid content (solid loading), moisture content (water content), heating rate, reaction solvent, extraction procedure and solvent, and catalyst (shown in although the same heating time is required. However, the heating rate of HTL and SCWG can be up to hundreds of centimeters per minute in fast-heating reactors, which can be achieved by immersing the reactors in a pre-heated sand bath (Akiya and Savage, 2002;Peterson et al., 2008).
In addition, for both wet and dry thermochemical processes, parameters can be integrated and presented as new parameters, e.g., the reaction severity index, which is generally the integration of temperature and residence time by functions . Moreover, other thermochemical treatment processes, such as hydrolysis and "thermal-dissolution-based carbon enrichment", have also been proposed that share similar process descriptors as those mentioned above (Hu et al., 2022).

Product characterization
According to previous reviews, the differences in the compositions and properties of oil/char/gas/aqueous phase products between wet and dry thermochemical treatments are small (Kambo and Dutta, 2015; Leng et al., 2018c and 2021b), and many descriptors for these products from wet and dry processes are the same. As shown in Table 3, there are three branches of descriptors for the products from the thermochemical treatment of biomass: yields, compositions, and properties. While the yields of bio-oil, biochar, and gas phases are mainly calculated based on the weight ratio of these products to the weight of the dry (more commonly used) or DAF-based biomass, the yield of the aqueous phase is generally calculated as the carbon recovery rate in the aqueous phase because weighing organics in the aqueous phase is difficult (Leng et al., 2018d).
The properties and compositions of the four phases vary significantly. As an oil for engine use, bio-oil fuel properties such as calorific value (mainly higher heating value, HHV), lower heating value (LHV), carbon/solid residue, viscosity, density, flash point, pour point, and acid number are required and have been reported frequently (Kan et al., 2016;Leng et al., 2018c). Other properties, such as pH, molecular weight, lubricity, and boiling range, have also been used to describe bio-oil (Kanaujia et al., 2014;Kan et al., 2016). These properties depend on the composition. Elemental compositions (C, H, O, N, and S), atomic ratios (O/C, H/C, and (O+N)/C), chemical compounds (relative or absolute contents of various chemicals), and water content are commonly reported. The water content of bio-oil has mainly been reported by references dealing with biomass pyrolysis, and the value is approximately 10-30%, constituting a large part of the bio-oil (Leng et al., 2018c). However, water content has only occasionally been reported for bio-oil from HTL because HTL-bio-oil is typically dewatered during the solvent extraction and separation procedure, and the value is generally much lower if reported (Leng et al., 2018c). N and S in bio-oil can be detrimental because they cause pollution upon combustion, but it can be beneficial if N/S-rich chemicals or materials are targeted (Leng et al., 2020c and2020e).
Biochar can also be used as a fuel for boilers; thus, calorific values such as HHV and LHV are also generally reported, and many equations are available for their calculation . However, using biochar as a carbon material, replacing activated carbon and other materials, is more promising than using it as fuel. Material properties such as specific surface area (SSA), porosity, aromaticity, electrical conductivity, and  Table 5 lists the indicators for the application performance of thermochemical products in different areas and the closely related compositions and properties of the char/oil/gas/aqueous phases, which should be investigated in the future. Gas generated by the thermochemical treatment of biomass is generally considered gas fuel. The chemical components, LHV, and tar content of the gas are the most important factors in its application (Kan et al., 2016). The enrichment of gas with H2 and CH4 has received the most interest because these components are the most effective and clean fuel components, and they can be used as fuel or applied to produce various chemicals (Molino et al., 2016). Gases from HTC, HTL, torrefaction, and slow pyrolysis are not frequently collected and analyzed because of the low contents of H2 and CH4 but the high content of CO2 (Leng et al., 2020f).
The aqueous phase, however, is a byproduct of the thermochemical treatment of biomass, with hydrothermal treatment processes producing a large amount, i.e., 1-20 times higher than the dry biomass weight (Leng et al., 2018a(Leng et al., , 2020d, and 2021b). Pyrolysis only produces an aqueous phase weight of 10-30% of the mass of dry biomass, and most frequently, it is mixed with bio-oil, which is why the water content in bio-oil from pyrolysis is high. The treatment and valorization of the aqueous phase, especially from hydrothermal treatment, is becoming a challenge for pathways toward commercial viability (Watson et al., 2020). Common wastewater indicators such as pH, chemical oxygen demand (COD), total organic carbon (TOC), and total nitrogen (TN) have been reported in the literature, and they help indicate the wastewater properties and facilitate the matching of suitable technologies to manage the wastewater. Total phosphorus (TP) is mainly reported in the aqueous phase from the thermochemical treatment of P-rich biomass, such as algae, sludge, manure, and food waste (Leng et al., 2019a;Chen et al., 2022b).
Chemical compounds of organics in the aqueous phase have also been characterized, and their compositions are similar to those of bio-oil; however, the contents of polar fractions are higher and nonpolar fractions are lower than those of bio-oil . It is worth mentioning that biomass inorganics, such as K and Na, are distributed predominantly in the aqueous phase.

Machine learning schemes
A typical ML scheme is shown in Figure 3. ML always starts with data collection from references, using tools such as Plot Digitizer or AI tools or data from experiments or open sources, which can be more easily collected. The distribution of each descriptor should be analyzed initially to ensure  that the collected data are typical, cover the required ranges, and exhibit a favorable normal distribution of the descriptor values. The ML model is built based on the data collected; thus, the applicability and generalization capability of the model is limited to the distribution ranges of the descriptors. When the dataset is ready, data processing is required because different descriptors have varied value scales, and normalization of all descriptors to be within -1 and 1 is generally conducted. Dimensionality reduction is sometimes required when the number of inputs is much higher than the number of outputs or when input descriptors are highly correlated, and principal component, discriminant, and independent component analyses are commonly used for such purposes . After the dataset is prepared and processed, it can be used for ML modeling (Fig. 3). The first step of modeling is selecting suitable algorithms (see Section 3.2) for prediction tasks. Then, the model is trained, and the corresponding hyperparameters are tuned with selected input data, i.e., generally, 70-90% of the data collected. During hyperparameter tuning, cross-validation, which includes hold-out, leave-one-out (Bagheri et al., 2019), rolling-windows analysis (Elmaz and Yücel, 2020), and k-fold methods , allowing more data to be trained and tested, is required to avoid overfitting, thus ensuring accurate prediction of the final optimum ML model (Fig. 3). For example, in k-fold cross-validation, the training dataset is randomly split into k folds without overlapping; then, the ML model is trained using the k-1 fold data and tested with the remaining fold; the above procedures are repeated k times to allow all training data to be used for training and validating, which is beneficial for testing with the test dataset ( Fig. 4a). However, in some studies, cross-validation was not used, for example, ML with test but without cross-validation (Chen et al., 2018; Ozbas et al., 2019) ( Fig. 4c) and ML with validation and test but without cross-validation (Qasem et al., 2023) (Fig. 4d). The average values of the coefficient of determination (R 2 ) and root mean square error (RMSE) from cross-validation are used to assess the predictive performance, and hyperparameters with the highest fitness (R 2 ) and/or lowest error (RMSE) as the optimum hyperparameters. Several other statistical criteria can be used to determine ML models' performance, accuracy, and reliability (Umenweke et al., 2022). With the optimum hyperparameters, the data used during hyperparameter tuning (i.e., 70-90% of the data collected) will be input as the training data to retrain the ML model, and the remainder (10-30%) will be used as test data to test the model. However, in some studies, the model was not further tested after cross-validation, and a model test was implemented within the cross-validation process ( Fig. 4b) Yapıcı et al., 2022), indicating that the models were not tested with data from out-of-dataset cases. The R 2 and RMSE of such models would be calculated again to evaluate whether the model is acceptable; if not, the Table 5.
For the optimum ML model, the model interpretation will be studied so that the working mechanisms of the "black box" model can be understood to some extent (Fig. 3). Feature importance analysis, Shapley additive explanations (SHAP) analysis, and partial dependence plots (PDP) are generally used to study the correlations between inputs and outputs. The reasonability of the model may not be favorable, even if it has an acceptable predictive performance. However, the data never lie, and the problem may be that the data collected are biased. The dataset needs to be checked carefully, and the deletion of odd data may solve the interpretation problems, or as much new data as possible should be added to allow for the reconstruction of models with both satisfactory predictive performance and reasonability. For the final optimum models, online or offline platforms can be developed to allow the models to be used by others (Fig. 3). For example, an online graphical user interface (GUI) website or offline software can be developed. With the optimum model, optimization can be conducted to obtain optimal thermochemical operating conditions, as well as biomass characteristics or mixing recipes, that will achieve the targets (generally high yield, high energy recovery, and favorable properties) , which should be the major research direction in the future.

Machine learning algorithms
Many frameworks and libraries are available for conducting ML modeling, e.g., scikit-learn, TensorFlow, PyTorch, CRAN, Keras, Weka, H2O, mlpack, and those from Amazon, IBM, and Google, among which the free Python-based library scikit-learn seems to be the most popular. The core of these libraries is similar to that of ML algorithms, and the major differences are the different working languages (Python, C++, R, Java, etc.) and operating companies or organizations. ML can be classified into three groups: supervised learning, which makes the machine learn explicitly that data with clearly defined output are required; unsupervised learning, in which the machine learns the data without any defined output; and reinforcement learning, in which the machine learns how to act within a certain environment to maximize the rewards because each learning task returns a reward (Mahmood and Wang, 2021). Supervised learning, mainly used to resolve regression and classification problems, is generally used to predict the yields, compositions, and properties of products obtained from biomass thermochemical treatment.
Many regression and classification algorithms are available, and typical algorithms are described in Figure 5. The accuracy and generalizability of the as-built ML model are dependent on the algorithms, and each algorithm has its own merits and weaknesses. Regression algorithms can be used to build the relationship between dependent and independent variables, and linear regression is the simplest, with an artificial neural network (ANN, Fig. 5c) being the most complex. An ANN is a biologically inspired algorithm that imitates the human brain's functionality and consists of an input layer for receiving input variables, one or more hidden layers for identifying nonlinearity and correlating inputs and outputs, and an output layer for representing output variables. ANNs have been widely used in this area owing to their ability to model highly nonlinear processes (Liao et al., 2019; Xia et al., 2021; . However, ANN models are rather complex, and screening network structures, training algorithms, and activation functions are challenging. Some studies have compared ANNs with different modeling skills. For example, comparisons between standard and ordinary activation functions indicate that the former outperforms the latter in predicting hydrogen production (Ayodele et al., 2021). In addition, the ANN has limited interpretability, and few interpretations are included in published references, posing challenges to understanding its working mechanism.
Tree-based models (Fig. 5d), such as decision tree (DT) (Quinlan, 1986) and random forest (RF) (Pavlov, 2001), have been increasingly used in this area because of their improved interpretability over ANN and comparable predictive performance. For example, RF is an ensemble model with many DTs that uses bagging or bootstrap aggregation to achieve accurate prediction and improved generalization results compared to single-DT models (Fig. 5d) (Pavlov, 2001). Tree-based models are classification algorithms that can be used to identify object categories, and they are favorable for processing non-numerical variables, such as category variables. For example, DT outperformed ANN in predicting hydrogen production . RF showed better performance than the multilayer perceptron neural network (MLP-NN) in predicting the yields of oil, char, and gas, as well as the compositions of gas and char (Shahbeik et al., 2022). However, there are few studies related to the comparisons between ANN and tree-based models, and the lower predictive performance of ANN may be due to limited tuning strategies. For example, suitable value-assigning methods for non-classification algorithms, such as the one-hot encoding method (Ascher et al., 2022a), may offset this shortage of regression algorithms, such as ANN, and improve the predictive performance.
Other regression and classification algorithms, such as support vector machine (SVM) (Cortes and Vapnik, 1995) (Fig. 5e) and gradient boosting regression (GBR) (Fig. 5f), have also been increasingly applied to predict yields, compositions, and properties of oil, char, gas, and aqueous phases. In SVM, the training algorithm identifies the separating hyperplane to classify two classes and allows the maximization of the distance between the nearest data points and the hyperplane, which is constructed by support vectors (data points from either class closest to the hyperplane) (Fig. 5e). By contrast, the GBR algorithm is trained using a boosting strategy, which establishes the first tree to predict the errors, i.e., variation between the actual and initial values, followed by the calculation of new prediction values under the previous prediction values and predicting new errors from the new tree until there is no obvious decrease in residues (Fig. 5f). The predictive performance of these models and ANN or tree-based models varies in different studies. For example, the SVM model had a more satisfactory prediction performance than ANN for predicting the HHV of bio-oil (Chen et al., 2018); GBR outperformed RF in predicting bio-oil yield and elemental compositions ); the RF model showed better prediction performance than SVM and DT, and multilinear regression (MLR) had the worst performance in predicting bio-oil yield . Although the predictive performance depends not only on algorithms but also on dataset characteristics, many researchers have proposed that tree-based models such as RF, SVM, and GBR may be preferable to ANN for small-number dataset problems.

Prediction of the yields of products
The yields of the four phases of wet and three phases (excluding the aqueous phase) of dry thermochemical processes have been predicted by researchers, with most cases concerning the yields of char and oil, and the predictive performance is shown in Figure 6a (see Supplementary  Information). Most studies used elemental compositions and thermochemical operation conditions as inputs. Proximate analysis was included in most studies dealing with dry thermochemical processes, while atomic ratios and biochemical compositions were also often considered (see Supplementary Information). However, only a few studies considered solvents (reaction solvents and extraction solvents for HTL) , catalysts (Castro , and metal compositions of biomass . Even when these variables are included as inputs, biased prediction and interpretation may be encountered if cases including these variables are too few or have a poor distribution (biased dataset) . In this case, the ML models can be built case-by-case for each particular solvent or catalyst (Zhou et al., 2022b), but models built in this way have low generalizability.
Strictly speaking, the predictive performance between the wet and dry thermochemical processes should not be compared directly because of the differences in datasets, algorithms, etc.; however, the statistical summaries of the predictive performance of all cases from so many references indicate that the ML predictive performance of yields from the dry thermochemical processes is superior to that of the wet ones (Fig. 6a). The R 2 values, mainly around 0.85-0.94, for the dry process models are slightly higher than those for the wet ones (R 2 mainly 0.85-0.90) (Fig. 5a), while RMSE values for the former (lower than five) are better than those for the latter. Considering that the acceptable errors of biomass thermochemical processes are within 5% (experimental uncertainty), the models obtained for predicting yields of products from dry thermochemical processes may be preferable, although an RMSE of 3-8 for the wet processes is also acceptable (Fig. 6a).

Prediction of the compositions of products
Unlike the prediction of the yields of products, the prediction of elemental and proximate compositions of the oil or char, as well as the compositions of the gas phase, in wet thermochemical process models is better than in the dry ones, with R 2 values of approximately 0.90 and RMSE ranging from near 0.5 to 2 for the former, and R 2 values of approximately 0.85 and RMSE ranging from near 0.2 to 4 for the latter (Fig. 6b). Only one study considered a catalyst as an input for the prediction of compositions  among the cases shown in Figure 6b, with others including thermochemical operation conditions and biomass compositions (elemental  Table 6). The yield of glucose from wet torrefaction of microalgae and sorghum distillery residue using H2SO4 as a catalyst was also predicted (Chen et al., 2022e). To enhance the predictive performance of N-heterocycles in bio-oil , the yield of bio-oil and the

Prediction of the properties of products
The prediction of the HHV or LHV of the oil, char, or gas is the most popular among researchers, and the predictive performance of the dry and wet thermochemical processes is similar, with an R 2 and RMSE of approximately 0.90 and 1.5, respectively (Fig. 6c), which is comparable to, if not better than, the performance of ML predicting the HHV of various municipal solid wastes including biomass, as shown in a previous review (Bagheri et al., 2019). It should be noted that the caloric value scopes of gas are much smaller than those of oil and char; therefore, the RMSE for the prediction of the caloric values of gas is generally lower ( (Table 7). However, there are still many other properties, such as density, flash point, pour point, acid number of bio-oil, porosity (total pore volume, micro-/meso-pore volume, and average pore size), and cation/anion exchange capacity of biochar, that are important for the application of these products (Ippolito et al., 2020) and have not yet been reported; thus, they are worthy of further investigation.

Other predictions (i) Simultaneous prediction of yields, compositions, and properties
Most current studies have performed single-task predictions, with a few reporting on multi-task predictions for simultaneous prediction of the yields of two or more product phases ( However, no study has been conducted on the yields of multiple product phases and their compositions and/or properties. Although many targets were predicted in some studies, they were achieved using the single-target prediction mode (Ascher et al., 2022a; Shafizadeh et al., 2022; Shahbeik et al., 2022). In these multitask predictions, a predictive performance comparable to singletarget predictions was achieved. According to the biorefinery concept, all product phases from the thermochemical treatment of biomass should be utilized or disposed of to valorize the biomass resource fully (Fan et al., 2020; Watson et al., 2020). In addition, although it is possible to use multiple singletarget models for the predictions of the required targets, it would be difficult to mediate the targets among many models when applying ML models. Therefore, multi-task predictions for simultaneous prediction and mediation of the yields of multiple product phases and their compositions and properties within one model are favorable and encouraged. However, the availability of data limits its implementation.

(ii) Prediction of the thermochemical behavior and kinetics
Other studies have reported the prediction of thermochemical conversion behavior or kinetics, for example, the degree of dehydration and decarboxylation of char during HTC ( Table 8), which can be useful for understanding the thermal degradation behavior and reaction kinetics of biomass or thermochemical process performance, and more studies should be conducted in this area.

Interpretation of the ML models
ML is very popular because it is not just a "black box" for predictions; it can also be interpreted to help understand the reaction mechanisms and engineer the thermochemical processes. First, the factors affecting the target can be ranked according to their importance level against the target through feature importance analysis (Fig. 3). Additionally, PDP and SHAP method analyses can be used to indicate the correlations between variables and targets. In PDP, the effect of any one input or the mutual influences of any two input variables on the predicted target can be plotted to show the linear, monotonous, or more complex connections between the input and output features. SHAP analysis can also be used to analyze feature importance levels as well as influence trends. In addition, SHAP assigns a value for each data point of the input feature to indicate its importance to the target, and the accumulation of SHAP values of all data points results in a full interpretation of the studied feature. These three interpretation methods have been widely used in the reviewed studies. Table 9 lists the top features for predicting the yields, compositions, and properties of the thermochemical products reported in the reviewed articles. The yield of bio-oil from HTL is dominated by temperature and lipids (contents of C and H are highly related to lipids (Leng et al., 2022c)) ( Table  9). Temperature is also the most important factor for bio-oil yield, as well as char yield from pyrolysis, but the other important features can vary in different studies. C and H are important to HHV, as they are the major elements that can be combusted to release heat. N can be very significant to the yields, compositions, and properties of bio-oil from HTL and pyrolysis because it participates in reactions to yield oil components, e.g., through the Maillard reaction (Leng et al., 2020c and 2020e). It seems that N has a more significant effect on the yield of char during HTC than on that of oil during HTL. The more prominent role of N on oil and char in wet thermochemical processes than in dry processes is because biomass with a higher N content is more commonly used in wet processes. The prediction of biochar SSA indicates that the parameters during activation are much more significant than the pyrolysis parameters (Liao et al., 2019), which is why biochar requires activation before it can be used to replace activated carbon from fossil resources. There are many other interesting and useful pieces of information in the ranking lists shown in Table 9, but these will not be detailed due to limited space.

Application of the ML models
The ML models obtained are mainly used to aid the thermochemical treatment of biomass. First, they can be used to predict the yields, compositions, and properties without performing experiments, which is the major focus of current studies. Additionally, ML models can be used to provide raw data for subsequent applications. For example, input data in life cycle assessment (LCA) and economic analysis studies can be obtained from ML predictions of the yields, compositions, and properties of char/oil/gas/aqueous phases, LCA (Cheng et al., 2020a and b), which can facilitate a more comprehensive LCA (Cheng et al., 2020a(Cheng et al., , 2020b. Second, ML can be applied to solve optimization problems, e.g., finding the optimal processing Table 9. Machine-learning-aided prediction of thermochemical conversion behavior or kinetics.

Product
Wet parameters for a given biomass (with known compositions, which are used as inputs) to produce target products, which is considered forward optimization. In this forward optimization, iterations of processing parameters with set step sizes yield corresponding targets, from which functions can be used to find preferable solutions. In addition, optimization of the biomass mixing ratio can be achieved by evaluating the results obtained from different biomass mixtures. ML models can also reverse design not only processing parameters but also biomass compositions (or biomass mixing recipes), which is considered reverse optimization. Through these methods, ML models can be applied to aid experimental studies to effectively determine the optimal solutions of thermochemical treatment processes and verify the validity of the models. For example, forward optimization of bio-oil production from HTL of specified algae using the iteration method and reverse optimization using the particle swarm optimization method with model compounds to obtain preferable biooil were conducted with experimental verification, and the results were satisfactory (Fig. 7)

Limitations of current studies
ML model, trained with experimental data, can predict the output accurately, and the application of ML to aid the thermochemical treatment of biomass has been receiving growing attention in recent years. As discussed in the previous parts of this work, many authors have published papers about the application of ML for predicting the yields, compositions, and properties of products from wet and dry thermochemical treatments of biomass. However, there are some limitations associated with these studies: (i) Data: Data unavailability is common, and data details, including input and output datasets, are missing in many studies. The accuracy of data is also crucial for the ML model's performance. The accuracy of the data was doubtful in some studies, such as inconsistency in calculation formulas (some on a dry basis while others on a DAF basis), feature engineering, scientific processing of textual data, etc. Furthermore, few articles have focused on the variables derived from thermochemical treatment-related processes during the construction of ML models based on biomass thermochemical conversion, such as the variables in the pretreatment of biomass, bio-oil separation, catalytic conversion, etc.
(ii) Modeling process. Different researchers used varied ML schemes during the construction of the ML model (Fig. 3) to obtain the optimized models trained and tested with the highest R 2 and/or lowest RMSE. However, schemes used in some studies are problematic. For example, models trained without cross-validation may have the problem of overfitting. Moreover, more indicators (e.g., generalizability), in addition to R 2 and RMSE, could be introduced to evaluate the model for better predictive performance. There are many hyperparameters for ML models, but only one or two of them were tuned in most studies, with other hyperparameters being unknown (whether they were tuned or the default).
(iii) Model application. Few studies focused on ML model optimization for enhanced thermochemical treatment. On the other hand, some studies only reported the predictive performance but with no interpretation of the ML model, particularly research based on ANN. In addition, the exploration of thermochemical conversion mechanisms based on ML model interpretation is rare.

Practical implications of this review
This review summarized and compared the up-to-date research in both machine-learning-aided wet and dry thermochemical treatment of biomass. In addition, the ML schemes, as well as strategies and descriptors of the input and output features in thermochemical processes, were also introduced. This study would make a significant practical contribution to the research and application work of the thermochemical treatment of biomass.
First, researchers can find the state-of-the-art, major, and strongly influenced journals and funding agencies of the ML-aided thermochemical treatment of biomass from this review. The summary and comparison of characterizations of biomass, technologies, and products would help interested researchers to have a deeper understanding of the field of biomass thermochemical conversion.
Second, ML schemes and algorithms, which would be very useful for new researchers interested in carrying out studies on ML-aided thermochemical conversion of biomass, were introduced in this review. Moreover, the discussion about the application of ML for predicting the yields, compositions, and properties of products from wet and dry thermochemical treatments of biomass can provide new inspiration and guidance for researchers. Moreover, it would be meaningful and helpful for the rapid development of the research area.
Third, the limitations of the present review on the ML-aided thermochemical treatment of biomass have also been overviewed, and the major challenges and perspectives were put forward, which could shed light on bridging the major gaps between the studies and real-world needs.

Improving the predictive performance of models
Predictive performance is the top priority of ML studies because it is the basis of interpretation, optimization, and application (monitoring, controlling, etc.). Data availability, inconsistency, and accuracy are vital for predictive performance. Including more cases to enlarge the dataset size (data number of several hundred or more is preferable) and introducing new input descriptors, such as subunit compositions of lignin, cellulose/hemicellulose, and protein (as detailed in Section 2.1); image (Ögren et al., 2018) and color of products (Li et al., 2018a); category descriptors for biomass or thermochemical processes (Ascher et al., 2022a); biomass ash compositions (Yan et al., 2020); and molecular simulation results from model biomass (Freitas et al., 2022) are effective approaches to increase the data dimension. Interpolation of missing data in a dataset is meaningful for ensuring data availability; interpolation by algorithms directly (Sun et al., 2022) or by building another ML model (Palansooriya et al., 2022) are both effective methods. Additionally, redundant input variables identified during ML modeling or feature analysis can be removed from the dataset for better prediction performance. Data consistency requires the data to be generated under the same or highly comparable conditions. For example, the consistency of the calculation equations for the input and output variables is the first thing to note. Researchers are advised to carefully check the calculation of the elemental compositions of biomass and char, as well as the yields of all products if they are calculated on a dry, DAF, or other bases. However, thermochemical reactors and their configurations in different studies may have large differences, and category descriptors for reactors should be listed as variables to overcome such inconsistencies. Currently, data are collected indiscriminately from references for most studies, and the accuracy of the data is not considered. Future studies may explore effective methods to exploit only accurate data for ML modeling. For example, data from modeling, e.g., data from Aspen Plus process modeling (Sezer and Özveren 2021), should be validated before use in ML modeling.
Model screening, feature selection, and model hyperparameter tuning are key to predictive performance. Current studies mainly screen ML based on evaluation metrics such as R 2 and RMSE because most researchers in this area are not from computer science, and many are not truly familiar with the working mechanisms of ML algorithms. Collaboration with peers in computer science and a screening model depending on the applicable characteristics of each ML technique corresponding to a given problem are encouraged. Suitable features and hyperparameters should be selected based on the domain expertise of the collaborators in thermochemical treatment and computer science. Additionally, optimization algorithms can be used before modeling to screen feature pairs  or during modeling to obtain the optimal hyperparameters , which is user-friendly for non-expert ML users. Genetic algorithms and particle swarm optimization (PSO) are more commonly used in these optimizations than other algorithms, such as the Rao algorithm, Sine Cosine Algorithm, and grey wolf optimization . When tuning the hyperparameters, cross-validation should be used, and the number of folds of the cross-validation may have a considerable effect on the RMSE; for example, the test RMSE was reduced from 8.43 to 8.07 when the fold number increased from 10 to 100 . However, in the reviewed articles, some studies did not use a cross-validation process, which would result in overfitting because the optimum hyperparameters were obtained most probably by chance, although running trial and error modeling several times may be beneficial for increasing accuracy .
For multi-target ML, optimizing the weight percent of each target (generally treated equally in most studies) can also balance the ML to obtain preferable predictive performance for all studied targets. Finally, advanced algorithms and modeling techniques such as deep learning (Lecun et al., 2015) can be used to improve predictive performance, especially for cases with a large amount of data.

Increasing model generalizability
The ML model cannot be simply evaluated by R 2 and RMSE; other indicators, such as the model's generalizability, are also important, and trade-offs between these two should be considered. A model with good predictive performance (high R 2 and low RMSE) does not necessarily indicate high generalizability. A model built based on a specific biomass type or thermochemical parameters, such as a specific thermochemical reactor, is likely to work only within this specific condition; it may not have the generalizability to predict under other conditions. For example, even if the data numbers are higher than 1000 with a predictive model R 2 higher than 0.95, the models built based on one or two biomasses in a particular gasifier ) cannot be used for accurately predicting other gasification processes. To obtain good generalizability, the coverage of the descriptors, amount of data in the dataset, and data distribution should be assessed carefully, with data distribution being the most important; bad data distribution, e.g., data of limited or biased coverage, would lead to poor generalizability. Creating highly generalizable models suitable across a wide range of feedstocks as well as thermochemical parameters and ranges, should be promoted in the future. Integrating dry and wet thermochemical treatment datasets to predict the yields, compositions, and properties of the char/oil/gas/aqueous phases without differentiating products from dry or wet processes may be a promising direction for testing. In addition, future ML models should be built with the extrapolative ability to explore the "unseen" space, such as the ML-aided discovery of new materials and chemicals (Butler et al., 2018), which is challenging but of high priority.

Increasing model interpretability and aiding thermochemical conversion mechanistic studies
Model interpretation results can not only be used to understand the fundamentals behind ML model-based decision-making but also be applied to guide thermochemical treatment mechanistic studies. Through feature ranking and PDP, the effects of biomass compositions and thermochemical parameters on a target can be understood in a rational manner. Mechanistic studies can be conducted for screened cases that are indicative of the connections between biomass compositions/thermochemical parameters and the yields, compositions, and properties of oil, char, gas, and aqueous phases. For some features, such as model biomass chemicals, catalysts, solvents, or additives that are composed of specific chemicals or elements, their molecular modeling data can be used directly as input features to understand how the structures, compositions, or properties of these features affect the yields, compositions, and properties of the oil, char, gas, and aqueous phases or the biomass thermochemical conversion behavior. In addition, more advanced and promising interpretation algorithms can be developed to help understand the connections between inputs and outputs. However, model complexity and interpretability should be balanced because increasing interpretability can result in higher structural complexity of ML.

Enhancing the real-world application of ML models
The ultimate target of ML studies is the real-world application of ML models. Current studies mainly concentrate on predicting the yields, compositions, and properties of thermochemical products; future studies should focus more on ML optimization. For example, thermochemical products should be engineered by integrating forward and reverse optimizations with the application performance of thermochemical products in different areas. During engineering, the ML model targets (e.g., yields, compositions, and properties of the oil/char/gas/aqueous products) should be screened based on the effects of the compositions and properties of the oil/char/gas/aqueous products on application performance. Therefore, the properties/structures of oil/char/gas/aqueous application performance relationships should be understood first, preferably by ML (e.g., those shown in Table 4). The main descriptors of oil/char/gas/aqueous products determining the application performance will be used as targets in thermochemical process ML models, with biomass compositions and thermochemical parameters as inputs. Therefore, the ML models in Table 4 can be integrated with the models presented in Section 4 to guide the production of smart products (see Fig. 8).
For example, the main descriptors determining CO2 adsorption capacity, namely SSA, total pore volume, contents of N and O, and mesopore volume , can be optimized within the as-built thermochemical treatment-for-biochar production prediction ML models to obtain optimal biochar production parameters, produce smart biochar, and achieve the highest CO2 adsorption capacity. The engineering of oil/char/gas/aqueous products in other areas, such as those in Table 5, can also be conducted in this manner. However, no studies have yet been conducted in this area.
In comparison to the prediction of the exact yields, compositions, and properties, the classification of the oil/char/gas/aqueous products according to different applications, such as the classification of the slagging degree of char (Bi et al., 2023) and carbon stability level of char (Leng et al., 2019b;, can be useful for the application of these products. ML can also be used to optimize computational parameters in other computational models, such as CFD and kinetic models, to indirectly aid thermochemical treatment. In addition to solving prediction and optimization problems, the ML model can be used for classification and control. Examples include identifying and classifying images of oil/char/gas/aqueous products from biomass thermochemical treatment processes or images from computational modeling for advanced predictions and monitoring . However, few studies have been conducted in this direction. There are other considerations when applying the ML model to the real world (Meena et al., 2021). For example, computational cost and efficiency (computation time) are vital if the model is used for online monitoring and control. Only a few studies have recorded the computation times of developed models. One study reported that the MLP-NN model was approximately three times faster than the artificial neuro-fuzzy inference system (ANFIS) model . Another concern is whether ML models are reliable and efficient enough (uncertainty quantifiable and acceptable) to guide and replace human-expert decision-making.

Promoting data and model sharing in the community
Sharing data and as-built models in published papers should be encouraged. Large-scale and high-quality databases may be built by researchers in this community, thus facilitating high-quality ML studies. Additionally, model sharing allows the models to be evaluated, used, and even rebuilt by others to promote the development of this area more effectively. Simple offline apps and online GUIs have been developed by some researchers Leng et al., 2022c and) and can be adopted by others.

Conclusions
General ML schemes and strategies were summarized in this review. Descriptors for the input and output features in the ML models for dry and wet thermochemical processes are similar, and predictive performance is preferable. The predictive performance for the yields of oil/char/gas/aqueous phases in modeling dry thermochemical processes is better than that of wet processes, while an inverse trend was observed for predicting the product compositions. The interpretation of the ML model indicates the key features affecting the yields, compositions, and properties of oil/char/gas/aqueous products, which can be useful in guiding future experimental studies on biomass thermochemical treatment. Improving predictive performance, increasing model generalizability, increasing model interpretability, aiding mechanistic studies, enhancing the real-world application of ML models in various areas, and sharing data and as-built models in the community are the frontiers of future investigations to bring ML to the next stage. In the near future, the development and research of biomass thermochemical treatment processes are envisaged to be accelerated by ML-aided prediction of yields, compositions, and properties of oil/char/gas/aqueous products, thermochemical conversion behavior and kinetics, as well as the characterization and application performance of different biomass products in various areas, in addition to ML-aided optimization, monitoring, and control of the thermochemical processes.