Machine Learning and Natural Language Processing Enable a Data-Oriented Experimental Design Approach for Producing Biochar and Hydrochar from Biomass

Abstract

Carbon Functional Materials (CFMs) such as biochar and hydrochar can be obtained from hundreds of biomass precursors varying from urban sludge to agriculture wastes. They can be produced through tens of synthesis methods and post-synthesis processing steps tuned each at specific conditions (e.g., temperature, time, and chemicals concentration). To achieve a “rational design” platform for a system with a high dimensional parameter space such as CFMs, we processed 10,975 scientific articles (from years 2000 to 2020) related to the subject with automatic reading-interpreting-extracting computational routines (namely a.RIX engine). The a.RIX engine automatically recognized more than a hundred precursors, among which wheat straw, rice husk, and rice straw were the most studied for CFMs synthesis and application in agriculture (e.g., as an amendment), as a fuel (energy generation), and as an adsorbent. Parameters related to the CFMs synthesis conditions such as carbonization temperature and time, and parameters related to CFMs properties such as surface area and heavy metals adsorption capacity, can also be extracted from the articles. Correlations between the CFMs precursors and synthesis conditions indicated very little statistical difference between the carbonization temperature and time used for the CFMs synthesis from different precursors. Essentially, precursors are carbonized at temperatures varying from 100 to 900 oC for 30 min to six hours using pyrolysis, hydrothermal carbonization, and gasification. When focusing the analysis just on CFMs produced by pyrolysis (biochar), we observed that peanut shells can produce materials with higher surface areas than other precursors (p < 0.05). When performing correlations between biochar synthesis conditions and their properties, general trends can be confirmed: (i) the higher the carbonization temperature, the lower the H/C and O/C ratios, and (ii) the increase in the surface area can be achieved by preserving a high aromatic degree (low H/C ratio) and a low oxidation level (low O/C ratio). However, a deeper understanding of the relation between CFMs synthesis/post-synthesis methods and the resulting properties can only be achieved using clustering algorithms (e.g., k-means) and complex network analysis. The a.RIX engine groups articles describing optimized synthesis conditions and CFMs properties (e.g., low carbonization temperature, low carbonization time, and large surface area and adsorption capacity) and automatically recognizes the synthesis/post-synthesis steps used for these groups. The program efficiently recognized that precursors such as peanut shells can be converted into highly porous biochar by using experimental routes such as “pyrolysis” -> “activation” -> “drying” -> “ashing” -> “washing” -> “filtration.” With this approach, we show that a non-computational review of scientific articles for materials with a huge parameter space such as CFMs is largely obsolete. Finally, taken together, the results provide a powerful platform for data-oriented experimental design of CFMs produced from biomass.