To effectively control and reduce environmental pollution, must first understand the sources of pollutants, distribution characteristics, transformation and toxic effect, thus the pollution governance or repair, which requires to environmental effect forecast of chemicals, this is the important means of risk prevention and control of dangerous chemicals, and is also the important direction of current toxicology studies. The prediction of environmental biological effects has played an irreplaceable role in major environmental issues such as the Songhua River pollution event and the cause of atmospheric haze in 2005.
At present, nano-materials in the electronic machinery, medical, chemical, energy, environment, and many other fields of research and application of rapid development, but how to predict the environmental effects of nanophase materials to the connotation of lack of database, general weak environment transformation scene omissions, model, harmful to the state of nano material in the prevention and control of risk.
Recently, the environmental science and engineering college of Nankai University professor Hu Xiangang team in the development of machine learning algorithm to predict the biological effect of nanometer materials, and by enhancing interpretability of machine learning, so as to explore the mechanism of biological effect of nanomaterials has achieved breakthrough, to solve the above problem provides a new research idea. On May 26th, Deep exploration of random forest model of varying the machine learning studies of varying degrees Complicated immune responses and lung burden of nanoparticles "was published in the internationally renowned journal Science Advances.
Paper screenshots
At present, machine learning models have been widely used in the prediction of environmental biological effects of nanomaterials. However, due to the limited interpretability of machine learning, it is still very difficult to use machine learning models to reveal the mechanism of complex nano toxicology.
Based on the previous work of Prof. Xiangang Hu and his team (PNAS,2020, 117, 10492-10499; ES&T, 2018, 52, 9666-9676) created a database of nanomaterials and biological effects, constructed a regression model of nanomaterials and biological effects, and proposed a tree-based random forest feature importance and feature network interactive analysis framework (TBRFA), which used the multi-indicator importance analysis method. The bias of feature importance analysis caused by small data sets was overcome, and a feature interaction network was established by using the working mechanism of random forest to reveal the potential interaction factors affecting the biological effects of nanomaterials.
TBRFA Frame
TBRFA analysis framework includes importance analysis and feature interaction network analysis. TBRFA importance analysis used multiple importance indicators to balance the bias caused by the traditional single index, and identified the exposure recovery time, material specific surface area and material size as the important factors affecting the biological effects induced by nanomaterials. By analyzing the tree structure of the random forest, TBRFA feature interaction network analysis calculated the interaction coefficient between the two features, and identified that the specific surface area and surface charge, specific surface area and length, length and diameter of the material play a role of mutual restriction and influence in the process of inducing biological effect.
Professor Hu Xiangang said that this study has a certain guiding role for the development of environmentally friendly nanomaterials, and will provide new strategies for the ecological and environmental safety assessment of nanomaterials. This machine learning algorithm is not only suitable for the analysis of environmental effects of nanomaterials, but also can be used for the prediction and evaluation of environmental biological effects such as heavy metal and organic pollution.
Nankai University is the independent author of this paper. Yu Fubo, a PhD candidate from Nankai University, is the first author, and Professor Hu Xiangang is the corresponding author. The research was supported by the National Natural Science Foundation of China (NSFC), the National Key Research and Development Program, and the Outstanding Youth Fund of Tianjin Science and Technology Bureau.