Combining plant biology and machine learning let researchers sort through tens of thousands of genes to determine which make specialized metabolites.
Researchers have combined plant biology and machine learning to sort through tens of thousands of genes to determine which genes make specialized metabolites.
Some metabolites attract pollinators while others repel pests. Ever wonder why deer eat tulips and not daffodils? It’s because daffodils have metabolites to fend off the critters who’d dine on them.
“Plants are amazing—they are their own mini factories, and we wanted to recreate what they do in a lab…”
The results could potentially lead to improved plants but also to the development of plant-based pharmaceuticals and environmentally safe pesticides, says Shin-Han Shiu, a plant computational biologist at Michigan State University.
“Plants are amazing—they are their own mini factories, and we wanted to recreate what they do in a lab to produce synthetic chemicals to make drugs, disease-resistant crops and even artificial flavors,” Shiu says.
“Our research found that it is possible to pick out the right gene by automating the process since machines are more capable of picking out minute differences among thousands of genes.”
Taking a machine-learning approach, an interdisciplinary team of biochemists and computational biologists created a model that looked at more than 30,000 genes in Arabidopsis thaliana, a small flowering plant that is called the “lab rat of plant science.”
The model is based on technology used by e-commerce to forecast consumer behavior and create targeted advertising, such as ads seen on a person’s Facebook page. Basically, the technology sorts through thousands of ads based on your previous online behavior to send you select ads geared toward your interests and activities.
In the plant study, scientists wrote a program that sorted through 30,000 genes to hone in on the ones related to making specialized metabolites.
“Machine learning was a novel approach for us in plant biology, a new application of tools widely used in other fields,” Shiu says.
“The model we created with machine learning can now be applied to other plant species that produce medicinally or industrially useful compounds to speed up the process of discovering the genes responsible for their production.”
“We’ve known for a long time that plants make a wealth of useful, valuable compounds, but this work really throws open that treasure chest in important new ways,” says Clifford Weil, a program director in the National Science Foundation’s Plant Genome Research Program, which funded the research. “It’s a great advance in how, and how well, we can explore nature’s most-creative biofactories.”
The research appears in Proceedings of the National Academy of Sciences. Additional researchers who contributed to the project are from Michigan State University and the University of Michigan.
Source: Michigan State University