Notícia

WoodZog

Artificial intelligence helps predict performance (65 notícias)

Publicado em 20 de outubro de 2022

A Brazilian study published in Scientific Reports shows that artificial intelligence (AI) can be used to create efficient models for genomic selection of sugarcane and forage grass varieties and to predict their performance in the field based on their DNA.

In terms of accuracy compared to traditional breeding techniques, the methodology has been developed with support of FAPESP improved predictive power by more than 50%. This is the first time that a highly efficient genomic selection method based on machine learning has been proposed for polyploid plants (in which cells have more than two complete sets of chromosomes), including the grasses studied.

Machine learning is a branch of AI and computer science that involves statistics and optimization, with numerous applications. Its main goal is to create algorithms that automatically extract patterns from data sets. It can be used to predict a plant's performance, including whether it will be resistant or tolerant to biotic stresses such as pests and diseases caused by insects, nematodes, fungi or bacteria, and whether abiotic stresses such as cold, drought, salinity or insufficient soil nutrients.

Crossbreeding is the most commonly used technique in traditional breeding programs. “You create populations by crossing interesting plants. In the case of sugar cane, for example, you cross a variety that produces a lot of sugar with another that is more resistant. You cross them and then assess the performance of the resulting genotypes in the field,” said computer scientist Alexandre Hild Sixde , lead author of the article on the study published in Scientific Reports. Aono is a researcher at the Center for Molecular Biology and Genetic Engineering (CBMEG-UNICAMP) at the State University of Campinas. He graduated from the Federal University of São Paulo (UNIFESP).

“But this review process takes a long time and is very expensive. The method we propose can predict the performance of these plants even before they grow. Based on the genetic material, we managed to predict the yield. This is important because it saves many years of assessment,” explains Aono.

In the case of sugar cane, the challenge is very complex. Traditional breeding techniques last between nine and 12 years and come at a high cost, according to Anete Pereira de Souza a professor of plant genetics at UNICAMP's Institute of Biology and Aono's PhD supervisor at CBMEG.

“When breeders identify an interesting plant, they multiply it by cloning so that the genotype is not lost, but that takes time and a lot of money. An extreme example is growing rubber trees, which can take up to 30 years,” says Souza. One way to overcome these problems is what she called “plant breeding 4.0”, which makes heavy use of data analysis and highly efficient computational and statistical tools. Each genotyping process per sequence can involve 1 billion sequences.

The biggest hurdle scientists face in trying to grow better varieties of polyploid plants like sugarcane and forage grass is the complexity of their genomes. “In this case, we didn't even know whether genomic selection would be possible, given the scarce resources and the difficulty of working with this complexity,” Aono said.

Methods:

The researchers started the genomic selection process with diploid plants [ containing cells with two sets of chromosomes ], because they have taken simpler. “The problem is that high-quality tropical plants like sugar cane are not diploids, but polyploids, which is a complication,” Souza said.

While humans and almost all animals are diploid, sugarcane can contain as many as 12 copies of each chromosome. Each individual of the species Homo sapiens can have up to two variants of each gene, one inherited from the father and the other from the mother. Sugarcane is more complex because theoretically each gene can have many variants in the same individual. There are regions of its genome with six sets of chromosomes, others with eight, ten and even 12 sets. “The genetics are so complex that breeders work with sugar cane as if it were diploid,” Souza said.

In 2001, Theodorus Meuwissen, a Dutch scientist who is currently a professor of animal breeding and genetics at the Norwegian University of Life Sciences (NMBU), proposed genomic selection to predict complex traits of animals and plants in association with their phenotypes (observable traits resulting in the interaction of their genotypes with the environment). The advantage of this plant breeding approach is the relationship between the phenotypic traits of interest, such as yield, sugar level or precocity, and single nucleotide polymorphisms (SNPs). A “snip” (as SNP is pronounced) is a genomic variant located at a single base position in the DNA, Souza explained.

“It's the difference in the genome of two individuals. For example, one can use an A. to have [ corresponding to the nucleotide adenine ] that yields slightly more than another with a G [ guanine ] at the same place in the genome. That changes everything,” she says. “If you find an association with what you're looking for, such as high sugar production and specific SNPs at different locations in the genome, you can only sequence the population that your breeding work is targeting.”

The advances suggested by Aono and colleagues eliminate the need for planting and phenotyping throughout the growing cycle. “We are doing field experiments in the early stages of the program to obtain the phenotype of interest for each clone,” Souza said. “At the same time, we sequence all clones in the breeding population fairly straight forward, without having to have the whole genome for each clone. This is called genotyping-by-sequencing – partial sequencing looking for the differences and similarities in the base pairs for the clones and their association with the production of each clone. The association between phenotype and genome shows which produces more and which SNPs are associated with higher production. In this way, we can identify clones with a high proportion of SNPs that contribute to the higher production observed in the first experiments and obtain the most productive variety faster and cheaper.”

The project has been successful thanks to many years of collaboration with scientists from different research institutions and universities, such as the Luiz de Queiroz College of Agriculture of the University of São Paulo (ESALQ-USP), the UNIFESP Institute of Science and Technology, the Campinas Agronomic Institute (IAC ) and the sugar cane center in Ribeirão Preto, the Beef Cattle Unit of the Brazilian Agricultural Research Corporation (EMBRAPA) in Campo Grande, the State of Mato Grosso do Sul, the Aeronautical Technology Institute (ITA) in São José dos Campos, the State of São Paulo and Edinburgh Roslin Institute of the University in the United Kingdom.

About Sao Paulo Research Foundation (FAPESP)

The São Paulo Research Foundation (FAPESP) is a public institution whose mission is to support scientific research in all fields of knowledge by awarding grants, fellowships and grants to researchers affiliated with higher education and research institutions in the State of São Paulo, Brazil. FAPESP is aware that the very best research can only be done by collaborating internationally with the best researchers. Therefore, it has established partnerships with funding agencies, higher education, private companies and research organizations in other countries known for the quality of their research, and has encouraged scientists funded by its grants to further develop their international collaboration. You can learn more about FAPESP at www.fapesp.br/en and visit the FAPESP news agency at www.agencia.fapesp.br/en to stay informed about the latest scientific breakthroughs that FAPESP is helping to achieve through its many programs, awards and research centers. You can also subscribe to the FAPESP news agency at http://agencia.fapesp.br/subscribe