Scientists wonder if evolution could have taken a different path. For example, if it was inevitable that human beings would emerge, or if we are the product of a series of natural events that could not have happened, resulting in an alternative world. There is no definitive answer, but today artificial intelligence (AI) can undertake evolutionary experiments. One of them, published this week in the magazine Sciencereveals that in the design of a type of protein there were other possible routes that nature did not explore. And this technology can provide valuable clues in the creation of new therapies and other applications.
In his 1989 book The wonderful lifeevolutionary biologist Stephen Jay Gould posed a thought experiment: if the tape of the evolution of terrestrial life could be rewinded to go back to the beginning and start again, would the result be the same as we know, or a completely different one? Gould argued in favor of the second: in a new game, using the simile of video games, evolution would have taken a very different path and humans would not exist. “Replay the tape a million times… and I doubt anything like the Homo sapiens could evolve again,” he wrote.
Gould’s thesis has been widely debated since then, with opinions in favor of determinism and others defending contingency. In his 1952 story The noise of thunderscience fiction author Ray Bradbury told how a time traveler who stepped on a butterfly in the age of the dinosaurs changed the course of the future. Gould expressed this same idea: “Alter any early event, even very slightly and of no apparent importance at the time, and evolution flows into an entirely different channel.”
Speak the language of proteins
Scientists have investigated this problem through experiments that try to recreate evolution in the laboratory or in nature, or by comparing species that have arisen under similar conditions. Today there is a new way: AI. In New York, a group of former researchers at Meta—the parent company of social networks like Facebook, Instagram and WhatsApp—founded EvolutionaryScale, an AI startup focused on biology. The ESM3 system (EvolutionaryScale Model 3), created by this company, is a generative language model; the type of platform to which the well-known ChatGPT belongs, but ESM3 does not generate texts, but proteins, the fundamental building blocks of life.
ESM3 is fed sequence, structure and function data from existing proteins to learn the biological language of these molecules and create new ones. Its creators have trained it with 771 billion data packets created from 3.15 billion sequences, 236 million structures and 539 million functional traits, totaling more than one trillion teraflops (a measure of computational performance), the greatest computing power ever used in biology, according to the company itself.
“ESM3 takes a step towards a future of biology, where AI is a tool to build from first principles, the way we build structures, machines and microchips,” says the co-founder and chief scientist of EvolutionaryScale and director of the new study, Alexander Rives. His vision is that biology is the most advanced technology ever created and that it is programmable, since it uses a common alphabet, the genetic code that is translated into amino acids, the building blocks of proteins. “ESM3 understands all this biological data, translates it and speaks it fluently to use it as a generative tool.”
The protein that was not
Rives and his collaborators have applied ESM3 to the problem of creating a new green fluorescent protein (GFP). GFP is a natural protein that glows green under ultraviolet light, and is used in research as a marker. The first was discovered in a jellyfish, but there are other versions in corals or anemones. The scientists trained ESM3 to create a new GFP, and the result surprised them: a fluorescent protein, which they called esmGFP, that only looks 58% similar to the most similar one, which according to the researchers is equivalent to simulating 500 millions of years of evolution. ESM3 is now available to the scientific community as a new tool for the design of new proteins with therapeutic, environmental remediation and other uses.
Thus, AI has found a new path that nature could have taken 500 million years ago, but which, for reasons unknown, it ignored. Rives and his collaborators explain that only a few GFP mutations can destroy fluorescence; and that, however, ESM3 has found a new space of fluorescent proteins that could have been, but were not: “Under these sequences there is a fundamental language of protein biology that can be understood using language models.”
According to Jonathan Losos, a professor at the University of Washington who works on the question of rewinding evolution by studying species in nature, “this study is a shining example that there are many ways evolution could have proceeded.” Losos values the results of the work as a confirmation of the contingency defended by Gould. This is also seen by Zachary Blount, a professor at Michigan State University who showed the contingency of evolution in a famous bacterial culture experiment started in 1988 by his former supervisor, Richard Lenski, and which is still continuing after more than 80,000 generations. .
“The study shows that there are viable biological possibilities that have not evolved (we believe) on Earth, suggesting genuine paths that evolution could have taken, but did not because the necessary history did not occur,” says Blount, warning that it also there is some determinism in nature; In the ESM3 experiment there is 42% similarity with other GFP. Blount doesn’t think AI will solve the problem of rewinding, but he does believe it will help understand what is contingent, what isn’t, and why: “It gives us ways to probe the realm of biological possibilities, allowing us to compare what “It is biologically possible with what exists or has existed.”