Research, Health News, ET HealthWorld

[ad_1]

Advertisements
AI technology creates new proteins from scratch: research

Washington: Scientists have developed an AI system that can create artificial enzymes from scratch. In laboratory experiments, some of these enzymes worked as well as enzymes found in nature, even though their artificially created amino acid sequences differed significantly from known natural proteins.

Advertisements

This experiment shows that although natural language processing was developed to read and write linguistic texts, at least some of the basic principles of biology can be learned. Salesforce Research has developed an AI program called ProGen that assembles amino acid sequences into artificial proteins using predictions of the following tokens:

Advertisements

Scientists say the new technique could be more powerful than directed evolution, the Nobel Prize-winning protein design technique, and will give a boost to the 50-year-old field of protein engineering by accelerating the development of new proteins that can be used for almost anything. From therapeutics to degrading plastics.

Advertisements

“Artificial designs perform far better than designs inspired by evolutionary processes,” said Dr. James Fraser, Professor of Biotechnology and Therapeutic Sciences at the UCSF School of Pharmacy and author of the work. , in natural biotechnology.

Advertisements

An earlier version of this paper was available on the pre-print server BiorXiv as of July 2021, and received dozens of citations before publication in peer-reviewed journals.

Advertisements

“Language models learn aspects of evolution, but they are different from normal evolutionary processes,” Fraser said. “We can now tailor the production of these properties for specific effects. For example, enzymes that are incredibly thermostable, like acidic environments, or don’t interact with other proteins.”

Advertisements

To create the model, the scientists fed the amino acid sequences of 280 million proteins of all kinds into a machine learning model and allowed it to digest the information over several weeks. Then, they fine-tuned the model with 56,000 sequences from the five lysozyme families and some contextual information about these proteins.

Advertisements

The model quickly generated one million sequences, and the research team selected 100 to test based on how similar they were to sequences in natural proteins, and how natural the AI ​​protein’s basic amino acid “grammar” and “sense” were.

Advertisements

In the first batch of 100 proteins screened in vitro by Tierra Biosciences, the team created five artificial proteins to test in cells and compared their activity to an enzyme found in chicken egg white known as egg white lysozyme. (HEWL). A similar lysozyme is found in human tears, saliva and milk and defends against bacteria and fungi.

Advertisements

The two artificial enzymes were able to degrade the bacterial cell wall with activities comparable to HEWL, but their sequences were only about 18% identical to each other. Both sequences were approximately 90% and 70% identical to known proteins.

Advertisements

A single mutation in the native protein can cause it to stop working, but in another test, the team found that the AI-generating enzyme was active even with a sequence that resembled only 31.4% of the known native protein.

Advertisements

AI could learn how enzymes should be formed just by studying raw sequence data. As measured by X-ray crystallography, the artificial protein’s atomic structure was previously unseen, but appeared to be what it should be.

Advertisements

Salesforce Research developed ProGen in 2020 based on a type of natural language programming originally developed by researchers to generate English text.

Advertisements

They knew from their previous work that AI systems could teach themselves grammar, the meaning of words, and other basic rules for writing well.

Advertisements

“Training data-heavy sequence-based models is really powerful for learning structures and rules,” said Dr. Nikhil Naik, AI research lead at Salesforce Research and lead author of the paper. “They learn which words can appear simultaneously, and also composition.”

Advertisements

For proteins, the design choices were almost limitless. Lysozyme is as small as a protein and consists of up to about 300 amino acids. However, with 20 possible amino acids, there are a huge number of possible combinations (20300). That’s more than the number of grains of sand on Earth multiplied by the number of atoms in the universe for all humans who have lived through time.

Advertisements

Given the limitless possibilities, it is surprising that the model can easily generate working enzymes.

Advertisements

“The ability to instantly generate functional proteins from scratch is a new way of protein design,” said Dr. Ali Madani, former research scientist at Salesforce Research and founder of Profluent Bio. It shows that we are entering the era,” he said. first author. “It’s a versatile new tool for protein engineers to use, and we’re looking forward to therapeutic applications.”

Advertisements

[ad_2]

Advertisements

Leave a Comment