Telomerase

Under the first model (a), mutations with high fitness are rare within the subset of mutations that are evolutionarily plausible

Under the first model (a), mutations with high fitness are rare within the subset of mutations that are evolutionarily plausible. antibody binding also guide efficient evolution across diverse protein families and selection pressures, including antibiotic resistance and enzyme activity, suggesting that these results generalize to many settings. Subject terms:Molecular evolution, Machine learning, Drug discovery A general protein language model guides protein evolution with 20 or fewer variants needed for testing. == Main == Evolution searches across an immense space of possible sequences for rare mutations that improve fitness1,2. In nature, this search is based on simple processes of random mutation and recombination1, but using the same approach for directed evolution of proteins in the laboratory3imposes a considerable experimental burden. Artificial evolution based on random guessing or brute force search typically devotes substantial effort to interrogate weakly active or nonfunctional proteins, requiring high experimental throughput to identify variants with improved fitness4,5. Although evolutionary fitness is determined, in part, by specific selection pressures, there are also properties that apply more generally across a protein family or are prerequisites for fitness and function across most proteins; for example, some mutations maintain or improve stability or evolvability6,7, whereas others are structurally destabilizing7or induce incompetent, misfolded says8. One approach to improving the efficiency of evolution is usually to ensure that mutations adhere to these general properties, which we refer to as evolutionary plausibility. Identifying plausible mutations could help guide evolution away from invalid regimes9, thereby indirectly improving evolutionary efficiency without requiring any explicit knowledge of the function of interest. However, this strategy is also challenging because, first, protein sequences are Mouse monoclonal to MER governed by complex rules, and, second, even if we restrict search to evolutionarily plausible mutations, those that also improve a specific definition of fitness might still be rare beyond practical utility (Fig.1a). More broadly, a major open question10is whether general evolutionary information (for example, learning patterns from sequence variation across past evolution) is sufficient to enable efficient evolution under specific selection pressures (for example, higher binding affinity to a specific antigen). == Fig. 1. Guiding evolution with protein language models. == a,b, Two possible models for relating the space of mutations with high evolutionary plausibility (for example, mutations seen in antibodies) to the space with high fitness under specific selection pressures (for example, mutations that result in high binding affinity to a specific antigen). Both models assume that mutations with high fitness make up a rare subset of the full mutational space and that, in general, high-fitness mutations are also evolutionarily plausible. Under the first model (a), mutations with high fitness are rare within the subset of mutations that are evolutionarily plausible. Under the second model (b), when restricted to the regime of plausible mutations, improvements to fitness become much more common.c, Protein language models, trained on millions of natural protein sequences learn amino acid patterns that are likely to be seen in nature. We hypothesized that most mutations with high language model likelihood would also be evolutionarily plausible. Assuming that this is true, and if the second model (b) better describes nature, then a language model with no information about specific selection pressures can still efficiently guide evolution. Here we show that evolutionary information alone can lead to improved fitness under specific selection pressures with high efficiency (Fig.1b). For our main experimental test case, we focused on affinity maturation of human antibodies ATR-101 in which our specific selection pressure is usually defined as stronger binding affinity to a particular antigen. In nature, a process known as somatic hypermutation evolves ATR-101 or matures an antibody lineage to have higher affinity for an antigen via repeated mutagenesis1113. In the laboratory, affinity maturation is usually a major ATR-101 application of directed evolution due to the therapeutic potential of antibodies with high affinity for disease.