HomeBiologyThe curse of the protein ribbon diagram

The curse of the protein ribbon diagram

Quotation: Bourne PE, Draizen EJ, Mura C (2022) The curse of the protein ribbon diagram. PLoS Biol 20(12):
e3001901.

https://doi.org/10.1371/journal.pbio.3001901

Revealed: December 12, 2022

Copyright: © 2022 Bourne et al. That is an open entry article distributed beneath the phrases of the Inventive Commons Attribution License, which allows unrestricted use, distribution, and replica in any medium, offered the unique writer and supply are credited.

Funding: The writer(s) obtained no particular funding for this work.

Competing pursuits: The authors have declared that no competing pursuits exist.

Profound advances in protein construction prediction, and our personal current work on exploring protein fold house, each of which use deep studying strategies, obtained us interested by one thing certainly one of us (PEB) has taught for a very long time—the curse of the ribbon. Do some reductionist fashions and/or scientific representations, such because the ribbon diagram illustrating protein construction, facilitate analysis, solely to finally hinder additional perception?

Science Journal selected [1] the AI-driven software program, AlphaFold-2 (AF2) [2], because the 2021 Breakthrough of the 12 months, because it successfully solved a long-standing problem in molecular biology, specifically, predicting a 3D construction of a protein from its 1D sequence. Whereas one can argue the nuances—(i) AF2 may not be fixing the protein folding drawback since we don’t know the precise mechanism by which folding happens, (ii) it doesn’t decide the precise construction to the extent achieved by experiment for each case, and (iii) on the time it relied upon having many homologs out there, e.g., to construct a number of sequence alignments (so it’s not predicting from 1 sequence alone, although single-sequence construction prediction is a really energetic space)—it’s nonetheless a monumental advance that may affect how we take into consideration protein operate, protein design and rather more. Briefly, AF2 and its builders at DeepMind deserve the accolade.

The query then turns into why did AI succeed the place people have failed? Once more, there are nuances. People haven’t failed precisely. The brilliantly conceived and executed Essential Evaluation of Construction Prediction (CASP), held since 1994 [3], has proven the numerous progress in construction prediction over time; however, all efforts fell far wanting what was achieved by AF2 and its predecessor, AlphaFold [4], and by different efforts, notably RoseTTAFold [5]. What do these algorithms “see” {that a} human doesn’t? A part of the reply to that query will not be what’s being “seen” however somewhat how a lot is being seen. Even the savviest structural biologist, with an eidetic reminiscence, can not concurrently maintain the variety of options of proteins of their head, on par with a well-trained neural community. In a way, that is analogous to the software program engineering precept that “given sufficient eyeballs, all bugs are shallow” [6]: With sufficient protein sequence (enter) and construction (output) information, a deep mannequin can “be taught” an answer, mapping enter to output. Perhaps one other a part of the reply is that human neural networks inaccurately (or at the very least sub optimally) conceive of protein buildings as singular, inflexible buildings (just like the frozen ribbons we see on a web page in an article), somewhat than because the fluid, physiologically practical entities that they’re in actuality—and which a deep neural community can “be taught” as an implicit (latent) illustration?

A protein construction, whether or not experimental or theoretical, as soon as identified, is described by a set of 3D Cartesian coordinates, the place every (x, y, z) coordinate represents the place of an atom. A normal human-readable textual content format, both PDB or mmCIF [7], gives an inventory of all atoms and different metadata used to symbolize the protein. Looking at such an inventory of numerical information is actually futile. Early within the historical past of structural biology, in response to Jane Richardson [8], it was Dick Dickerson who was the primary to make a protein schematic and Irving Geis the primary to point out successive peptide planes with ribbons tracing the protein spine. These diagrams are actually the stuff of legend, as they need to be, and they are often discovered on the partitions of laboratories and houses of structural biologists. Jane herself, with husband David, illustrated the complete vary of protein buildings with a wide range of ribbon diagrams in a landmark 1981 article [9]. That tour de drive, from which certainly one of us (PEB) learnt about and have become fascinated by, cataloged all 75 protein buildings out there on the time (there are actually 196,979; October 28, 2022).

As is commonly the case within the organic sciences, comparative evaluation proved to be the way in which ahead to know protein construction. By evaluating ribbon diagrams, or related, initially hand-drawn sketches (and later generated by a wide range of more and more highly effective molecular graphics applications), similarities between buildings began to change into obvious; these 3D spatial “motifs” began to build up names like jelly-roll, Greek key, and Rossmann fold as people drew comparisons to both identified objects and patterns, or to the one that first noticed the commonality. Because the variety of buildings elevated, the reliance on these simplified visualizations essentially elevated (Fig 1).

thumbnail

Fig 1. Cartoon ribbon diagrams as a blessing and a curse.

The earliest period of structural biology made clear the need of molecular visualization for even small proteins, such because the 62-amino acid snake venom toxin proven right here (PDB ID 3EBX). On this course of, (a) atomic coordinates are visually rendered on a pc show as (b) strains, “sticks,” spheres, and so on., thereby making a illustration of the protein’s 3D construction. Although helpful for detailed, atomic-scale analyses, e.g., of enzyme mechanisms, such renditions are too visually cluttered and sophisticated (incomprehensible, basically) to allow one to understand a protein’s general structure and topology. For that objective, (c) ribbon diagrams are a blessing: these diagrams are highly effective abstractions of a single protein entity, however do they (d) masks different options and relationships.

https://doi.org/10.1371/journal.pbio.3001901.g001

With the doable exception of Feynman diagrams, we are able to’t consider a compact visible illustration of scientific data, particular to a given area, that has had extra affect on our understanding—on this case, on the connection between sequence, construction, and performance—than the ribbon diagram and variations thereof. Briefly, it’s a blessing. So why are we saying it’s a curse, too? We’d argue that this singular representational model has change into too ingrained in our considering, to the purpose non-experts think about proteins to be actually like (static) ribbons. In gazing at ribbon cartoons on a web page, we abandon the physicochemical properties that underlie the construction; contemplate dynamics as solely variations of the ribbon; and we predict much less about solvent, different interacting molecules, mobile location, evolution, and performance. Briefly, the geometric form, exemplified by the ribbon, dominates our considering (and, even then, we neglect topography and different geometric options of the floor, e.g., drug-binding pockets and such). There lies the curse. Maybe it’s time to reduce the ribbon? Or at the very least train an understanding of proteins that has college students assume past the ribbon?

With out ribbon representations of proteins, would people have solved the protein construction prediction drawback? A greater query is: has the diploma to which we’re steeped in interested by proteins as ribbons restricted a sort of understanding (fashions, and so on.) of proteins that’s vital to higher perceive their kind and performance? There the reply will not be so clear. That is precisely why we encourage college students to view proteins as collections of bonded atoms present process dynamic units of interactions with one another and the atmosphere—unimaginable to conceptualize, however the worth in opening one’s thoughts to alternate options would appear necessary.

Are there different examples the place our considering turns into “locked in”? Taxonomies and ontologies come to thoughts. The tree of life, whereas an evolutionary anchor level, is extra precisely seen as dynamic and changeable. If Woese and Fox [10] had not thought so, the invention of Archaea would have been delayed.

The unique anatomy and taxonomy of protein construction [9] was certainly derived partly by a human visible evaluation of ribbon diagrams. This and later classifications had been pivotal in our progress in understanding protein sequence-structure-function and evolutionary relationships. Then once more, is it in some methods too limiting and restrictive to categorise entities equivalent to proteins by putting them into mutually unique bins, as is finished in current hierarchical schemes? What if such hierarchical binning has brought on us to overlook necessary relationships—for instance, relationships arising as shared structural “themes,” which in flip trace at somewhat distant evolutionary relationships (and counsel deep homology)?

There was a protracted debate as as to if the house of all protein folds is discrete or steady [11]. Present considering would are inclined to favor a extra steady mannequin. If that’s the case, the hierarchical binning that happens in current classifications would possibly miss necessary relationships. We posit that certainly we now have missed distant linkages, equivalent to between distinct protein “superfolds,” and proposed the existence of the Urfold [12]. An Urfold exists when there may be architectural similarity regardless of topological variability, no matter issues of (identified) homology. Paradoxically, the proof that urged the existence of an Urfold was obtained from the visible inspection of many perhaps-related proteins, together with ribbon views. Extra not too long ago, a machine studying research tries to quantitatively “outline” the Urfold by way of realized embeddings in deep generative fashions, whereby a protein’s sequence, construction, and physicochemical properties could be seen as being compressed right into a lower-dimensional “latent house” illustration [13]; although not readily visualized, like ribbon diagrams, such distilled function representations do counsel a brand new view of protein relationships. Additional scrutiny over time will decide the worth of such representations.

What is evident is that machine studying approaches permit us to “look” past human digestible metaphors, just like the protein ribbon, and can trigger us to reevaluate our considering in lots of areas of biology. The curse has been lifted in methods we now have but to totally perceive.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments