May 7, 2024

MIT Biologists Glean New Insight Into Repetitive Protein Sequences

Using their method, the scientists evaluated all of the proteins discovered in 8 various types, from germs to human beings. They discovered that while LCRs can vary between species and proteins, they typically share a comparable role– helping the protein in which theyre found to sign up with a larger-scale assembly such as the nucleolus, an organelle found in almost all human cells.
” Instead of looking at particular LCRs and their functions, which may seem separate due to the fact that theyre associated with different processes, our more comprehensive approach permits us to see resemblances between their properties, suggesting that maybe the functions of LCRs arent so diverse after all,” says Byron Lee, an MIT college student.
Differences between LCRs of various types were also discovered by the research study group. They revealed that these species-specific LCR series correspond to species-specific functions, such as forming plant cell walls.
Lee and college student Nima Jaberi-Lashkari are the lead authors of the study, which was recently released in the journal eLife. Eliezer Calo, an assistant professor of biology at MIT, is the senior author of the paper.
Using computational analysis, scientists have actually discovered that numerous recurring sequences are shared throughout proteins and are comparable in species from bacteria to people. Credit: Courtesy of the researchers
Massive study
Previous research study revealed that LCRs are associated with a range of cellular processes, consisting of cell adhesion and DNA binding. These LCRs are frequently rich in a single amino acid such as alanine, lysine, or glutamic acid.
Discovering these sequences and after that studying their functions individually is a time-consuming procedure, so the scientists chose to use bioinformatics– a method that uses computational approaches to examine large sets of biological data– to examine them as a larger group.

Bioinformatics is a relatively brand-new scientific subdiscipline that integrates aspects of biology and computer technology together for the purpose of establishing effective and robust methods for the analyses and interpretation of large quantities of biological information, such as DNA, RNA, and amino acid series or annotations about those sequences.

” What we wished to do is take a step back and instead of taking a look at individual LCRs, to try to have a look at all of them and to see if we could observe some patterns on a bigger scale that might help us find out what the ones that have appointed functions are doing, and likewise help us discover a bit about what the ones that do not have actually designated functions are doing,” Jaberi-Lashkari states.
To do that, the MIT team used a method called dot-plot matrix (see image at the top of the page), which is a way to visually represent amino acid sequences, to generate pictures of each protein under research study. Next, they used computational image processing techniques to compare thousands of these matrices at the exact same time.
Utilizing this technique, the researchers were able to categorize LCRs based on which amino acids were most regularly duplicated in the LCR. They also grouped LCR-containing proteins by the number of copies of each LCR type discovered in the protein. Analyzing these characteristics assisted the scientists to learn more about the functions of these LCRs.
As one demonstration, the group of researchers selected a human protein, referred to as RPA43, which has three lysine-rich LCRs. This protein is one of lots of subunits that comprise an enzyme called RNA polymerase 1, which synthesizes ribosomal RNA. The researchers found that the copy variety of lysine-rich LCRs is essential for helping the protein integrate into the nucleolus, the organelle responsible for synthesizing ribosomes.
Biological assemblies
In a comparison of the proteins discovered in 8 different species, the researchers found that some LCR types are highly saved in between species, meaning that the sequences have altered very bit over evolutionary timescales. These sequences tend to be discovered in proteins and cell structures that are likewise extremely conserved, such as the nucleolus.
” These sequences appear to be important for the assembly of particular parts of the nucleolus,” Lee states. “Some of the principles that are understood to be important for greater order assembly appear to be at play because the copy number, which might control the number of interactions a protein can make, is necessary for the protein to incorporate into that compartment.”
The MIT group also found differences in between LCRs seen in 2 various types of proteins that are associated with nucleolus assembly. They found that a nucleolar protein referred to as TCOF contains numerous glutamine-rich LCRs that can help scaffold the formation of assemblies, while nucleolar proteins with just a few of these glutamic acid-rich LCRs might be hired as clients (proteins that interact with the scaffold).
Another structure that appears to have lots of conserved LCRs is the nuclear speckle, which is discovered inside the cell nucleus. The scientists also found lots of resemblances between LCRs that are associated with forming larger-scale assemblies such as the extracellular matrix, a network of particles that supplies structural support to cells in plants and animals.
The research group likewise found examples of structures with LCRs that appear to have diverged between types. Plants have unique LCR sequences in the proteins that they utilize to scaffold their cell walls, and these LCRs are not seen in other types of organisms.
Now the researchers plan to expand their LCR analysis to additional types.
” Theres so much to explore, since we can broaden this map to essentially any species,” Lee states. “That provides us the structure and the opportunity to recognize new biological assemblies.”
Referral: “A unified view of low complexity areas (LCRs) across types” by Byron Lee, Nima Jaberi-Lashkari and Eliezer Calo, 13 September 2022, eLife.DOI: 10.7554/ eLife.77058.
The research was funded by the National Institute of General Medical Sciences, National Cancer Institute, the Ludwig Center at MIT, a National Institutes of Health Pre-Doctoral Training Grant, and the Pew Charitable Trusts.

MIT researchers used a technique called dot-plot matrix, which is a way to aesthetically represent amino acid sequences, to compare protein series known as “low-complexity areas” across many different species. Credit: Courtesy of the researchers, and modified by MIT News.
Computational analysis reveals that many recurring series are shared throughout proteins and are similar in types from bacteria to humans.
Approximately 70 percent of all human proteins consist of at least one sequence consisting of a single amino acid duplicated sometimes, with a few other amino acids sprinkled in. These “low-complexity areas” (LCRs) are likewise discovered in the proteins of most other organisms.
Although the proteins that consist of these series have various functions, MIT biologists have now developed a way to identify and examine them as a unified group. Their method permits them to examine similarities and distinctions between LCRs from various types, and helps them to solve the functions of these sequences and the proteins in which they are discovered.

Using this method, the scientists were able to classify LCRs based on which amino acids were most often repeated in the LCR. They also organized LCR-containing proteins by the number of copies of each LCR type discovered in the protein. Examining these characteristics helped the scientists to discover more about the functions of these LCRs.
As one demonstration, the team of scientists picked out a human protein, known as RPA43, which has three lysine-rich LCRs. The researchers found that the copy number of lysine-rich LCRs is essential for assisting the protein integrate into the nucleolus, the organelle responsible for manufacturing ribosomes.