Cells organize their interiors not only with membranes but also through liquid-liquid phase separation, LLPS, the spontaneous demixing that creates membraneless organelles like nucleoli and stress granules. The stickers-and-spacers model has guided thinking about this phenomenon: aromatic residues act as "stickers" that drive intermolecular contacts through π–π and cation–π interactions, while polar residues like glycine and serine serve as flexible "spacers." Yet this framework has been validated primarily in prion-like RNA-binding proteins, leaving uncertainty about whether the same rules apply across diverse cellular contexts. Understanding which sequence motifs promote phase separation, and whether those motifs differ among protein families, could enable rational design of synthetic condensates with tailored properties.
A team led by Dr. Ana S. Pina from the New University of Lisbon, Portugal, publishing in Biomacromolecules, took a data-driven approach to this question. They assembled a database of 178 experimentally validated phase-separating proteins spanning six functional categories: RNA-binding, DNA-binding, chromatin-binding, regulatory, hydrolase, and structural proteins. Using the FuzDrop algorithm, they identified 712 droplet-promoting regions, DPRs, within these sequences. They then systematically catalogued every peptide motif from three to six residues in length, comparing presence and frequency against a negative control database of 208 proteins with low phase-separation probability. A combined fold score captured both how many DPRs contained each motif and how often it repeated within sequences.
This analysis yielded 129 enriched motifs characterized by glycine-rich backbones interspersed with aromatic, charged, and polar residues. Tetrapeptides dominated, suggesting they represent an optimal unit balancing length and interaction potential. Familiar patterns emerged: RGG-adjacent sequences, YGG motifs, and elastin-like VPGVG repeats all appeared prominently. But the study also uncovered previously unreported motifs including HHP, PAPA, DSSS, and DEDD. Importantly, motif preferences varied by protein family. RNA-binding proteins favored longer penta- and hexapeptides rich in aromatic residues, while chromatin-binding proteins showed enrichment in positively charged sequences that facilitate histone interactions. Structure proteins displayed the distinctive Val-Pro-Gly patterns characteristic of elastin. These family-specific signatures suggest that phase separation mechanisms adapt to functional requirements.
The researchers then asked whether motif co-occurrence patterns could guide peptide design. They developed an algorithm that scored all possible combinations of three motifs based on how frequently they appeared together in DPRs and how symmetrically they co-occurred. Top-scoring trios were merged into minimal peptide sequences averaging 12 residues. Eight candidates spanning diverse compositions were synthesized and tested. Every designed peptide formed liquid droplets when mixed with buffer at physiological temperature. The highest-scoring sequence, PJ1, containing phenylalanine, glycine, arginine, and aspartate, produced abundant droplets two to five micrometers in diameter at moderate concentrations. Fluorescence recovery after photobleaching confirmed liquid-like dynamics across all peptides, though recovery rates varied from 36% to 92%. Intriguingly, peptides built from frequently co-occurring motifs showed lower recovery, suggesting their denser interaction networks create more stable condensates, while peptides containing proline displayed enhanced fluidity.
By moving beyond the canonical stickers-and-spacers framework validated in RNA-binding proteins, this work reveals that phase separation draws on a broader and more context-dependent vocabulary of sequence motifs. The successful experimental validation of computationally designed peptides demonstrates that co-occurrence analysis can identify functional combinations even from short sequence elements. For researchers seeking to build synthetic condensates for applications ranging from drug delivery to metabolic engineering, this motif dictionary and design strategy offer a practical starting point.