Identification of Enriched PTM Crosstalk Motifs from Large-Scale Experimental Data Sets

ABSTRACT: Post-translational modifications (PTMs) play an impor-
tant role in the regulation of protein function. Mass spectrometry based
proteomics experiments nowadays identify tens of thousands of PTMs in
a single experiment. A wealth of data has therefore become publically
available. Evidently the biological function of each PTM is the key
question to be addressed; however, such analyses focus primarily on
single PTM events. This ignores the fact that PTMs may act in concert
in the regulation of protein function, a process termed PTM crosstalk.
Relatively little is known on the frequency and functional relevance of
crosstalk between PTM sites. In a bioinformatics approach, we extracted
PTMs occurring in proximity in the protein sequence from publically
available databases. These PTMs and their flanking sequences were
subjected to stringent motif searches, including a scoring for evolutionary
conservation. Our unprejudiced approach was able to detect a respectable set of motifs, of which about half were described previously. Among these we could add many new proteins harboring these motifs. We extracted also several novel motifs, which through their widespread appearance and high conservation may pinpoint at previously nonannotated concerted PTM actions. By employing network analyses on these proteins, we propose putative functional roles for these novel motifs with two PTM sites in close proximity.

 

Dell promotional clip

In February this year our group participated in a promotional video clip for Dell in which they present their recently obtained data storage cluster for proteomics data.

When a fluorescent sea anemone glows on a coral reef, it’s using a protein. When we move, proteins help our muscles contract. And when we curse our wrinkles as we age, it’s the lack of the protein collagen we lament.

{youtube}tRfN5Os79P4{/youtube}

Proteins also tell us about human health. By identifying proteins linked with specific diseases, we can develop novel diagnostics and therapies. At Utrecht University in the Netherlands, the Mass Spectrometry and Proteomics Group is exploring new methods of protein research. They can identify a protein or peptide every 100 milliseconds, producing vast amounts of data that must be stored safely and easily retrieved. Often, of all the proteins defined for each project, a tiny number are significant. Quick access to data is crucial for efficient analysis and isolation of the proteins that can fuel drug discovery.

The group wanted a new storage solution that would prevent bottlenecks and cut out painstaking backups. The Dell DX Object Storage Platform emerged as the best fit, based on its simplicity, flexibility and integrated data protection. Instead of slow tape backups, the IT team now relies on mirroring and automatic replication. When the group looks after data for other organisations, it knows it’ll be safe and easy to access. And, crucially, researchers can download files twice as quickly because the platform has higher throughput. Without delayed downloads, all their time is dedicated to the discovery of patterns and anomalies in protein behaviour that can change the way we manage disease.

Watch the video to hear the Utrecht University team talk about the challenges and rewards of proteomics.