Protein interactions at multiple scales (SP 2.3.1)
Project leader: J. Heringa, Vrije Universiteit, Amsterdam
Participants: M. Huynen, Radboud University, Nijmegen; R. van Ham, Wageningen University and Research Centre; J.L. Bos, University Medical Centre Utrecht; R.J. Siezen, Radboud University, Nijmegen / NIZO food research; B. Snel, Utrecht University
Genotype defines phenotype by a network of molecular interactions: key to all cellular processes. This project develops and tests bioinformatics tools that analyse interactions based on a variety of (genomics) data. The scale at which the interactions take place, vary, as do the interacting substrates. In spite of these variations, underlying mechanisms are quite similar. The complementary expertise of the various groups provides an excellent basis for collaboration.
This project will study protein complexes to illuminate the association/dissociation dynamics of protein complexes in solution, and even the ferocious problem of complex and cluster formation in different cellular environments. Pair-wise protein interactions will be studied by machine learning techniques. Various types of ~omics data will be used, but also sequence and structure-information of the interacting partners. Protein-DNA interactions will be predicted for the transcription regulatory networks in bacteria by 'phylogenetic footprinting'. This will enable detection of regulatory elements in whole genomes. The Golgi apparatus functions as a central delivery system in the cell. Are there signals present in protein sequences that interact with this apparatus? Finally, two extremely complex signal-transduction pathways, associated with oncogenesis, will be analyzed. This will in part be driven by previous results, and will direct new wet-lab experimentation.
Overview of subprojects and results:
Subproject SP 2.3.1.5
Project leader: R.J. Siezen, Radboud University, Nijmegen / NIZO food research
Introduction and objectives
The research within this subproject was carried out as part of the Wageningen Centre for Food Science (WCFS), later termed Top Institute Food and Nutrition (TIFN), in the Host-Microbe Interactions subprogram. The research goals within WCFS and later TIFN were: understanding how the human intestine perceives and responds to four species of lactic acid bacteria that are natural constituents of sauerkraut and yoghurts, including three probiotics. Most of the data consisted of whole-genome gene expression data obtained through microarray technology. Experiments were carried out together with the Maastricht University Medical Centre where healthy human adult volunteers drank beverages containing large numbers of four different species of lactic acid bacteria each half hour, for a total of six hours. Biopsies were taken from the small intestine of these persons and used for histology and microarray gene profiling.
This project used bioinformatics to obtain sets of genes that were significantly expressed in response to the different bacteria and applied functional gene annotations and of the proteins encoded by these genes. For all these proteins, protein-protein interactions were retrieved from databases and visualised in network diagrams. A combination of gene set comparisons, gene regulatory network - and pathway analysis was used to correlate changes in gene expression, plotted onto protein-protein interaction networks, with histology of the human intestine. Using the advanced gene and protein annotations, it was also determined which pathways and gene regulatory networks were taking place in epithelial cells, and which ones in different immune cells. Based on these results, it could be reconstructed how the intestinal balance (homeostasis) was preserved in healthy humans while consuming very large amounts of bacteria.
Results
It was shown that the human intestine perceived all four bacterial species in a different way, even though the intestine of adult humans is already colonised by millions of commensal (symbiotic) bacteria. Moreover, the protein-protein interactions and the pathways that the proteins and their encoding genes participated in could potentially explain why some probiotics do contribute to human health. It was hypothesised that efficacy of probiotics could depend on the strength of the response of individual volunteers at the level of gene transcription. This work has shown that it is possible to make use of predicted molecular interactions to generate testable hypotheses and obtain novel findings, even in very challenging studies. Currently, the same approaches are being applied in more experimental studies using mouse models and in vitro cell lines.
Subproject SP 2.3.1.6
Project leader: B. Snel, Utrecht University
Introduction and objectives
Reversible phosphorylation of proteins plays a crucial role in eukaryotes. Recent development of high-throughput phosphoproteomics techniques resulted in the availability of many large datasets that contain lists of phosphorylated residues for a wide range of model organisms. Phosphorylation is enzymatically catalysed by kinases that attach the phosphogroups covalently to serine, threonine and tyrosine residues. This project aims to apply the comparison of these novel phosphoproteomics data sets to improve the understanding of the complex regulatory networks formed by kinases and their substrates. Given the pivotal role of these phosphorylation networks in regulating the cell, such insights are expected to provide leads for unravelling cancer and other complex diseases.
Results
Unfortunately, comparing phosphoproteomics data is complicated: for example the choice of experimental method has a strong impact on measuring the phosphoproteome and thus also on the overlap between phosphoproteomes sampled in different experiments. By careful analysis of such complications, the computational analysis methods were adjusted. This allowed the first-ever demonstration of a significant evolutionary signal being present in the overlap between phosphoproteomics datasets from different species. A so-called neighbour-joining tree using the overlap between datasets as distance measure has the same topology as the tree of life. In addition, a strong functional signal could be detected: compared to phosphosites found in only a single species, relatively many phosphosites with homologs in two or more species are found in proteins with functions related to information storage and processing, proteins often involved in complex diseases like cancer.
Next, bioinformatics analyses were performed that estimated the impact of different experimental techniques on the similarity between phosphoproteomes by comparing two datasets from different experimental pipelines to a common reference dataset. This allows the comparative analyses not only of datasets specifically generated for this purpose, but also to leverage the ever-increasing wealth of publicly available phosphorylation data by comparative analysis.
In summary, it is expected that the rapidly growing amount of data from high-throughput mass spectrometry analysis will lead to comparative phosphoproteomics as a powerful tool in predicting, evaluating and understanding reversible phosphorylation.

