QUERY INPUT:


Query Organism: enter organism common name, scientific name or NCBI tax ID. Organisms that are not found in the drop down box were not included in our study.

Query Protein: enter protein names, common or systematic gene names, or protein ID (Ensembl, JGI, Refseq, Uniprot). By clicking "By Fasta', users can also enter the protein sequence. If the protein sequence is not included in ProtPhylo, as for example for newly sequenced organisms, the user can search for an orthologous sequence in a species of interest as the first step. ProtPhylo would then predict functional associations based on the matched protein.

Species WITH the Phenotypic Properties: enter common name, scientific name or NCBI tax ID of organisms with the phenotype of interest. Each organism needs to be entered one at a time.

Species WITHOUT the Phenotypic Properties (Optional): enter common name, scienfific name or NCBI tax ID of organisms without the phenotype of interest. Each organism needs to be entered one at a time.

Orthology Method: users can choose among four established methods to search for orthologs of the query protein (One-way Best Hits, Best Reciprocal Hits, OrthoMCL (1), eggNOG (2)) of the query protein. One-way Best Hits is set as default.
Click "Search" to predict protein-protein or phenotype-to-protein functional associations based on the above criteria. The list of candidate proteins can be further prioritized by applying any combination of filter options (listed below). Alternatively, filter options can be selected before the run to obtain a pre-filtered ranked list of candidates predicted to be functional associated with the query protein or phenotype.

FILTER OPTIONS:


Profile Similarity:
Subcellular Localization (Filter Type: AND, OR): users can filter candidate proteins based on common evidence of subcellular localization from more than one prediction method (AND). Alternatively, users can filter candidate proteins based on combined evidence of subcellular localization from any of the selected methods (OR).
Others:

QUERY OUTPUT:


The output of a ProtPhylo run is a list of candidate proteins (rows) that are predicted to be functionally associated with a query protein (Protein Phylogenetic Profiling) or phenotype (Phenotype Phylogenetic Profiling) solely based on evidence of coevolution. Candidate proteins are automatically ranked based on their phylogenetic distance (Hamming distance, HD) to the query protein or the phenotype. The lowest the HD the strongest the functional association. The total number of hits (predicted functional associations) is indicated at the top of the list. By default, only the first 100 hits are shown; to display the next 100 hits, the user can click on the button “Show more results” found on the top right corner below the filter options. Note that the first listed protein of a “Protein Phylogenetic Profiling” output list (highlighted in red) is always the query protein. If all the hits are displayed, the user can sort the hits based on any header. For each protein the following additional information (columns) can be retrieved:


Click “Export” to export the results into a csv text file.



Protein Phylogenetic Profiling: the MICU1 case study


INPUT:

Users should use the “Protein Phylogenetic Profiling” option to search for additional proteins that could be functionally associated (members of the same protein complex, pathways, receptors and ligands, etc.) to a protein of interest (‘Query Protein’) within a user-defined organism (‘Query Organism’). As an example, we show how ProtPhylo can be used to identify human proteins that are functionally associated to Micu1, the mitochondrial calcium uptake 1 protein (12,13). Micu1 was identified by phenotype phylogenetic profiling in 2010 as the founding member of the human mitochondrial calcium uniporter (12). The latter was then shown to include additional subunits such as Mcu, Mcub, Micu2, and Micu3 (14-16). Can ProtPhylo predict a functional association between Micu1 and other subunits of the mitochondrial calcium uniporter solely based on phylogenetic profiling analysis? First, type and select ‘Homo sapiens’ in the ‘Query Organism’ text box. Next, enter the protein name ‘Micu1’ in the ‘Query Protein’ text box and select ‘One-way Best Hits’ (default) as ‘Orthology Method’. You can select an HD percentile <1st if you only want to retrieve functional associations to the top 1% of human proteins with the lowest Hamming distance to the query protein. Now, click the ‘Search’ button.

Mountain View

OUTPUT:

ProtPhylo predicts a total of 219 proteins out of all human proteins that match the above criteria. Only the top four hits (lowest HD) are shown below and they include Micu2, Micu3, Mcu, and Mcub, known components of the mitochondrial calcium uniporter. An HD of 29 (Micu2, Micu3) means that the phylogenetic profiles of Micu1, query protein, and Micu2 or Micu3 are identical except for 29 out of 2048 organisms used in the analysis.

Mountain View



Phenotype Phylogenetic Profiling: the MCU case study


INPUT:

Users should use the “Phenotype Phylogenetic Profiling” option when none of the proteins involved in a phenotype of interest are known. Here, users can search for proteins within a user-defined organism (‘Query Organism’) that are functionally associated to a phenotype of interest. An example is shown below for the identification of human mitochondrial proteins involved in mitochondrial calcium uptake (12,17). First, users need to select a query organism for which the phenotype-to-protein associations should be predicted. In the example below, we selected ‘Homo sapiens’ as we are interested in identifying human proteins involved in mitochondrial calcium uptake. Second, in the textbox ‘Species WITH the Phenotype of Interest’, users need to type the common or scientific name of species that show the phenotype of interest. The species need to be entered one at a time. As a result, a string of blue boxes with numbers will appear. The number in each box represents the NCBI tax ID of the selected species. If the phenotype of interest is not conserved in other species, users should enter these species in the textbox ‘Species WITHOUT the Phenotype of Interest’. For example, it is known that mitochondria of yeast cannot uptake calcium, therefore, we enter S. cerevisiae in the textbox ‘Species WITHOUT the Phenotype of Interest’.

Mountain View

Next, users need to select the ‘Orthology Method’ of choice. Here, we select ‘One-way Best Hits’ as ‘Orthology Method’. As we search for the proteins involved in calcium uptake in mitochondria, we will restrict the analysis to all human proteins that are predicted to be localized in mitochondria based on common (AND) evidence of mitochondrial localization from TargetP, MitoProtII, LocTree and Uniprot. Hamming distance of zero is set as default, therefore only perfect matches will be retrieved. Now, click the ‘Search’ button.

Mountain View



OUTPUT:

ProtPhylo predicts a total of 35 proteins out of all human proteins that match the above criteria. All of the candidate proteins have the same probability to be associated with the phenotype of interest, as all have an Hamming distance of zero. Users can now plan targeted experiments to test whether any of these candidate proteins affect mitochondrial calcium uptake (phenotype of interest). As we can see, the set of candidate proteins include known components of the mitochondrial calcium uniporter (Micu1, Micu2, Micu3, Mcu, and Mcub).

Mountain View


Note: We suggest users to run ProtPhylo with each orthology method first and then evaluate the ranking of known interacting/funcionally associated proteins (if available) based on HD, HD percentile, and the reciprocal HD percentile. We also suggest users, who are not familiar with orthology inference methods, to consult relevant reviews on this topic that provide examples of decision tree for choosing the appropriate orthology detection tool, for example, Kuzniar et al. (18),

REFERENCES:


  1. Li L., Christian J. Stoeckert, Jr., and David S. Roos (2003). OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes. Genome Res. 13: 2178-2189.
  2. Powell S., Forslund K., Szklarczyk D., Trachana K., Roth A., Huerta-Cepas J., et al. . (2014). eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res. 42, D231–D239.
  3. Van Dongen, S. (2000). “Graph clustering by flow simulation.” Ph.D thesis, University of Utrecht, The Netherlands.
  4. Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci U S A 96: 4285–4288
  5. Emanuelsson O, Brunak S, von Heijne G, Nielsen H (2007). Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protoc 2:953–971.
  6. M.G. Claros, P. Vincens (1996). Computational method to predict mitochondrially imported proteins and their targeting sequences. Eur. J. Biochem. 241, 779-786.
  7. Goldberg T., Hecht M., Hamp T., Karl T., Yachdav G., Ahmed N., Altermann U., Angerer P., Ansorge S., Balasz K (2014). LocTree3 prediction of localization. Nucleic Acids Res. 42 (Web Server issue):W350-5
  8. Pagliarini, D.J., Calvo, S.E., Chang, B., Sheth, S.A., Vafai, S.B., Ong, S.E., Walford, G.A., Sugiana, C., Boneh, A., Chen, W.K., et al. (2008). A mitochondrial protein compendium elucidates complex I disease biology. Cell 134, 112-123.
  9. A. Krogh, B. Larsson, G. von Heijne, and E. L. L. Sonnhammer (2001). Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. Journal of Molecular Biology, 305(3):567-580.
  10. Finn, R.D., Bateman, A., Clements, J., Coggill, P., Eberhardt, R.Y., Eddy, S.R., Heger, A., Hetherington, K., Holm, L., Mistry, J. et al. (2014) Pfam: the protein families database. Nucleic acids research, 42, D222-230.
  11. Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J., Minguez, P., Bork, P., von Mering, C. et al. (2013) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic acids research, 41, D808-815.
  12. Perocchi F, Gohil VM, Girgis HS, Bao XR, McCombs JE, Palmer AE, Mootha VK (2010). MICU1 encodes a mitochondrial EF hand protein required for Ca(2+) uptake. Nature. 467(7313):291-6.
  13. Baughman JM*, Perocchi F*, Girgis HS, Plovanich M, Belcher-Timme CA, Sancak Y, Bao XR, Strittmatter L, Goldberger O, Bogorad RL, Koteliansky V, Mootha VK (2011). Integrative genomics identifies MCU as an essential component of the mitochondrial calcium uniporter. Nature. 476(7360):341-5.
  14. Plovanich, M., Bogorad, R.L., Sancak, Y., Kamer, K.J., Strittmatter, L., Li, A.A., Girgis, H.S., Kuchimanchi, S., De Groot, J., Speciner, L. et al. (2013) MICU2, a paralog of MICU1, resides within the mitochondrial uniporter complex to regulate calcium handling. PloS one, 8, e55785
  15. Raffaello, A., De Stefani, D., Sabbadin, D., Teardo, E., Merli, G., Picard, A., Checchetto, V., Moro, S., Szabo, I. and Rizzuto, R. (2013) The mitochondrial calcium uniporter is a multimer that can include a dominant-negative pore-forming subunit. The EMBO journal, 32, 2362-2376
  16. Patron, M., Checchetto, V., Raffaello, A., Teardo, E., Vecellio Reane, D., Mantoan, M., Granatiero, V., Szabo, I., De Stefani, D. and Rizzuto, R. (2014) MICU1 and MICU2 finely tune the mitochondrial Ca2+ uniporter by exerting opposite effects on MCU activity. Molecular cell, 53, 726-737
  17. De Stefani, D., Raffaello, A., Teardo, E., Szabo, I. and Rizzuto, R. (2011) A forty-kilodalton protein of the inner membrane is the mitochondrial calcium uniporter. Nature, 476, 336-34
  18. Kuzniar, A., van Ham, R.C., Pongor, S. and Leunissen, J.A. (2008) The quest for orthologs: finding the corresponding gene across genomes. Trends in genetics : TIG, 24, 539-551