Navitagor | Similarity | Statistics | Inputs | Outputs | Q&A |

Navigator

The GO navigator tools are designed to draw a path way between two Gene Ontology (GO) terms. The output file will be a tarball containing .png files that represent the two nodes. The Single Term allows one to draw a path from the root node to the node provided by the user. The Term Pairs node draws the path between the two nodes in the pair provided by the user. When using these tools, select the type of input (single input, multiple entries, or input by tab-delimited file). Then choose whether to draw the parents, the children, or both in the path.



Navigation with single GO term


This tool takes either single GO Term input, a file, or a user inputted list of GO Terms. Here is an explanation of each field:

1
This allows selection of drawing either child nodes, parent nodes, or both parent and child.
2

This field determines how the user will input the data. 'GO Term Single Input' activates field #3, 'GO Term File Upload (Newline-delimited)' activates the file upload field #4, 'GO Term Multiple (Newline-delimited)' activates the text area #5.

3

This allows for single entry of a GO term in the form GO:xxxxxxx

4

This allows for entry of multiple GO terms using either file input. A sample of the formatting of the files or the text area input for this tool is available here: gotermsingle.txt *Newline-delimited should be used to distinct each GO term!

5

Text box for GO term multiple input * Newline-delimited should be used for split each GO term!


- Top -




Navigation with GO term pair


This tool takes either a pair of GO Terms in the form GO:xxxxxxx, a tab-delimited file, or a user inputted list of GO Term pairs. Here is an explanation of each field:

1
This allows selection of drawing either child nodes, parent nodes, or both parent and child.
2

This field determines how the user will input the data. 'GO Term Single Input' activates field #3, 'GO Term File Upload (Tab-Delimited)' activates the file upload field #4, 'GO Term Multiple (Tab-Delimited)' activates the text area #5.

3

This allows for the entry of a pair of GO terms in the form GO:xxxxxxx. * Both must be filled out!

4

This allows for entry of multiple GO terms using file input. * TAB should be used as delimiter two split two GO term! . A sample of the formatting of the files input for this tool is available here: gotermpairs.txt

5

Text box for GO term multiple input * Single space should be used as delimiter two split two GO term!



What should I do if encounter an error with a "child node"?
This means that the gene ontology term given is a child "leaf" node and does not have any further children. It is also possible that the terms are newly introduced so that our DB does not contain them. This makes drawing children not possible for these gene ontology terms. Please make sure the terms are neither leaft node nor newly introduced one.


- Top -




Similarity

Writing the Contents of Gene Products


This form helps users to create the contents of gene products with UniProt IDs :

1
This activates #4 'File Upload' options and can upload a text file containing the list of Gene Product IDs
2
This activates #5 'Text Area' options and can directly type the list of Gene Product IDs
3
This filters out the list of gene products by selecting predefined species only
4
File Upload interface activated by choosing #1
5
Direct input text area activated by chooing #2

- Top -




Similarity in Gene Ontologies


This form uses different algorithms to generated a text file that compares the similarity between two GO Terms. An explanation of each field is found below:

1
This dropdown allows for the selection of the algorithm by name. Choose from Resnik, Lin, Gentleman, Ye, Jiang, or Schlicker.
Note:
Jiang's method requires the fields in field #4 to be filled out!
  • Gentleman, R. (2010). "Visualizing and Distances Using GO." from http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/GOvis.pdf.
  • Jiang, J. J. and D. W. Conrath (1997). Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. International Conference Research on Computational Linguistics (ROCLING X).
  • Lin, D. (1998). An Information-Theoretic Definition of Similarity. In Proceedings of the 15th International Conference on Machine Learning, Morgan Kaufmann.
  • Resnik, P. (1995). "Using Information Content to Evaluate Semantic Similarity in a Taxonomy." In Proceedings of the 14th International Joint Conference on Artificial Intelligence: 448-453.
  • Schlicker, A., F. S. Domingues, et al. (2006). "A new measure for functional similarity of gene products based on Gene Ontology." BMC Bioinformatics 7: 302.
  • Ye, P., B. D. Peyser, et al. (2005). "Gene function prediction from congruent synthetic lethal interactions in yeast." Mol Syst Biol 1: 2005 0026.
2

This field determines how the user will input the data. 'GO Term Single Input' activates field #3, 'GO Term File Upload (Tab-Delimited)' activates the file upload field #5, 'GO Term Multiple (a GO pair per a line)' activates the text area #6.

3

This allows for the entry of a pair of GO terms in the form GO:xxxxxxx. * Both must be filled out!

4

Only filled out if using Jiang's Method. Otherwise, input is disabled and ignored.

5

This allows for entry of multiple GO terms using file input. * TAB should be used as delimiter two split two GO term!. A sample of the formatting of the files input for this tool is available here: gotermpairs.txt

6

Text box for GO term multiple input * Single space should be used as delimiter two split two GO term!


- Top -




Similarity in Gene Products


This tool uses gene products along with the algorithms for GO Similarity in order to produce an output text file of GO Similiarity. This form only takes files or text area input.

1
The first dropdown is the method of GP Similarity. Choose from AveMax, HdfDist, AveNMS, AveMatch, or Match. The second dropdown allows for the selection of the algorithm by name. Choose from Resnik, Lin, Gentleman, Ye, Jiang, or Schlicker. Note: Jiang's method requires the fields in field #4 to be filled out! Also note that "AveMatch" and "Match" ignore the GO Similarity method.
  • AveMatch : average number of match GO terms between two gene products
  • Match : get score 1 if there is a match GO term between two gene products otherwise 0
  • AveMax : Wang, J. Z., Z. Du, et al. (2007). "A new method to measure the semantic similarity of GO terms." Bioinformatics 23(10): 1274-1281.
  • HdfDist : del Pozo, A., F. Pazos, et al. (2008). "Defining functional distances over gene ontology." BMC Bioinformatics 9: 50.
  • AveNMS : Lerman, G. and B. E. Shakhnovich (2007). "Defining functional distance using manifold embeddings of gene ontology annotations." Proc Natl Acad Sci U S A 104(27): 11334-11339.
2

This field determines how the user will input the data. 'File Input' activates field #3, 'Text Input (Space-Delimited)' activates the file upload field #4;

3

This enabled file upload. The example file for GP Content: gps.txt (GP Content). The example file for GP Pairs: gppairs.txt

4

This takes input formatted like the sample files above.

5

Only filled out if using Jiang's Method. Otherwise, input is disabled and ignored.


- Top -




Expected GO Term Pairs


This form allows one to generate a list of pairs in relation to the root node: GO:0003674. Enter the number of expected pairs and the file will generated the pairs in tab-delimited format.

1
Enter the expecting number of GO term pairs which share root node (GO:0003674) only

- Top -




Statistics

Number of Annotated Gene Products


This graph shows the number of gene products annoated by corresponding gene ontology categories. The graph is drawn daily base and the actual number of gene products can be seen by moving the mouse point over a square in the graph


- Top -




Number of GO terms


This graph shows the number of gene ontology terms corresponding to gene ontology categories. The graph is drawn daily base and the actual number of GO terms can be seen by moving the mouse point over a square in the graph


- Top -




Inputs

Direct Input Vs. File Upload


This service provides two types of inputs: direct input through web browser and text file upload".
Each type requires thier special format. The formats are described below:


1
For single input, both direct input and text file upload MUST contain one GO term per a line so newline delimiter is required for each GO term. (i.e. Navigator > Single Term)

2
For pair input, direct input MUST split two terms by using a single space and each line MUST contain one GO term pair, so newline delimiter is required between every pairs. Text file upload has same format except TAB delimiter is used for distinct individual GO term within a pair. (i.e. Navigator > Term Pairs, Similarity > Similarity in Gene Ontologies)

3
For both direct input and file upload, the contents of a Gene Product (GP) consists of the name of GP and corresponding Gene Ontology (GO) terms of molecular function. The name of GP MUST start with a symbole '>' without any space. Corresponding GO terms are followed by GP name. Each line MUST contain a single GO term or a single GP name. Newline delimiter MUST be used for each line. (i.e. Similarity > Contents of Gene Products)

4
Any combinations of gene products can be made by users and the format of query GP pair is very much same as the one in GO pairs such that direct input MUST split two terms by using a single space and each line MUST contain one GP pair, so newline delimiter is required between every pairs. Text file upload has same format except TAB delimiter is used for distinct individual GP within a pair. (i.e. Similarity > Similarity in Gene Products)

- Top -




Outputs

GO Navigator


The result of GO navigator is compressed with tar.gz and can be decompressed as following:
- Windows : use decompress program (e.g.
7-zip )
- Linux : use command "tar -xvzf download_file.tar.gz"


1
Single node with drawing all ancestor nodes: the green shows the query GO term
2
Single node with drawing all descendant nodes: the green shows the query GO term
3
Single node with drawing all ancestor and descendant nodes: the green shows the query GO term
4
GO term pair with drawing all ancestor nodes: the green shows the query GO term and the pink shows common ancestor between GO terms

- Top -




The list of outputs


Once the job is processed, it will show the download page. The results are compressed by tar.gz

1
By clicking this button, download starts shortly. The download file is tar.gz, so you need to decompress it.
- Windows : use decompress program (e.g. 7-zip )
- Linux : use command "tar -xvzf download_file.tar.gz"

2
After the file is decompressed, you can see four files
- GP_AveMax_Resnik.txt : this file contains gene product similarity calculated by Resnik's method with AveMax.
- GO_unique_Resnik.txt : this file contains unique GO similarity used for calculating gene product similarity


3
Each button is linked to decompressed version of files. By clicking each button, the contents will be opened through the web browser.

- Top -




The contents of outputs


This section explains the contents of outputs

A
This file contains the results of gene product similarity:
- column #1 : list of gene product (GP) pairs given by the user.
- column #2 : GP similarity score based on directly annotation GPs in GO database.
- column #3 : GP similarity score based on annotated GPs including all ancestor terms in GO database.
- column #4 : GP similarity score based on directly annotation GPs in UniProt database.
- column #5 : GP similarity score based on annotated GPs including all ancestor terms in UniProt database.

B
This file contains the results of GO term similarity:
- column #1 : list of unique gene ontology (GO) pairs belonging to GPs.
- column #2 : GO term similarity score based on directly annotation GPs in GO database.
- column #3 : GO term similarity score based on annotated GPs including all ancestor terms in GO database.
- column #4 : GO term similarity score based on directly annotation GPs in UniProt database.
- column #5 : GO term similarity score based on annotated GPs including all ancestor terms in UniProt database.


- Top -




Q&A


How should I format my files?


For drawing single GO term nodes:

Upload the GO terms with a line break between each GO term. Do not remove the "GO:" prefix from the GO Term. A sample file is provided here: gotermsingle.txt



For drawing GO term pair nodes or calculating GO term similarity:

Upload the GO term pairs delimited by tabs. There should be a tab between the GO terms in the pair. There should be a line break between each pair. Do not remove the "GO:" prefix from the GO terms. A sample file is provided here: gotermpairs.txt



For calculating GP similarity:

Upload the GP files using the following format in the sample files:
gps.txt (GP Content)
gppairs.txt (GP Pairs)


- Top -