Retrieving Results...
Loading New Contents...

Network Query

Gene/GO Query

ID Mapper

GAIL

Network Query

Network query function allow you to query gene-gene associations based on cosine similarity scores. You can enter a list of genes and query the association network of these genes. Currently, we only support HGNC official IDs for genes. We provide a ID mapper function to map other gene names/synonyms to HGNC IDs. Optionally, you can input GO terms in the query to obtain a hypergeometric p-value matrix between genes and input GO terms.

You can enter a list of HGNC IDs/GO IDs into the box on the left and click on 'Search' to query results. You can click on 'Clear' to empty the box. You can enter a list of GO terms into the box below after clicking on 'Optional GO terms'.

Network Query Result

Optional GO terms

You can include GO terms in the network query. If so, a p-value matrix between queried genes and GO terms is calculated. You can further use this matrix in bayesGO. See Software page for details.

Use GO terms

If you do not know which GO terms you are interested in, you can still specify the number of GO terms to include in the p-value matrix, or just use all the GO terms in the database (Note, to use all GO terms are not recommended because the computation will be very slow).

View Matrix

You can view the resulting cosine similarity matrix between genes and genes by clicking on the 'View' button and choose 'Network Matrix'. If additional GO terms are queried, you can also view the hypergeometric p-value matrix between genes and GO terms.

Download Graph and Matrix

You can download the resulting netwrok graph as well as the cosine similarity matrix between genes and genes by clicking on the 'Download' button and choose corresponding options. If additional GO terms are queried, you can also download the hypergeometric p-value matrix between genes and GO terms. The downloaded graph is '.png' format and the downloaded matrix is '.csv' format.

Community Detection

We provide a function called 'Community Detection' to detect the cluster in the gene-gene network. You can perform the community detection by clicking on the 'Community Detection' button.

A tab in the 'Network Analysis' section will show up when the detection finishes. You can change the color of the cluster or check the averaged p-value between each GO term and the cluster of genes.

Global GO Analysis

'Global GO Analysis' calculates the averaged p-value of a given gene with all GO terms.

Network Customization

You can customize the network by changing the options under the 'Customize the Network' tab.

The details of all customizations are listed below:

Customizations Description
Edge Display We provide two methods to display the network. You can choose 'Cosine Similarity' method, which constructs the network based on computed cosine similarity scores. You can specify the cut-off value for cosine similarity. Two genes are connected if their similarity score is above the cut-off. In the other method, you can choose the maximum number of associations(edges) allowed for each gene. The associations are picked For example, the default value of 2 restricts each gene to have 2 associations with other genes at maximum.
Find a Gene You can search for a specific gene in the input box. The camera will zoom in if you select a specific gene.
Network Layout You can update the network layout by changing the distance between each nodes. This is useful when you have performed the community detection. Genes within a gene cluster will locate near each other. Also, you can restore the original network by clicking on 'Zoom Back'.

Network Info

You can find information about genes and gene-gene associations when you click on a node or an edge. When you click on a gene node, name and symbol for the specific gene will display. In addition, you can go to GeneCard to find more information about the gene by clicking on 'Link to GeneCard'

When you click on a gene-gene association, symbols of two gene nodes and the cosine similarity between two genes will display. You can also compare the two genes by clicking on 'Compare Nodes'. This will direct you to a table that lists hypergeometric p-values between these two genes and all GO terms. You can download the corresponding matrix of the comparison.

Data Table

You can find cosine similarity between each pair of two genes in the data table tab. Note, the data is the same as that in the cosine similarity matrix. We only change the format to display the data so that you can order by gene name or the similarity scores.

Community Detection

We provide a function called 'Community Detection' to detect the cluster in the gene-gene network. You can perform the community detection by clicking on the 'Community Detection' button.

A tab in the 'Network Analysis' section will show up when the detection finishes. You can change the color of the cluster or check the averaged p-value between each GO term and the cluster of genes.

Global GO Analysis

'Global GO Analysis' calculates the averaged p-value of a given gene with all GO terms.

Search Again

You can return to the search page and start over by clicking on 'Search Again'.

Check Documentation

If you forget the above information when you head to the results page. Do not worry, you can still check documentation by clicking on the book icon.

Gene/GO Query

Query functions allow you to query hypergeometric test p-values between genes and GO terms. You can enter a list of genes/GO terms (one per line). Currently, we only support HGNC official IDs for genes and GO official IDs for GO terms. We provide a ID mapper function to map other gene names/synonyms to HGNC IDs.

You can enter a list of HGNC IDs/GO IDs into the box on the left and click on 'Search' to query results. You can click on 'Clear' to empty the box.

Gene/GO Query Result

Check Genes/GO terms

You can check each retrieved gene/GO term by clicking on the corresponding tab on the left. The page on the right will display detailed information of the selected gene/GO term. The information contains the hypergeometric p-values between the selected gene/GO term and all GO terms/genes occurring in a same PubMed abstract.

Hypergeometric p-values and Bonferroni-corrected p-values

The table contains both the hypergeometric p-value between gene and GO term as well as the Bonferroni-corrected p-value. A p-value in green indicates that the value is significant, passing the Bonferroni cutoff of 0.05.

Download Matrix

You can download the hypergeometric p-values for the selected by clicking on the 'Download' button.

Link to GeneCard

Each gene can be linked to GeneCard, a integrated database for genes. You can explore more details for the specific gene in GeneCard. In 'Gene Query', the external link will appear under the name header in each gene page. In 'GO Query', the external link can be reached by clicking on each term in the table.

Link to GO Consortium

Each GO term can be linked to GO Consortium, the official website for Gene Ontology. You can explore more details for the specific GO term in GO consortium. In 'GO Query', the external link will appear under the name header in each GO page. In 'Gene Query', the external link can be reached by clicking on each term in the table.

Search Again

You can return to the search page and start over by clicking on 'Search Again'.

No Match

If there is no match for a queried ID. The ID will show up in the red box. You can click 'x' on the top-right to close the box.

Check Documentation

If you forget the above information when you head to the results page. Do not worry, you can still check documentation by clicking on the book icon.

ID Mapper Query

Currently, our query functions only support HGNC IDs. The ID Mapper can take in a list of gene names/synonyms and return mapped HGNC IDs. You can click on 'Clear' to clear the box. The box on the right side contains a similar documentation for the ID Mapper.

You can enter a list of gene names/synonyms into the box on the left and click on 'Search' to search mappings. Various types of input are supported, including gene names/synonyms, Ensembl ID and NCBI Accession Number.

Supported Input Example of Input Example of Output
Gene Names breast cancer 1 BRCA1
Gene Synonyms BRCC1 BRCA1
Ensembl ID ENSG00000012048 BRCA1
NCBI Accession NM_007294 BRCA1
Previous Names/Synonyms PNCA4 BRCA1

ID Mapper Results

Copy Selected

You can copy mapped HGNC IDs into your clipboard and use them in the query functions. First, you need to select the ones you want to copy. All unique matched terms are selected by default. Next, you need to click on the 'Copy Selected' button to copy selected terms into your clipboard. Note, if multiple queried terms are mapped to the same HGNC ID, the function only copies one HGNC ID to your clipboard.

Search Again

You can return to the search page and start over by clicking on 'Search Again'.

No Match

If there is no match for a queried term. The term will show up in the red box. You can click 'x' on the top-right to close the box.

Find Similar Term

For terms without matches, you can click on 'Search for Similar Terms'. The search function will retrieve top 5 matches in our database for each of the term. It will take a while if there are many unmatched terms.

Multiple Matches

There are possibilities that a name/synonym return multiple matches. Most duplicates are resulted from previous gene synonyms. Some synonyms are annotated for two or more different genes, thus can return multiple matches. You can check the box in the right to select the one you want and copy to your clipboard.

Check Documentation

If you forget the above information when you head to the results page. Do not worry, you can still check documentation by clicking on the book icon.

Software

BayesGO

bayesGO is a Bayesian hierarchical model that simultaneously identifies pathway-modulating genes based on the literature mining data and facilitates interpreting functions of these new genes using Gene Ontology terms. This approach allows rigorous inference of gene-gene relationships based on the literature mining data while the GAIL web interface allows users to implement dynamic and interactive exploratory analyses.

Download

The R package bayesGO implements this statistical model and provides simple and user-friendly interface for its statistical inference. The 'P-Value Matrix' output from the GAIL web interface can be used as input for this software. Note, bayesGO requires JAGS. You can download JAGS and bayesGO through following links:

Usage

After you download the ‘P-Value Matrix’ output from the GAIL web interface, you can load this file into the R environment using the R function read.csv() and the R function bayesGO() fits the bayesGO model by taking this data as input. Then, the R function predict() implements clustering of genes and GO terms and identifies association between genes and GO terms. Finally, the R function plot() visualizes the analysis results as below, where column and row side bars show gene and GO term cluster indices and red colors within the heatmap indicates stronger association between genes and GO terms. Please check Yu et al. (2018) for the step-by-step analysis guideline and Chung et al. (2017) for more details about the statistical model.

Development

The data is integrated from HUGO, Ensembl, GenBank, Uniprot and Gene Ontology (GO). All data is stored using graph structures in Neo4j graph database. The web interface is developed using Django framework.

Database Statistics

PubMed abstracts
20,003,700
Genes
41,138
GO terms
45,000
GO-gene correlations
5,052,454
Gene-gene correlations
302,254,780