GATHER Frequently Asked Questions

  1. What does GATHER do?

    GATHER helps you understand the function of a group of genes, such as a cluster of co-regulated genes from microarrays. If you type (or easier, copy-paste) a list of genes into the text box, it will show you the annotations that distinguish your genes from other genes in the genome.

  2. How do I cite GATHER in a publication?

    Please cite:
    [PubMed] JT Chang and JR Nevins. "GATHER: A Systems Approach to Interpreting Genomic Signatures." Bioinformatics 22(23), 2006.

  3. What organisms/annotations are supported?

    Although human genes are completely supported, other organisms lack support for one or more types of annotations.

    Supported Annotations
      Human Mouse Rat Fly Worm Yeast
    Gene Ontology Yes Yes Yes Yes Yes Yes
    MEDLINE Words Yes Yes Yes Yes Yes
    MeSH Yes Yes Yes Yes Yes
    KEGG Pathway Yes Yes Yes
    Protein Binding Yes
    Literature Net Yes Yes Yes Yes Yes
    miRNA Yes Yes
    TRANSFAC Yes Yes
    Chromosome Yes Yes

    Supported Predictions
      Human Mouse Rat Fly Worm Yeast
    Include Homologs Yes Yes Yes Yes Yes Yes
    Infer from Network Yes Yes Yes Yes

    We hope to add support for more organisms and annotations in the future. If your favorite organism or annotation is missing, please let us know!

  4. What kind of gene names does GATHER accept?

    The genes should be separated by spaces, commas, semicolons, or newlines.

    GATHER accepts Probe Set IDs from the following Human Affymetrix chips: U133A, U133Av2, U133B, U133+2, U95A, U95Av2, Hu35KA/B/C/D, Hu6800.
    Mouse Affymetrix chips: U74Av2, U74Bv2, U74Cv2, 430 2.0, 430A 2.0, Mu11KA/B.
    Rat Affymetrix chips: U34A, 230A/B/2.0.
    Fly Affymetrix chips: Genome 2.0.

    GATHER accepts IDs from the following Operon chips: Human v3, v4; Mouse v3, v4; Drosophila V1.

    If your favorite database or microarray chip is not included in this list, please let us know. If it's a common ID that many people request, we'll try to add it.

  5. What is a Bayes factor?

    A Bayes factor is a measure of the strength of the evidence supporting an association of an annotation with your gene list. Higher Bayes factors indicate stronger evidence that the annotation is relevant to your genes.

    A positive Bayes factor indicates that the evidence supports the hypothesis that the annotation is more related to your list of genes than other genes in the genome. A negative Bayes factor indicates that the evidence suggests that the annotation is more strongly associated with other genes in the genome.

    It is hard to choose an appropriate cutoff for Bayes factors. It measures the amount of evidence supporting the annotation, so the cutoff depends on what you intend to do with the information. If you need to prove an association beyond a shadow of doubt, you need to have a high Bayes factor cutoff. On the other hand, if you can accept annotations upon a preponderance of evidence, you can use a lower Bayes factor cutoff.

    Still want a cutoff? OK, we recommend 6, as that cutoff appears to balance false positives with false negatives (see Chang and Nevins, accepted for publication).

  6. Do the p-values account for multiple hypothesis testing?

    Yes. They are calculated based on the probability of seeing a Bayes factor of a particular magnitude in a query. See the publication for more details.

  7. How come when I click on a TRANSFAC matrix, the new webpage says the entry does not exist?

    We did the analysis using the TRANSFAC v8.2 Professional. However, BIOBASE makes available only the matrices from TRANSFAC v7.0. If you would like to find out more information about a v8.2 matrix, you will need to contact them for access to their most recent database.

  8. How do I interpret the Gene Ontology annotations?

    A Gene Ontology annotation has the format:
    1. GO:0007049 [5]: cell cycle
    The GO:0007049 is the unique identifier and cell cycle is the description for this GO annotation. [5] indicates the depth of that annotation. GO terms are structured as a tree such that deeper terms are refinements of their parents. For example, "cell cycle" at depth 5 is a refinement of "cellular physiological process," which is at depth 4.

  9. How do I save the results locally?

    Once you have entered a set of genes and chosen the desired type of annotations, click on the Download button at the bottom of the screen. This will save to your local computer a tab-delimited text file with a table of the results. You can open this file in Microsoft Excel or a text editor such as TextPad.

    The resultant table contains the following columns:
    # An index to number the annotations.
    Annotation The name of the annotation.
    Total Genes With Ann The number of genes from your list that have the annotation. If the Include Homologs inference is used, then this number will also include the homologous genes with the annotation from other organisms.
    Your Genes (With Ann) The number of genes from your list with the annotation.
    Your Genes (No Ann) The number of genes from your list without the annotation.
    Genome (With Ann) The number of genes in the genome (excluding those in your list) with the annotation.
    Genome (No Ann) The number of genes in the genome (excluding those in your list) without the annotation.
    ln(Bayes factor) The Bayes factor quantifying the amount of evidence supporting the hypothesis that the annotation is associated with your gene list. This is the same number that is shown on the website, but here, it is not rounded -- it contains more significant digits.
    neg ln(p value) The negative logarithm of the p value calculated from the Bayes factor (see Supplementary materials for the publication). The website shows the actual p values, but here, they are reported as logarithms for a more compact representation.
    FE: neg ln(p value) The negative logarithm of the p value calculated using a Fisher's exact test.
    FE: neg ln(FDR) The false discovery rate based on the Fisher's exact p value.
    Genes The symbols of the genes that have the annotation. If the Include Homologs inference is used, the homologous genes that have the annotation will also appear, but with a :H suffix. Similarly, if the Infer from Network inference is used, the genes that were included based on the network inference, and also have the annotation, will have a :N suffix.

  10. Why do I get more results when I download the annotations to a file?

    For speed, the webpage only shows the annotations whose Bayes factor is > 0. The downloaded report shows all the annotations associated with any of your input genes, regardless of Bayes factor.

  11. How do I query the server using a script?

    Use your favorite programming language (or wget) and send GET or POST requests to:
    http://gather.genome.duke.edu/?cmd=report&gene_box=<genes>&tax_id=<organism>&annot_type=<annot_type>&network=<network>&homologs=<homologs> e.g. http://gather.genome.duke.edu/?cmd=report&gene_box=e2f1+e2f3+myc&tax_id=9606&annot_type=gene_ontology&network=0&homologs=1

    The server will return the results as a tab-delimited text file. Please add a 2s delay between queries so that the server is not saturated with requests. We do monitor for excessive use and will throttle or cut off a connection if it is degrading the quality of service for others. If you need to run a large batch of queries, contact us.

  12. Nothing happens after I type in some genes (or other odd things are happening).

    Please make sure that javascript is turned on in your web browser. We have tested GATHER with Internet Explorer (version 6.0, 7.0), Firefox (1.5, 2.0), and Safari (2.0). There seems to be problems with Firefox 1.5 beta 1. Other browsers may work as well, but we have not tested them. If javascript is enabled, and the webpage still doesn't work, then please write us at the address below.

  13. Something is not working. How do I report this?

    Please send an email to Jeffrey Chang at jeffrey.chang@duke.edu that includes the following information:
    1. A detailed but concise description of the problem.
    2. The list of genes that caused the problem.
    3. The name of your operating system, including its version (e.g. Windows 2000).
    4. The name of your browser, including its version (e.g. FireFox 1.0.4).
    5. Whether javascript and/or cookies are enabled.

More questions? Write Jeffrey Chang at jeffrey.chang@duke.edu.
Back to GATHER.