WormBase Userguide
Home Genome Blast / Blat Batch Genes Batch Sequences Markers Genetic Maps Submit More Searches

User's Guide for WormBase

Prev

[ Index ] [ Summary ] [ Menu ] [ Submenu ] [ FAQ ]

Next


Frequently Asked Questions

There is a help page from Sanger Center for AceDB (the underlying database management system of WormBase). Instructions for the Genetic map, Physical map and Sequence map are all included there. http://www.acedb.org

The ACeDB database can be obtained by downloading acedb from the NCBI site. Go to ftp://ftp.ncbi.nlm.nih.gov/repository/acedb/ace4/ to install acedb software, and go to ftp://ftp.ncbi.nlm.nih.gov/repository/acedb/celegans/. Other software that drives WormBase can be downloaded from ftp://ftp.wormbase.org/pub/wormbase/software to download and install the C. elegans data.

General Questions

Query and Web Interface

How to find homologues?

Batch Download

Gene/Clone

Cell/Lineage


How do I find out cell lineage pedigrees?

There are two kinds of pedigree display. The Cell pedigree tree (located on the Cell Page) or the Lineage pedigree tree (located on the Pedigree Browser). The Cell Page is simple and easy to use, with a full description of the cell lay out, while the advantage of the Pedigree Browser is that it displays complete lineage pathways (from P0) with user-interested cell(s) highlighted.

Starting from the Search on the WormBase home page. Select from the pull-down menu "Cell" and enter the cell name. A "cell summary" display will appear with a Cell pedigree display box showing three generations of cells. Your cell will appear red on the pedigree. Users can move the pedigree tree up or down in the lineage by clicking on the parent cell or daughter cells. Another way to access pedigree is from "Cell and Pedigree Search" (under More Searches menu), which searches for specific cells, cell groups, or lineages.

Back to FAQ

What's the nomenclature for C. elegans. cells?

About the "basic question" you asked, there is a very good article explaining everything about embryonic cell lineage and nomenclature. It is:

Sulston JE et al (1983) Dev Biol. "The embryonic cell lineage of the nematode C. elegans."

That article is the "dictionary" everyone refers.

P0 is the founder cell for C. elegans. It is the zygote after fertilization. The first few rounds of divisions produce six "founder cells": E, MS, AB, C, D and P4. Each of these founder cells generate different tissues. From then on, cells are named after these founder cells. For example, the daughters of AB are called ABa ('a' means anterior) and ABp ('p' means posterior). ABa will generate daughters ABal ('l' means left) and ABar ('r' means right)... If cell divides dorsal-ventrally, 'd' or 'v' will be added to the name of daughters.

Now you know when you see ABalppp , it comes from:

P0->AB->ABa->ABal->ABalp->ABalpp->ABalppp

Not only will you see the lineage pathway from the cell name, you will also see in which direction cells have divided and what the sister cells are for each step of the division.

Back to FAQ

How can I know each C. elegans cell's function and exactly at which stage of the embryonic lineage it appears?

Most of the information you need for a cell should be contained on Cell Report, which can be located by "Cell and Pedigree" search. In WormBase, if you read the Tree Display of a Cell Report, there is a tag called "Embryo_division_time"; it is the time when the cell divides or dies. Unfortunately, for cells generated after hatch, there is no such information in WormBase.

Back to FAQ

What is the connection between the cell P0, and the cells P1, P2, P3, ..., P7, P8, etc?

There are two sets of P cells. One arises from early embryonic divisions, and are called P0, P1', P2', P3' ... in WormBase; these are the lineage names. The other set is called P1, P2, P3, ... These are postembryonic blast cells, which are not related to the embryonic founder cells.

P1, P2, P3.. are adult names for post embryonic blast cells preset from hatching until the middle of the first lalval stage (L1), . A lot of cells have two names: lineage name and adult name. Adult name is the name people give to some cells that become terminal and differentiate (such as neurons) or not differentiate but will divide into an important lineage (such as P1, P2 ... lineages). Adult names are given by cell position and function, so it is a different naming system. Cells with the same adult name could come from different lineages depending on how bilateral symmetry is broken, for example: P7 can be developed either from AB.plappapp or AB.prappapp.

Lineage name is accurate, unique, but hard to remember for most people, so adult names are usually for researchers to use and do the query. That is why in WormBase cell nomenclature, whenever there exists adult name, we use it to call a cell, and bury its lineage name inside data field.

Back to FAQ

How to retrieve nematode specific genes with no homology to yeast, fly, mouse, and human?

From the "advanced query" WormBase page, construct the following query:

find predicted_gene NOT Pep_homol

Back to FAQ

What is the meaning of several abbreviations that are used by WormBase: "Protein SW", "Protein TR" and "Protein WP", In addition, using the TR Database, sometimes the species origin (e.g. C. elegans) is missing - how can I find out? Furthermore, how can I get from a TR Database entry to the corresponding predicted gene in the C. elegans genome?

SW stands for Swiss-Prot, TR stands for TrEMBL and WP stands for WormPep. In case you're not familiar with any of theses protein databases you can go to: http://www.expasy.org/sprot/ and http://www.sanger.ac.uk/Projects/C_elegans/wormpep/ for an explanation and access to them.

Inside Protein SW or Protein TR, you may find the accession number of Swiss-Prot or TrEMBL. You can get all details of the protein (including species origin..) by going to http://www.expasy.org/sprot/ and entering the accession numbers,

Back to FAQ

How to obtain all the abstracts on Wormbase and the particular genes that they are associated with.?

There are two ways:
1) go to ftp://ftp.sanger.ac.uk/pub/wormbase/current_release/ and get the acedb data files.
2) Use AcePerl to get the abstracts. You can do this easily with an Aceperl script:
my $db = Ace->connect(-host=>yourhost.com) || warn 'yikes';
my $iterator = $db->fetch_many(-query=>qq(find Paper where Abstract)); while (my $obj = $iterator->next) {
# grab info from the object
my @genes = $obj->Gene;
... etc ...
print join(' ',@genes);
}

--Todd Harris & Lincoln Stein

Back to FAQ

How to get all the cell types (neurons, actually) in which a gene is expressed?

When we curate a gene, we enter all the cells and cell groups that express the gene. This information can be easily viewed by clicking the "details" button at the gene page.
For example, if you search for eat-16, which is expressed in neurons:

1. At the WormBase home page, select "Any gene" and search for "eat-16", and select "Exact Match", this will take you to the Gene Summary page for eat-16.
2. In the Function section, you will see "Anatomimic Expression Pattern". Here you will see some information about the eat-16 expression pattern, at the very end of the entry, you will see a link "Details".
3. If you click here, you will be brought to the Expression Pattern page for eat-16. On this page you will see the detailed cell and cell group information associated with eat-16.
(To keep annotation easy, when a gene is expressed in lots of cells, we enter cell group name instead of all the cell names one by one. Each cell group will include the list of cells associated.)

Back to FAQ

How to find out how many genes contain expression patterns generated with a specific method (for example, in_situ)?

Type the following command under the menu and in the box of "Advanced Search". The following line search for Expr_patterns containig all three types of methods. If the '&' is replaced by '|', the command will search for Expr_pattern with In_situ OR Antibody OR Reporter_gene data.

find Expr_pattern Type="In_situ" & Type="Antibody" & Type="Reporter_gene"

You may change the words following the same syntax to search for other objects.

Back to FAQ

What do the colorful bars for biggsae alignments mean?

Dark blue bars are regions of strong similarity. Light blue bars are regions of weak similarity. Dashed areas don't match.

When there are multiple bars in the same region, it means that there are several C. briggsae clones that all match the region.

--Lincoln Stein

Back to FAQ

Who could I contact about getting a cDNA clone?

Please go to NextDB(http://nematode.lab.nig.ac.jp/db/index.html), Yuji Kohara cDNA database repository. You can obtain cDNA clones from Yuji Kohara at the National Institute of Genetics, Mishima, Japan: ykohara@LAB.nig.ac.jp

Back to FAQ

How to download the alignments of EST sequences to genomic sequences?

You can extract it from the GFF files that we provide with every release of WormBase. For more information on GFF files see:

http://www.sanger.ac.uk/Software/formats/GFF/

Basically, we release one GFF file per chromosome and this contains the coordinates and details of most features that we can map onto chromosome base pair coordinates.

These files are accesible from the main WormBase page (see the Feature table links on the right) and should also be on the WormBase and Sanger Institute FTP sites.

You will need to extract only a subset of these files, i.e. lines that match the pattern 'BLAT_EST_'. This is very easy to do if you have access to a UNIX/Linux system (use the 'grep' command).

E.g. here are two sample lines from the Chromosome II file (these will probably wrap around your screen):

CHROMOSOME_II BLAT_EST_BEST similarity 5754433 5755008 100
. . Target "Sequence:yk776e12.5" 21 596
CHROMOSOME_II BLAT_EST_OTHER similarity 5755968 5755971 98.4
. . Target "Sequence:yk4g4.5" 116 119

Within these lines are details of the chromosome coordinates, the BLAT score, the matching sequence name, and the coordinates within the matching sequence.

--Keith Bradnam

Back to FAQ

Is there a file showing the lineage map of the worm.

Leon Avery has something like that on the Web: http://elegans.swmed.edu/parts/

--Raymond Lee

Back to FAQ

How to cite WormBase?

See "About WormBase" link from the WormBase home page.

Back to FAQ

How to register a new lab?

New lab and allele designation should be registered directely with Jonathan Hodgkin (jah@bioch.ox.ac.uk) of CGC.

Back to FAQ

What's the difference between the sequence displayed in "Sequence Report" and that in "Genome Browser"?

(For example: Sequence F35E8) The coordinates given in the Sequence Report under 'Genomic Location' are for the sequence F35E8, which is not the full sequence of the clone F35E8. The clone is represented under the diagram of the sequence features and has an arrow point off the left end indicating the clone extends to the left.

When you click to the Genome Browser your seeing the sequence of F35E8 with the clone again represented under the diagram of sequence features with an arrow pointing left. You have to zoom out to see the full extent of the clone F35E8.

--John Spieth

Back to FAQ

What's the easiest way for users to find the 'TRUE' ends and thus insert size of a clone?

The set of clone ends is dumped as part of the gff files:

http://www.sanger.ac.uk/Projects/C_elegans/WORMBASE/GFF_files.shtml

This is the source for the extents displayed in WormBase.

The caveat with this is that the 'true' end is not marked up for all clones. The early cosmids do not have such annotations because nobody thought about marking them up. Later cosmids do have clone left and right ends as this became part of the standard procedure. Finally, many of the YACs do not have clone ends because the segment submitted to GenBank/EMBL is much smaller than the full clone, and hence the true ends lie within sequences already finished at that stage of the sequencing (i.e. we never went back to update clone ends in sequence already finished).

--Dan Lawson

Back to FAQ

How to FTP download the genomic DNA database and the EST database for C. elegans?

Our underlying database for WormBase is built on the acedb software (available freely from www.acedb.org). If you have acedb installed locally, you can download the entirety of our database from: ftp://ftp.sanger.ac.uk/pub/wormbase/current_release

However, a simpler approach may be to just download a GFF file and DNA file for each chromosome from: ftp://ftp.sanger.ac.uk/pub/wormbase/current_release/CHROMOSOMES/

These two set of files contain information on all sequence features (coordinates of genomic clones, genes, BLAST hits etc.). EST and mRNA sequences can be downloaded from: ftp://ftp.sanger.ac.uk/pub/C.elegans_sequences/ESTS/C.elegans_nematode_ESTs.gz

--Keith Bradnam

Back to FAQ

how to retrieve timestamps of Acedb from command line as well as using PERL?

We use AcePerl to retrieve some timestamp information...this is done via an AQL query.

E.g. if you wanted to find the timestamp of a tag in a particular object, belonging to a particular class, you could do:
my $aql_query = "select s,s->$tag.node_session from s in object(\"$class\",\"$object\")";
my @aql = $db->aql($aql_query);
my $timestamp = $aql[0]->[1];

--Keith Bradnam

Back to FAQ

Where is the flat file for the gene annotation of each chromosome of C. elegans?

You should take a look at the Feature Tables (GFF), which you can pick up from the same 'WormBase Downloads' page where you found the "Summary Tables" (http://www.wormbase.org/downloads.html).

You should also look at the 'Batch Downloads' page at WormBase (http://www.wormbase.org/db/searches/info_dump), where you can build your own tables of gene annotations.

One other WormBase page you should look at is the "Genome Dumper" (http://www.wormbase.org/db/searches/advanced/dumper).

--John Spieth

Back to FAQ

Where can I find details of the methods used to create wormpep?

We make WormPep during each release of WormBase and the starting point is always a translation of our latest set of gene predictions. Gene predictions are initially based on the GeneFinder prediction program with human modification as is deemed necessary. The level of human involvement really depends on what other supporting data is available. Aside from routine inspection of gene predictions based on EST/mRNA data we also evaluate our predictions based on information from published papers and direct contact from the worm community. All gene predictions have been looked at by a human to some level.

We have started to distinguish subsets of WormPep. Thus all WormPep proteins can be thought of as either 'CONFIRMED', 'PARTIALLY CONFIRMED', or 'PREDICTED'. The first set contains all genes where there is transcript evidence for every base of every exon of the gene (note that this can still - in theory - mean that there are unpredicted exons in a 'CONFIRMED' gene). The second set contains genes for which there is some transcript evidence but the whole gene is not yet supported...either due to lack of transcript evidence or errors in our current gene prediction. The last set is everything else, i.e. genes with no transcript support. In the future we may expand this classification system to take account of other evidence (e.g. homology info from C. briggsae).

Each new build usually sees a slight increase in the first two categories and a drop in the third category. The relevant status of each Wormpep entry is added into the FASTA header of every entry in each WormPep release.

--Keith Bradnam

Back to FAQ

How to search pseudogenes on Wormbase website?

It will take a long time if you do AQL queries. However, a different way of query can be done if you want to retrieve the info from wormbase website.

From More search -> Advanced search at http://www.wormbase.org/db/searches/query

In Query Acedb, type in

find sequence *; pseudogene

You should get a result of pseudogene objects within a couple of seconds.

--Chao-Kung Chen

Back to FAQ

What technology are you using for worms database?

1) At the back end, WormBase data are deposited in an object-oriented database, ACeDB, which is the "master" database containing all data. ACeDB can be accessed both remotely and locally, through both commandline and web server.

2) Some data (especially sequence data including genomics sequence, ESTs, OSTs, SNPs, genes, RNAs etc) are extracted from ACeDB and are deposited in a "slave" MySQL database, to support some key features like gbrowse (see below);

3) At the front end sits the apache server with mod_perl. Wormbase software package containing configuration files and a series of CGI scripts runs on the apache server. The CGI scripts provide users with a number of ways to browse and search WormBase.

4) Some key features of the WormBase package:
i. gbrowse (http://www.wormbase.org/db/seq/gbrowse?source=wormbase):
developed by Lincoln Stein for the GMOD consortium and is widely used for other model organisms. It allows users to browse through the whole genome for feature tracks corresponding to specific genome regions. gbrowse is highly configuarable and support multiple foreign languages.
ii. synteny browser(http://www.wormbase.org/db/seq/ebsyn?name=CBG22984):
recently developed by Lincoln Stein for the GMOD consortium as well. It allows comparative view of two genomes side by side, focusing on the syntenic regions.

--Jack Chen

Back to FAQ

What are the different types of clones in WormBase?

1. There are seven different types of "Clone" objects in WormBase:
Cosmid : cosmids (A*,B*,C*,D*,F*,J*,K*,M*,R*,T*,W*,Z*)
Fosmid : fosmids (H*)
YAC : yacs (Y*)
cDNA : Yuji et al (yk*)
Plasmid : PCR clones (V*,EGAP*)
Other Text : telo clones, 1 BAC
Most cosmids, fosmids, YACs can be requested from Sanger, cDNA (yk*) from Dr. Yuji Kohara. The EGAP* plasmids can be obtained from MRC Geneservice. The V* plasmids are no longer available.

--Raymond Lee

Back to FAQ

How to find allele information?

We do have information on many thousands of alleles in WormBase. We have also tried to extract the molecular details of the mutations (where known) and add those to WormBase. Some examples:

Go to a gene page: http://www.wormbase.org/db/gene/gene?name=unc-71;class=Locus
Then click on the link to the 'ay47' allele (near the bottom of the page), this takes you to:
http://www.wormbase.org/db/gene/allele?name=ay47;class=Allele
You can see that there is a 'c' to 't' substitution in this gene. If you go to the genome browser display for this gene:
http://www.wormbase.org/db/seq/gbrowse/wormbase?name=unc-71
Then turn on the 'SNPs, Knockouts, and other Alleles' track and you will see the positions of the alleles in this gene.

To find other alleles, you can go to the query page:
http://www.wormbase.org/db/searches/wb_query
and type the following queries (everything between the single quotes):
'Find Allele; Substitution'
'Find Allele; Deletion'
'Find Allele; Insertion'

--Keith Bradnam

Back to FAQ

How is a pseudogene determined?

Most pseudogenes in WormBase represent genes that exhibit homology with other functional genes but which have become inactive through accumulation of mutations (frame shifts and internal stop codons).

Other pseudogenes in WormBase may contain valid open reading frames and additionally show evidence of transcription (EST matches), but have been classified as a pseudogene on the basis of sequence comparison to other genes. E.g. a multiple sequence alignment might reveal a high degree of sequence conservation between different genes, but one of those genes may have a premature stop codon in comparison to all other family members.

Pseudogenes are usually annotated in this way on the basis of comments from experts working on that gene family. Alternatively, some tRNA genes are classified as pseudogenes on the basis of the tRNA-scan program which automatically classifies potential pseudogenic tRNAs.

It should be noted that all of these pseudogenes in WormBase are only classified as such in relation to the sequenced Bristol N2 strain that was used for the genome project. Orthologous genes in closey related species (and possibly in other C. elegans strains) may not be pseudogenes.

--Keith Bradnam

Back to FAQ

Where can i find a list of classes and subclasses for the ACEDB?

You can find a list of Acedb classes by first clicking on the More Searches link on the upper right corner of the WormBase home page. From here, select the WormBase Class Browser, which will bring you to a searchable drop-down menu of all the Acedb classes.

For performing queries, it is helpful to know the data model for each of the classes that you would like to search. The data models can be accessed from this same page by typing the class of interest into the search box and then selecting "Model" from the drop-down menu. This will lead you to a Tree Display that diagrams how data for a particular class is represented in Acedb.

Also, from the MoreSearches link, you can access the Advanced AQL Search, which has further documentation and examples for querying the database.

--Kimberly Van Auken

Back to FAQ

How do I retrieve the gi numbers only for a list of entries having the GO term selected?

At present, you can retrieve Genbank identifiers (i.e. AAMxxxxx, AAKxxxxx, AAFxxxxx, etc.) for CDS's that are associated with a particular GO term by performing an AQL query. Here are the steps:
1) At the top right corner of the WormBase homepage, click on the More Searches link.
2) Under the general heading, select the Advanced AQL Search link.
3) Type the following query into the box:
select a, b, c[1] from a in class CDS,
b in a->go_term,
c in a->protein_id where b = "GO:0003700"
4) You should get back a three-column table listing each CDS, the GO term you selected, and a Genbank ID.

If you are interested, the rationale for the query can be bettter understood by looking at our data model for CDS's, which is at http://www.wormbase.org/db/misc/etree?name=%3FCDS&class=Model;expand=Visible#Visible. The above query searches in the CDS class, in the attribute go_term, where we have defined the go_term to be "GO:0003700", and in the attribute protein_id for the unique text id which is the database identifier. The [1] after the letter c in the query indicates that the search will retrieve information in the 1, or text, column of protein_id, since the sequence column is considered column 0.

--Kimberly Van Auken

Back to FAQ

How to download C. brigase 3' UTRs in large number?

We don't really have a strictly empirical set of 3' UTRs (3' flanking sequences taking from cDNA). However, I take it that what you really want are predicted 3' UTR regions. That, you can get by going to the Genome Dumper:

http://wormbase.org/db/searches/advanced/dumper

selecting the species "C. briggsae", and then filling in the options for the download that you want. The main stumbling block is that the list you need for briggsae genomic sequences is rather long. However, I've tried typing this:

cb25.*

into the window "Type in a list of sequence or chromosome names", and that seems to successfully prompt the Genome Dumper to search through all available C. briggsae genomic contigs.

Also, you should pick the "Integrated ('hybrid') briggsae gene set", and select some reasonable value (e.g., 1000 bp) for the length of the 3' flank sequences that you want.

--Erich Schwarz

Back to FAQ

How to find the coding sequences of alleles for particular gene(s) having SNPs from C. elegans.?

For instance, if you want to find out SNP sequences for H39E23.1a gene, you can use the following AQL query:
select a->predicted_gene, a, a->flanking_sequences[1],
a->flanking_sequences[2], a->substitution from a in class allele where
a->predicted_gene = "H39E23.1a" and a->method = "snp"

The output of the query (in text mode) looks like this:
H39E23.1a snp_AH10.2 tgaaaaaaactaatttttaatgtga tcttggccacaattgacctagtttg [A/G]
H39E23.1a snp_AH10.3 ctgaacaactgaaaaaggaaagaaa agggaaaaagttcgaccacaaaaaa [G/A]
Here the first column is the gene name, second is the allele name, third and fourth are sequences flanking the allele and the last one is the actual allele sequence change. You can modify the query to retrieve information for genes that you're interested in.

--Igor Antoshechkin

Back to FAQ

How to download the C. elegans-Human gene homology map?

You can download a file that lists best blastp match to human, fly, yeast, C. briggsae, and SwissProt & Trembl proteins for every C. elegans protein form the wormbase ftp site:
ftp://ftp.sanger.ac.uk/pub/wormbase/current_release
The file name is best_blastp_hits.WSXXX.gz where XXX is the release number.

--Igor Antoshechkin

Back to FAQ

How to download the spliced and the non-spliced regions for all the available C. elegans or C. brigasegenes?

You can download spliced/unspliced sequences for a list of genes using Batch Gene tool:
http://www.wormbase.org/db/searches/info_dump
You can paste a list of genes you're interested in into the search box and select Spliced and Unspliced check boxes in the Sequence field. If you output data as text, you'll be able to save it to your harg disk.

To get the list of C. briggsae genes (so that you can paste it into the search box), you can use the following query:
select a from a in class cds where a->species like "*briggsae*"

--Igor Antoshechkin

Back to FAQ

How to download C. elegans-C.briggsae ortholog and the coding segments?

One possible way to retrieve those would be to download a C. elegans-C.briggsae ortholog file:
ftp://ftp.wormbase.org/pub/wormbase/briggsae/supporting_data_stein_2003/orthologs_and_orphans/orthologs.txt
and C. briggsae gene sequences in fasta format (briggenes.fa.gz):
ftp://ftp.wormbase.org/pub/wormbase/briggsae/
and write a script that would parse C. briggsae ortholog sequences based on C. elegans gene name.

--Igor Antoshechkin

Back to FAQ

How are the WormBase entries created and maintained?

There is no simple answer to that. WormBase has a team of about 30 people who generate and curate data in many different ways. The genome sequence of C. elegans was determined at two of the four WormBase groups, and so a lot of data pertaining to gene predictions and other features annotated on to the genome are created and maintained by those groups.

The group at Caltech do a lot of literature curation and extract all sorts of information from the published literature (from hand-curated descriptions of gene function to details of individual RNAi experiments).

Also a lot of data comes from 3rd party collaborators who submit bulk datasets direct to WormBase (e.g. Orfeome data, 'knockout' deletion alleles). In contrast we also get directly submitted data from users at a very small level, e.g. individual allele submissions.

Finally, we also generate data de novo as part of the database build procedure, e.g. calculating molecular weights of proteins.

--Keith Bradnam

Back to FAQ

How to find homologues?

How to retrieve the best blast_p scored homologies of worm genes(produced automatically with each Wormbase build - roughly every 2 weeks).
a. Go to the Wormbase ftp site by following the "Bulk Downloads" link in the "Web Site Directory" section of the Wormbase homepage or by entering the following URL in your browser:
ftp://ftp.sanger.ac.uk/pub/wormbase?
Select the most current Wormbase release (i.e. WS130).

b. Download the two best blastp files in this folder:
best_blastp_hits.WS130.gz (elegans homolgies)
best_blastp_hits_brigpep.WS130.gz (briggsae homologies)

c. Unpack the compressed files using a suitable software e.g. gunzip (linux)

d. The files have 15 columns delmited by a comma. The contents of the columns are as follows:
Column 1 : Wormbase peptide accession number for elegans peptide
Column 2 : Wormbase peptide accession number for highest homology elegans peptide
Column 3 : e value for best elegans peptide/worm peptide hit
Column 4 : Ensemble accession number for highest homolgy ensemble sequence
Column 5 : e value for best elegans peptide/ensemble sequence hit
Column 6 : Wormbase peptide accession number for highest homolgy briggsae peptide
Column 7 : e value for best elegans peptide/briggsae peptide hit
Column 8 : Flybase accession number for highest homology fly protein
Column 9 : e value for best elegans peptide/fly protein hit
Column 10: Saccharomyces Genome Database accession number from highest homology yeast protein
Column 11: e value for best elegans peptide/yeast protein hit
Column 12: Swissprot/Uniprot name from highest homology sequence
Column 13: e value for best elegans peptide/swissprot sequence hit
Column 14: TrEMBL accession number from highest homology sequence
Column 15: e value for best elegans peptide/TrEMBL sequence hit

e. You might also want a file that maps Wormbase peptide accession numbers to the corresponding Gene in Wormbase (warning, a single gene may correspond to multiple peptides). For this you will have to perform an AQL query on Wormbase:
- on the banner at the top of the Wormbase homepage select "Searches"
- select the top search from the resulting list, "Acedb Searches(AQL)"
- copy paste the following text into the search text box:
select a, a->Cgc_name, c from a in class Gene,
c in a->Molecular_name
where c like "CE*"
order by :1 asc
- choose the "Text output" radio button and click Query ACeDB(the search may take a few minutes)
- the resulting file contains a tab delimited mapping of Wormbase gene accession numbers to the CGC approved name for that gene (if it has one) to the peptide accession number for that gene. save the results file to your hard drive

--Eimear Kenny

Back to FAQ


Prev

[ User Guide Home ] [ Page Top ]

Next


Page maintained by Wen J. Chen Documentation by Wen J. Chen
Send comments or questions to WormBase Graphics by Wen J. Chen