AceView Worm Genome
http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/
AceView/WormGenes
gene
vha-6
http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?exdb=AceView&db=worm&term=
Arabidopsis Genome Initiative (TAIR, TIGR, MIPS)
http://www.arabidopsis.org
gene
protein
At2g17950
http://mips.gsf.de/cgi-bin/proj/thal/search_gene?code=
http://www.tigr.org/tigr-scripts/euk_manatee/shared/ORF_infopage.cgi?db=ath1&orf=
http://arabidopsis.org/servlets/TairObject?type=locus&name=
AGRICultural OnLine Access
http://agricola.nal.usda.gov/
bib=0000-05160
ID example is strange; can't find db entry point
Sergei Egorov
2006-02-17
AGRICultural OnLine Access
http://agricola.cos.com/
IND84014403
AGRICultural OnLine Access
http://agricola.nal.usda.gov/
TP248.2 P76 v.14
ID example is strange
Sergei Egorov
2006-02-17
A Systematic Annotation Package for Community Analysis of Genomes
https://asap.ahabs.wisc.edu/annotation/php/ASAP1.htm
8
ID example is strange: can 8 be an ID? Server requires login...
Sergei Egorov
2006-02-17
American Type Culture Collection database
http://www.atcc.org/
CCL-240
HL-60
http://www.atcc.org/common/catalog/numSearch/numResults.cfm?collection=ce&ATCCNum=
American Type Culture Collection database
http://www.atcc.org/
ATCC(dna)
DNA
CCL-64D
Mink genomic DNA (Mv 1 Lu)
http://www.atcc.org/common/catalog/numSearch/numResults.cfm?collection=ce&ATCCNum=
American Type Culture Collection database
http://www.atcc.org/
ATCC(in host)
123456
ID example is fake
Sergei Egorov
2006-02-17
A Xenopus laevis database
http://www.dkfz-heidelberg.de/molecular_embryology/axeldb.htm
gene
32B3.1
http://indigene.ibaic.u-psud.fr/cgi-bin/ace/generic/tree/default?class=Locus&name=
Berkeley Drosophila Genome Project EST database
http://www.fruitfly.org/EST/index.shtml
123456
ID example is fake
Sergei Egorov
2006-02-17
Berkeley Drosophila Genome Project database -- Insertion
http://www.fruitfly.org/
123456
ID example is fake
Sergei Egorov
2006-02-17
BioModels Database
http://www.ebi.ac.uk/biomodels/
BIOMD0000000045
http://www.ebi.ac.uk/compneur-srv/biomodels-main/publ-model.do?mid=
BIOSIS previews
http://www.biosis.org/
200200247281
BRENDA, The Comprehensive Enzyme Information System
http://www.brenda.uni-koeln.de/
4.2.1.3
http://www.brenda.uni-koeln.de/php/result_flat.php4?ecno=
CAS Registry
http://www.cas.org/
chemical substance
registry number
up to 9 digits, divided by hyphens into 3 parts; the right digit is a checksum
58-08-2
caffeine
[0-9]+[-][0-9]+[-][0-9]
http://webbook.nist.gov/cgi/cbook.cgi?ID=&
http://www.chemindustry.com/apps/chemicals?m=s\&t=&
CAS ID
Center for Biological Sequence Analysis
http://www.cbs.dtu.dk/
NetNGlyc
Check http://www.cbs.dtu.dk/services/TMHMM/ ?
Sergei Egorov
2006-02-17
Conserved Domain Database
http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml
pfam01484
Candida Genome Database
http://www.candidagenome.org/
CAL0005516
http://www.candidagenome.org/cgi-bin/locus.pl?sgdid=
Candida Genome Database
http://www.candidagenome.org/
orf19.2475
http://www.candidagenome.org/cgi-bin/locus.pl?locus=
Another ID example is HPW1
Sergei Egorov
2006-02-17
Candida Genome Database
http://www.candidagenome.org/
1490
http://www.candidagenome.org/cgi-bin/reference/reference.pl?refNo=
Compugen Gene Ontology Gene Association Data
PrID131022
CGSC: E.coli Genetic Stock Center
http://cgsc.biology.yale.edu/
rbsK
server entry point: check http://cgsc.biology.yale.edu/cgi-bin/sybgw/cgsc/Site/
Sergei Egorov
2006-02-17
Chemical Entities of Biological Interest (ChEBI) database of small molecules
http://www.ebi.ac.uk/chebi/
chemical compound
sequential identifier
positive numeral
27732
caffeine
[1-9][0-9]*
http://www.ebi.ac.uk/chebi/searchId.do?chebiId=
CHEBI:
Cell Type Ontology
http://lists.sourceforge.net/lists/listinfo/obo-cell-type
0000041
NCBI COG cluster
http://www.ncbi.nlm.nih.gov/COG/
COG0001
http://www.ncbi.nlm.nih.gov/COG/new/release/cow.cgi?cog=
NCBI COG function
http://www.ncbi.nlm.nih.gov/COG/
H
http://www.ncbi.nlm.nih.gov/COG/new/release/coglist.cgi?fun=
NCBI COG pathway
http://www.ncbi.nlm.nih.gov/COG/
14
http://www.ncbi.nlm.nih.gov/COG/new/release/coglist.cgi?pathw=
EST database maintained at the NCBI
http://www.ncbi.nlm.nih.gov/dbEST/index.html
BP535535
Variation database maintained at the NCBI
http://www.ncbi.nlm.nih.gov/SNP/
rs133073
STS database maintained at the NCBI
http://www.ncbi.nlm.nih.gov/dbSTS/index.html
BV210161
DictyBase
http://dictybase.org
DDB0001836
http://dictybase.org/db/cgi-bin/gene_page.pl?dictybaseid=
DictyBase
http://dictybase.org
mlcE
http://dictybase.org/db/cgi-bin/gene_page.pl?gene_name=
DictyBase literature references
http://dictybase.org
10157
http://dictybase.org/db/cgi-bin/dictyBase/reference/reference.pl?refNo=
INSD:
Dictyostelium genome database
http://dictybase.org/
DDB0191090
The IUBMB Enzyme Commission
http://www.chem.qmw.ac.uk/iubmb/enzyme/
enzyme
registry number
N.N.N.N where N is either a number or '-'
3.4.11.4
tripeptide aminopeptidase
([0-9]+|[-])[.]([0-9]+|[-])[.]([0-9]+|[-])[.]([0-9]+|[-])
http://www.genome.ad.jp/dbget-bin/www_bget?ec:
EC Number
server entry point: check http://www.chem.qmw.ac.uk/iubmb/enzyme/EC1/1/1/1.html
Sergei Egorov
2006-02-17
The Encylopedia of E. coli metabolism
http://ecocyc.org/
P2-PWY
http://malibu.ai.sri.com:1555/ECOLI/new-image?type=PATHWAY&object=?
CGSC: E.coli Genetic Stock Center
http://cgsc.biology.yale.edu/
EG10818
server entry point: check http://cgsc.biology.yale.edu/cgi-bin/sybgw/cgsc/Site/315
Sergei Egorov
2006-02-17
CGSC: E.coli Genetic Stock Center
http://cgsc.biology.yale.edu/
deoC
INSD:
Database of automatically annotated genomic data
http://www.ensembl.org
HUMAN-Gene-ENSG00000007102
http://www.ensembl.org/perl/protview?peptide=
The Swiss Institute of Bioinformatics database of Enzymes
http://www.expasy.ch/
1.1.1.1
http://www.expasy.ch/cgi-bin/nicezyme.pl?
redirect to EC:?
Sergei Egorov
2006-02-17
EBI's EST library identifier
1200
Database of Functional Annotation of Mouse
http://fantom.gsc.riken.go.jp/
0610005A07
FlyBase
http://flybase.bio.indiana.edu/
FBgn0000024
http://fly.ebi.ac.uk:7081/.bin/fbidq.html?
http://flybase.bio.indiana.edu/.bin/fbidq.html?
two URLs, one service?
Sergei Egorov
2006-02-17
Database of Genetic and molecular data of Drosophila
http://www.flybase.org/
FBgn0000024
http://flybase.bio.indiana.edu/.bin/fbidq.html?
http://fly.ebi.ac.uk:7081/.bin/fbidq.html?
two URLs, one service?
Sergei Egorov
2006-02-17
Network of Different Plant Genomic Research Projects
https://gabi.rzpd.de/
HA05J18
INSD:
Human Genome Database accession numbers
http://www.gdb.org/
G00-128-600
http://www.gdb.org/gdb-bin/genera/accno?accessionNum=
url example suggested GDB:306600 as ID format
Sergei Egorov
2006-02-17
INSD:
Curated gene database for Schizosaccharomyces pombe, Leishmania major and Trypanosoma brucei
http://www.genedb.org/
SPCC285.16c
http://www.genedb.org/genedb/Search?organism=All%3A*&name=
GeneDB_Gmorsitans
http://www.genedb.org/glossina
Gmm-0142
http://www.genedb.org/genedb/Search?organism=glossina&name=
GeneDB_Lmajor
http://www.genedb.org/leish
LM5.32
http://www.genedb.org/genedb/Search?organism=leish&name=
GeneDB_Pfalciparum
http://www.genedb.org/malaria
PFD0755c
http://www.genedb.org/genedb/Search?organism=malaria&name=
GeneDB_Spombe
http://www.genedb.org/pombe
SPAC890.04C
http://www.genedb.org/genedb/Search?organism=pombe&name=
GeneDB_Tbrucei
http://www.genedb.org/tryp
Tb927.1.5250
http://www.genedb.org/genedb/Search?organism=tryp&name=
Entrez Gene Database (replaces NCBI Locus Link)
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene
gene
sequential identifier
positive numeral
7157
tumor protein p53
[1-9][0-9]*
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=full_report&list_uids=
LocusLink ID
GermOnline
http://www.germonline.org/
140116
http://germonline.unibas.ch/gene_page.php?orf_id=
http://germonline.yeastgenome.org/gene_page.php?orf_id=
http://germonline.biochem.s.u-tokyo.ac.jp/gene_page.php?orf_id=
GenInfo identifier, used as a unique sequence identifier for nucleotide and proteins
1234567890
ID example is fake; cannot resolve GIs to URLs without knowing their type (protein/nucleotide)
Sergei Egorov
2006-02-17
Gene Ontology Database
http://www.geneontology.org/
concept
sequential identifier
7-digit numeral
0006915
apoptosis
[0-9]{7}
http://godatabase.org/cgi-bin/go.cgi?query=GO:
GO ID
Gene Ontology Database references
0000001
Gene Ontology Annotation Database Identifier
http://www.ebi.ac.uk/GOA/
P01100
Gramene: A Comparative Mapping Resource for Grains
http://www.gramene.org/
P93436
http://www.gramene.org/perl/protein_search?acc=
Gramene: A Comparative Mapping Resource for Grains
http://www.gramene.org/
659
http://www.gramene.org/db/mutant/search_mutant?id=
ID example may be fake; need to try GR:0060198
Sergei Egorov
2006-02-17
Gramene: A Comparative Mapping Resource for Grains.
http://www.gramene.org/
110916
http://www.gramene.org/db/protein/protein_search?protein_id=
Gramene: A Comparative Mapping Resource for Grains
http://www.gramene.org/
659
http://www.gramene.org/perl/pub_search?ref_id=
H-Invitational Database
http://www.h-invitational.jp
HIX0000001
H-invitational Database
http://www.h-invitational.jp/
AK093148
http://www.h-invdb.jbic.or.jp/soup/pub_Detail.pl?acc_id=
http://www.jbirc.aist.go.jp/hinv/soup/pub_Detail.pl?acc_id=
H-invitational Database
http://www.h-invitational.jp/
HIX0014446
http://www.h-invdb.jbic.or.jp/soup/pub_Locus.pl?locus_id=
http://www.jbirc.aist.go.jp/hinv/soup/pub_Locus.pl?locus_id=
High-quality Automated and Manual Annotation of microbial Proteomes
http://us.expasy.org/sprot/hamap/
MF_00031
http://us.expasy.org/unirules/
HUGO Gene Nomenclature Committee
http://www.gene.ucl.ac.uk/nomenclature/
gene
sequential identifier
positive numeral
11998
tumor protein p53
[1-9][0-9]*
http://www.gene.ucl.ac.uk/nomenclature/data/get_data.php?hgnc_id=HGNC:
HGNC ID
HGNC:ID as used by NCBI
Sergei Egorov
2006-02-17
HUGO Gene Nomenclature Committee
http://www.gene.ucl.ac.uk/nomenclature/
gene
symbol
a unique series of Latin (upper case in human) letters and Arabic numbers which usually is no longer than six characters in length; case-sensitive
BRCA1
breast cancer, early onset 1
http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/searchgenes.pl?field=all_text&anchor=equals&symbol_search=Search&number=50&format=html&sortby=symbol&match=
HUGO Symbol
Database of homology-derived secondary structure of proteins
http://www.sander.ebi.ac.uk/hssp/
12GS
Institute for Fermentation, Osaka
http://www.ifo.or.jp/index_e.html
3189
Immunogenetics database, immunoglobulin and T-cell receptor genes
http://imgt.cines.fr
IMGT/GENE-DB
IGKC
Immunogenetics database, human MHC
http://www.ebi.ac.uk/imgt/hla/
IMGT/HLA
HLA00031
Immunogenetics database, immunoglobulins and T-cell receptors
http://imgt.cines.fr
IMGT/LIGM
U03895
chemical compound
self-describing identifier
canonicalized formula with ordered layers separated by /
1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3
caffeine
[-+/.,*;()?A-Za-z0-9]+
may need urlencoding
Sergei Egorov
2006-02-17
International Nucleotide Sequence Database Collaboration (GenBank, EMBL, DDBJ)
http://www.insdc.org/
nucleotide sequence
accession number
uppercase letters, followed by digits and optional .version suffix; e.g. LNNNNN, LLNNNNNN
J01749
Cloning vector pBR322, complete genome
[A-Z]+[0-9]+([.][0-9]+)?
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=
http://www.ebi.ac.uk/cgi-bin/emblfetch?style=html&Submit=Go&id=
http://arsa.ddbj.nig.ac.jp/arsa/ddbjSplSearch?KeyWord=
GenBank ID
IntAct protein interaction database
http://www.ebi.ac.uk/intact/
EBI-17086
http://www.ebi.ac.uk/intact/search/do/search?searchString=
InterPro protein sequence database
http://www.ebi.ac.uk/interpro/
Interpro
IPR002928
http://www.ebi.ac.uk/interpro/DisplayIproEntry?ac=
International Protein Index
http://www.ebi.ac.uk/IPI/IPIhelp.html
IPI00000005.1
International Standard Book Number
http://isbntools.com/
0781702534
http://my.linkbaton.com/get?lbCC=q&nC=q&genre=book&item=
Insertion sequence elements database
ISA1083-2
International Standard Serial Number
http://www.issn.org/
1234-1231
The IUPHAR Compendium of Receptor Characterization and Classification
2.1.CBD
Japan Collection of Microorganisms
http://www.jcm.riken.go.jp/
1339
([a-z]+)[:](.*)
KEGG_\1:\2
Kyoto Encyclopedia of Genes and Genomes
http://www.genome.ad.jp/kegg/
chemical compound
accession number
C followed by 5 digits
C00074
Phosphoenolpyruvate
C[0-9]{5}
http://www.genome.ad.jp/dbget-bin/www_bget?cpd:
KEGG_COMPOUND:
KEGG_DRUG:
Kyoto Encyclopedia of Genes and Genomes
http://www.genome.ad.jp/kegg/
drug
accession number
D followed by 5 digits
D00245
Bisacodyl (JP14/USP)
D[0-9]{5}
http://www.genome.ad.jp/dbget-bin/www_bget?dr:
EC:
GeneID:
GeneID:
KEGG_PATHWAY:
Kyoto Encyclopedia of Genes and Genomes
http://www.genome.ad.jp/kegg/
pathway
accession number
lowercase letters, followed by 5 digits
ot00020
Citrate cycle (TCA cycle)
[a-z]+[0-9]{5}
http://www.genome.ad.jp/dbget-bin/www_bget?path:
http://www.genome.ad.jp/dbget-bin/www_bget?path:
GeneID:
KEGG LIGAND Database
http://www.genome.ad.jp/kegg/docs/upd_ligand.html#COMPOUND
1.1.1.1
http://www.genome.ad.jp/dbget-bin/www_bget?ec:
http://www.genome.ad.jp/dbget-bin/www_bget?cpd:
needs editing or redirection
Sergei Egorov
2006-02-17
GeneID:
MaizeGDB
http://www.maizegdb.org
881225
http://www.maizegdb.org/cgi-bin/id_search.cgi?id=
MaizeGDB
http://www.maizegdb.org
ZmPK1
http://www.maizegdb.org/cgi-bin/displaylocusresults.cgi?term=?
The Medline literature database
20572430
superseded by PubMed (see PMID:)
Sergei Egorov
2006-02-17
MEROPS - the Peptidase Database
http://merops.sanger.ac.uk/
A01.001
MEROPS: The Peptidase Database
http://merops.sanger.ac.uk/
M18
http://merops.sanger/ac/uk/famcards/
check if .htm needs to be appended to URLs
Sergei Egorov
2006-02-17
Medical Subject Headings
http://www.nlm.nih.gov/mesh/2005/MBrowser.html
mitosis
http://www.nlm.nih.gov/cgi/mesh/2005/MB_cgi?mode=&term=
need to preserve case and urlencode commas, spaces, ...?
Sergei Egorov
2006-02-17
The Metabolic Encyclopedia of metabolic and other pathways
http://metacyc.org/
GLUTDEG-PWY
http://biocyc.org:1555/META/new-image?object=
Mouse Genome Database
http://www.informatics.jax.org/
Adcy9
http://www.informatics.jax.org/searches/marker.cgi?
Mouse Genome Informatics
http://www.informatics.jax.org/
gene
sequential identifier
positive numeral
98834
transformation related protein 53
[1-9][0-9]*
http://www.informatics.jax.org/searches/accession_report.cgi?id=
http://www.informatics.jax.org/searches/accession_report.cgi?id=MGI:
MGI ID
OMIM:
MIM: is NCBI usage
Sergei Egorov
2006-02-17
MIPS Functional Catalogue
http://mips.gsf.de/proj/funcatDB/
11.02
http://mips.gsf.de/cgi-bin/proj/funcatDB/search_advanced.pl?action=2&wert=
The MGED Ontology
http://mged.sourceforge.net/ontologies/MGEDontology.php
Action
http://mged.sourceforge.net/ontologies/MGEDontology.php#
Nottingham Arabidopsis Stock Centre Seeds Database
http://arabidopsis.info
N3371
http://seeds.nottingham.ac.uk/NASC/stockatidb.lasso?code=
INSD:
NCBI GenPept
http://www.ncbi.nlm.nih.gov/
EAL72968
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&val=
NCBI RefSeq
http://www.ncbi.nlm.nih.gov/
123456
ID example is fake
Sergei Egorov
2006-02-17
NCBI RefSeq
http://www.ncbi.nlm.nih.gov/
123456
ID example is fake
Sergei Egorov
2006-02-17
Nematode Expression Pattern DataBase
http://nematode.lab.nig.ac.jp/
CELK01662
NIA Mouse cDNA Project
http://lgsun.grc.nia.nih.gov/cDNA/cDNA.html
L0304H12-3
Online Mendelian Inheritance in Man database
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
gene
gene-related genetic disorders
sequential identifier
positive numeral
191170
TUMOR PROTEIN p53
[1-9][0-9]*
http://www3.ncbi.nlm.nih.gov/htbin-post/Omim/dispmim?
http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=
OMIM ID
Protein Data Bank
http://www.rcsb.org/pdb/
1A4U
http://www.rcsb.org/pdb/cgi/explore.cgi?pid=223051005992697&pdbId=
Pfam: Protein families database of alignments and HMMs
http://www.sanger.ac.uk/Software/Pfam/
PF00046
http://www.sanger.ac.uk/cgi-bin/Pfam/getacc?
Pfam-B supplement to Pfam
http://www.sanger.ac.uk/Software/Pfam/
PB014624
Plant Genome Network
http://pgn.cornell.edu
aam01-1ms3-a05
Protein Information Resource
http://pir.georgetown.edu/
protein
accession number
uppercase letters, followed by digits
S02192
cellular tumor antigen p53 - rat
[A-Z]+[0-9]+
http://pir.georgetown.edu/cgi-bin/pirwww/nbrfget?uid=
PIR ID
PIR Superfamily Classification System
http://pir.georgetown.edu/pirsf/
SF002327
http://pir.georgetown.edu/cgi-bin/ipcSF?id=
PubMed
http://pubmed.gov/
article
sequential identifier
positive numeral
16446403
Mol Cancer Res. 2006 Jan;4(1):15-25.
[1-9][0-9]*
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Abstract&list_uids=
Plant Ontology Consortium Database
http://www.plantontology.org/
0009004
http://www.plantontology.org/amigo/go.cgi?action=query&view=query&search_constraint=terms&query=
template may have PO: before the ID?
Sergei Egorov
2006-02-17
Schizosaccharomyces pombe protein data
SPAC890.04C
PRINTS compendium of protein fingerprints
http://umber.sbs.man.ac.uk/dbbrowser/PRINTS/
PR00025
http://umber.sbs.man.ac.uk/cgi-bin/dbbrowser/PRINTS/DoPRINTS.pl?cmd_a=Display&qua_a=none&fun_a=Text&qst_a=
ProDom protein domain families automatically generated from Swiss-Prot and TrEMBL
http://prodes.toulouse.inra.fr/prodom/current/html/home.php
PD000001
http://prodes.toulouse.inra.fr/prodom/current/cgi-bin/request.pl?question=DBEN&query=
Prosite. Database of protein families and domains
http://www.expasy.ch/prosite/
PS00365
http://www.expasy.ch/cgi-bin/prosite-search-ac?
INSD:
EMBL pseudo protein identifier
CAC44644.1
NCBI PubChem database of chemical structures
http://pubchem.ncbi.nlm.nih.gov/
chemical compound
sequential identifier
positive numeral
2519
Caffeine
[1-9][0-9]*
http://http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=pccompound&term=
http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=
NCBI PubChem database of chemical substances
http://pubchem.ncbi.nlm.nih.gov/
chemical substance
sequential identifier
positive numeral
199601
Coffee
[1-9][0-9]*
http://http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=pcsubstance&term=
http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?sid=
Rat Genome Database
http://ratmap.gen.gu.se/
5
fake ID?
Sergei Egorov
2006-02-17
Reactome human pathway database
http://www.reactome.org/
70635
http://www.reactome.org/cgi-bin/eventbrowser?DB=gk_current&ID=
REBASE, The Restriction Enzyme Database
http://rebase.neb.com/rebase/rebase.html
EcoRI
http://rebase.neb.com/rebase/enz/
template may need to add .html after the ID: use ere?
Sergei Egorov
2006-02-17
RESID Database of Protein Modifications
AA0062
Rat Genome Database
http://rgd.mcw.edu/
gene
sequential identifier
positive numeral
3889
tumor protein p53
[1-9][0-9]*
http://rgd.mcw.edu/tools/genes/genes_view.cgi?id=
RGD ID
RGD:
Rice database accession numbers
http://www.gramene.org/
AA231856
The RNA Modification Database
http://medlib.med.utah.edu/RNAmods/
037
http://medlib.med.utah.edu/cgi-bin/rnashow.cgi?
Resource Centre Primary Database Clone Identifiers
http://www.rzpd.de/
IMAGp998I142450Q6
Saccharomyces Genome Database
http://www.yeastgenome.org/
S000006169
http://db.yeastgenome.org/cgi-bin/locus.pl?dbid=
Saccharomyces Genome Database
http://www.yeastgenome.org/
GAL4
http://db.yeastgenome.org/cgi-bin/locus.pl?locus=
another ID example: YEL001C
Sergei Egorov
2006-02-17
Saccharomyces Genome Database
http://www.yeastgenome.org/
S000049602
http://db.yeastgenome.org/cgi-bin/reference/reference.pl?dbid=
Simple Modular Architecture Research Tool
http://smart.embl-heidelberg.de/
SM00005
http://smart.embl-heidelberg.de/smart/do_annotation.pl?BLAST=DUMMY&DOMAIN=
Glycine max Genome Database
http://soybase.agron.iastate.edu/
Satt005
stkecm_([A-Z]+)_([0-9]+)
STKECM_\1:\2
Science's STKE Connections Map
http://stke.sciencemag.org/
chemical substance
sequential identifier
positive numeral
15493
Acetylcholinesterase (AChE)
[1-9][0-9]*
http://stke.sciencemag.org/cgi/cm/stkecm;CMC_
Science's STKE Connections Map
http://stke.sciencemag.org/
pathway
sequential identifier
positive numeral
10827
JNK MAPK Pathway
[1-9][0-9]*
http://stke.sciencemag.org/cgi/cm/stkecm;CMP_
Bacillus subtilis genome sequencing project
http://genolist.pasteur.fr/SubtiList/
BG10001
Bacillus subtilis Genome Sequence Project
http://genolist.pasteur.fr/SubtiList/
accC
The Arabidopsis Information Resource
http://www.arabidopsis.org/
gene:2062713
http://www.arabidopsis.org/servlets/TairObject?accession=
should gene: in the ID be part of the template?
Sergei Egorov
2006-02-17
NCBI's taxonomic identifier
http://www.ncbi.nlm.nih.gov/Taxonomy/tax.html
4932
this needs to be linked ASAP
Sergei Egorov
2006-02-17
The Transport Protein Database
http://tcdb.ucsd.edu/tcdb/
9.A.4.1.1
http://tcdb.ucsd.edu/tcdb/tcprotein.php?substrate=
Tetrahymena Genome Database
http://www.ciliate.org/
PDD1
http://db.ciliate.org/cgi-bin/locus.pl?locus=
another ID example: U66363
Sergei Egorov
2006-02-17
Tetrahymena Genome Database
http://www.ciliate.org/
T000005818
http://db.ciliate.org/cgi-bin/reference/reference.pl?dbid=
The Institute for Genomic Research, Arabidopsis thaliana database
http://www.tigr.org/tdb/e2k1/ath1/ath1.shtml
At3g01440
http://www.tigr.org/tigr-scripts/euk_manatee/shared/ORF_infopage.cgi?db=ath1&orf=
The Institute for Genomic Research, Comprehensive Microbial Resource
http://www.tigr.org/
VCA0557
http://www.tigr.org/tigr-scripts/CMR2/GenePage.spl?locus=
The Institute for Genomic Research, EGAD database
http://www.tigr.org/
74462
http://www.tigr.org/tigr-scripts/CMR2/ht_report.spl?prot_id=
The Institute for Genomic Research, Genome Properties
http://www.tigr.org/
GenProp0120
http://www.tigr.org/tigr-scripts/CMR2/genome_property_def.spl?prop_acc=
The Institute for Genomic Research, Plasmodium falciparum database
http://www.tigr.org/tdb/e2k1/pfa1/pfa1.shtml
PFB0010w
http://www.tigr.org/tigr-scripts/euk_manatee/shared/ORF_infopage.cgi?db=pfa1&orf=
http://www.tigr.org/tdb/GO_REF/GO_REF.shtml
GO_ref
is this ID real?
Sergei Egorov
2006-02-17
The Institute for Genomic Research, Trypanosoma brucei database
http://www.tigr.org/tdb/e2k1/tba1/
25N14.10
http://www.tigr.org/tigr-scripts/euk_manatee/shared/ORF_infopage.cgi?db=tba1&orf=
The Institute for Genomic Research, TIGR Gene Index
http://www.tigr.org/
Cattle_TC123931
http://www.tigr.org/tigr-scripts/nhgi_scripts/tc_report.pl?tc=?
The Institute for Genomic Research, TIGRFAMs HMM collection
http://www.tigr.org/
TIGR00254
http://www.tigr.org/tigr-scripts/CMR2/hmm_report.spl?acc=
UniProtKB-TrEMBL, a computer-annotated protein sequence database supplementing UniProtKB and containing the translations of all coding sequences (CDS) present in the EMBL Nucleotide Sequence Database but not yet integrated in UniProtKB/Swiss-Prot
http://www.uniprot.org
O31124
http://www.ebi.uniprot.org/entry/
The University of Minnesota Biocatalysis/Biodegradation Database
http://umbbd.ahc.umn.edu/index.html
e0413
http://umbbd.ahc.umn.edu:8007/umbbd/servlet/pageservlet?ptype=ep&enzymeID=
The University of Minnesota Biocatalysis/Biodegradation Database
http://umbbd.ahc.umn.edu/index.html
acr
http://umbbd.ahc.umn.edu/
may need to turn acr into acr/acr_map.html -- use ere?
Sergei Egorov
2006-02-17
NCBI's UniGene database
http://www.ncbi.nih.gov/UniGene
cluster of nucleotide sequences
sequential identifier (organism-specific)
2/3-letter organism abbreviation followed by dot and positive numeral
Hs.408312
p53 cluster
([A-Z][a-z][a-z]?)[.]([1-9][0-9]*)
http://www.ncbi.nlm.nih.gov/UniGene/clust.cgi?ORG=\1\&CID=\2
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=unigene&cmd=search&term=
Unigene ID
Unified Library Database, a library-level view of the EST and SAGE libraries present in dbEST, UniGene and SAGEmap
1002
UniProt Archive; a non-redundant archive of protein sequences extracted from Swiss-Prot, TrEMBL, PIR-PSD, EMBL, Ensembl, IPI, PDB, RefSeq, FlyBase, WormBase, European Patent Office, United States Patent and Trademark Office, and Japanese Patent Office
http://www.ebi.ac.uk/uniparc/
UPI000000000A
http://www.ebi.ac.uk/cgi-bin/dbfetch?db=uniparc&id=
The Universal Protein Knowledgebase, a central repository of protein sequence and function created by joining the information contained in Swiss-Prot, TrEMBL, and PIR
http://www.uniprot.org/
UniProtKB/Swiss-Prot
protein
accession number
6 alphanumerical characters
P08251
Sodium/potassium-transporting ATPase beta-1 chain
[OPQ][0-9][A-Z0-9][A-Z0-9][A-Z0-9][0-9]
http://www.ebi.uniprot.org/entry/
Swiss-Prot Accession
section of the UniProt Knowledgebase, containing annotated records, which include curator-evaluated computational analysis, as well as, information extracted from the literature
http://www.uniprot.org
UniProtKB/Swiss-Prot
P12345
redirect to UniProt?
Sergei Egorov
2006-02-17
section of the UniProt Knowledgebase, containing computationally analysed records waiting for full manual annotation
http://www.uniprot.org
UniProtKB/TrEMBL
Q00177
Integrative database of germ-line V genes from the immunoglobulin loci of human and mouse
http://www.dnaplot.de/vbase2/vbase2.php
humIGKV165
WormBase, database of nematode biology
http://www.wormbase.org/
lin-12
http://www.wormbase.org/db/get?class=Locus;name=
WormBase, database of nematode biology
http://www.wormbase.org/
cgc467
http://www.wormbase.org/db/misc/paper?name=
may need to put %5Bcgc467%5D;class=Paper after the template -- use ere?
Sergei Egorov
2006-02-17
C. elegans ORFeome cloning project
http://worfdb.dfci.harvard.edu/
pos-1
Caenorhabditis elegans Genome Database
http://www.wormbase.org/
R13H7
Wormpep, database of proteins of C. elegans
http://www.wormbase.org/
CE25104
http://www.wormbase.org/db/get?class=Protein;name=
may need to put WP%3A between the template and the ID -- add to template?
Sergei Egorov
2006-02-17
The Zebrafish Information Network
http://zfin.org/
ZDB-GENE-990415-103
http://zfin.org/cgi-bin/ZFIN_jump?record=