Bioinformatics Resources

Nucleotide Sequence Databases (the principal ones)

  • NCBI - National Center for Biotechnology Information

  • EBI - European Bioinformatics Institute

  • DDBJ - DNA Data Bank of Japan

Protein Sequence Databases

  • SWISS-PROT & TrEMBL - Protein sequence database and computer annotated supplement

  • UniProt - UniProt (Universal Protein Resource) is the world's most comprehensive catalog of information on proteins. It is a central repository of protein sequence and function created by joining the information contained in Swiss-Prot, TrEMBL, and PIR.

  • PIR - Protein Information Resource

  • MIPS - Munich Information centre for Protein Sequences

  • HUPO - HUman Proteome Organization

Database Searching by Sequence Similarity

Sequence Alignment

  • USC Sequence Alignment Server - align 2 sequences with all possible varieties of dynamic programming

  • T-COFFEE - multiple sequence alignment

  • ClustalW @ EBI - multiple sequence alignment

  • MSA 2.1 - optimal multiple sequence alignment using the Carrillo-Lipman method

  • BOXSHADE - pretty printing and shading of multiple alignments

  • Splign - Splign is a utility for computing cDNA-to-Genomic, or spliced sequence alignments. At the heart of the program is a global alignment algorithm that specifically accounts for introns and splice signals.

  • Spidey - an mRNA-to-genomic alignment program

  • Wise2 - align a protein or profile HMM against genomic sequence to predict a gene structure, and related tools

  • PipMaker - computes alignments of similar regions in two (long) DNA sequences

  • VISTA - align + detect conserved regions in long genomic sequences

  • myGodzilla - align a sequence to its ortholog in the human genome

Human Genome Databases

Databases of other Organisms

Genome-wide Analysis

  • MBGD - comparative analysis of completely sequenced microbial genomes

  • COGs - phylogenetic classification of orthologous proteins from complete genomes

  • STRING - detect whether a given query gene occurs repeatedly with certain other genes in potential operons

  • Pedant - automatic whole genome annotation

  • GeneCensus - various whole genome comparisons

Protein Domains: Databases and Search Tools

  • InterPro - integration of Pfam, PRINTS, PROSITE, SWISS-PROT + TrEMBL

  • PROSITE - database of protein families and domains

  • Pfam - alignments and hidden Markov models covering many common protein domains

  • SMART - analysis of domains in proteins

  • ProDom - protein domain database

  • PRINTS Database - groups of conserved motifs used to characterise protein families

  • Blocks - multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins

  • Protein Domain Profile Analysis @ BMERC - search a library of profiles with a protein sequence

  • TIGRFAMs - yet more protein families based on Hidden Markov Models

Motif and Pattern Search in Sequences

  • Gibbs Motif Sampler - identification of conserved motifs in DNA or protein sequences

  • AlignACE Homepage - gene regulatory motif finding

  • MEME  - motif discovery and search in protein and DNA sequences

  • SAM - tools for creating and using Hidden Markov Models

  • Pratt - discover patterns in unaligned protein sequences

  • Motivated Proteins - a web facility for exploring small hydrogen-bonded motifs

Protein 3D Structure

Phylogeny & Taxonomy

Gene Prediction

Gene Expression Databases (including RNA-seq and single cell)

Gene Regulation

Metabolic, Gene Regulatory & Signal Transduction Network Databases

  • KEGG - Kyoto Encyclopedia of Genes and Genomes

  • BioCarta

  • DAVID - Database for Annotation, Visualization and Integrated Discovery - A useful server to for annotating microarray and other genetic data.

  • stke - Signal Transduction Knowledge Environment

  • BIND - Biomolecular Interaction Network Database

  • EcoCyc

  • WIT

  • PathGuide A very useful collection of resources dealing primarily with pathways

  • SPAD - Signaling Pathway Database

  • CSNDB - Cell Signalling Networks Database

  • PathDB

  • Transpath

  • DIP - Database of Interacting Proteins

  • PFBP - Protein Function and Biochemical Networks

  • Alliance for Cellular Signalling

Systems Biology

Other Databases (Annotations, Ontologies, Consortia, etc.)

Miscellaneous Tools

Computational Resources

Bioinformatics on-line course materials and tutorials (not an exhaustive collection)

Intro to bioinformatics and computational biology:

Algorithms:

Miscellaneous:

Web Sites for Background Information & News

Other Collections of Bioinformatics Resources