Projects
To achieve my research goals, I use genomics (functional and comparative) and computational biology tools (bioinformatic approaches).
Here's a list of our efforts — including past and current, and also those in pipeline.
Click on the links below to learn more
Comparative Genomics Tools (in-house developed and published online databases and servers):
Name: Trafac
Purpose: Combines the results of conserved sequence analyses with those of transcription factor binding site analyses and allows detection and visualization of compositionally similar cis-element clusters in the context of conserved sequences. The results are depicted as a Regulogram and Trafacgram
Reference: Jegga et al., 2002 (Citing articles via Google Scholar)
Name: Cis-Mols
Purpose: Identifies compositionally predicted cis-clusters that occur in groups of co-regulated genes within each of their ortholog-pair evolutionarily conserved cis-regulatory regions. It is designed to search for regulatory clusters not just in the upstream region of co-expressed genes but also in the non-coding intronic and 5' and 3' flanking genomic regions
Reference: Jegga et al., 2005 (Citing articles via Google Scholar)
Name: GenomeTrafac
Purpose: Web-accessible database resource that allows genome-wide detection and characterization of compositionally similar cis-clusters that occur in gene orthologs between any two genomes for both microRNA genes as well as conventional RNA-encoding genes.
Reference: Jegga et al., 2007 (Citing articles via Google Scholar).
Name: ConCisE Scanner
Purpose: Enables you to select one or more transcription binding sites and search all genes in the database for clusters containing the selected site(s). Within each cluster, you can view the exact position of each of the conserved binding site.
Functional Genomics Server and Repository
Name: PolyDoms
Purpose: Integrates the results of multiple algorithmic procedures and functional criteria applied to the entire Entrez dbSNP dataset. In addition to predicting structural and functional impacts of all nsSNPs, filtering functions enable group-based identification of potentially harmful nsSNPs among multiple genes associated with specific diseases, anatomies, mammalian phenotypes, gene ontologies, pathways or protein domains. PolyDoms provides a means to derive a list of candidate SNPs to be evaluated in experimental or epidemiological studies for impact on protein functions and disease risk associations.
Reference: Jegga et al., 2007 (Citing articles via Google Scholar).
Systems Biology
Name: ToppFun
Purpose: can be used for gene list functional enrichment analysis. It uses as many as 17 annotation categories. Flexible options are provided to either download results as a tab-delimited file or display as a chart. Hypergeometric distribution with Bonferroni correction is used as the standard method for determining statistical significance.
Access: http://toppgene.cchmc.org
Help: http://toppgene.cchmc.org/help/help.jsp
Reference: Chen et al., 2009 (Citing articles via Google Scholar)
Name: ToppGene
Purpose: works by generating a representative profile of the training genes using as many as 17 features and identifies over-representative terms from the training genes. This forms the first step and is done by using ToppFun component. The test set genes are compared to this representative profile of the training set or the overrepresented terms from the training genes.
Access and Help: same as above
Reference: Chen et al., 2007 (Citing articles via Google Scholar).
Name: ToppNet
Purpose: Gene prioritization based on protein–protein interaction network (PPIN) analyses. Based on the observation that biological networks share many properties with Web and social networks, ToppNet uses extended versions of three algorithms from White and Smyth —PageRank with Priors, HITS with Priors and K-step Markov — to prioritize disease candidate genes by estimating their relative importance in the PPIN to the disease-related genes.
Access and Help: same as above
Reference: Chen etal., 2009 (Citing articles via Google Scholar).
Name: ToppGeNet
Purpose: differs from ToppGene and ToppNet in that the test set is derived from the protein interactome. In other words, for a training set of known disease genes, the test set is generated by mining the protein interactome and compiling the genes either directly or indirectly interacting (based on user input) with the training set. The interactome-based test set genes can be prioritized using either a functional annotation-based method (ToppGene) or PPIN-based method (ToppNet).
Access and Help: same as above
Reference: Chen etal., 2009 (Citing articles via Google Scholar)
Name: PhenoHM
Purpose: Human–mouse comparative phenome–genome server that facilitates cross-species identification of genes associated with orthologous phenotypes. By cross-mapping mouse–human phenotype terms, extracting implicated genes and extrapolating phenotype-gene associations between species PhenoHM enables rapid identification of genes that trigger similar outcomes in human and mouse.
Reference: Sardana et al., 2010 (Citing articles via Google Scholar)
Name: GATACA
Purpose: GATACA or Gene(tic) Associations To Anatomy and Clinical Abnormalities is a disease-centered knowledgebase that enables biomedical researchers to explore, analyze, and hypothesize genetic pathways, networks and processes responsible for disease.
Access: http://gataca.cchmc.org
Status: Ongoing
Name: Orphan Diseasome (Orphan or Rare Disease Networks)
Purpose: The Orphan Diseasome web site allows investigators to explore the orphan disease (OD) or rare disease relationships based on shared genes and shared enriched features (e.g., Gene Ontology Biological Process, Cellular Component, Pathways, Mammalian Phenotype). Additionally, users can also explore the networks of orphan disease causal genes where the nodes are orphan disease causing mutant genes (ODMG) while the edge represents shared OD or a protein-protein interaction.
Access: http://research.cchmc.org/od
Reference: Zhang et al., 2011 (Citing articles via Google Scholar)
Name: IPF Database
Purpose: Compilation of differentially expressed genes in IPF from different published studies.
Status: Ongoing