Skip redundant pieces
Information and Telecommunication Technology Center (ITTC)

ITTC Project


CAREER: Mining Genome-wide Chemical-Structure Activity Relationships in Emergent Chemical Genomics Databases

Project Award Date: 07-01-2009



Description

ITTC will develop an integrated research and education program for advancing the underlying theoretical and computational principles of data mining in the emergent chemical genomics databases. The core technical innovations are advances in (i) developing effective kernel-based representations and structure pattern extraction and selection methods to capture the intrinsic characteristics of irregular and discrete spaces such as the chemical space, (ii) designing methods for adaptive and scalable similarity search in large databases of complex data and methods for accurate classification model construction with imbalanced and out-of-domain data, and (iii) deriving application oriented validation.

A key strength of this work is the application of the theoretic and computational advancements to real-world problems, namely, chemical toxicity prediction based on microarray gene expression profiles and high-throughput chemical screening. By developing innovative tools for graphs and geometric structures, ITTC will enable much better techniques for searching, mining, and analyzing domains of complex data. The timely effort integrates and advances knowledge in three communities: cheminformatics, data mining, and machine learning.


Investigators

Faculty Investigator(s): Jun Huan (PI)

Student Investigator(s): Brian Quanz


Project Sponsors


Primary Sponsor(s): National Science Foundation