ITTC will develop an integrated research and education program for advancing the underlying theoretical and computational principles of data mining in the emergent chemical genomics databases. The core technical innovations are advances in (i) developing effective kernel-based representations and structure pattern extraction and selection methods to capture the intrinsic characteristics of irregular and discrete spaces such as the chemical space, (ii) designing methods for adaptive and scalable similarity search in large databases of complex data and methods for accurate classification model construction with imbalanced and out-of-domain data, and (iii) deriving application oriented validation.
A key strength of this work is the application of the theoretic and computational advancements to real-world problems, namely, chemical toxicity prediction based on microarray gene expression profiles and high-throughput chemical screening. By developing innovative tools for graphs and geometric structures, ITTC will enable much better techniques for searching, mining, and analyzing domains of complex data. The timely effort integrates and advances knowledge in three communities: cheminformatics, data mining, and machine learning.
Faculty Investigator(s): Jun Huan (PI)
Student Investigator(s): Brian Quanz
Copyright © 2008 by the University of Kansas
Please send comments and questions to the webmaster.
