xwchen.jpgPh.D. Carnegie Mellon University, Pittsburgh, PA

Director

Bioinformatics and Computational Life Sciences Laboratory

Information and Telecommunication Technology Center

Associate Professor

Electrical Engineering and Computer Science Department

2001 Eaton Hall, University of Kansas

1520 West 15th Street
Lawrence, KS 66045-7621

 

Tel: (785) 864-8825 (Eaton), or 864-4559 (Nichols)

Fax: (785) 864-3226 or -0387 (ITTC)  

Email: xwchen AT ku DOT edu

 

 

News             Research Projects             Publications                        Software      Lab Members                     Awards         Teaching

News

-      I am always looking for highly motivated students who are interested in our PhD program and in doing research in the areas of machine learning, data mining, and bioinformatics. If you are interested, feel free to email me with your CV and supporting material (if any).

-      If you are interesting in developing and evaluating your algorithms for protein-protein interaction prediction, please check out our website KUPS (click here) that generates positive and negative PPI pairs.  

-      We are holding a weekly seminar The Intelligent Informatics Tea time. Feel free to stop by.

-      Interested in Bioinformatics? Join us in the IEEE CS Technical Committee on Bioinformatics (TCBI).

-      IEEE Conference on Healthcare Informatics, Imaging, and Systems Biology (HISB), San Jose, California, 2011

Return to Top

Research Projects

My group is interested in developing novel machine learning and data mining algorithms to accelerate knowledge discovery in life sciences and engineering fields. Currently, our work focuses on multi-label learning, learning from large-scale data, small sample classification, and dimensionality reduction. We also computationally analyze a variety of biological data (e.g., high-throughput expression data (e.g., microarray), protein interaction data, and protein sequence data) in biological pathway understanding, GWAS studies, cancer biology, and healthcare informatics.

We gratefully acknowledge the support from the following sponsors: NSF, DoD, NIH, HRSA, NASA/EPSCoR, JR & Inez Jay Fund, KTEC, KCALSI, and KU.  

-      National Science Foundation CAREER Project

-      National Science Foundation CDI Project (Under Construction)

Return to Top

Publications (selected, 2005 and after)

 

2010

-      X. Chen, J. Jeong, and P. Dermyer: KUPS: Constructing datasets of interacting and non-interacting protein pairs with associated attributes. Nucleic Acids Research, 2010; doi: 10.1093/nar/gkq943

-      M. Wasikowski and X. Chen: Combating the Small Sample Class Imbalance Problem Using Feature Selection. IEEE Transactions on Knowledge and Data Engineering, vol. 22(10):1388-1400, 2010.

-      Y. Chen, H, Yu, B. Luo, and X. Chen: iLike: Integrating Visual and Textual Features for Vertical Search. ACM Multimedia 2010 (MM 10, full paper), Oct. 2010, Firenze, Italy.

-      X, Lin and X. Chen: Soft Relevance for Multi-label Classification. The 18th ACM Conference on Information and Knowledge Management, Oct. 2010 (CIKM 10, full paper), Canada.

-      J. Jeong, X. Lin, and X. Chen: On Position-specific Scoring Matrix for Protein Function Prediction. IEEE/ACM Trans. On Computational Biology and Bioinformatics (TCBB), 2010.

-      B. Han, M. Park, and X. Chen: A Markov blanket-based method for detecting causal SNPs in GWAS. BMC Bioinformatics, 11(suppl 3):S5, doi:10.1186/1471-2105-11-S3-S5, 2010.

-      H. Xiong, Y. Zhang, X. Chen, and J. Yu: Cross-platform microarray data integration using the normalized linear transform. International Journal of Data Mining an Bioinformatics, vol. 4(2): 142-157, 2010.

-      X. Chen and H. Arabnia: Special Issue on Data Mining in Bioinformatics and Biomedicine. IEEE Trans. On Information Technology in Biomedicine, vol. 14(1), 2010, Editorial.

Return to Top

2009

-      M. Liu, X. Chen, and R. Jothi: Knowledge-guided Inference of Domain domain Interactions from Incomplete Protein-protein Interaction Networks. Bioinformatics, 25(19): 2492-2499, 2009.

-      A. Senf and X. Chen: Identification of Genes Involved in the Same Pathway Using a Hidden Markov Model-based Approach. Bioinformatics, 25(22): 2945-2954, 2009.

-      X. Chen and J. Jeong: Sequence-based Prediction of Protein Interaction sites with an Integrative Method. Bioinformatics, 25(5): 585-591, 2009.

-      X. Lin, M. Liu, and X. Chen: Assessing reliability of protein-protein interactions by integrative analysis of data in model organisms. BMC Bioinformatics, 2009, 10(Suppl 4):S5.

-      X. Chen, H. Wang, and X. Lin: Learning to rank with a novel kernel perceptron method. The 18th ACM Conference on Information and Knowledge Management, 505-512, 2009 (CIKM 09).

-      X. Wang, A. Zaidi, R. Pal, A. Garrett, R. Braceras, X. Chen, M. Michaelis, and E. Michaelis: Genomics and Biochemical Approaches in the Discovery of Mechanisms for Selective Neuronal Vulnerability to Oxidative Stress. BMC Neuroscience, 10:12, doi: 10.1186/1471-2202-10-12, 2009.

-      Z. Liu, R. Gartenhaus, X. Chen, and M Tan: Survival Prediction and Gene Identification with Penalized Global AUC maximization. Journal of Computational Biology, ahead of print. doi: 10.1089/cmb.2008.0188, 2009.

-      B. Han, X. Chen, X. Wang, and M. Michaelis: Integrating Multiple Microarray Data for Cancer Pathway Analysis Using Bootstrapping K-S Test. Journal of Biomedicine and Biotechnology, Article ID 707580, doi: 10.1155/2009/707580, 2009.

Return to Top

2008

-      X. Chen, M. Liu, and R. Ward: Protein Function Assignment through Mining Cross Species Protein-protein Interactions. PLoS ONE, 3(2): e1562, 2008.

-      X. Chen, G. Anantha, and X. Lin: Improving Bayesian Network Structure Learning with Mutual Information-based Node Ordering in the K2 Algorithm. IEEE Transactions on Knowledge and Data Engineering, vol. 20(5): 628-640, 2008. 

-      X. Chen and M. Wasikowski: FAST:  A ROC-based Feature Selection Metric for Small Samples and Imbalanced Data Classification Problems. The 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), 2008.

-      J. Yu, F. Cheng, H. Xiong, W. Qu, and X. Chen: A Bayesian approach to support vector machines for the binary classification. Neurocomputing, vol. 72 (1-3), 177 – 185, 2008.

-      X. Lin, M. Liu, and X. Chen: Protein-protein Interaction Prediction and Assessment from Model Organisms. Proceedings of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), regular paper, 187 – 192, 2008.

-      X. Chen, B. Han, J. Fang, and R. Haasl: Large-scale protein-protein interaction prediction using novel kernel methods. International Journal of Data Mining and Bioinformatics, 2(2), 145-156, 2008.

Return to Top

2007

-      X. Chen and J. Jeong: Minimum Reference Set Based Feature Selection for Small Sample Classifications. Proceedings of the 24th International Conference on Machine Learning (ICML 07), 153 – 160.

-      Tarca, V. Carey, X. Chen, R. Romero, and S. Draghici: Machine Learning and Its applications to Biology. PLoS Computational Biology, vol. 3(6), e116, 2007.

-      H. Xiong, Y. Zhang, and X. Chen: Data-dependent Kernel Machines for Microarray Data Classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 4(4), 583-595, 2007.

-      X. Wang, R. Pal. X. Chen, K. Kumar, OJ. Kim, and E. Michaelis: Genome-wide Transcriptome Profiling of Region-specific Vulnerability to Oxidative Stress in the Hippocampus. Genomics, doi:10.1016/j.ygeno.2007.03.007, 2007.

-      X. Chen, B. Gerlach, D. Chen, and Z. Liu: Structural risk minimization based gene expression profiling analysis. International Journal on Bioinformatics Research and Applications, vol. 3(2), 153-169, 2007.

-      Aaron Smalter, Seak Fei Lei, and Xue-wen Chen: Human Disease-gene Classification with Integrative Sequence-based and Topological Features of Protein-protein Interaction Networks. Proceedings of Sixth IEEE International Conference on Bioinformatics and Biomedicine, 209-214, 2007.

Return to Top

2006

-      X. Chen, G. Anantha, and X. Wang: An effective structure learning method for constructing gene networks. Bioinformatics, 22(11):1367-1374, 2006.

-      H. Xiong and X. Chen: Kernel-Based Distance Metric Learning for Microarray Data Classification. BMC Bioinformatics, 7:299, 2006.

-      X. Chen, X. Zeng, and D. van Alphen: Multi-class feature selection for texture classification. Pattern Recognition Letters, vol. 27(14), pp. 1685-1691, 2006.

-      X. Chen: Margin based wrapper methods for gene identification using microarray. Neurocomputing, vol. 69 (16-18): 2236-2243, 2006

-      X. Chen and M. Liu: Domain based predictive models for protein-protein interaction prediction. EURASIP Journal on Applied Signal Processing, special issue in Bioinformatics, vol. 2006, Article ID 32767, 2006.

Return to Top

2005

-      X. Chen and M. Liu: Prediction of Protein-protein Interactions Using Random Decision Forest Framework. Bioinformatics, 21(24): 4394-4400, 2005.

-      J. Yu and X. Chen: Bayesian Neural Network Approaches to Ovarian Cancer Identification from High-resolution Mass Spectrometry Data. Bioinformatics, 21 (suppl_1):i487-i494, 2005.

-      J. Yu, S. Ongarello, R. Fiedler, X. Chen, G. Toffolo, C. Cobelli, and Z. Trajanoski: Ovarian Cancer Identification Based on Dimensionality Reduction for High-Throughput Mass Spectrometry Data. Bioinformatics, vol. 21(10), pp. 2200-2209, 2005.

-      X. Zheng and X. Chen: SMO Based Pruning Method for Sparse Least Squares Support Vector Machines. IEEE Transactions on Neural Networks, vol. 16(6), pp. 1541-1546, 2005.

-      X. Wang, R. Pal, X. Chen, N. Limpeachob, K. Kumar, and E. Michaelis: High Intrinsic Oxidative Stress May Underlie Selective Vulnerability of the Hippocampal CA1 Region. Mol. Brain Research, 140: 120-126, 2005.

-      H. Xiong and X. Chen: Optimized Kernel Machine Based Cancer Classification Using Gene Expression Data. Proceedings of 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 268-274, 2005.

-      X. Chen and J. Chen: Protein Flexibility Modeling Using Kernel Based methods. International Joint Conference on Neural Networks, vol. 1:521-526, 2005.

-      X. Chen, B. Gerlach, and D. Casasent: Pruning Support Vectors for Imbalanced Data Classification. International Joint Conference on Neural Networks, vol. 3:1883-1888, 2005.

Return to Top

Software, Code and Websites

 

-      KUPS: The University of Kansas Proteomics Services

 

Fig1.tif

 

KUPS provides high-quality protein-protein interaction (PPI) datasets for researchers who are interested in developing and evaluating their computational models for PPI prediction. With KUPS, users can generate both positive and negative PPIs with several choices of features. KUPS also provides benchmark results.

 

Paper                                                  Database

 

-      Markov blanket-based method for causal SNP detection in GWAS

 

DASSO-MB is a new Markov blanket-based approach to detect epistatic interactions in case-control genome-wide association studies (GWAS).

 

Paper                                                  Software

 

-      HMM Method for Uncovering Genes in the Same pathways

 

Figure 2

A Hidden Markov Model (HMM) based algorithm for detecting groups of genes functionally similar to a set of input genes from microarray expression data.

(Paper        Software and Data)

Return to Top

-      DDINet: Network of Interacting Protein Domains

Fig3_mod
 

 

 

 

 

 

 

 

 


DDINet provides a network of interacting protein domains, which is modeled by an undirected graph where vertices correspond to Pfam domains, and edges represent interactions inferred using our proposed model.

(Paper       Software and Data)  Return to Top

 

 

 

-      Binding: software for protein binding site prediction

 

 

1GG2_trace_ours

 

Identification of protein interaction sites has significant impact on understanding protein function, elucidating signal transduction networks, and drug design studies. With the exponentially growing protein sequence data, predictive methods using sequence information only for protein interaction site prediction have drawn increasing interest. In this paper, we propose a predictive model for identifying protein interaction sites.

(Paper       Software and Data) Return to Top

 

-      CSIDOP: software for protein function assignment

Figure1

CSIDOP is a new method for protein function assignment based on the shared interacting domain patterns extracted from cross-species protein-protein interaction (PPI) data. (Paper       Software and Data) Return to Top

-      RFPPI: random forest based PPI prediction

RFPPI is a random forest-based approach for protein-protein interaction prediction using domain information. (Paper       Software and Data) Return to Top

-      DataKernel: Matlab codes for data-dependent kernel

We developed a data-dependent kernel for microarray data analysis

(Paper       Codes)  Return to Top

Lab Members

image001.jpg

Current Students:

Jong Cheol Jeong, PhD Candidate

Bing Han, PhD Candidate

Meeyoung Park, PhD Candidate

Alex Senf, PhD Candidate

Wenrong Zeng, PhD Candidate

Hariprasad Sampathkumar, PhD Candidate

Patrik Dermyer, REU Student

 

Alumni:

Mei Liu (Vanderbilt University), PhD

Mike Wasikowski (ATDCA Center), MSc

Jim Vallandingham (GARMIN), MSc

Gopal Anantha (Sprint), MSc

Byron Gerlach (IBM), MSc

Jeremy Chen (Cerner), MSc

 

Huilin Xiong (Shanghai Jiaotong University, Professor)

Jiangsheng Yu (Beijing University, Associate Professor)

 

Return to Top

Awards

-      NSF CAREER Award

-      2007 Miller Professional Development Award for Distinguished Research, KU School of Engineering

-      2008 Miller Professional Development Award for Distinguished Service, KU School of Engineering

Teaching

-      EECS168: C++ Programming (2010 Fall)

Return to Top