Computational Biology, Bioinformatics, Statistical Genetics

Bayesian inference for a generative model of transcriptome profiles from single-cell RNA sequencing. R. Lopez, J. Regier, M. Cole, M. I. Jordan, and N. Yosef. Nature Methods, 15, 1053-1058, 2018.

A deep generative model for semi-supervised classification with noisy labels. M. Langevin, E. Mehlman, J. Regier, R. Lopez, M. I. Jordan, and N. Yosef. arxiv.org/abs/1809.05957, 2018.

A marked Poisson process driven latent shape model for 3D segmentation of reflectance confocal microscopy image stacks of human skin. S. Ghanta, M. I. Jordan, K. Kose, D. Brooks, M. Rajadhyaksha, and J. Dy. IEEE Transactions on Image Processing, \emph{26}, 172-184, 2017.

Mining massive amounts of genomic data: A semiparametric topic modeling approach. E. Fang, M-D. Li, M. I. Jordan, and H. Liu. Journal of the American Statistical Association, 10.1080/01621459.2016.1256812, 2016.

Changepoint analysis for efficient variant calling. A. Bloniarz, A. Talwalkar, J. Terhorst, M. I. Jordan, D. Patterson, B. Yu, and Y. Song. International Conference on Research in Computational Molecular Biology (RECOMB), Pittsburgh, PA, 2014.

SMASH: A benchmarking toolkit for variant calling. A. Talwalkar, J. Liptrap, J. Newcomb, C. Hartl, J. Terhorst, K. Curtis, M Bresle r, Y. Song, M. I. Jordan, and D. Patterson. Bioinformatics, DOI:10.1093/bioinformatics/btu345, 2014.

Evolutionary inference via the Poisson indel process. A. Bouchard-Côté and M. I. Jordan. Proceedings of the National Academy of Sciences, 110, 1160-1166, 2013.

Molecular function prediction for a family exhibiting evolutionary tendencies towards substrate specificity swapping: Recurrence of tyrosine aminotransferase activity in the I$\alpha$ subfamily. K. Muratore, B. Engelhardt, J. Srouji, M. I. Jordan, S. Brenner, and J. Kirsch. Proteins: Structure, Function, and Bioinformatics, DOI:10.1002/prot.24318, 2013.

A million cancer genome warehouse. D. Haussler, D. A. Patterson, M. Diekhans, A. Fox, M. I. Jordan, A. D. Joseph, S. Ma, B. Paten, S. Shenker, T. Sittler and I. Stoica. Technical Report UCB/EECS-2012-211, Department of EECS, University of California, Berkeley, 2012.

Phylogenetic inference via sequential Monte Carlo. A. Bouchard-Côté, S. Sankararaman, and M. I. Jordan. Systematic Biology, 61, 579-593, 2012.

Genome-scale phylogenetic function annotation of large and diverse protein families. B. Engelhardt, M. I. Jordan, J. Srouji, and S. Brenner. Genome Research, 21, 1969-1980, 2011.

Nonparametric combinatorial sequence models. F. Wauthier, M. I. Jordan, and N. Jojic. 15th Annual International Conference on Research in Computational Molecular Biology (RECOMB), Vancouver, BC, 2011.

Feature space resampling for protein conformational search. B. Blum, M. I. Jordan, and D. Baker. Proteins: Structure, Function, and Bioinformatics, 78, 1583-1593, 2010. [Supplementary information].

Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model. D. Ting, G. Wang, M. Shapovalov, R. Mitra, M. I. Jordan, and R. Dunbrack. PLoS Computational Biology, 6, e1000763, 2010.

Active site prediction using evolutionary and structural information. S. Sankararaman, F. Sha, J. Kirsch, M. I. Jordan, and K. Sjolander. Bioinformatics, 26, 617-624, 2010.

Genomic privacy and the limits of individual detection in a pool. S. Sankararaman, G. Obozinski, M. I. Jordan, and E. Halperin. Nature Genetics, 41, 965-967, 2009.

Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data. J. Yin, M. I. Jordan, and Y. Song. Bioinformatics, 25, i231-i239, 2009.

Efficient inference in phylogenetic InDel trees. A. Bouchard-Côté, M. I. Jordan, and D. Klein. In D. Koller, Y. Bengio, D. Schuurmans and L. Bottou (Eds.), Advances in Neural Information Processing Systems (NIPS) 22, 2009.

Association mapping and significance estimation via the coalescent. G. Kimmel, R. Karp, M. I. Jordan, and E. Halperin, American Journal of Human Genetics, 83, 675-683, 2008.

A dual receptor cross-talk model of G protein-coupled signal transduction. P. Flaherty, M. A. Radhakrishnan, T. Dinh, M. I. Jordan, and A. P. Arkin. PLoS Computational Biology, 4, e1000185, 2008.

Consistent probabilistic outputs for protein function prediction. G. Obozinski, C. E. Grant, G. R. G. Lanckriet, M. I. Jordan, and W. S. Noble. Genome Biology, 9, S7, 2008.

On the inference of ancestries in admixed populations. S. Sankararaman, G. Kimmel, E. Halperin, and M. I. Jordan. Genome Research, 18, 668-675, 2008.

Feature selection methods for improving protein structure prediction with Rosetta. B. Blum, M. I. Jordan, D. Kim, R. Das, P. Bradley, and D. Baker. In J. Platt, D. Koller, Y. Singer and A. McCallum (Eds.), Advances in Neural Information Processing Systems (NIPS) 21, 2008.

Quantitative gene function assignment from genomic datasets in M. musculus. L. Pena-Castillo, et al. Genome Biology, 9, 2008.

A randomization test for controlling population stratification in whole-genome association studies. G. Kimmel, M. I. Jordan, E. Halperin, R. Shamir, and R. Karp. American Journal of Human Genetics, 81, 895-905, 2007.

Bayesian haplotype inference via the Dirichlet process. E. P. Xing, M. I. Jordan, and R. Sharan. Journal of Computational Biology, 14, 267-284, 2007.

Bayesian multi-population haplotype inference via a hierarchical Dirichlet process mixture. E. P. Xing, K.-A. Song, M. I. Jordan, and Y. W. Teh. Proceedings of the 23rd International Conference on Machine Learning (ICML), 2006.

A statistical graphical model for predicting protein molecular function. B. Engelhardt, M. I. Jordan, and S. Brenner. Proceedings of the 23rd International Conference on Machine Learning (ICML), 2006.

Robust design of biological experiments. P. Flaherty, M. I. Jordan and A. P. Arkin. In Y. Weiss and B. Schoelkopf and J. Platt (Eds.), Advances in Neural Information Processing Systems (NIPS) 19, 2006.

Mining the Caenorhabditis Genetic Center bibliography for genes related to life span. D. M. Blei, M. I. Jordan, and S. Mian. BMC Bioinformatics, 7, 250-269, 2006.

Protein function prediction by Bayesian phylogenomics. B. E. Engelhardt, M. I. Jordan, K. E. Muratore, and S. E. Brenner. PLoS Computational Biology, 1, e45, 2005.

Subtree power analysis and species selection for comparative genomics. J. D. McAuliffe, M. I. Jordan, and L. Pachter. Proceedings of the National Academy of Sciences, 102, 7900-7905, 2005. [preprint]

Genome-wide requirements for resistance to functionally distinct DNA-damaging agents. L. William, R. P. St. Onge, M. Proctor, P. Flaherty, M. I. Jordan, A. P. Arkin, R. W. Davis, C. Nislow, and G. Giaever. PLoS Genetics, 1, 235-246, 2005.

Sulfur and nitrogen limitation in Escherichia coli K12: specific homeostatic responses. P. Gyaneshwar, O. Paliy, J. McAuliffe, A. Jones, M. I. Jordan, and S. Kustu. Journal of Bacteriology, 187, 1074-1090, 2005.

A latent variable model for chemogenomic profiling. P. Flaherty, G. Giaever, J. Kumm, M. I. Jordan, and A. P. Arkin. Bioinformatics, 21, 3286-3293, 2005.

Multiple-sequence functional annotation and the generalized hidden Markov phylogeny. J. D. McAuliffe, L. Pachter, and M. I. Jordan. Bioinformatics, 20, 1850-1860, 2004.

Chemogenomic profiling: Identifying the functional interactions of small molecules in yeast. G. Giaever, P. Flaherty, J. Kumm, M. Proctor, D. F. Jaramillo, A. M. Chu, M. I. Jordan, A. P. Arkin, and R. W. Davis. Proceedings of the National Academy of Sciences, 3, 793-798, 2004.

A statistical framework for genomic data fusion. G. R. G. Lanckriet, T. De Bie, N. Cristianini, M. I. Jordan, and W. S. Noble. Bioinformatics, 20, 2626-2635, 2004.

Graphical models. M. I. Jordan. Statistical Science (Special Issue on Bayesian Statistics), 19, 140-155, 2004.

Bayesian haplotype inference via the Dirichlet process. E. P. Xing, R. Sharan, and M. I. Jordan. Proceedings of the 21st International Conference on Machine Learning (ICML), 2004.

Kernel-based data fusion and its application to protein function prediction in yeast. G. R. G. Lanckriet, M. Deng, N. Cristianini, M. I. Jordan, and W. S. Noble. Pacific Symposium on Biocomputing (PSB), 2004. [Supplementary information].

Robust sparse hyperplane classifiers: application to uncertain molecular profiling data. C. Bhattacharyya, L. R. Grate, M. I. Jordan, L. El Ghaoui, and Mian, I. S. Journal of Computational Biology, 11, 1073-1089, 2004.

LOGOS: A modular Bayesian model for de novo motif detection. E. P. Xing, W. Wu, M. I. Jordan, and R. M. Karp. Journal of Bioinformatics and Computational Biology, 2, 127-154, 2004.

Toward a protein profile of Escherichia coli: Comparison to its transcription profile. R. W. Corbin, O. Paliy, F. Yang, J. Shabanowitz, M. Platt, C. E. Lyons, Jr., K. Root, J. D. McAuliffe, M. I. Jordan, S. Kustu, E. Soupene, and D. F. Hunt. Proceedings of the National Academy of Sciences, 100, 9232-9237, 2003.

Kernel-based integration of genomic data using semidefinite programming. G. R. G. Lanckriet, N. Cristianini, M. I. Jordan, and W. S. Noble. In B. Schoelkopf, K. Tsuda and J-P. Vert (Eds.), Kernel Methods in Computational Biology, Cambridge, MA: MIT Press, 2003.

A hierarchical Bayesian Markovian model for motifs in biopolymer sequences. E. P. Xing, M. I. Jordan, R. M. Karp and S. Russell. In S. Becker, S. Thrun, and K. Obermayer (Eds.), Advances in Neural Information Processing Systems (NIPS) 16, 2003.

Integrated analysis of transcript profiling and protein sequence data. L. R. Grate, C. Bhattacharyya, M. I. Jordan, and I. S. Mian. Mechanisms of Ageing and Development, 124, 109-114, 2003.

LOGOS: A modular Bayesian model for de novo motif detection. E. P. Xing, W. Wu, M. I. Jordan, and R. M. Karp. IEEE Computer Society Bioinformatics Conference (CSB), 2004.

Simultaneous relevant feature identification and classification in high-dimensional spaces: Application to molecular profiling data. C. Bhattacharyya, L. R. Grate, A. Rizki, D. Radisky, F. J. Molina, M. I. Jordan, M. J. Bissell, and I. S. Mian. Signal Processing, 83, 729-743, 2003.

Simultaneous relevant feature identification and classification in high-dimensional spaces. L. R. Grate, C. Bhattacharyya, M. I. Jordan and I. S. Mian. Workshop on Algorithms in Bioinformatics, 2002. [matlab code], [perl/lp_solve code].

Feature selection for high-dimensional genomic microarray data. E. P. Xing, M. I. Jordan, and R. M. Karp. Machine Learning: Proceedings of the Eighteenth International Conference, San Mateo, CA: Morgan Kaufmann, 2001.