Readings

An introduction to kernelbased learning algorithms.
K.R. Müller, S. Mika, G. Rätsch, K. Tsuda, and B. Schölkopf.
IEEE Neural Networks, 12(2):181201, 2001.

Nonlinear component analysis as a kernel eigenvalue problem.
B. Schölkopf, A. Smola, and K.R. Müller.
Neural Computation, 10:12991319, 1998.

Kernel independent component analysis.
F. R. Bach and M. I. Jordan. Journal of Machine Learning Research,
3, 148, 2002. [Read sections 2.1 and 3.2 for now].

Convolution kernels on discrete structures.
D. Haussler. Technical Report UCSCCRL9910,
University of California, Santa Cruz.

Positive definite rational kernels.
C. Cortes, P. Haffner, and M. Mohri.
Proceedings of the Conference on Computational Learning Theory,
2003.

Marginalized kernels for biological sequences.
K. Tsuda, T. Kin and K. Asai. Bioinformatics,
18(Suppl 1), 268275, 2002.

Learning the kernel matrix with semidefinite programming.
G. R. G. Lanckriet, N. Cristianini, L. El Ghaoui, P. L. Bartlett,
and M. I. Jordan. Journal of Machine Learning Research,
5:2772, 2004.

Prediction with Gaussian processes: from linear regression to linear
prediction and beyond.
C. Williams. In ``Learning and Inference in Graphical Models,''
MIT Press, 1999.

On spectral clustering: Analysis and an algorithm.
A. Ng, M. I. Jordan, and Y. Weiss.
In Advances in Neural Information Processing (NIPS) 14,
MIT Press, 2002.

Multiclass spectral graph partitioning.
S. X. Yu and J. Shi.
In International Conference on Computer Vision,
2003.

Learning spectral clustering.
F. Bach and M. I. Jordan.
In Advances in Neural Information Processing (NIPS) 16,
MIT Press, 2004.

Cluster kernels for semisupervised learning.
O. Chapelle, J. Weston and B. Schoelkopf.
In Advances in Neural Information Processing (NIPS) 14,
MIT Press, 2002.

An introduction to MCMC for machine learning.
C. Andrieu, N. de Freitas, A. Doucet and M. I. Jordan.
Machine Learning, 50, 543, 2003.

A Bayesian analysis of some nonparametric problems.
T. S. Ferguson.
Annals of Statistics, 1, 209230, 1973.

Ferguson distributions via Polya urn schemes.
D. Blackwell and J. MacQueen.
Annals of Statistics, 1, 353355, 1973.

Mixtures of Dirichlet processes with applications to
Bayesian nonparametric problems.
C. Antoniak.
Annals of Statistics, 2, 11521174, 1974.

Bayesian density estimation and inference using mixtures.
M. Escobar and M. West.
Journal of the American Statistical Association, 90, 577588, 1995.

A constructive definition of Dirichlet priors.
J. Sethuraman. Statistica Sinica, 4, 639650, 1994.

Markov chain sampling methods for
Dirichlet processes mixture models.
R. Neal. Technical Report 9815, Department of Statistics,
University of Toronto, 1998.

Bayesian haplotype inference via the Dirichlet process.
E. P. Xing, R. Sharan and M. I. Jordan.
Technical Report CSD031275, Division of Computer Science,
University of California, Berkeley, 2003.

Hierarchical topic models and the nested Chinese restaurant process.
D. M. Blei, T. Griffiths, M. I. Jordan, and J. Tenenbaum.
In press: Advances in Neural Information Processing Systems (NIPS) 16, 2003.

Hierarchical Dirichlet processes.
Y. W. Teh, M. I. Jordan, M. J. Beal and D. M. Blei.
Technical Report 653, Department of Statistics,
University of California, Berkeley, 2004.

Convexity, classification, and risk bounds.
P. L. Bartlett, M. I. Jordan, and J. D. McAuliffe.
Technical Report 638, Department of Statistics,
University of California, Berkeley, 2003.

Concentrationofmeasure inequalities.
G. Lugosi. Department of Economics,
Pompeu Fabra University, 2004.

A few notes on Statistical Learning Theory
S. Mendelson. In Advanced Lectures in Machine Learning,
LNCS 2600, New York: Springer, 2003.