CS 281, Spring 1998, Machine Learning
Reading List




Books

Required: Machine Learning, by Tom Mitchell, McGraw Hill, 1997.

Required: Neural Networks and Pattern Recognition, by Chris Bishop, Oxford, 1995.

Recommended: Artificial Intelligence: A Modern Approach, by Stuart Russell and Peter Norvig, Prentice Hall, 1995.

Week 1 (1/19, Wed. only): Introduction: Learning in intelligent agents

Russell and Norvig Ch.1 (skim), Ch.2.

Week 2 (1/26): Models of learning; function learning; version spaces

Russell and Norvig Ch.18.1-2
Mitchell Ch.1-2
Bishop Ch.1.1-1.7

Week 3 (2/2): Function learning (decision trees)

Mitchell Ch.3

Claude Sammut, Scott Hurst, Dana Kedzier, and Donald Michie ``Learning to Fly.'' In Proc. ML-92.

George John, Ron Kohavi, and Karl Pfleger ``Irrelevant features and the subset selection problem.'' In Proc. ML-94.

Ron Kohavi, ``A study of cross-validation and Bootstrap for accuracy estimation and model selection.'' In Proc. IJCAI-95.

Week 4 (2/9): Function learning, theoretical analysis

Usama Fayyad and Keki Irani ``Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning.'' In Proc. IJCAI-93.

Andrew W. Moore and Mary Soon Lee, ``Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets.'' JAIR, to appear.

Mitchell Ch.7.1-7.3

Week 6 (2/23): Bayesian learning, Naive Bayes classifier

Bishop Ch.1.8-1.10, Ch.2.1-2.3
Mitchell 6.1-6.10

Week 7 (3/2): Mixture models, Bayesian networks

Bishop Ch.2.6
Mitchell Ch.6.12
(Background: Russell and Norvig Ch.14, Ch.15.1-3)

Mitchell Ch.6.11

John Binder, Daphne Koller, Stuart Russell, Keiji Kanazawa, ``Adaptive Probabilistic Networks with Hidden variables.'' Machine Learning}, 29, 213-244, 1997.

Week 8 (3/9): Learning Bayesian networks contd.

David Heckerman, ``A tutorial on learning Bayesian networks,'' Microsoft research MSR-TR-95-06, 1995.

Nir Friedman, ``Learning belief networks in the presence of missing values and hidden variables.'' In Proc. ML-97.

Week 9 (3/16): Probabilistic temporal models, speech

L. Rabiner and B. Juang, ``An introduction to hidden Markov models,'' IEEE ASSP Magazine, 1986.

Geoff Zweig and Stuart Russell, ``Speech Recognition with Dynamic Bayesian Networks.'' Submitted for publication.

Week 10 (3/23): Spring break

Schultz, C. et al. ``Comics.'' In Hearst, W. R. III (Ed.) Sunday Examiner/Chronicle, March 22, 1998.

Week 11 (3/30): Instance-based methods

Bishop Ch.2.5
Mitchell Ch.8 (except 8.4)

C. G. Atkeson, S. A. Schaal and Andrew W, Moore, ``Locally Weighted Learning.'' AI Review, 11, 11-73, 1997.

Week 12 (4/6): Linear models

Bishop Ch.3

Week 13 (4/13): Neural networks

Bishop Ch.4

Tom Dietterich and Ghulam Bakiri ``Error-correcting output codes: A general method for improving multiclass inductive learning programs.'' In Proc. AAAI-91.

Week 14 (4/20): RBFs, SVMs

Bishop Ch.5
Mitchell Ch.7.4

C. J. C. Burges, ``A Tutorial on Support Vector Machines for Pattern Recognition.'' Submitted for publication.

Week 15 (4/27): Ensemble methods

Mitchell Ch.7.5

Tom Dietterich, ``Machine Learning Research.'' AI Magazine, 18(4), 97-105 (first part), 1997.

Yoav Freund and Robert Schapire, ``Experiments with a new boosting algorithm.'' In Proc. ML-96.

Andy Golding and Dan Roth, ``Applying Winnow to context-sensitive spelling correction.'' In Proc. ML-96.

Chan, P. K. and Stolfo, S. J., ``Learning arbiter and combiner trees from partitioned data for scaling machine learning.'' In Proc. KDD-95.

Week 16 (5/4): Rule learning, inductive logic programming

Mitchell Ch.10

Week 17 (5/11) (Mon. only): ILP contd.

Ashwin Srinivasan, Stephen Muggleton, Mike Sternberg, and Ross King ``Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction.'' Artificial Intelligence, 85, 277-99, 1996.

Supplementary reading: Reinforcement learning, NLP

Arthur Samuel ``Some Studies in Machine Learning Using the Game of Checkers.'' In E. A. Feigenbaum and J. Feldman (Eds.), Computers and Thought. New York: McGraw-Hill; 1963.

Russell and Norvig Ch. 17.1-3, 20

Gerry Tesauro ``Temporal difference learning of backgammon strategy.'' In Proc. ML-92.

Andrew Moore and Chris Atkeson ``Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time.'' Machine Learning 13(1), 103-130, 1993.

Ron Parr and Stuart Russell, ``Reinforcement Learning with Hierarchies of Machines.'' In Proc. NIPS-97.

Eric Brill and Ray Mooney (Eds.), Special Issue on Empirical Natural Language Processing. AI Magazine, 18(4), 13-80, 1997.