Multiclass classification

S.Godbole, S. Sarawagi, S. Chakrabarti,  Scaling multi-class Support Vector Machines using inter-class confusion, KDD 02   Multiclass SVMsee the Hsu and Lin paper below for a survey and references therein

    Multiclass by combination of binary classifiers

 R Rifkin, A Klautau  In Defense of One-Vs-All Classification,   -  The Journal of Machine Learning Research, 2004
A very nice paper that gives an overview of all techniques that have been proposed for multiclass classification with a
critical look at their respective pusblished analyses, and a thorough experimental investigation.

C-W Hsu and C-J. Lin  A comparison of methods for multi-class support vector machines.  IEEE Transactions on Neural Networks, 13:415{425}, 2002.
Comparison of multiclass SVM, OVA, AVA, DAGSVM.

Kaibo Duan & S. Sathiya Keerthi. Which is the Best Multiclass SVM Method? An Empirical Study,  Proceedings of NIPS, 2003
Claims that AVA + pairwise coupling is best. See calibration for pairwise coupling.

    Calibration for One-vs-All

J. Platt, Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods (84K gzipped PS file) , Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Scholkopf, D. Schuurmans, eds., MIT Press, (1999), to appear.

H.-T. Lin, C.-J. Lin, and R. C. WengA note on Platt's probabilistic outputs for support vector machines. May, 2003.

    Calibration for All-vs-All

Ting-Fang Wu, C-J Lin, R. Weng,  Probability estimates for Multi-Class classification by pairwise coupling. Journal of Machine Learning Research 5 (2004) 975-1005

Structured classification

    Structured perceptron

Michael Collins.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms.
The Structured Perceptron is very easy to implement as a first try for structured classification. For an example of application in machine translation, see An End-to-End Discriminative Approach to Machine Translation by Percy Liang, Alex Bouchard-Cote, Dan Klein, and Ben Taskar, ACL 2006.

    CRF

Gentle introduction by Hannah M.Wallach  
The above page also contains valuable pointers to software for CRF.

    M3net and max-margin structured learning

Gentle introduction by Simon Lacoste-Julien

Tutorial by Ben Taskar and Dan Klein on  Max-Margin Methods for NLP: Estimation, Structure, and Applications. The Association for Computational Linguistics (ACL05), Ann Arbor, MI, June 2005.