
|
Dan Klein
Assistant Professor
Computer Science Division
University of California at Berkeley
Contact
Information
| Email |
|
 |
| Mail |
|
775 Soda Hall, Berkeley, CA
94720-1776 |
| Phone |
|
(510) 643-0805 |
Research
My research focuses on the automatic organization of natural language
information. Some topics of interest to me are:
- Unsupervised language acquisition
- Machine translation
- Efficient algorithms for NLP
- Information extraction
- Linguistically rich models of language
- Integrating symbolic and statistical methods for NLP
- Organization of the web
My group's web page.
Background
My education, in reverse order.
Some fellowships / awards:
Some paper awards I've won:
- Best Paper Award, ACL 2003, for "Accurate Unlexicalized
Parsing" with Chris Manning
- Best Paper Award, EMNLP 2004, for "Max-Margin Parsing"
with Ben Taskar, Mike Collins, Chris Manning, and Daphne Koller
- Best Student Paper Award, NAACL 2006, for "Prototype-Driven
Learning for Sequence Models" with Aria Haghighi
A vaguely current CV. [pdf]
Teaching
This term I am teaching
cs294-19, the graduate statistical NLP course (once known as cs294-5 and cs294-7).
Last term I taught
cs188, the undergraduate introduction to artificial intelligence.
Publications
-
2007
- A Probabilistic Approach to Diachronic Phonology, Alexandre Bouchard-Côté, Percy Liang, Thomas Griffiths, and Dan Klein, In proceedings of EMNLP 2007. [pdf] [slides]
- The Infinite PCFG using Hierarchical Dirichlet Processes, Percy Liang, Slav Petrov, Michael Jordan, and Dan Klein, In proceedings of EMNLP 2007. [pdf] [slides]
- Learning Structured Models for Phone Recognition, Slav Petrov, Adam Pauls, and Dan Klein, In proceedings of EMNLP-CoNLL 2007. [pdf] [slides] [bib]
- A* Search via Approximate Factoring, Aria Haghighi, John DeNero, and Dan Klein, In proceedings of AAAI (Nectar Track) 2007. [pdf]
- Learning and Inference for Hierarchically Split PCFGs, Slav Petrov and Dan Klein, In proceedings of AAAI (Nectar Track) 2007. [pdf] [slides] [bib]
- Unsupervised Coreference Resolution in a Nonparametric Bayesian Model, Aria Haghighi and Dan Klein, In proceedings of ACL 2007. [pdf] [slides] [bib]
- Tailoring Word Alignments to Syntactic Machine Translation, John DeNero and Dan Klein, In proceedings of ACL 2007. [pdf] [slides]
- Improved Inference for Unlexicalized Parsing, Slav Petrov and Dan Klein, In proceedings of HLT-NAACL 2007. [pdf] [slides] [bib]
- Approximate Factoring for A* Search, Aria Haghighi, John DeNero, and Dan Klein, In proceedings of HLT-NAACL 2007. [pdf] [slides] [bib]
-
2006
- Learning Accurate, Compact, and Interpretable Tree Annotation, Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein, In proceedings of COLING-ACL 2006. [pdf] [slides] [bib]
- Non-Local Modeling with a Mixture of PCFGs, Slav Petrov, Leon Barrett, and Dan Klein, In proceedings of CoNLL 2006. [pdf] [slides] [bib]
- An End-to-End Discriminative Approach to Machine Translation, Percy Liang, Alexandre Bouchard-Côté, Dan Klein, and Ben Taskar, In proceedings of COLING-ACL 2006. [pdf] [slides] [bib]
- Why Generative Phrase Models Underperform Surface Heuristics, John DeNero, Dan Gillick, James Zhang, and Dan Klein, Workshop on Statistical Machine Translation at HLT-NAACL 2006. [pdf] [slides] [bib]
- Alignment by Agreement, Percy Liang, Ben Taskar, and Dan Klein, In proceedings of NAACL 2006. [pdf] [slides] [bib]
- Protoype-Driven Learning for Sequence Models, Aria Haghighi and Dan Klein, In proceedings of HLT-NAACL 2006. [pdf] [slides] [bib]
- Protoype-Driven Grammar Induction, Aria Haghighi and Dan Klein, In proceedings of COLING-ACL 2006. [pdf] [slides] [bib]
- Word Alignment Via Quadratic Assignment, Simon Lacoste-Julien, Ben Taskar, Dan Klein, and Michael Jordan, In proceedings of NAACL 2006. [pdf] [bib]
-
2005
- Robust Textual Inference via Graph Matching, Aria Haghighi, Andrew Ng, and Christopher Manning, In proceedings of HLT-EMNLP 2005. [pdf] [bib]
- A Discriminative Matching Approach to Word Alignment, Ben Taskar, Simon Lacoste-Julien, and Dan Klein, In proceedings of EMNLP 2005. [pdf] [bib]
- The Unsupervised Learning of Natural Language Structure, Dan Klein, Ph.D. Thesis, Stanford University 2005. [pdf]
- Unsupervised Learning of Field Segmentation Models for Information Extraction, Trond Grenager, Dan Klein, and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2005. [pdf]
-
2004
- Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency, Dan Klein and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2004. [pdf]
- Max-Margin Parsing, Ben Taskar, Dan Klein, Michael Collins, Daphne Koller, and Chris Manning, In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) 2004. [pdf]
- Review of Data-Oriented Parsing, edited by Rens Bod, Remko Scha, and Khalil Sima'an, Dan Klein, Computational Linguistics 2004.
-
2003
- Accurate Unlexicalized Parsing, Dan Klein and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2003. [pdf]
- Factored A* Search for Models over Sequences and Trees, Dan Klein and Chris Manning, In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) 2003. [pdf]
- A* Parsing: Fast Exact Viterbi Parse Selection, Dan Klein and Chris Manning, In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL) 2003. [pdf]
- Named Entity Recognition with Character-Level Models, Dan Klein, Joseph Smarr, Huy Nguyen, and Chris Manning, In Proceedings of the Conference on Natural Language Learning (CoNLL) 2003. [pdf]
- Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network, Kristina Toutanova, Dan Klein, Chris Manning, and Yoram Singer, In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL) 2003. [pdf]
- Spectral Learning, Sepandar Kamvar, Dan Klein, and Chris Manning, In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) 2003. [pdf]
-
2002
- A Generative Constituent-Context Model for Improved Grammar Induction, Dan Klein and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2002. [pdf]
- Parsing and Hypergraphs, Dan Klein and Chris Manning, Bunt, Carroll, and Satta, eds., New Developments in Parsing Technology, Kluwer Academic Publishers 2002.
- Fast Exact Inference with a Factored Model for Natural Language Processing, Dan Klein and Chris Manning, In Advances in Neural Information Processing Systems 15 (NIPS) 2002. [pdf]
- Conditional Structure versus Conditional Estimation in NLP Models, Dan Klein and Chris Manning, In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) 2002. [pdf]
- Combining Heterogeneous Classifiers for Word-Sense Disambiguation, Dan Klein, Kristina Toutanova, Tolga Ilhan, Sepandar Kamvar, and Chris Manning, ACL Workshop on Word Sense Disambiguation 2002. [pdf]
- From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering, Dan Klein, Sepandar Kamvar, and Chris Manning, In Proceedings of the International Conference on Machine Learning (ICML) 2002. [pdf]
- Interpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based Approach, Sepandar Kamvar, Dan Klein, and Chris Manning, In Proceedings of the International Conference on Machine Learning (ICML) 2002. [pdf]
- Evaluating Strategies for Similarity Search on the Web, Taher Haveliwala, Aristides Gionis, Dan Klein,, and Piotr Indyk, In Proceedings of the International World Wide Web Conference (WWW) 2002. [pdf]
-
2001
- Natural Language Grammar Induction Using a Constituent-Context Model, Dan Klein and Chris Manning, In Advances in Neural Information Processing Systems (NIPS) 2001. [pdf]
- Distributional Phrase Structure Induction, Dan Klein and Chris Manning, In Proceedings of the Conference on Natural Language Learning (CoNLL) 2001. [pdf]
- Parsing and Hypergraphs, Dan Klein and Chris Manning, In Proceedings of the International Workshop on Parsing Technologies (IWPT) 2001. [pdf]
- Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank, Dan Klein and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2001. [pdf]
- An O(n^3) Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free Grammars, Dan Klein and Chris Manning, Stanford Technical Report 2001. [pdf]
-
Tutorials
- Structured Bayesian Nonparametric Models with Variational Inference, Presented at ACL 2007 with Percy Liang. [pdf]
- Introduction to Classification: Likelihoods, Margins, Features, and Kernels, Presented at NAACL 2007. [pdf]
- Machine Learning for Natural Language Processing: New Developments and Challenges, Presented at NIPS 2006.
- Max-Margin Methods for NLP: Estimation, Structure, and Applications, Presented at ACL 2005 with Ben Taskar. [pdf]
- Maxent Models, Conditional Estimation, and Optimization, without the Magic, Presented at NAACL 2003 and ACL 2003 with Chris Manning. [pdf slides] [pdf handouts]
- Lagrange Multipliers without Permanent Scarring. Permanently in rough draft form, it seems! [pdf-draft]
Personal
I do actually exist outside of the CS/linguistics world. I took
karate for most of my life, and then spent many year with ballroom
dance. Competitive ballroom dance is just like karate, but with more music
and less scowling. I competed and taught for the Stanford
Ballroom Dance Team, and previously competed for the Cornell
Team and the Oxford Team.
Last modified:
|