Instructor: Noah Smith (Assistant Professor in the LTI) — email me (fix the address first)
Time: Spring 2010, Tuesdays and Thursdays 3-4:20pm
Place: GHC 4101
Web page: makeitunderstand.org
History: Spring 2010, Spring 2009, Spring 2008
Prerequisites: Fundamental Data Structures and Algorithms (15-211) and strong programming capabilities
Textbook: Speech and Language Processing (second edition, 2007, Prentice-Hall), by Daniel Jurafsky and James Martin

| DAVID BOWMAN: | Open the pod bay doors, HAL. | |
| HAL 9000: | I'm sorry, Dave, I'm afraid I can't do that. | |
| —Stanley Kubrick and Arthur C. Clarke, screenplay of 2001: A Space Odyssey | ||
This field is called Natural Language Processing or Computational Linguistics, and it is extremely multidisciplinary. This course will therefore include some ideas central to Machine Learning and to Linguistics.
We'll cover computational treatments of words, sounds, sentences, meanings, and conversations. We'll see how probabilities and real-world text data can help. We'll see how different levels interact in state-of-the-art approaches to applications like translation and information extraction.
From a software engineering perspective, there will be an emphasis on rapid prototyping, a useful skill in many other areas of Computer Science. In particular, we will introduce some high-level languages (e.g., regular expressions and Dyna) and some scripting languages (e.g., Python and Perl) that can greatly simplify prototype implementation.
Should I take this course?
Yes, if: