Natural Language Processing (11-411 for undergraduates, 11-611 for graduate students)
Instructors:
Noah Smith
Time: Spring 2013, Tuesdays and Thursdays 3-4:20pm
Place: GHC 4102
Syllabus: [pdf]
Prerequisites: CS courses on data structures and algorithms, and strong programming capabilities
Textbook:
Speech and Language Processing
(second edition, 2007, Prentice-Hall), by Daniel
Jurafsky and James Martin
Course Description
This course is about a variety of ways to represent human languages
(like English and Chinese) as computational systems, and how to
exploit those representations to write programs that do neat stuff with
text and speech data, like
- translation,
- summarization,
- extracting information,
- question
answering,
- natural interfaces to databases, and
- conversational agents.
This field is called Natural Language
Processing or Computational Linguistics, and it is extremely
multidisciplinary. This course will therefore include some ideas central to Machine Learning
and to Linguistics.
We'll cover computational treatments of words, sounds, sentences,
meanings, and conversations. We'll see how probabilities and real-world text data can help.
We'll see how different levels interact in state-of-the-art approaches to applications
like translation and information extraction.
From a software engineering perspective, there will be an emphasis on
rapid prototyping, a useful skill in many other areas of Computer
Science. In particular, we will introduce some high-level languages
(e.g., regular expressions and Dyna) and some scripting languages
(e.g., Python and Perl) that can greatly simplify prototype
implementation.
Competitive Project
A major component will be the project: build a program whose input is
a web page P and whose output is a set of questions about the
content in P (that a human could answer if she read P), and can
also, if given a question Q about the content of P, answer the question
intelligently. Projects will be pitted against each other in a competition at the
end of the course.
Evaluation
Students will be evaluated by exam (midterm and final, totaling 40%),
regular short quizzes and weekly pencil-and-paper or small
programming homework problems (30% together), and the group
project (30%).
FAQ
Should I take this course?
Yes, if:
- you're a CS student interested in languages, language technology, or information processing
- you're a CS student who needs an "applications" credit
- you're a language technology minor (this course is an elective option)
- you're a linguistics student who can write computer programs (this course is an elective option)
- you always suspected natural language was kind of like Lisp (or Java or ...)
- you want computers to take over the world
- you don't want computers to take over the world, but if they do, you want to negotiate your release
- you like AI, machine learning, and/or theoretical computer science, and want to apply them to a hard real-world problem
Related courses elsewhere (not exhaustive!)
University of California, Berkeley
,
Brown University
,
University of Colorado
,
Columbia University,
Cornell University
,
Harvard
University
,
University of Illinois at Urbana-Champaign
,
Johns Hopkins University
,
University of Maryland
,
New York University
,
University of Pennsylvania
,
Stanford University
,
University of Utah
,
University of Wisconsin-Madison