Noah's ARK

Animals entering the ark.

Edward Hicks' rendition of Noah's ark, and an excuse to violate One Sense Per Discourse.

Noah's ARK[1] is Noah Smith's informal research group at the Language Technologies Institute, School of Computer Science, Carnegie Mellon University. (The research is formal; the group is informal.) As you may have guessed, our research focuses on problems of ambiguity and uncertainty in natural language processing, including morphology, syntax, semantics, translation, behavioral/social phenomena observed through language.

[1]The acronym is ambiguous; possible interpretations might include Ambiguity Research Kith or Ambiguity Resolution K. or A. R. Kibbutz. With apologies to the Bible and DAGS.

Lab meetings: Mondays at 11am, NSH 4513

ARK Research in News and Blogs

May 2008: Little Green Footballs, a political blog, happened on some data Tae had on her website, prompting fascinating (to us) speculation about what we were up to. See Noah and William's response here.

January 2008: The New Scientist and Tech Digest blogs commented on Danny and Noah's relative keyboard.

November 2007: Yahoo granted us access to M45, a 4,000-processor supercomputer.

Projects and Resources

Researchers

PhotoNamePositionTopicsLanguages (Spoken and/or Written) Languages (Researched)Languages (Hacked In)Favorite Term of Venery
Shay Cohen Ph.D. student, LTImorphosyntactic parsing, unsupervised parsing, parsing algorithmsHebrew, English, Swedish (lite)Arabic, Chinese, English, German, HebrewC++, C, Perl, Scheme, Java, C#shoal
Dipanjan Das Ph.D. student, LTI sentence-sentence semantic relationships, parsing (RAVINE) English, Bengali, Hindi, Sanskrit (swalpam) English, BengaliJava, C, C++, Perlbloat
Kevin Gimpel Ph.D. student, LTIstatistical NLP and translationEnglish, German, ItalianEnglish, Portuguese, German Java, C, C++, Matlabcrash
Mohammad Haque undergraduate researcher, CS
Michael HeilmanPh.D. student, LTIquestion-answer modeling (RAVINE)English, Japanese (sukoshi)English, English as a second languageJava, Perl, PHP, Ruby, C++mob
Dimitry Levin undergraduate researcher, Computational Finance
André Martins Ph.D. student, LTI and Universidade Técnica de Lisboastructured and kernel machine learning, parsingPortuguese, English, Spanish, German (ein bisschen), French (un petit peu)Portuguese, English, SpanishC, C++, Matlabscurry
Nate Schneider Ph.D. student, LTIparsing, grammar learning, semantics; cognitive linguisticsEnglish, Hebrew (כמה), Arabic (قليل), French (un peu)English, HebrewPython, Java, PHP, Scheme, C#, C++smack
Tae Yano Ph.D. student, LTINLP in the political domain, rich models of structured NL data (e.g., blogs)Japanese, English, Spanish, FrenchEnglishC, C++, Java, Perl, Pythonhusk
Noah SmithAssistant Professor, LTI & MLD(all of the above)English, French (un peu) Arabic, Bulgarian, Czech, English, French, German, Hebrew, Korean, Mandarin, Portuguese, Turkish C++, Perl, Dynaparade

Alumni

Friends & collaborators

We are very open to collaboration, inside and outside of CMU. Technical correspondence is welcome. Some people we've been talking and working with lately: Pedro Aguiar (Universidade Técnica de Lisboa), Alan Black (LTI), William Cohen (MLD, LTI), Ric Crabbe (U.S. Naval Academy), Jason Eisner (Johns Hopkins University), Mário Figueiredo (Universidade Técnica de Lisboa), Rebecca Hwa (University of Pittsburgh), Shimon Kogan (Tepper School of Business), John Lafferty (CSD, MLD), Alon Lavie (LTI), Teruko Mitamura (LTI), Thuy Linh Nguyen (ISL, LTI), Kemal Oflazer (Sabancı University), Daniel Rashid (LTI), Bryan Routledge (Tepper School of Business), Jacob Sagi (Vanderbilt University), Stu Shulman (University of Pittsburgh), Rob Simmons (CSD), David Smith (Johns Hopkins University), Doug Vail (CSD), Ashish Venugopal (ISL, LTI), Stephan Vogel (ISL, LTI), Mengqiu Wang (Stanford University), Shuly Wintner (University of Haifa), Joy Zhang (ISL, LTI), and Andreas Zollmann (ISL, LTI).

Acknowledgments

Our research is supported in part by the DARPA Computer Science Study Panel program (grant numbers HR-00110110013 and NBCH-1080004), by the National Science Foundation (grant numbers IIS-0713265 and IIS-0836431 and a graduate fellowship to Michael Heilman), by an IBM Faculty Award, by a grant from Google, and by computational resources provided by Yahoo.

Locations of visitors to this page