TBSD: Using TurboParser to Get Stanford Dependencies

This is the page for using TurboParser to get the popular Stanford Dependencies. TurboParser is a dependency parser written by Andr¨¦ Martins.

Background

Stanford typed dependencies are a widely desired representation of natural language sentences, but parsing is one of the major computational bottlenecks in text analysis systems. In light of the evolving definition of the Stanford dependencies and developments in statistical dependency parsing algorithms, this paper revisits the question of Cer et al. (2010): what is the tradeoff between accuracy and speed in obtaining Stanford dependencies in particular? We also explore the effects of input representations on this tradeoff: part-of-speech tags, the novel use of an alternative dependency representation as input, and distributional representaions of words. We find that direct dependency parsing is a more viable solution than it was found to be in the past.

Further Reading

The main technical ideas behind this software appear in the paper:

[1]  Lingpeng Kong and Noah A. Smith.
An Empirical Comparison of Parsing Methods for Stanford Dependencies.
April, 2014

Download

Pre-trained Models

For Stanford Dependencies v3.3.0, we trained three models (to generate Stanford basic dependencies) full, standard, basic. They require TurboParser v. 2.1.0.

Software Package

We provide the code for our additional inference rules and scripts to get Stanford Dependencies from raw input files conveniently here.

Pre-trained Tagging Models

In the script we provide, we use TurboTagger to perform the POS tagging. Here we offer a tagging model trained on the sections 02-21 of the Penn Treebank.

For questions, bug fixes and comments, please e-mail lingpenk [strudel] cs.cmu.edu.