This page discusses a few errata in the following paper:

M. Heilman and N. A. Smith. 2010. Good Question! Statistical Ranking for Question Generation. In Proc. of NAACL/HLT. [PDF]

In section 6, on page 6, in the paragraph labeled "N-Gram Language Model Features", we incorrectly stated that we included features for unnormalized language model probabilities (which would imply 12 rather than 6 features). In fact, we only included 6 length-normalized language model features. Thus, the first sentence of that section should have stated, "The set includes real valued features for the length-normalized log likelihoods of the question, the source sentence, and the answer phrase."

In section 6, on page 6, the paragraph labeled "Grammatical Features" states that the feature set included counts for various grammatical categories appearing in the parse trees for the question and answer phrase. Due to a bug, both sets of counts were computed from the question parse tree, and thus there were duplicate features. This issue was corrected for Michael Heilman's dissertation Automatic Factual Question Generation from Text. The findings in the dissertation more or less agree with the NAACL 2010 paper.

Click here to go to Michael Heilman's home page.

Click here to go to the page for the Question Generation system.