17. 11. 2020

Foundations of Statistical Natural Language Processing

by Christopher D. Manning


Sometimes it felt a bit out-dated but the explanations of various algorithms and principles was very good and understanable.


(Church and Mercer 1993: 1 (Friday, December 05, 2014, 03:36 PM, page 65)

Virginia Electronic Text Center (see the website)
(Friday, December 05, 2014, 03:37 PM, page 66)

in general about 90% of periods are sentence boundary indicators (Riley 198 (Monday, December 08, 2014, 04:19 PM, page 163)

learning algorithms can be found in (Dietterich 1998). A good case study, for the exampleof word sense disambiguation, is (Mooney 1996) (Saturday, December 13, 2014, 04:16 PM, page 236)

Bell et al. (1990) and Witten and Bell (1991) introduce a number of smoothing algorithms for the goal of improving text compression (Saturday, December 13, 2014, 04:41 PM, page 249)

Chen and Goodman (1996, 1998) presentextensive evaluations of different smoothing algorithms. The conclusionsof (Chen and Goodman 1998) are that a variant of Kneser-Ney back-off smoothing that they develop normally gives the best performance. (Saturday, December 13, 2014, 04:46 PM, page 251)

only consider coarse-grained distinctions, for example only those that manifest themselves across languages (Resnik and Yarowsky 1998 (Sunday, December 14, 2014, 06:39 AM, page 284)

giving more context contributes little to human disambiguation performance (Sunday, December 14, 2014, 06:42 AM, page 285)

expressed aptly by Mercer (1993): “one cannot learn a new language by reading a bilingual dictionary (Sunday, December 14, 2014, 07:08 AM, page 334)

It has been estimated that the average educated person reads on the order of one million words in a year, but hears ten times as many words spoken (Sunday, December 14, 2014, 07:15 AM, page 337)

Mu&v t?rocesses/chains/models were first developed by Andrei A. Markov (a student of Chebyshev). Their first use was actually for a linguistic purpose - modeling the letter sequences in works of Russian literature (Markov 1913) - but Markov models were then VISIBLE MARKOV developed as a general statistical tool (Sunday, December 14, 2014, 07:20 AM, page 341)