Advances in Statistical Methods for Natural Language
Processing 2011
Instructor
Assistant professor, Mamoru Komachi <komachi--at--is.naist.jp>
Course description
This course gives you a brief introduction to statistical natural language
processing in comparison to previous heuristic methods.
You will see several recent statistical approaches to natural language processing, including:
- Statistical machine translation
- Statistical Japanese text input method
- Information retrieval and information extraction.
Dates
- 9:20-10:50, Dec 16, 2011: Introduction to statistical natural language
processing (slides)
- 9:20-10:50, Dec 20, 2011: Statistical machine translation
(slides)
- 9:20-10:50, Jan 6, 2012: Statistical Japanese text input method
(slides)
- (kiyoshi-k) Hal Daume III and Daniel Marcu. A Noisy-Channel Model for
Document Compression. ACL 2002.
(slides)
- 9:20-10:50, Jan 10, 2012: Information retrieval and information extraction
- (komachi) Kei Uchiumi, Mamoru Komachi, Keigo Machinaga,
Toshiyuki Maezawa, Toshinori Satou and Yoshinori Kobayashi.
Japanese Abbreviation Expansion with Query and Clickthrough Logs.
IJCNLP 2011. (slides)
- (haruka-m) Pallavi Choudhury, Chris Quirk, and Hisami Suzuki. From
pecher to pĂȘcher ... or pĂ©cher: Simplifying French Input by Accent
Prediction. WTIM 2011. (slides)
- (kou-j) Zhen Chen and Kai-Fu Lee. A New Statistical Approach to Chinese
Pinyin Input. ACL 2000. (slides)
All slides can be accessible from naist.jp.
Venue
L2 (Lecture room 2), Graduate School of Information Science
Grading
100% on your final presentation (either on Jan 6 or 10). You need to choose one research paper
relevant to this course, and will explain the paper using slides (you can
take up to 15 minutes). Your paper must be somewhat related to noisy channel
model.
- Hal Daume III and Daniel Marcu. A Noisy-Channel Model for Document Compression. ACL 2002.
- Kevin Knight and Jonathan Graehl. Machine Transliteration. ACL 1997.
- Zheng Chen and Kai-Fu Lee. A New Statistical Approach
to Chinese Pinyin Input. ACL 2000.
- Yabin Zheng, Chen Li, and Maosong Sun. CHIME: An Efficient
Error-Tolerant Chinese Pinyin Input Method. IJCAI 2011.
- Pallavi Choudhury, Chris Quirk, and Hisami Suzuki. From pecher to
pêcher ... or pécher: Simplifying French Input by Accent
Prediction. WTIM 2011.
- Eric Brill and Robert C. Moore. An Improved Error Model for
Noisy Channel Spelling Correction. ACL 2000.
- Mark D. Kernighan, Kenneth W. Church, and William A. Gale. A Spelling
Correction Program Based on a Noisy Channel Model. COLING 1990.
- Kristina Toutanova and Robert Moore. Pronunciation Modeling for Improved
Spelling Correction. ACL 2002.
- Mark Johnson and Eugene Charniak. A TAG-based noisy channel
model of speech repairs. ACL 2004.
- Randy West, Y. Albert Park, and Roger Levy. Billingual Random
Walk Models for Automated Grammar Correction of ESL Author-Produced Text.
EduNLP 2011.
- John Lee and Stephanie Seneff. Automatic
Grammar Correction for Second-Language Learners. InterSpeech 2006.
All students who want to get a credit for this course must submit their
preference of the papers (up to three) to komachi@is by
19:59, Dec 22, 2011.
Office hours
- 15:00-17:00, Dec 22, 2011
- 15:00-17:00, Jan 5, 2012
No appointment needed. Please come by A705. Other hours available
upon request.
References
- Philipp Koehn. 2009. Statistical Machine Translation. CUP.
- Daniel Jurafsky and James H. Martin. 2008. Speech and Language Processing.
Pearson Prentice Hall.
Mamoru Komachi <komachi--at--tmu.ac.jp>
Tokyo Metropolitan University