Natural Language Processing Using Very Large Corpora

November 1999



ABOUT THIS BOOK This book is intended for researchers who want to keep abreast of cur­ rent developments in corpus-based natural language processing. It is not meant as an introduction to this field; for readers who need one, several entry-level texts are available, including those of (Church and Mercer, 1993; Charniak, 1993; Jelinek, 1997). This book captures the essence of a series of highly successful work­ shops held in the last few years. The response in 1993 to the initial Workshop on Very Large Corpora (Columbus, Ohio) was so enthusias­ tic that we were encouraged to make it an annual event. The following year, we staged the Second Workshop on Very Large Corpora in Ky­ oto. As a way of managing these annual workshops, we then decided to register a special interest group called SIGDAT with the Association for Computational Linguistics. The demand for international forums on corpus-based NLP has been expanding so rapidly that in 1995 SIGDAT was led to organize not only the Third Workshop on Very Large Corpora (Cambridge, Mass. ) but also a complementary workshop entitled From Texts to Tags (Dublin). Obviously, the success of these workshops was in some measure a re­ flection of the growing popularity of corpus-based methods in the NLP community. But first and foremost, it was due to the fact that the work­ shops attracted so many high-quality papers.


Introduction. Implementation and Evaluation of a German HMM for POS Disambiguation;
H. Feldweg. Improvements in Part-of-Speech Tagging with an Application To German;
H. Schmid. Unsupervised Learning of Disambiguation Rules for Part-of-Speech Tagging;
E. Brill, M. Pop. Tagging French without Lexical Probabilities - Combining Linguistic Knowledge and Statistical Learning;
E. Tzoukermann, et al. Example-Based Sense Tagging of Running Chinese Text;
X. Tong, et al. Disambiguating Noun Groupings with Respect to WordNet Senses;
P. Resnik. A Comparison of Corpus-based Techniques for Restoring Accents in Spanish and French Text;
D. Yarowsky. Beyond Word N-Grams;
F. Pereira, et al. Statistical Augmentation of a Chinese Machine-Readable Dictionary;
P. Fung, D. Wu. Text Chunking Using Transformation-based Learning;
L. Ramshaw, M.P. Marcus. Prepositional Phrase Attachment through a Backed-off Model;
M. Collins, J. Brooks. On the Unsupervised Induction of Phrase-Structure Grammars;
C. de Marcken. Robust Bilingual Word Alignment for Machine Aided Translation;
I. Dagan, et al. Iterative Alignment of Syntactic Structures for a Bilingual Corpus;
R. Grishman. Trainable Coarse Bilingual Grammars for Parallel Text Bracketing;
D. Wu. Comparative Discourse Analysis of Parallel Texts;
P. van der Eijk. Comparing the Retrieval Performance of English and Japanese Text Databases;
H. Fujii, W.B. Croft. Inverse Document Frequency (IDF): A Measure of Deviations from Poisson;
K. Church, W. Gale. List of Authors. Subject Index.
EAN: 9780792360551
ISBN: 0792360559
Untertitel: 'Text, Speech and Language Technology'. 1999. Auflage. Book. Sprache: Englisch.
Verlag: Springer
Erscheinungsdatum: November 1999
Seitenanzahl: 328 Seiten
Format: gebunden
