Multiword Expressions Acquisition: A Generic and Open Framework

Front Cover
Springer, Sep 24, 2014 - Computers - 230 pages

​This book is an excellent introduction to multiword expressions. It provides a unique, comprehensive and up-to-date overview of this exciting topic in computational linguistics. The first part describes the diversity and richness of multiword expressions, including many examples in several languages. These constructions are not only complex and arbitrary, but also much more frequent than one would guess, making them a real nightmare for natural language processing applications.

The second part introduces a new generic framework for automatic acquisition of multiword expressions from texts. Furthermore, it describes the accompanying free software tool, the mwetoolkit, which comes in handy when looking for expressions in texts (regardless of the language). Evaluation is greatly emphasized, underlining the fact that results depend on parameters like corpus size, language, MWE type, etc. The last part contains solid experimental results and evaluates the mwetoolkit, demonstrating its usefulness for computer-assisted lexicography and machine translation.

This is the first book to cover the whole pipeline of multiword expression acquisition in a single volume. It is addresses the needs of students and researchers in computational and theoretical linguistics, cognitive sciences, artificial intelligence and computer science. Its good balance between computational and linguistic views make it the perfect starting point for anyone interested in multiword expressions, language and text processing in general.

 

Contents

Chapter 1 Introduction
1
A Tough Nut to Crack
20
Part II MWE Acquisition
103
Part III Applications
156
Appendix A Extended List of Translation Examples
206
Appendix B Resources Used in the Experiments
209
Documentation
211
Appendix D Tagsets for POS and Syntax
223
Appendix E Detailed Lexicon Descriptions
228
Copyright

Other editions - View all

Common terms and phrases

About the author (2014)

Carlos Ramisch is a researcher and lecturer at the Aix-Marseille University (France). He holds a double PhD in computer science from Grenoble University (France) and UFRGS (Brazil). His research interests are multiword expressions, semantics and multilingualism. Carlos coordinated many events, including the MWE workshops (2010, 2011, 2013) and the ACM TSLP special issue. He is the creator and developer of the mwetoolkit.