Modular Lexicalization of Probabilistic Context-Free Grammars
This project aims to develop and implement improved statistical disambiguation methods for syntactic analyses. It also develops a clustering model for verb-argument tuples which generalises selectional restrictions over WordNet concepts.
In the next phase, the project will implement a new parameter estimation technique for the BitPar parser which was developed in the first phase. The new method is based on ensembles of decision trees and is intended to improve the accuracy of parsing with fine-grained syntactic categories which contain information about e.g. number, gender, and case. The project will also examine whether reranking strategies can further increase the accuracy of the parser. The reranker will use features derived from the clustering model as well as other features. The clustering model will be extended by (i) dealing with adjuncts in addition to arguments (ii) automatically inducing noun hierarchies instead of using WordNet, and (iii) implementing a hybrid probability model. The clustering model will be applied to tasks such as word sense disambiguation.
Principal Investigator: Helmut Schmid
Staff: Richard Farkas, Thomas Müller
Former Staff: Alexander Balabanov, Christian Hying, Wiebke Wagner, Sabine Schulte im Walde
HiWi: Renjing Wang
Former HiWi: Christian Scheible, Max Kisselew
Events
2nd Interdisciplinary Workshop on Verbs: The Identification and Representation of Verb Features
November 4-5, 2010, Pisa, Italy
Organisers: Pier Marco Bertinetto (Scuola Normale Superiore di Pisa), Anna Korhonen (University of Cambridge), Alessandro Lenci (University of Pisa), Alissa Melinger (University of Dundee), Sabine Schulte im Walde (D4), Aline Villavicencio (Federal University of Rio Grande do Sul, and University of Bath)
Human Judgements in Computational Linguistics
August 23, 2008, Manchester, UK
Organisers:Ron Artstein (University of Southern California), Gemma Boleda (Universitat Politècnica de Catalunya), Frank Keller (University of Edinburgh), Sabine Schulte im Walde (D4)
Poster and Demo Session at the Annual Meeting of the DGfS
February 28-March 2, 2007, Siegen, Germany
Organisers: Stefan Evert (Universität Osnabrück), Sabine Schulte im Walde (D4)
Software
PAC is a predicate argument clustering software that is trained on predicate-frame-argument tuples and outputs a multi-dimensional cluster analysis, including clusters for the predicates and selectional preference abstraction over the predicate arguments.