LivingKnowledge goal is to bring a new quality into search and knowledge management technology for more concise, complete and contextualised search results.
The paper “Shallow Discourse Parsing with Conditional Random Fields” co-written by S. Ghosh, R. Johansson, G. Riccardi and S. Tonelli has been presented in the in Proceedings of the 5th International Joint Conference on Natural Language Processing IJCNLP 2011 in Chiang Mai, Thailand, on November 8-13, 2011.
Parsing discourse is a challenging natural language processing task. In this paper we take a data driven approach to identify arguments of explicit discourse connectives. In contrast to previous work we do not make any assumptions on the span of arguments and consider parsing as a token-level sequence labeling task. We design the argument segmentation task as a cascade of decisions based on conditional random ﬁelds (CRFs). We train the CRFs on lexical, syntactic and semantic features extracted from the Penn Discourse Treebank and evaluate feature combinations on the commonly used test split. We show that the best combination of features includes syntactic and semantic features. The comparative error analysis investigates the performance variability over connective types and argument positions.