A symbolic approach for automatic detection of nuclearity and rhetorical relations among intra-sentence discourse segments in Spanish
Por:
Da Cunha I., Sanjuan E., Torres-Moreno J.-M., Cabré M.T., Sierra G.
Publicada:
1 ene 2012
Resumen:
Nowadays automatic discourse analysis is a very prominent research topic, since it is useful to develop several applications, as automatic summarization, automatic translation, information extraction, etc. Rhetorical Structure Theory(RST) is the most employed theory. Nevertheless, there are not many studies about this subject in Spanish. In this paper we present the first system assigning nuclearity and rhetorical relations to intra-sentence discourse segments in Spanish texts. To carry out the research, we analyze the learning corpus of the RST Spanish Treebank, a corpus of manually-annotated specialized texts, in order to build a list of lexical and syntactic patterns marking rhetorical relations. To implement the system, this patterns' list and a discourse segmenter called DiSeg are used. To evaluate the system, it is applied over the test corpus of the RST Spanish Treebank. Automatic and manual rhetorical analyses of each sentence are compared, by means of recall and precision, obtaining positive results. © 2012 Springer-Verlag.
Filiaciones:
Da Cunha I.:
Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra, C/ Roc Boronat, 138, Barcelona 08018, Spain
Sanjuan E.:
Laboratoire Informatique d'Avignon, Université d'Avignon et des Pays de Vaucluse, 339 chemin des Meinajaries, Avignon Cedex 9 84911, France
Torres-Moreno J.-M.:
Laboratoire Informatique d'Avignon, Université d'Avignon et des Pays de Vaucluse, 339 chemin des Meinajaries, Avignon Cedex 9 84911, France
Département de Génie Informatique, École Polytechnique de Montréal, Succ. Centre Ville, Montréal, QC H3C 3A7, Canada
Cabré M.T.:
Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra, C/ Roc Boronat, 138, Barcelona 08018, Spain
Sierra G.:
Instituto de Ingeniería, Universidad Nacional Autónoma de México, Ciudad Universitaria, Mexico D.F. 04510, Mexico
|