Characterization of temporal complementarity: fundamentals for multi-document summarization
DOI:
https://doi.org/10.1590/1981-5794-1804-6Keywords:
Linguistic description, Complementarity, CST, Multi-document Summarization, Natural Language Processing,Abstract
Complementarity is a usual multi-document phenomenon that commonly occurs among news texts about the same event. From a set of sentence pairs (in Portuguese) manually annotated with CST (Cross-Document Structure Theory) relations (Historical background and Follow-up) that make explicit the temporal complementary among the sentences, we identified a potential set of linguistic attributes of such complementary. Using Machine Learning algorithms, we evaluate the capacity of the attributes to discriminate between Historical background and Follow-up. JRip learned a small set of rules with high accuracy. Based on a set of 5 rules, the classifier discriminates the CST relations with 80% of accuracy. According to the rules, the occurrence of temporal expression in sentence 2 is the most discriminative feature in the task. As a contribution, the JRip classifier can improve the performance of the CST-discourse parsers for Portuguese.Downloads
Download data is not yet available.
Downloads
Published
25/04/2018
How to Cite
SOUZA, J. W. da C.; FELIPPO, A. D. Characterization of temporal complementarity: fundamentals for multi-document summarization. ALFA: Revista de Linguística, São Paulo, v. 62, n. 1, 2018. DOI: 10.1590/1981-5794-1804-6. Disponível em: https://periodicos.fclar.unesp.br/alfa/article/view/9204. Acesso em: 26 nov. 2024.
Issue
Section
Papers
License
Manuscripts accepted for publication and published are property of Alfa: Revista de Linguística. It is forbidden the full or partial submission of the manuscript to any other journal. Authors are solely responsible for the article's content. Translation into another language without written permission from the Editor advised by the Editorial Board is prohibited.