Automatic authorship detection using textual patterns extracted from integrated syntactic graphs


Por: Gómez-Adorno H., Sidorov G., Pinto D., Vilariño D., Gelbukh A.

Publicada: 1 ene 2016
Resumen:
We apply the integrated syntactic graph feature extraction methodology to the task of automatic authorship detection. This graph-based representation allows integrating different levels of language description into a single structure. We extract textual patterns based on features obtained from shortest path walks over integrated syntactic graphs and apply them to determine the authors of documents. On average, our method outperforms the state of the art approaches and gives consistently high results across different corpora, unlike existing methods. Our results show that our textual patterns are useful for the task of authorship attribution. © 2016 by the authors; licensee MDPI, Basel, Switzerland.

Filiaciones:
Gómez-Adorno H.:
 Instituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz S/N, Mexico City, 07738, Mexico

Sidorov G.:
 Instituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz S/N, Mexico City, 07738, Mexico

Pinto D.:
 Benemérita Universidad Autónoma de Puebla, Facultad de Ciencias de la Computación, Av. San Claudio y 14 Sur, Puebla, 72570, Mexico

Vilariño D.:
 Benemérita Universidad Autónoma de Puebla, Facultad de Ciencias de la Computación, Av. San Claudio y 14 Sur, Puebla, 72570, Mexico

Gelbukh A.:
 Instituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz S/N, Mexico City, 07738, Mexico
ISSN: 14248220
Editorial
Multidisciplinary Digital Publishing Institute (MDPI), KANDERERSTRASSE 25, CH-4057 BASEL, SWITZERLAND, Suiza
Tipo de documento: Article
Volumen: 16 Número: 9
Páginas:
WOS Id: 000385527700035
ID de PubMed: 27589740
imagen All Open Access; Gold Open Access; Green Open Access