LIPN-IIMAS at SemEval-2016 Task 1: Random forest regression experiments on align-and-differentiate and word embeddings penalizing strategies
Por:
Lithgow, O., Meza, I.V., Orozco, A., Flores, J.G., Buscaldi, D.
Publicada:
1 ene 2016
Resumen:
This paper describes the SOPA-N system used by the LIPN-IIMAS team in Semeval 2016 Semantic Textual Similarity (Task 1). We based our work on the SOPA 2015 system. The SOPA-2015 system used 16 similarity features (including Wordnet, Information Retrieval and Syntactic Dependencies) within a Random Forest learning model. We expanded this system with an Align and Differentiate based strategy, word embeddings and penalization, which showed 6.8% of improvement on the development set. However, we found that on the evaluation data for the 2016 STS shared task, the 2015 system outperformed our newer systems. © 2016 Association for Computational Linguistics.
|