TY - JOUR
T1 - Transformer based Model for Coherence Evaluation of Scientific Abstracts
T2 - Second Fine-tuned BERT
AU - Gutierrez-Choque, Anyelo Carlos
AU - Medina-Mamani, Vivian
AU - Castro-Gutierrez, Eveling
AU - Nú˜nez-Pacheco, Rosa
AU - Aguaded, Ignacio
N1 - Publisher Copyright:
© 2022. International Journal of Advanced Computer Science and Applications. All Rights Reserved.
PY - 2022
Y1 - 2022
N2 - Coherence evaluation is a problem related to the area of natural language processing whose complexity lies mainly in the analysis of the semantics and context of the words in the text. Fortunately, the Bidirectional Encoder Representation from Transformers (BERT) architecture can capture the aforementioned variables and represent them as embeddings to perform Fine-tunings. The present study proposes a Second Fine-Tuned model based on BERT to detect inconsistent sentences (coherence evaluation) in scientific abstracts written in English/Spanish. For this purpose, 2 formal methods for the generation of inconsistent abstracts have been proposed: Random Manipulation (RM) and K-means Random Manipulation (KRM). Six experiments were performed; showing that performing Second Fine-Tuned improves the detection of inconsistent sentences with an accuracy of 71%. This happens even if the new retraining data are of different language or different domain. It was also shown that using several methods for generating inconsistent abstracts and mixing them when performing Second Fine-Tuned does not provide better results than using a single technique.
AB - Coherence evaluation is a problem related to the area of natural language processing whose complexity lies mainly in the analysis of the semantics and context of the words in the text. Fortunately, the Bidirectional Encoder Representation from Transformers (BERT) architecture can capture the aforementioned variables and represent them as embeddings to perform Fine-tunings. The present study proposes a Second Fine-Tuned model based on BERT to detect inconsistent sentences (coherence evaluation) in scientific abstracts written in English/Spanish. For this purpose, 2 formal methods for the generation of inconsistent abstracts have been proposed: Random Manipulation (RM) and K-means Random Manipulation (KRM). Six experiments were performed; showing that performing Second Fine-Tuned improves the detection of inconsistent sentences with an accuracy of 71%. This happens even if the new retraining data are of different language or different domain. It was also shown that using several methods for generating inconsistent abstracts and mixing them when performing Second Fine-Tuned does not provide better results than using a single technique.
KW - Bert
KW - Coherence evaluation
KW - Inconsistent sentences detection
KW - Second fine-tuned
UR - http://www.scopus.com/inward/record.url?scp=85131413431&partnerID=8YFLogxK
U2 - 10.14569/IJACSA.2022.01305105
DO - 10.14569/IJACSA.2022.01305105
M3 - Artículo
AN - SCOPUS:85131413431
SN - 2158-107X
VL - 13
SP - 929
EP - 937
JO - International Journal of Advanced Computer Science and Applications
JF - International Journal of Advanced Computer Science and Applications
IS - 5
ER -