SPECTER-Based Transformer Model For Multi-Label Research Paper Classification

Abstract

The ever-increasing number of research and academia publications presents new problems for future researchers and publication libraries, such as difficulty finding the proper papers for literature reviews due to cluttered data and inadequate keywords. It also hinders academic libraries from creating adequate indexing and leads to more and more unstructured data. This paper proposes a novel solution for classifying published research and scholarly articles, which would help solve the given problems by facilitating improved search engine optimization for online academic libraries. The study method utilizes the openly available arXiv dataset of 2.3 million records containing the metadata for each published research paper, which includes the title and the abstract. It uses the paper's metadata as the input and outputs multiple labels related to the input domain. The research compared different techniques for multi-label classification tasks and proposed a Specter method, which uses a transformer-based model for the task of research article classification. This method achieved a significant increase in precision and accuracy on the dataset and will help improve the maintenance of scholarly articles and make relevant works more readily available.

Publication
Presented at 6th IEEE International PuneCon
Date
Links
Coming Soon!