Automatic Indexing of Digital Books using RAKE and Word2Vec

Arhan Windu Rizki Putra Budianto; Ulla Delfana Rosiani; Vit Zuraida; Rizki Putri Ramadhani; Mohammad Alfarizi Abdullah

doi:10.20961/joive.v9i1.3050

Authors

Arhan Windu Rizki Putra Budianto Politeknik Negeri Malang Author
Dr. Ulla Delfana Rosiani, ST., MT. Politeknik Negeri Malang Author https://orcid.org/0000-0002-1512-7528
Vit Zuraida, S.Kom., M.Kom. Politeknik Negeri Malang Author
Rizki Putri Ramadhani Politeknik Negeri Malang Translator https://orcid.org/0009-0001-6053-7585
Mohammad Alfarizi Abdullah Politeknik Negeri Malang Author

DOI:

https://doi.org/10.20961/joive.v9i1.3050

Keywords:

Automatic indexing, digital book, RAKE, Word2Vec, PDF

Abstract

Manual indexing of digital books is time-consuming and prone to inconsistency. To address this, this study developed an automatic indexing system using RAKE (Rapid Automatic Keyword Extraction) method and Word2Vec. The system accepts PDF files as input, performs text preprocessing, and extracts key phrases using RAKE. These phrases are subsequently filtered based on semantic relevance to the specified topic using an Indonesian-language Word2Vec model. Users can manually add phrases and select relevant ones to be included in the final index. The resulting index includes phrases, page numbers, and relevance scores, which are inserted as an additional page at the end of the PDF document. Evaluation was conducted by comparing the system-generated index with the author’s manual index using precision, recall, and cosine similarity metrics. The results indicate that although precision and recall were very low, a cosine similarity score of 0.69 suggests a semantic similarity between the system output and the author’s index.

Automatic Indexing of Digital Books using RAKE and Word2Vec

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Main Menu

ACCREDITATION

TEMPLATE

Journal Archive

COLLABORATION

recommended tools

Information

Visitor Statistics

Journal of Informatics and Vocational Education