A comparative study of keyword extraction algorithms for English texts

Journal of Intelligent Systems 30 (1):808-815 (2021)
  Copy   BIBTEX

Abstract

This study mainly analyzed the keyword extraction of English text. First, two commonly used algorithms, the term frequency–inverse document frequency (TF–IDF) algorithm and the keyphrase extraction algorithm (KEA), were introduced. Then, an improved TF–IDF algorithm was designed, which improved the calculation of word frequency, and it was combined with the position weight to improve the performance of keyword extraction. Finally, 100 English literature was selected from the British Academic Written English Corpus for the analysis experiment. The results showed that the improved TF–IDF algorithm had the shortest running time and took only 4.93 s in processing 100 texts; the precision of the algorithms decreased with the increase of the number of extracted keywords. The comparison between the two algorithms demonstrated that the improved TF–IDF algorithm had the best performance, with a precision rate of 71.2%, a recall rate of 52.98%, and an F 1 score of 60.75%, when five keywords were extracted from each article. The experimental results show that the improved TF–IDF algorithm is effective in extracting English text keywords, which can be further promoted and applied in practice.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,752

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

How to think about algorithms.Jeff Edmonds - 2008 - New York: Cambridge University Press.
Desperately Seeking ‘Justice’ in Classical Chinese: On the Meanings of Yi.Deborah Cao - 2019 - International Journal for the Semiotics of Law - Revue Internationale de Sémiotique Juridique 32 (1):13-28.

Analytics

Added to PP
2021-07-11

Downloads
6 (#1,456,990)

6 months
2 (#1,188,460)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references