Paper
22 February 2023 A keyword extraction method for Chinese professional field text based on improved RAKE
Yu-xi xie, Jiang-ping yang, Tai-yong fei, Bao-zhen yu, Xin hu
Author Affiliations +
Proceedings Volume 12587, Third International Seminar on Artificial Intelligence, Networking, and Information Technology (AINIT 2022); 125871F (2023) https://doi.org/10.1117/12.2667258
Event: Third International Seminar on Artificial Intelligence, Networking, and Information Technology (AINIT 2022), 2022, Shanghai, China
Abstract
An improved RAKE (Rapid Automatic Keyword Extraction) algorithm is proposed to solve the problems of phrase conglutination and inability to obtain professional words in the process of extracting keywords from Chinese professional texts. Through the TTF-IDF (Total Term Frequency-Inverse Document Frequency) method, professional field stop words are extracted and added to the general stop word dictionary for phrase segmentation. Professional domain entity words are introduced into the general word segmentation dictionary, and appropriate weight is given to them in the degree calculation to ensure that professional entity words get higher scores and are effectively extracted as keywords, because in professional field texts, professional entity words contain more core information. The experiments show that this algorithm is better than the basic RAKE and other algorithms in keyword extraction for Chinese professional field texts.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yu-xi xie, Jiang-ping yang, Tai-yong fei, Bao-zhen yu, and Xin hu "A keyword extraction method for Chinese professional field text based on improved RAKE", Proc. SPIE 12587, Third International Seminar on Artificial Intelligence, Networking, and Information Technology (AINIT 2022), 125871F (22 February 2023); https://doi.org/10.1117/12.2667258
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Associative arrays

Radar

Semantics

Reliability

Neural networks

Analytical research

Data modeling

Back to Top