Paper
28 March 2023 Spam detection using Catboost integration algorithm
TianLin Zhang, YouFeng Niu, Rong Ma, MengYuan Zhao, DuoYang Song, HengBin Liu
Author Affiliations +
Proceedings Volume 12597, Second International Conference on Statistics, Applied Mathematics, and Computing Science (CSAMCS 2022); 125973A (2023) https://doi.org/10.1117/12.2672788
Event: Second International Conference on Statistics, Applied Mathematics, and Computing Science (CSAMCS 2022), 2022, Nanjing, China
Abstract
This paper introduces the background and significance of building a CatBoost-based spam detection model and proposes a new research approach on the classical research model. Besides, a large number of network resources are occupied which makes 85% of the system resources of the mail server are used for the identification of spam. It is not only a waste of resources, but may even lead to network congestion and paralysis, affecting the normal business email communication of enterprises. In this study, we use the Enron-Spam dataset, which is currently the most publicly available dataset used in email-related research. First, we use word bagging processing and TF-IDF processing for feature extraction, and then CatBoost integration algorithm is used for training. The final accuracy of the model is more than 98%. Compared with the conventional model, the model has better performance, which can effectively improve the accuracy of spam detection and identification.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
TianLin Zhang, YouFeng Niu, Rong Ma, MengYuan Zhao, DuoYang Song, and HengBin Liu "Spam detection using Catboost integration algorithm", Proc. SPIE 12597, Second International Conference on Statistics, Applied Mathematics, and Computing Science (CSAMCS 2022), 125973A (28 March 2023); https://doi.org/10.1117/12.2672788
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Machine learning

Education and training

Data modeling

Detection and tracking algorithms

Feature extraction

Feature selection

Reflection

Back to Top