Paper
26 May 2023 Research and optimization of massive small file processing performance based on Ceph
Aiwu Shi, Zhencai Tian, Ge Chen, Jiyong Min, Jun Wu
Author Affiliations +
Proceedings Volume 12700, International Conference on Electronic Information Engineering and Data Processing (EIEDP 2023); 127000L (2023) https://doi.org/10.1117/12.2682512
Event: International Conference on Electronic Information Engineering and Data Processing (EIEDP 2023), 2023, Nanchang, China
Abstract
In today’s computing era, lots of files are generated from various areas due to the rapid development of technologies. Storing and processing massive, small files is one of the significant challenges for Ceph. Ceph is a scalable, reliable, high-performance storage solution widely used in cloud computing. However, for a large number of small files, Ceph has problems such as write amplification will cause performance bottlenecks. This paper proposes a novel technique Extended Small Files Processing Framework (ESFPF). Firstly, for efficient storage of files, the small files are merged after deduplication, which will effectively reduce the data blocks of Ceph to reduce load to achieve high-efficiency data processing operation. Secondly, a prefetching mechanism and file index is introduced to improve the efficiency of accessing small files. The experimental results indicate that the proposed approach can improve the efficiency of storing and accessing massive, small files on Ceph.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Aiwu Shi, Zhencai Tian, Ge Chen, Jiyong Min, and Jun Wu "Research and optimization of massive small file processing performance based on Ceph", Proc. SPIE 12700, International Conference on Electronic Information Engineering and Data Processing (EIEDP 2023), 127000L (26 May 2023); https://doi.org/10.1117/12.2682512
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data storage

Distributed computing

Data processing

Cloud computing

Industrial applications

Internet

Back to Top