Paper
1 December 2021 Research on error correction method of Tibetan text based on deep learning
Huaguo Cairang, Secha Jia, Cairang Jia
Author Affiliations +
Proceedings Volume 12079, Second IYSF Academic Symposium on Artificial Intelligence and Computer Engineering; 120791F (2021) https://doi.org/10.1117/12.2622716
Event: 2nd IYSF Academic Symposium on Artificial Intelligence and Computer Engineering, 2021, Xi'an, China
Abstract
In order to avoid too many Tibetan text errors from affecting the text quality, a method of Tibetan text error correction based on soft masked Bert is proposed. In this work, the definition and classification of Tibetan text error correction are pointed out, the embedding composed of word embedding, position embedding and paragraph embedding is used as the input of Tibetan text error detection model. The probability labels of characters in the text correspond to the output of the error detection model. The output of the model detects characters with high error probability.The final soft masked embedding result is obtained by the output of the weighted sum error detection model and the input embedding. The best model constructed by the encoder layer sets all soft masked embedding sequences as input, and the residual connection values are obtained by connecting the input embedding results and all hidden states of the encoder layer. The obtained residual connection values are input to the full connection layer, and the softmax function is used to obtain the probability that the output characters can be accurately corrected as candidate characters to complete Tibetan text error correction. Experimental results show that this method can achieve efficient Tibetan text error correction, and use the error probability of different candidate words to obtain the best error correction results.
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Huaguo Cairang, Secha Jia, and Cairang Jia "Research on error correction method of Tibetan text based on deep learning", Proc. SPIE 12079, Second IYSF Academic Symposium on Artificial Intelligence and Computer Engineering, 120791F (1 December 2021); https://doi.org/10.1117/12.2622716
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Error analysis

Data modeling

Computer programming

Data processing

Data corrections

Legal

Performance modeling

Back to Top