FOLLOWUS
1.School of Computer Science and Engineering, Southeast University, Nanjing 210000, China
2.Key Laboratory of Computer Network and Information Integration, Ministry of Education, Southeast University, Nanjing 210000, China
3.School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
E-mail: syk@seu.edu.cn
libing@seu.edu.cn
lexiangli@seu.edu.cn
‡ Corresponding author
Received:11 June 2024,
Revised:10 October 2024,
Published:2025-07
Scan QR Code
Yuankang SUN, Bing LI, Lexiang LI, et al. Shared-weight multimodal translation model for recognizing Chinese variant characters[J]. Frontiers of information technology & electronic engineering, 2025, 26(7): 1066-1082.
Yuankang SUN, Bing LI, Lexiang LI, et al. Shared-weight multimodal translation model for recognizing Chinese variant characters[J]. Frontiers of information technology & electronic engineering, 2025, 26(7): 1066-1082. DOI: 10.1631/FITEE.2400504.
中文变体字识别任务旨在解决中文字符中存在的语义模糊和混淆问题,这些问题对网页内容的安全性构成潜在风险,并加剧敏感词汇管理的复杂性。大多数现有方法在预训练阶段侧重于从中文语料库和词汇中获取上下文语义,往往忽视了中文固有的音韵和形态特征。基于上述问题,本文提出一种面向中文变体字识别的共享权重多模态翻译模型。该模型将拼音的音韵特征和字体的形态特征整合到每个中文词元中,以学习变体文本的深层语义特征。具体来说,通过嵌入层对中文拼音音韵特征进行编码,并利用卷积神经网络学习中文字体形态特征。考虑到中文变体字识别任务中源句与目标句之间的多模态特征相似性,设计了共享权重嵌入机制,在训练过程中利用源句的启发式信息生成目标句。实验结果表明,本文所提出的共享权重多模态翻译模型在双语评估测试(BLEU)和F1值方面分别达到89.550%和79.480%,与当前最先进的基线模型相比有显著提升。
The task of recognizing Chinese variant characters aims to address the challenges of semantic ambiguity and confusion
which potentially cause risks to the security of Web content and complicate the governance of sensitive words. Most existing approaches predominantly prioritize the acquisition of contextual knowledge from Chinese corpora and vocabularies during pretraining
often overlooking the inherent phonological and morphological characteristics of the Chinese language. To address these issues
we propose a shared-weight multimodal translation model (SMTM) based on multimodal information of Chinese characters
which integrates the phonology of Pinyin and the morphology of fonts into each Chinese character token to learn the deeper semantics of variant text. Specifically
we encode the Pinyin features of Chinese characters using the embedding layer
and the font features of Chinese characters are extracted based on convolutional neural networks directly. Considering the multimodal similarity between the source and target sentences of the Chinese variant-character-recognition task
we design the shared-weight embedding mechanism to generate target sentences using the heuristic information from the source sentences in the training process. The simulation results show that our proposed SMTM achieves remarkable performance of 89.550% and 79.480% on bilingual evaluation understudy (BLEU) and F1 metrics respectively
with significant improvement compared with state-of-the-art baseline models.
Bao ZY , Li C , Wang R , 2020 . Chunk-based Chinese spelling check with global optimization . Proc Findings of the Association for Computational Linguistics , p. 2031 - 2040 . https://doi.org/10.18653/v1/2020.findings-emnlp.184 https://doi.org/10.18653/v1/2020.findings-emnlp.184
Bryant C , Yuan Z , Qorib MR , et al. , 2023 . Grammatical error correction: a survey of the state of the art . Comput Linguist , 49 ( 3 ): 643 - 701 . https://doi.org/10.1162/COLI_A_00478 https://doi.org/10.1162/COLI_A_00478
Chang Y , Kong L , Jia KJ , et al. , 2021 . Chinese named entity recognition method based on BERT . Proc IEEE Int Conf on Data Science and Computer Application , p. 294 - 299 . https://doi.org/10.1109/ICDSCA53499.2021.9650256 https://doi.org/10.1109/ICDSCA53499.2021.9650256
Chen KH , Wang R , Utiyama M , et al. , 2018 . Syntax-directed attention for neural machine translation . Proc 32 nd AAAI Conf on Artificial Intelligence , p. 4792 - 4799 . https://doi.org/10.1609/aaai.v32i1.11910 https://doi.org/10.1609/aaai.v32i1.11910
Cheng XY , Xu WD , Chen KL , et al. , 2020 . SpellGCN: incorporating phonological and visual similarities into language models for Chinese spelling check . Proc 58 th Annual Meeting of the Association for Computational Linguistics , p. 871 - 881 . https://doi.org/10.18653/v1/2020.acl-main.81 https://doi.org/10.18653/v1/2020.acl-main.81
Cho K , van Merriënboer B , Gulcehre C , et al. , 2014 . Learning phrase representations using RNN encoder–decoder for statistical machine translation . Proc Conf on Empirical Methods in Natural Language Processing , p. 1724 - 1734 . https://doi.org/10.3115/v1/D14-1179 https://doi.org/10.3115/v1/D14-1179
Choi H , Cho K , Bengio Y , 2018 . Fine-grained attention mechanism for neural machine translation . Neurocomputing , 284 : 171 - 176 . https://doi.org/10.1016/j.neucom.2018.01.007 https://doi.org/10.1016/j.neucom.2018.01.007
Chollampatt S , Taghipour K , Ng HT , 2016 . Neural network translation models for grammatical error correction . Proc 25 th Int Joint Conf on Artificial Intelligence , p. 2768 - 2774 .
Cui YM , Che WX , Liu T , et al. , 2021 . Pre-training with whole word masking for Chinese BERT . IEEE/ACM Trans Audio Speech Lang Process , 29 : 3504 - 3514 . https://doi.org/10.1109/TASLP.2021.3124365 https://doi.org/10.1109/TASLP.2021.3124365
Dabre R , Chu CH , Kunchukuttan A , 2021 . A survey of multilingual neural machine translation . ACM Comput Surv , 53 ( 5 ): 99 . https://doi.org/10.1145/3406095 https://doi.org/10.1145/3406095
Dai F , Cai Z , 2017 . Glyph-aware embedding of Chinese characters . Proc 1 st Workshop on Subword and Character Level Models in NLP , p. 64 - 69 . https://doi.org/10.18653/v1/W17-4109 https://doi.org/10.18653/v1/W17-4109
Devlin J , Chang MW , Lee K , et al. , 2019 . BERT: pre-training of deep bidirectional Transformers for language understanding . Proc Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , p. 4171 - 4186 . https://doi.org/10.18653/v1/N19-1423 https://doi.org/10.18653/v1/N19-1423
Diao SZ , Bai JX , Song Y , et al. , 2020 . ZEN: pre-training Chinese text encoder enhanced by N-gram representations . Proc Findings of the Association for Computational Linguistics , p. 4729 - 4740 . https://doi.org/10.18653/v1/2020.findings-emnlp.425 https://doi.org/10.18653/v1/2020.findings-emnlp.425
Dubey A , Jauhri A , Pandey A , et al. , 2024 . The Llama 3 herd of models . https://arxiv.org/abs/2407.21783 https://arxiv.org/abs/2407.21783
Gehring J , Auli M , Grangier D , et al. , 2017 . Convolutional sequence to sequence learning . Proc 34 th Int Conf on Machine Learning , p. 1243 - 1252 .
Hong YZ , Yu XG , He N , et al. , 2019 . FASPell: a fast, adaptable, simple, powerful Chinese spell checker based on DAE-decoder paradigm . Proc 5 th Workshop on Noisy User-Generated Text , p. 160 - 169 . https://doi.org/10.18653/v1/D19-5522 https://doi.org/10.18653/v1/D19-5522
Hu H , Richardson K , Xu L , et al. , 2020 . OCNLI: original Chinese natural language inference . Proc Findings of the Association for Computational Linguistics , p. 3512 - 3526 . https://doi.org/10.18653/v1/2020.findings-emnlp.314 https://doi.org/10.18653/v1/2020.findings-emnlp.314
Ji JS , Wang QL , Toutanova K , et al. , 2017 . A nested attention neural hybrid model for grammatical error correction . Proc 55 th Annual Meeting of the Association for Computational Linguistics , p. 753 - 762 . https://doi.org/10.18653/v1/P17-1070 https://doi.org/10.18653/v1/P17-1070
Jia C , Shi YF , Yang QR , et al. , 2020 . Entity enhanced BERT pre-training for Chinese NER . Proc Conf on Empirical Methods in Natural Language Processing , p. 6384 - 6396 . https://doi.org/10.18653/v1/2020.emnlp-main.518 https://doi.org/10.18653/v1/2020.emnlp-main.518
Jia YZ , Xu XB , 2018 . Chinese named entity recognition based on CNN-BiLSTM-CRF . Proc IEEE 9 th Int Conf on Software Engineering and Service Science , p. 1 - 4 . https://doi.org/10.1109/ICSESS.2018.8663820 https://doi.org/10.1109/ICSESS.2018.8663820
Jin H , Zhang ZB , Yuan PP , 2022 . Improving Chinese word representation using four corners features . IEEE Trans Big Data , 8 ( 4 ): 982 - 993 . https://doi.org/10.1109/TBDATA.2021.3106582 https://doi.org/10.1109/TBDATA.2021.3106582
Li B , Yang P , Zhao HL , et al. , 2023 . Hierarchical sliding inference generator for question-driven abstractive answer summarization . ACM Trans Inform Syst , 41 ( 1 ): 7 . https://doi.org/10.1145/351189 https://doi.org/10.1145/351189
Li B , Yang P , Sun YK , et al. , 2024 . Advances and challenges in artificial intelligence text generation . Front Inform Technol Electron Eng , 25 ( 1 ): 64 - 83 . https://doi.org/10.1631/FITEE.2300410 https://doi.org/10.1631/FITEE.2300410
Li JT , Meng K , 2021 . MFE-NER: multi-feature fusion embedding for Chinese named entity recognition . https://arxiv.org/abs/2109.07877 https://arxiv.org/abs/2109.07877
Li WG , Ramos RM , Brom PC , 2024 . Threshold determination for Chinese character image processing in multimodal information fusion . Proc 28 th Int Conf on Asian Language Processing , p. 43 - 48 . https://doi.org/10.1109/IALP63756.2024.10661155 https://doi.org/10.1109/IALP63756.2024.10661155
Li WS , Wei YG , An D , et al. , 2022 . LSTM-TCN: dissolved oxygen prediction in aquaculture, based on combined model of long short-term memory network and temporal convolutional network . Environ Sci Pollut Res , 29 ( 26 ): 39545 - 39556 . https://doi.org/10.1007/s11356-022-18914-8 https://doi.org/10.1007/s11356-022-18914-8
Li XN , Yan H , Qiu XP , et al. , 2020 . FLAT: Chinese NER using flat-lattice Transformer . Proc 58 th Annual Meeting of the Association for Computational Linguistics , p. 6836 - 6842 . https://doi.org/10.18653/v1/2020.acl-main.611 https://doi.org/10.18653/v1/2020.acl-main.611
Liang ZY , Du JP , Li CY , 2020 . Abstractive social media text summarization using selective reinforced Seq2Seq attention model . Neurocomputing , 410 : 432 - 440 . https://doi.org/10.1016/j.neucom.2020.04.137 https://doi.org/10.1016/j.neucom.2020.04.137
Liu J , Yang YH , Lv SQ , et al. , 2019 . Attention-based BiGRU-CNN for Chinese question classification . J Amb Intell Human Comput . https://doi.org/10.1007/s12652-019-01344-9 https://doi.org/10.1007/s12652-019-01344-9
Liu JG , Xia CH , Li XJ , et al. , 2020 . A BERT-based ensemble model for Chinese news topic prediction . Proc 2 nd Int Conf on Big Data Engineering , p. 18 - 23 . https://doi.org/10.1145/3404512.3404524 https://doi.org/10.1145/3404512.3404524
Liu SL , Yang T , Yue TC , et al. , 2021 . PLOME: pre-training with misspelled knowledge for Chinese spelling correction . Proc 59 th Annual Meeting of the Association for Computational Linguistics and the 11 th Int Joint Conf on Natural Language Processing , p. 2991 - 3000 . https://doi.org/10.18653/v1/2021.acl-long.233 https://doi.org/10.18653/v1/2021.acl-long.233
Liu WJ , Zhou P , Wang ZR , et al. , 2020 . FastBERT: a self-distilling BERT with adaptive inference time . Proc 58 th Annual Meeting of the Association for Computational Linguistics , p. 6035 - 6044 . https://doi.org/10.18653/v1/2020.acl-main.537 https://doi.org/10.18653/v1/2020.acl-main.537
Liu Y , Lapata M , 2019 . Hierarchical Transformers for multi-document summarization . Proc 57 th Conf of the Association for Computational Linguistics , p. 5070 - 5081 . https://doi.org/10.18653/v1/P19-1500 https://doi.org/10.18653/v1/P19-1500
Ma SM , Sun X , Lin JY , et al. , 2018 . Autoencoder as assistant supervisor: improving text representation for Chinese social media text summarization . Proc 56 th Annual Meeting of the Association for Computational Linguistics , p. 725 - 731 . https://doi.org/10.18653/v1/P18-2115 https://doi.org/10.18653/v1/P18-2115
Maruf S , Saleh F , Haffari G , 2022 . A survey on document-level neural machine translation: methods and evaluation . ACM Comput Surv , 54 ( 2 ): 45 . https://doi.org/10.1145/3441691 https://doi.org/10.1145/3441691
Meng FD , Zhang JC , 2019 . DTMT: a novel deep transition architecture for neural machine translation . Proc 33 rd AAAI Conf on Artificial Intelligence , p. 224 - 231 . https://doi.org/10.1609/aaai.v33i01.3301224 https://doi.org/10.1609/aaai.v33i01.3301224
Meng FD , Lu ZD , Li H , et al. , 2016 . Interactive attention for neural machine translation . Proc 26 th Int Conf on Computational Linguistics , p. 2174 - 2185 .
Meng YX , Wu W , Wang F , et al. , 2019 . Glyce: glyph-vectors for Chinese character representations . Proc 33 rd Int Conf on Neural Information Processing Systems , Article 247 .
Otter DW , Medina JR , Kalita JK , 2021 . A survey of the usages of deep learning for natural language processing . IEEE Trans Neur Netw Learn Syst , 32 ( 2 ): 604 - 624 . https://doi.org/10.1109/TNNLS.2020.2979670 https://doi.org/10.1109/TNNLS.2020.2979670
Papineni K , Roukos S , Ward T , et al. , 2002 . BLUE: a method for automatic evaluation of machine translation . Proc 40 th Annual Meeting of the Association for Computational Linguistics , p. 311 - 318 . https://doi.org/10.3115/1073083.1073135 https://doi.org/10.3115/1073083.1073135
Reimers N , Gurevych I , 2019 . Sentence-BERT: sentence embeddings using siamese BERT-networks . Proc Conf on Empirical Methods in Natural Language Processing and the 9 th Int Joint Conf on Natural Language Processing , p. 3980 - 3990 . https://doi.org/10.18653/v1/D19-1410 https://doi.org/10.18653/v1/D19-1410
Shao YF , Geng ZC , Liu YT , et al. , 2024 . CPT: a pre-trained unbalanced transformer for both Chinese language understanding and generation . Sci China Inform Sci , 67 ( 5 ): 152102 . https://doi.org/10.1007/s11432-021-3536-5 https://doi.org/10.1007/s11432-021-3536-5
Sheng L , Xu ZX , Li XL , et al. , 2023 . EDMSpell: incorporating the error discriminator mechanism into Chinese spelling correction for the overcorrection problem . J King Saud Univ-Comput Inform Sci , 35 ( 6 ): 101573 . https://doi.org/10.1016/j.jksuci.2023.101573 https://doi.org/10.1016/j.jksuci.2023.101573
Soydaner D , 2022 . Attention mechanism in neural networks: where it comes and where it goes . Neur Comput Appl , 34 ( 16 ): 13371 - 13385 . https://doi.org/10.1007/s00521-022-07366-3 https://doi.org/10.1007/s00521-022-07366-3
Stahlberg F , 2020 . Neural machine translation: a review . J Artif Intell Res , 69 : 343 - 418 . https://doi.org/10.1613/jair.1.12007 https://doi.org/10.1613/jair.1.12007
Sun Y , Wang SH , Li YK , et al. , 2019 . ERNIE: enhanced representation through knowledge integration . https://arxiv.org/abs/1904.09223 https://arxiv.org/abs/1904.09223
Sun ZJ , Li XY , Sun XF , et al. , 2021 . ChineseBERT: Chinese pretraining enhanced by glyph and pinyin information . Proc 59 th Annual Meeting of the Association for Computational Linguistics and the 11 th Int Joint Conf on Natural Language Processing , p. 2065 - 2075 . https://doi.org/10.18653/v1/2021.acl-long.161 https://doi.org/10.18653/v1/2021.acl-long.161
Tao HQ , Tong SW , Zhao HK , et al. , 2019 . A radical-aware attention-based model for Chinese text classification . Proc 33 rd AAAI Conf on Artificial Intelligence , p. 5125 - 5132 . https://doi.org/10.1609/aaai.v33i01.33015125 https://doi.org/10.1609/aaai.v33i01.33015125
Vaswani A , Shazeer N , Parmar N , et al. , 2017 . Attention is all you need . Proc 31 st Int Conf on Neural Information Processing Systems , p. 6000 - 6010 .
Wang DM , Song Y , Li J , et al. , 2018 . A hybrid approach to automatic corpus generation for Chinese spelling check . Proc Conf on Empirical Methods in Natural Language Processing , p. 2517 - 2527 . https://doi.org/10.18653/v1/D18-1273 https://doi.org/10.18653/v1/D18-1273
Wang DM , Tay Y , Zhong L , 2019 . Confusionset-guided pointer networks for Chinese spelling check . Proc 57 th Annual Meeting of the Association for Computational Linguistics , p. 5780 - 5785 . https://doi.org/10.18653/v1/P19-1578 https://doi.org/10.18653/v1/P19-1578
Wang YG , Cheng SB , Jiang LY , et al. , 2017 . Sogou neural machine translation systems for WMT17 . Proc 2 nd Conf on Machine Translation , p. 410 - 415 . https://doi.org/10.18653/v1/W17-4742 https://doi.org/10.18653/v1/W17-4742
Weng RX , Yu H , Huang SJ , et al. , 2020 . Acquiring knowledge from pre-trained model to neural machine translation . Proc 34 th AAAI Conf on Artificial Intelligence , p. 9266 - 9273 . https://doi.org/10.1609/aaai.v34i05.6465 https://doi.org/10.1609/aaai.v34i05.6465
Wu FZ , Liu JX , Wu CH , et al. , 2019 . Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation . Proc World Wide Web Conf , p. 3342 - 3348 . https://doi.org/10.1145/3308558.3313743 https://doi.org/10.1145/3308558.3313743
Xie JB , Hou YJ , Wang YJ , et al. , 2020 . Chinese text classification based on attention mechanism and feature-enhanced fusion neural network . Computing , 102 ( 3 ): 683 - 700 . https://doi.org/10.1007/s00607-019-00766-9 https://doi.org/10.1007/s00607-019-00766-9
Xu HD , Li ZL , Zhou QY , et al. , 2021 . Read, listen, and see: leveraging multimodal information helps Chinese spell checking . Proc Findings of the Association for Computational Linguistics , p. 716 - 728 . https://doi.org/10.18653/v1/2021.findings-acl.64 https://doi.org/10.18653/v1/2021.findings-acl.64
Yan H , Deng BC , Li XN , et al. , 2019 . TENER: adapting Transformer encoder for named entity recognition . https://arxiv.org/abs/1911.04474 https://arxiv.org/abs/1911.04474
Yang A , Yang BS , Hui BY , et al. , 2024 . Qwen2 technical report . https://arxiv.org/abs/2407.10671 https://arxiv.org/abs/2407.10671
Yao YS , Huang Z , 2016 . Bi-directional LSTM recurrent neural network for Chinese word segmentation . Proc 23 rd Int Conf on Neural Information Processing , p. 345 - 353 . https://doi.org/10.1007/978-3-319-46681-1_42 https://doi.org/10.1007/978-3-319-46681-1_42
Zhang B , Xiong DY , Xie J , et al. , 2020 . Neural machine translation with GRU-gated attention model . IEEE Trans Neur Netw Learn Syst , 31 ( 11 ): 4688 - 4698 . https://doi.org/10.1109/TNNLS.2019.2957276 https://doi.org/10.1109/TNNLS.2019.2957276
Zhang SH , Huang HR , Liu JC , et al. , 2020 . Spelling error correction with soft-masked BERT . Proc 58 th Annual Meeting of the Association for Computational Linguistics , p. 882 - 890 . https://doi.org/10.18653/v1/2020.acl-main.82 https://doi.org/10.18653/v1/2020.acl-main.82
Zhang Y , Liu YG , Zhu JJ , et al. , 2019 . Learning Chinese word embeddings from stroke, structure and pinyin of characters . Proc 28 th ACM Int Conf on Information and Knowledge Management , p. 1011 - 1020 . https://doi.org/10.1145/3357384.3358005 https://doi.org/10.1145/3357384.3358005
Zhang YS , Zheng J , Jiang YR , et al. , 2019 . A text sentiment classification modeling method based on coordinated CNN-LSTM-attention model . Chin J Electron , 28 ( 1 ): 120 - 126 . https://doi.org/10.1049/cje.2018.11.004 https://doi.org/10.1049/cje.2018.11.004
Zhao H , Cai D , Xin Y , et al. , 2017 . A hybrid model for Chinese spelling check . ACM Trans Asian Low-Resour Lang Inform Process , 16 ( 3 ): 21 . https://doi.org/10.1145/3047405 https://doi.org/10.1145/3047405
Zhao S , Hu MH , Cai ZP , et al. , 2023 . Enhancing Chinese character representation with lattice-aligned attention . IEEE Trans Neur Netw Learn Syst , 34 ( 7 ): 3727 - 3736 . https://doi.org/10.1109/TNNLS.2021.3114378 https://doi.org/10.1109/TNNLS.2021.3114378
Zhou J , Cui GQ , Hu SD , et al. , 2020 . Graph neural networks: a review of methods and applications . AI Open , 1 : 57 - 81 . https://doi.org/10.1016/j.aiopen.2021.01.001 https://doi.org/10.1016/j.aiopen.2021.01.001
Zhou SY , Xu S , Xu B , 2018 . Multilingual end-to-end speech recognition with a single Transformer on low-resource languages . https://arxiv.org/abs/1806.05059v2 https://arxiv.org/abs/1806.05059v2
Zhuang H , Wang C , Li CL , et al. , 2017 . Natural language processing service based on stroke-level convolutional networks for Chinese text classification . Proc IEEE Int Conf on Web Services , p. 404 - 411 . https://doi.org/10.1109/ICWS.2017.46 https://doi.org/10.1109/ICWS.2017.46
Publicity Resources
Related Articles
Related Author
Related Institution