FOLLOWUS
College of Information and Management Science, Henan Agricultural University, Zhengzhou 450002, China
E-mail: xsf@whu.edu.cn
‡ Corresponding author
Received:31 March 2024,
Revised:22 September 2024,
Published:2025-06
Scan QR Code
Shufeng XIONG, Guipei ZHANG, Xiaobo FAN, et al. MAL: multilevel active learning with BERT for Chinese textual affective structure analysis[J]. Frontiers of information technology & electronic engineering, 2025, 26(6): 833-846.
Shufeng XIONG, Guipei ZHANG, Xiaobo FAN, et al. MAL: multilevel active learning with BERT for Chinese textual affective structure analysis[J]. Frontiers of information technology & electronic engineering, 2025, 26(6): 833-846. DOI: 10.1631/FITEE.2400242.
中文文本情感结构分析(CTASA)是一项序列标注任务,通常依赖于带监督的深度学习方法。然而,获取用于训练的大型带标注数据集可能既昂贵又耗时。通过选择最有价值的样本,主动学习提供了一种降低标注成本的解决方案。以往的主动学习方法主要关注样本的不确定性或多样性,但实际上还面临着模型偏差或选择到无关样本等挑战。为解决这些问题,本文引入多层次主动学习(MAL),该方法利用句子和词汇两个层面的深层文本信息建模汉语的复杂结构。通过整合从基于Transformer的双向编码器表示(BERT)嵌入中提取的句子级特征以及从随机条件场(CRF)模型中获得的词汇级概率分布,MAL能够全面捕捉中文文本的情感结构(CTAS)。实验结果表明,与基线方法相比,MAL显著降低了约70%的标注成本,并且性能更加稳定。
Chinese textual affective structure analysis (CTASA) is a sequence labeling task that often relies on supervised deep learning methods. However
acquiring a large annotated dataset for training can be costly and time-consuming. Active learning offers a solution by selecting the most valuable samples to reduce labeling costs. Previous approaches focused on uncertainty or diversity but faced challenges such as biased models or selecting insignificant samples. To address these issues
multilevel active learning (MAL) is introduced
which leverages deep textual information at both the sentence and word levels
taking into account the complex structure of the Chinese language. By integrating the sentence-level features extracted from bidirectional encoder representations from Transformers (BERT) embeddings and the word-level probability distributions obtained through a conditional random field (CRF) model
MAL comprehensively captures the Chinese textual affective structure (CTAS). Experimental results demonstrate that MAL significantly reduces annotation costs by approximately 70% and achieves more consistent performance compared to baseline methods.
Alamoodi AH , Zaidan BB , Zaidan AA , et al. , 2021 . Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: a systematic review . Expert Syst Appl , 167 : 114155 . https://doi.org/10.1016/j.eswa.2020.114155 https://doi.org/10.1016/j.eswa.2020.114155
Angluin D , 1988 . Queries and concept learning . Mach Learn , 2 ( 4 ): 319 - 342 . https://doi.org/10.1023/A:1022821128753 https://doi.org/10.1023/A:1022821128753
Ash JT , Zhang CC , Krishnamurthy A , et al. , 2020 . Deep batch active learning by diverse, uncertain gradient lower bounds . Proc 8 th Int Conf on Learning Representations .
Barnes J , Kurtz R , Oepen S , et al. , 2021 . Structured sentiment analysis as dependency graph parsing . Proc 59 th Annual Meeting of the Association for Computational Linguistics and the 11 th Int Joint Conf on Natural Language Processing , p. 3387 - 3402 . https://doi.org/10.18653/v1/2021.acl-long.263 https://doi.org/10.18653/v1/2021.acl-long.263
Basiri ME , Nemati S , Abdar M , et al. , 2021 . ABCDM: an attention-based bidirectional CNN-RNN deep model for sentiment analysis . Fut Gener Comput Syst , 115 : 279 - 294 . https://doi.org/10.1016/j.future.2020.08.005 https://doi.org/10.1016/j.future.2020.08.005
Bishop CM , 2006 . Pattern Recognition and Machine Learning . Springer , New York, USA .
Bodó Z , Minier Z , Csató L , 2011 . Proc Mach Learn Res , 16 : 127 - 139 .
Brinker K , 2003 . Incorporating diversity in active learning with support vector machines . Proc 20 th Int Conf on Machine Learning , p. 59 - 66 .
Chen YK , Lasko TA , Mei QZ , et al. , 2015 . A study of active learning methods for named entity recognition in clinical text . J Biomed Inform , 58 : 11 - 18 . https://doi.org/10.1016/j.jbi.2015.09.010 https://doi.org/10.1016/j.jbi.2015.09.010
Cohn DA , Ghahramani Z , Jordan MI , 1996 . Active learning with statistical models . J Artif Intell Res , 4 : 129 - 145 . https://doi.org/10.1613/jair.295 https://doi.org/10.1613/jair.295
Culotta A , McCallum A , 2005 . Reducing labeling effort for structured prediction tasks . Proc 20 th National Conf on Artificial Intelligence and the 17 th Innovative Applications of Artificial Intelligence Conf , p. 746 - 751 .
Dagan I , Engelson SP , 1995 . Committee-based sampling for training probabilistic classifiers . Proc 12 th Int Conf on Machine Learning , p. 150 - 157 . https://doi.org/10.1016/B978-1-55860-377-6.50027-X https://doi.org/10.1016/B978-1-55860-377-6.50027-X
Dasgupta S , 2011 . Two faces of active learning . Theor Comput Sci , 412 ( 19 ): 1767 - 1781 . https://doi.org/10.1016/j.tcs.2010.12.054 https://doi.org/10.1016/j.tcs.2010.12.054
Devlin J , Chang MW , Lee K , et al. , 2019 . BERT: pre-training of deep bidirectional transformers for language understanding . Proc Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , p. 4171 - 4186 . https://doi.org/10.18653/v1/N19-1423 https://doi.org/10.18653/v1/N19-1423
Dor LE , Halfon A , Gera A , et al. , 2020 . Active learning for BERT: an empirical study . Proc Conf on Empirical Methods in Natural Language Processing , p. 7949 - 7962 . https://doi.org/10.18653/v1/2020.emnlp-main.638 https://doi.org/10.18653/v1/2020.emnlp-main.638
Ducoffe M , Precioso F , 2018 . Adversarial active learning for deep networks: a margin based approach . https://arxiv.org/abs/1802.09841 https://arxiv.org/abs/1802.09841
Gal Y , Islam R , Ghahramani Z , 2017 . Deep Bayesian active learning with image data . Proc 34 th Int Conf on Machine Learning , p. 1183 - 1192 .
Geifman Y , El-Yaniv R , 2017 . Deep active learning over the long tail . https://arxiv.org/abs/1711.00941 https://arxiv.org/abs/1711.00941
Hanneke S , 2014 . Theory of disagreement-based active learning . Found Trends Mach Learn , 7 ( 2-3 ): 131 - 309 . https://doi.org/10.1561/2200000037 https://doi.org/10.1561/2200000037
Houlsby N , Huszár F , Ghahramani Z , et al. , 2011 . Bayesian active learning for classification and preference learning . https://arxiv.org/abs/1112.5745 https://arxiv.org/abs/1112.5745
Hu R , Mac Namee B , Delany SJ , 2016 . Active learning for text classification with reusability . Expert Syst Appl , 45 : 438 - 449 . https://doi.org/10.1016/j.eswa.2015.10.003 https://doi.org/10.1016/j.eswa.2015.10.003
Huang TK , Li LH , Vartanian A , et al. , 2016 . Active learning with oracle epiphany . Proc 30 th Int Conf on Neural Information Processing Systems , p. 2828 - 2836 .
Kirsch A , Van Amersfoort J , Gal Y , 2019 . BatchBALD: efficient and diverse batch acquisition for deep Bayesian active learning . Proc 33 rd Int Conf on Neural Information Processing Systems , Article 631 .
Konyushkova K , Sznitman R , Fua P , 2017 . Learning active learning from data . Proc 31 st Int Conf on Neural Information Processing Systems , p. 4228 - 4238 .
Lewis DD , 1995 . A sequential algorithm for training text classifiers: corrigendum and additional data . ACM SIGIR Forum , 29 ( 2 ): 13 - 19 . https://doi.org/10.1145/219587.219592 https://doi.org/10.1145/219587.219592
Liu M , Buntine W , Haffari G , 2018 . Learning how to actively learn: a deep imitation learning approach . Proc 56 th Annual Meeting of the Association for Computational Linguistics , p. 1874 - 1883 . https://doi.org/10.18653/v1/P18-1174 https://doi.org/10.18653/v1/P18-1174
Liu MY , Tu ZY , Zhang T , et al. , 2022 . LTP: a new active learning strategy for CRF-based named entity recognition . Neur Process Lett , 54 ( 3 ): 2433 - 2454 . https://doi.org/10.1007/s11063-021-10737-x https://doi.org/10.1007/s11063-021-10737-x
Margatina K , Vernikos G , Barrault L , et al. , 2021a . Active learning by acquiring contrastive examples . Proc Conf on Empirical Methods in Natural Language Processing , p. 650 - 663 . https://doi.org/10.18653/v1/2021.emnlp-main.51 https://doi.org/10.18653/v1/2021.emnlp-main.51
Margatina K , Barrault L , Aletras N , 2021b . On the importance of effectively adapting pretrained language models for active learning . Proc 60 th Annual Meeting of the Association for Computational Linguistics , p. 825 - 836 . https://doi.org/10.18653/v1/2022.acl-short.93 https://doi.org/10.18653/v1/2022.acl-short.93
McCallum A , Nigam K , 1998 . Employing EM and pool-based active learning for text classification . Proc 15 th Int Conf on Machine Learning , p. 350 - 358 .
Medhat W , Hassan A , Korashy H , 2014 . Sentiment analysis algorithms and applications: a survey . Ain Shams Eng J , 5 ( 4 ): 1093 - 1113 . https://doi.org/10.1016/j.asej.2014.04.011 https://doi.org/10.1016/j.asej.2014.04.011
Qiu XP , Sun TX , Xu YG , et al. , 2020 . Pre-trained models for natural language processing: a survey . Sci China Technol Sci , 63 ( 10 ): 1872 - 1897 . https://doi.org/10.1007/s11431-020-1647-3 https://doi.org/10.1007/s11431-020-1647-3
Samuel D , Barnes J , Kurtz R , et al. , 2022 . Direct parsing to sentiment graphs . Proc 60 th Annual Meeting of the Association for Computational Linguistics , p. 470 - 478 . https://doi.org/10.18653/v1/2022.acl-short.51 https://doi.org/10.18653/v1/2022.acl-short.51
Sener O , Savarese S , 2018 . Active learning for convolutional neural networks: a core-set approach . Proc 6 th Int Conf on Learning Representations .
Settles B , 2010 . Active Learning Literature Survey. Computer Sciences Technical Report 1648 , University of Wisconsin-Madison , Madison, WI, USA .
Settles B , Craven M , 2008 . An analysis of active learning strategies for sequence labeling tasks . Proc Conf on Empirical Methods in Natural Language Processing , p. 1070 - 1079 .
Settles B , Craven M , Ray S , 2007 . Multiple-instance active learning . Proc 20 th Int Conf on Neural Information Processing Systems , p. 1289 - 1296 .
Shelmanov A , Puzyrev D , Kupriyanova L , et al. , 2021 . Active learning for sequence tagging with deep pre-trained models and Bayesian uncertainty estimates . Proc 16 th Conf of the European Chapter of the Association for Computational Linguistics , p. 1698 - 1712 . https://doi.org/10.18653/v1/2021.eacl-main.145 https://doi.org/10.18653/v1/2021.eacl-main.145
Shen YY , Yun H , Lipton Z , et al. , 2017 . Deep active learning for named entity recognition . Proc 2 nd Workshop on Representation Learning for NLP , p. 252 - 256 . https://doi.org/10.18653/v1/W17-2630 https://doi.org/10.18653/v1/W17-2630
Shi WX , Li F , Li JY , et al. , 2022 . Effective token graph modeling using a novel labeling strategy for structured sentiment analysis . Proc 60 th Annual Meeting of the Association for Computational Linguistics , p. 4232 - 4241 . https://doi.org/10.18653/v1/2022.acl-long.291 https://doi.org/10.18653/v1/2022.acl-long.291
Siddhant A , Lipton ZC , 2018 . Deep Bayesian active learning for natural language processing: results of a large-scale empirical study . Proc Conf on Empirical Methods in Natural Language Processing , p. 2904 - 2909 . https://doi.org/10.18653/v1/D18-1318 https://doi.org/10.18653/v1/D18-1318
Smailović J , Grčar M , Lavrač N , et al. , 2014 . Stream-based active learning for sentiment analysis in the financial domain . Inform Sci , 285 : 181 - 203 . https://doi.org/10.1016/j.ins.2014.04.034 https://doi.org/10.1016/j.ins.2014.04.034
Tong SM , Koller D , 2001 . Support vector machine active learning with applications to text classification . J Mach Learn Res , 2 : 45 - 66 .
Venugopalan M , Gupta D , 2015 . Exploring sentiment analysis on Twitter data . Proc 8 th Int Conf on Contemporary Computing , p. 241 - 247 . https://doi.org/10.1109/IC3.2015.7346686 https://doi.org/10.1109/IC3.2015.7346686
Wu X , Chen C , Zhong MY , et al. , 2021 . HAL: hybrid active learning for efficient labeling in medical domain . Neurocomputing , 456 : 563 - 572 . https://doi.org/10.1016/j.neucom.2020.10.115 https://doi.org/10.1016/j.neucom.2020.10.115
Xiong SF , Fan XB , Batra V , et al. , 2023 . An entropy-based method with a new benchmark dataset for Chinese textual affective structure analysis . Entropy , 25 ( 5 ): 794 . https://doi.org/10.3390/e25050794 https://doi.org/10.3390/e25050794
Yuan M , Lin HT , Boyd-Graber J , 2020 . Cold-start active learning through self-supervised language modeling . Proc Conf on Empirical Methods in Natural Language Processing , p. 7935 - 7948 . https://doi.org/10.18653/v1/2020.emnlp-main.637 https://doi.org/10.18653/v1/2020.emnlp-main.637
Zhai ZP , Chen H , Li RF , et al. , 2023 . USSA: a unified table filling scheme for structured sentiment analysis . Proc 61 st Annual Meeting of the Association for Computational Linguistics , p. 14340 - 14353 . https://doi.org/10.18653/v1/2023.acl-long.802 https://doi.org/10.18653/v1/2023.acl-long.802
Zhang HT , Huang ML , Zhu XY , 2012 . A unified active learning framework for biomedical relation extraction . J Comput Sci Technol , 27 ( 6 ): 1302 - 1313 . https://doi.org/10.1007/s11390-012-1306-0 https://doi.org/10.1007/s11390-012-1306-0
Zhang MK , Plank B , 2021 . Cartography active learning . Proc Findings of the Association for Computational Linguistics , p. 395 - 406 . https://doi.org/10.18653/v1/2021.findings-emnlp.36 https://doi.org/10.18653/v1/2021.findings-emnlp.36
Zhou CJ , Li BB , Fei H , et al. , 2024 . Revisiting structured sentiment analysis as latent dependency graph parsing . Proc 62 nd Annual Meeting of the Association for Computational Linguistics , p. 10178 - 10191 . https://doi.org/10.18653/v1/2024.acl-long.548 https://doi.org/10.18653/v1/2024.acl-long.548
Publicity Resources
Related Articles
Related Author
Related Institution