nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
2026, 01, v.36 24-30
基于深度学习的编译型语言代码转换技术研究
基金项目(Foundation): 国家电网有限公司总部管理科技项目资助(5700-202418244A-1-1-ZN)
邮箱(Email):
DOI: 10.20165/j.cnki.ISSN1673-629X.2025.0188
摘要:

随着软件系统跨平台和语言多样化的需求日益增长,自动源代码转换技术成为现代软件工程中的关键研究方向。传统基于规则和统计方法的代码转换手段受限于语法覆盖范围小、语义一致性弱等问题,难以满足大规模、高精度的代码迁移需求。该文聚焦于编译型语言之间的代码转换任务,提出一种基于深度学习的Java到C++自动代码转换方法。该方法融合了Transformer编码-解码结构、语法树建模、层次注意力机制和指针生成机制,能够同时捕捉源代码的词法和结构特征,并有效处理未登录标识符的翻译问题。在构建的Java-C++平行数据集上开展了系统实验,结果表明该模型在BLEU得分提升了6.4百分点,CodeBLEU提升了4.7百分点,精确匹配率提高了5.7百分点,功能正确率提高了7.8百分点,在多个评价指标上均显著优于现有主流方法。通过消融实验和案例分析进一步验证了模型结构各部分对性能提升的重要贡献。

Abstract:

With the increasing demand for cross-platform and language diversity of software systems, automatic source code conversion technology has become a key research direction in modern software engineering. Traditional code conversion methods based on rules and statistical approaches are limited by problems such as small grammar coverage and weak semantic consistency, making it difficult to meet the requirements of large-scale and high-precision code migration. We focus on the code conversion task between compiled languages and propose an automatic code conversion method from Java to C++ based on deep learning. This method integrates the Transformer encoder-decoding structure, syntactic tree modeling, hierarchical attention mechanism and pointer generation mechanism, which can simultaneously capture the lexical and structural features of the source code and effectively handle the translation problem of unregistered identifiers. We conduct systematic experiments on the constructed Java-C++ parallel dataset. The results show that the proposed model has increased the BLEU score by 6.4 percentage points, the CodeBLEU score by 4.7 percentage points, the exact matching rate by 5.7 percentage points, and the functional accuracy rate by 7.8 percentage points. It is significantly superior to the existing mainstream methods in multiple evaluation indicators.

参考文献

[1] 王炜东,张宏海,刘硕,等.民航机票搜索系统国产化适配研究[J].信息技术与信息化,2025(1):52-55.

[2] ROCCO J D,NGUYEN P T,SIPIO C D,et al.DeepMig:a transformer-based approach to support coupled library and code migrations[J].Information and Software Technology,2025,177:107588.

[3] BIRULIA M D.Управление качеством при разработке программного обеспечения[J].Advanced Engineering Research (Rostov-on-Don),2024,24(3):255-263.

[4] 金磐石,张晓东,邢磊,等.建行信用卡系统全栈国产化改造研究[J].计算机技术与发展,2024,34(6):192-200.

[5] ROBERT S,ROBERT N.Pattern matching algorithms in blockchain for network fees reduction[J].The Journal of Supercomputing,2024,80(12):17741-17759.

[6] 尚承翔,李桦宇,李瀚洋,等.基于深度迁移学习的恶意代码可视化检测[J].网络安全技术与应用,2024(3):37-39.

[7] 甘凌霄.基于OpenFaaS的无服务器工作流系统设计[J].无线互联科技,2024,21(10):21-24.

[8] 杨光,周宇,陈翔,等.CodeScore-R:用于评估代码合成功能准确性的自动化鲁棒指标[J].计算机研究与发展,2024,61(2):291-306.

[9] 徐明瑞,李征,刘勇,等.基于代码语句掩码注意力机制的源代码迁移模型[J].计算机系统应用,2023,32(9):77-88.

[10] 李征,徐明瑞,吴永豪,等.基于层次注意力机制的源代码迁移模型[J].计算机应用研究,2023,40(10):3082-3090.

[11] RONALD L,MARTIN B,TOBIAS H,et al.Sustainable development of simulation setups and addons for OpenFOAM for nuclear reactor safety research[J].Kerntechnik,2023,88(2):131-140.

[12] 谢鑫,梁卫芳,张钰莎.基于三位一体协同的虚拟机隔离保护方案[J].贵州大学学报:自然科学版,2023,40(1):62-69.

[13] 向麒麟,彭鑫,赤坂居纱美,等.基于动态和静态分析的单体应用FaaS改造方法[J].软件学报,2022,33(11):4061-4083.

[14] 李明煜,夏虞斌,陈海波.面向SGX2代新型可信执行环境的内存优化系统[J].软件学报,2022,33(6):2012-2029.

[15] GAIE C,BARBIER F.Cost-effective modernisation of COBOL legacy applications[J].International Journal of Computational Systems Engineering,2021,6(3):115-122.

[16] NGOC M T,HIEU T,SON N,et al.Does BLEU score work for code migration?[J].CoRR,abs/1906.04903,2019.

[17] LI Bing,XIAO Xueli,PAN Yi.Automatic translation from Java to Spark[J].Concurrency and Computation:Practice and Experience,2018,30(20):e4459.1-e4459.12.

[18] NAWROCKI P,SNIEZYNSKI B,SLOJEWSKI H.Adaptable mobile cloud computing environment with code transfer based on machine learning[J].Pervasive and Mobile Computing,2019,57:49-63.

[19] GAIE C,BARBIER F.Towards automated migration of legacy code using deep learning[J].Software Practice and Experience,2020,50(6):897-914.

[20] DI COSMO R,ZACCHIROLI S.Software heritage:why and how to preserve software source code[J].Communications of the ACM,2017,61(10):56-61.

基本信息:

DOI:10.20165/j.cnki.ISSN1673-629X.2025.0188

中图分类号:TP311.5;TP18;TP314

引用信息:

[1]张明明,张富林,刘建戈,等.基于深度学习的编译型语言代码转换技术研究[J].计算机技术与发展,2026,36(01):24-30.DOI:10.20165/j.cnki.ISSN1673-629X.2025.0188.

基金信息:

国家电网有限公司总部管理科技项目资助(5700-202418244A-1-1-ZN)

发布时间:

2025-06-26

出版时间:

2025-06-26

网络发布时间:

2025-06-26

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文